Add test gym utils play. Fix #2729 #2743

gianlucadecola · 2022-04-10T15:55:49Z

Referring to #2729; to add test to gym.utils.play I have refactored the play() script and extracted a PlayableGame class to have more separation and easier testing.
Running headless the whole game is possible but it seems that pygame events are not available (https://www.pygame.org/wiki/HeadlessNoWindowsNeeded) so I prefer to mock them in tests and test the PlayableGame class

pseudo-rnd-thoughts · 2022-04-11T18:27:41Z

Thanks for making the PR, couple of things

Would you be able to add typing to the PlayableGame or pydocs for people using the class to be able to know what expected information is?

pseudo-rnd-thoughts · 2022-04-11T18:31:35Z

Also, you don't seem to have any tests for the primary play function. Could you add that, please

gianlucadecola · 2022-04-11T22:20:01Z

Added type hint 😃
As for testing the play() function, I'm quite unsure how to break the while loop since it does not seem possible to inject events in headless mode. One solution might be adding a testing parameter inside play() and tweaking a little bit the function but it seems like a sort of hacky solution to me 🤔

pseudo-rnd-thoughts · 2022-04-12T12:48:08Z

I'm not experienced with pygame so this may not work
My idea would be to use a custom testing callback function with the play function that would add a new event action to the event queue that would be processed next by the game.
This could allow you to run the play function forward n time steps with a known input then compare the resulting game observation at n to a standard environment at n

pseudo-rnd-thoughts · 2022-04-12T12:53:27Z

Looking at the blame of the play.py, there is a __main__ script at the end that appears to be at least 5 years old (42a42fd)
@jkterry1 Would it be ok to remove this __main__ script?

gianlucadecola · 2022-04-12T22:22:23Z

I'm not experienced with pygame so this may not work My idea would be to use a custom testing callback function with the play function that would add a new event action to the event queue that would be processed next by the game. This could allow you to run the play function forward n time steps with a known input then compare the resulting game observation at n to a standard environment at n

Well, it actually worked pretty well 🎉 Really good hint

pseudo-rnd-thoughts · 2022-04-12T23:27:47Z

Wow, I was half expecting that to not work, thanks for doing that

You probably have a reason but

Why are you using dummy env rather than an actual gym environment, i.e. cartpole?
Why are you using a dummy key event rather than an actual pygame event?

Extra test ideas
3. Could you extend the current test_play to be longer and assert the final observation are equal as checking that the reward hasn't changed at all (as the initial value is 0) doesn't seem to test the function properly to me
4. This may be too much, but we could run the play for all of the spec_list environments with random actions being generated in the callback (key down for the new action, key up for the old action)

Finally, if you think it is possible to add a test for PlayPlot? There doesn't seem much to test, maybe the data saved is correct, particularly for len(plot_names) > 1.
Plus we should add a doc string and type hinting for the class if possible

gianlucadecola · 2022-04-13T00:13:23Z

Wow, I was half expecting that to not work, thanks for doing that

You probably have a reason but
1. Why are you using dummy env rather than an actual gym environment, i.e. cartpole?

I didn't like the idea of test being tied to a particular environment. It seems more generic to me to have the dummy just for testing purpose of the play function

2. Why are you using a dummy key event rather than an actual pygame event?

This can actually be removed and replaced with an actual pygame event, I will commit the fix

Extra test ideas 3. Could you extend the current test_play to be longer and assert the final observation are equal as checking that the reward hasn't changed at all (as the initial value is 0) doesn't seem to test the function properly to me 4. This may be too much, but we could run the play for all of the spec_list environments with random actions being generated in the callback (key down for the new action, key up for the old action)

Finally, if you think it is possible to add a test for PlayPlot? There doesn't seem much to test, maybe the data saved is correct, particularly for len(plot_names) > 1. Plus we should add a doc string and type hinting for the class if possible

Good point on .3;
.4 seems an overkill to me, maybe it could be done in another PR 🤔

I will try to look into PlayPlot, also 👍

pseudo-rnd-thoughts · 2022-04-13T15:21:33Z

I understand your point about using an abstract implementation (a dummy environment) and actual implementation (i.e. cartpole) but I see two possible problems

What if someone changes the actual implementation of an environment in the future then they would have to both know about this dummy environment and then update it
It assumes that the dummy implementation is equivalent to an actual environment which it possible is but it is easier to check that if we just use an actual environment

Thanks for making those changes and looking into PlayPlot tests

jkterry1 · 2022-04-13T17:52:39Z

to answer an above question, removing old code that doesn't do anything is fine

RedTachyon

Overall looks pretty good, left a bunch of comments. Have you tried actually playing this with some environments? As a "full" test which can't really be easily automated, but it's helpful to do it as a human.

RedTachyon · 2022-04-14T19:04:25Z

gym/utils/play.py



+class PlayableGame:
+    def __init__(self, env: Env, keys_to_action: dict = None, zoom: float = None):


keys_to_action: Optional[dict]

Even better if we can specify the key/value types on the dict (I guess values are arbitrary actions, but keys should be keypresses? how are they represented?)

It's Dict[Tuple[int], int] where the key is (ord(key),)

gym/gym/envs/classic_control/mountain_car.py

Lines 240 to 242 in 5ae6bf9

def get_keys_to_action(self):

# Control with left and right arrow keys.

return {(): 1, (276,): 0, (275,): 2, (275, 276): 1}

reference of the actions:

gym/gym/envs/classic_control/mountain_car.py

Lines 50 to 54 in 5ae6bf9

| Num | Observation | Value | Unit |

|-----|-------------------------------------------------------------|---------|------|

| 0 | Accelerate to the left | Inf | position (m) |

| 1 | Don't accelerate | Inf | position (m) |

| 2 | Accelerate to the right | Inf | position (m) |

gym/gym/utils/play.py

Lines 125 to 135 in 9a1d4b6

keys_to_action: dict: tuple(int) -> int or None

Mapping from keys pressed to action performed.

For example if pressed 'w' and space at the same time is supposed

to trigger action number 2 then key_to_action dict would look like this:

{

# ...

sorted(ord('w'), ord(' ')) -> 2

# ...

}

If None, default key_to_action mapping for that env is used, if provided.

RedTachyon · 2022-04-14T19:05:41Z

gym/utils/play.py

+        self.pressed_keys = []
+        self.running = True
+
+    def _get_relevant_keys(self, keys_to_action: dict) -> set:


Optional[dict]

RedTachyon · 2022-04-14T19:07:35Z

gym/utils/play.py

+            elif hasattr(self.env.unwrapped, "get_keys_to_action"):
+                keys_to_action = self.env.unwrapped.get_keys_to_action()
+            else:
+                assert False, (


Explicitly raise some Exception instead of a failing assert, you can either use one of the existing exceptions like AttributeError or something; or make a new one in this file to make it more descriptive.

RedTachyon · 2022-04-14T19:12:00Z

gym/utils/play.py

+                    + " does not have explicit key to action mapping, "
+                    + "please specify one manually"
+                )
+        relevant_keys = set(sum(map(list, keys_to_action.keys()), []))


map is nice (functional programming is great), but in Python it's typically more readable to use a comprehension, so it'd be something like
set(sum(list(key) for key in keys_to_action), []))

But is this actually what we want this to do? I'm not sure what each key is meant to be (see previous comment about types), so take another look at the logic here. Intuition tells me that you might have wanted to just do set(keys_to_actions.keys()) or something like that (which might be equivalent to just set(keys_to_actions))

I'm going to explore it more in depth, this logic was already there in the old play(), it seems also to me that set(keys_to_actions.keys()) may be fine

RedTachyon · 2022-04-14T19:12:31Z

gym/utils/play.py

+        return relevant_keys
+
+    def _get_video_size(self, zoom: float = None) -> Tuple[int, int]:
+        rendered = self.env.render(mode="rgb_array")


Can you put like a TODO here so that we remember to update this when the render API change goes through?

RedTachyon · 2022-04-14T19:13:37Z

gym/utils/play.py

+        if event.type == pygame.KEYDOWN:
+            if event.key in self.relevant_keys:
+                self.pressed_keys.append(event.key)
+            elif event.key == 27:


What is 27? Ideally replace this with some constant that already might exist in pygame, or define it yourself, or at the very least add a comment

The whole _get_relevant_keys() was extracted "as is" from the old play(). Didn't want to change it before having all tests in place but this is safe enough to replace with a pygame constants 😅 (it should be the exit key)

RedTachyon · 2022-04-14T19:16:24Z

gym/utils/play.py

+            self.screen = pygame.display.set_mode(self.video_size)
+
+
 def display_arr(screen, arr, video_size, transpose):


Can you add type hints here and to play()?

RedTachyon · 2022-04-14T19:18:19Z

gym/utils/play.py

            obs = env.reset()
        else:
-            action = keys_to_action.get(tuple(sorted(pressed_keys)), 0)
+            action = keys_to_action.get(tuple(sorted(game.pressed_keys)), 0)


Can we always safely assume that 0 is a good default action? (if I'm understanding this correctly)
Maybe this should be part of the specification?

In particular this seems like it'd crash with any continuous-action environments?

Yes it is actually the default action if no relevant keys are pressed; I have the same doubt regarding 0, I will explore it

@RedTachyon About this I think we can safely say that, since we can not map a continuos-action environment to discrete keyboard presses it is ok to leave 0 as default because play() will crash before since there is no keys_to_action. (Maybe we could add an explicit message for continuos-action environment?)
An example of a keys to action mapping it is indeed this:

gym/gym/envs/classic_control/mountain_car.py

Lines 240 to 242 in 5ae6bf9

def get_keys_to_action(self):

# Control with left and right arrow keys.

return {(): 1, (276,): 0, (275,): 2, (275, 276): 1}

RedTachyon · 2022-04-14T19:23:18Z

tests/utils/test_play.py

+def test_play_relevant_keys():
+    env = DummyPlayEnv()
+    game = PlayableGame(env, dummy_keys_to_action())
+    assert game.relevant_keys == {97, 100}


No magic numbers please

RedTachyon · 2022-04-14T19:26:24Z

tests/utils/test_play.py

+    assert game.pressed_keys == []
+
+
+def test_play_loop():


Is it possible to do the same test with some actual environment? Say, CartPole, define the game, input a bunch of random actions in a loop just to see that it doesn't crash.

For a more "advanced" version, if possible, you could instantiate one PlayableGame of an env with a fixed seed, and one normal env with the same fixed seed. You input the same sequence of actions through the game interface, and through the regular env.step, and check that the outcome is the same.

Yes, following what suggested by @pseudo-rnd-thoughts I'm adding a test with an actual environment. It's a little bit tricky to manage the Keydown and Keyup event with pygame but I'm on the way

gianlucadecola · 2022-04-14T21:06:03Z

Overall looks pretty good, left a bunch of comments. Have you tried actually playing this with some environments? As a "full" test which can't really be easily automated, but it's helpful to do it as a human.

Hey! Thanks for the reply, yes I had some fun trying to balance the cartpole myself 😄 I'm going to look into the comments 👍

gianlucadecola · 2022-04-15T14:19:41Z

Pushed a bunch of updates; still need to assess some things like set(sum(list(key) for key in keys_to_action), [])) and explore this:

Can we always safely assume that 0 is a good default action? (if I'm understanding this correctly) Maybe this should be part of the specification?

In particular this seems like it'd crash with any continuous-action environments?

gianlucadecola · 2022-04-16T10:12:55Z

@RedTachyon I think everything is in place now

pseudo-rnd-thoughts · 2022-04-17T17:28:55Z

Thanks @gianlucadecola I think the play function changes are complete.
Do you want to add a PlayPlot docstring, type hinting and test in this PR or another?

gianlucadecola · 2022-04-17T21:03:22Z

I think we can merge this and then I'll submit another PR assessing only the PlayPlot part as soon as I can manage to work on it 👍

pseudo-rnd-thoughts · 2022-04-18T09:37:02Z

@jkterry1 Looks good to merge

gianlucadecola added 6 commits April 10, 2022 10:26

refactoring play function. Tests for keys to action mapping.

87f34da

Add mocking pygame events.

7982347

partial event processing in class.

2a73571

pre-commit.

02b81e5

quit pygame after tests.

7c65be7

fix typos in functions names.

1b3485f

Add type hint.

48d5f07

Add test for play function.

3b9e08d

remove mockKeyEvent.

9a1d4b6

remove unused main code.

a6a7cd9

RedTachyon reviewed Apr 14, 2022

View reviewed changes

gianlucadecola added 5 commits April 15, 2022 12:33

Adding type hints.

42b5ead

catch custom exception in tests.

7dabd13

Fix magic numbers.

97fa4e9

Add test with an actual environment.

808d0a1

fix comment.

bf4d2d8

gianlucadecola added 2 commits April 16, 2022 00:07

Add TODO memo on env.render.

52b37cc

change map with list comprehension.

5a8f6a4

gianlucadecola added 2 commits April 16, 2022 00:27

remove unused imports.

b5aae1b

Add type hint.

e9d333b

gianlucadecola added 2 commits April 16, 2022 12:14

typo.

b0081c2

docstring.

a02323c

jkterry1 merged commit 36a7fe5 into openai:master Apr 18, 2022

gianlucadecola mentioned this pull request Apr 29, 2022

Adding docstring and tests to PlayPlot #2783

Closed



		class PlayableGame:
		def __init__(self, env: Env, keys_to_action: dict = None, zoom: float = None):

	def get_keys_to_action(self):
	# Control with left and right arrow keys.
	return {(): 1, (276,): 0, (275,): 2, (275, 276): 1}

	\| Num \| Observation \| Value \| Unit \|
	\|-----\|-------------------------------------------------------------\|---------\|------\|
	\| 0 \| Accelerate to the left \| Inf \| position (m) \|
	\| 1 \| Don't accelerate \| Inf \| position (m) \|
	\| 2 \| Accelerate to the right \| Inf \| position (m) \|

	keys_to_action: dict: tuple(int) -> int or None
	Mapping from keys pressed to action performed.
	For example if pressed 'w' and space at the same time is supposed
	to trigger action number 2 then key_to_action dict would look like this:

	{
	# ...
	sorted(ord('w'), ord(' ')) -> 2
	# ...
	}
	If None, default key_to_action mapping for that env is used, if provided.

		self.screen = pygame.display.set_mode(self.video_size)


		def display_arr(screen, arr, video_size, transpose):

Add test gym utils play. Fix #2729 #2743

Add test gym utils play. Fix #2729 #2743

Uh oh!

Conversation

gianlucadecola commented Apr 10, 2022

Uh oh!

pseudo-rnd-thoughts commented Apr 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pseudo-rnd-thoughts commented Apr 11, 2022

Uh oh!

gianlucadecola commented Apr 11, 2022

Uh oh!

pseudo-rnd-thoughts commented Apr 12, 2022

Uh oh!

pseudo-rnd-thoughts commented Apr 12, 2022

Uh oh!

gianlucadecola commented Apr 12, 2022

Uh oh!

pseudo-rnd-thoughts commented Apr 12, 2022

Uh oh!

gianlucadecola commented Apr 13, 2022

Uh oh!

pseudo-rnd-thoughts commented Apr 13, 2022

Uh oh!

jkterry1 commented Apr 13, 2022

Uh oh!

RedTachyon left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gianlucadecola commented Apr 14, 2022

Uh oh!

gianlucadecola commented Apr 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gianlucadecola commented Apr 16, 2022

Uh oh!

pseudo-rnd-thoughts commented Apr 17, 2022

Uh oh!

gianlucadecola commented Apr 17, 2022

Uh oh!

pseudo-rnd-thoughts commented Apr 18, 2022

Uh oh!

Reviewers

Assignees

Labels

pseudo-rnd-thoughts commented Apr 11, 2022 •

edited

Loading

gianlucadecola commented Apr 15, 2022 •

edited

Loading