-
Notifications
You must be signed in to change notification settings - Fork 8.7k
Add test gym utils play. Fix #2729 #2743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test gym utils play. Fix #2729 #2743
Conversation
|
Thanks for making the PR, couple of things Would you be able to add typing to the PlayableGame or pydocs for people using the class to be able to know what expected information is? |
|
Also, you don't seem to have any tests for the primary |
|
Added type hint 😃 |
|
I'm not experienced with pygame so this may not work |
Well, it actually worked pretty well 🎉 Really good hint |
|
Wow, I was half expecting that to not work, thanks for doing that You probably have a reason but
Extra test ideas Finally, if you think it is possible to add a test for |
I didn't like the idea of test being tied to a particular environment. It seems more generic to me to have the dummy just for testing purpose of the play function
This can actually be removed and replaced with an actual pygame event, I will commit the fix
Good point on .3; I will try to look into PlayPlot, also 👍 |
|
I understand your point about using an abstract implementation (a dummy environment) and actual implementation (i.e. cartpole) but I see two possible problems
Thanks for making those changes and looking into PlayPlot tests |
|
to answer an above question, removing old code that doesn't do anything is fine |
RedTachyon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks pretty good, left a bunch of comments. Have you tried actually playing this with some environments? As a "full" test which can't really be easily automated, but it's helpful to do it as a human.
gym/utils/play.py
Outdated
|
|
||
|
|
||
| class PlayableGame: | ||
| def __init__(self, env: Env, keys_to_action: dict = None, zoom: float = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keys_to_action: Optional[dict]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better if we can specify the key/value types on the dict (I guess values are arbitrary actions, but keys should be keypresses? how are they represented?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's Dict[Tuple[int], int] where the key is (ord(key),)
gym/gym/envs/classic_control/mountain_car.py
Lines 240 to 242 in 5ae6bf9
| def get_keys_to_action(self): | |
| # Control with left and right arrow keys. | |
| return {(): 1, (276,): 0, (275,): 2, (275, 276): 1} |
reference of the actions:
gym/gym/envs/classic_control/mountain_car.py
Lines 50 to 54 in 5ae6bf9
| | Num | Observation | Value | Unit | | |
| |-----|-------------------------------------------------------------|---------|------| | |
| | 0 | Accelerate to the left | Inf | position (m) | | |
| | 1 | Don't accelerate | Inf | position (m) | | |
| | 2 | Accelerate to the right | Inf | position (m) | |
Lines 125 to 135 in 9a1d4b6
| keys_to_action: dict: tuple(int) -> int or None | |
| Mapping from keys pressed to action performed. | |
| For example if pressed 'w' and space at the same time is supposed | |
| to trigger action number 2 then key_to_action dict would look like this: | |
| { | |
| # ... | |
| sorted(ord('w'), ord(' ')) -> 2 | |
| # ... | |
| } | |
| If None, default key_to_action mapping for that env is used, if provided. |
gym/utils/play.py
Outdated
| self.pressed_keys = [] | ||
| self.running = True | ||
|
|
||
| def _get_relevant_keys(self, keys_to_action: dict) -> set: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional[dict]
gym/utils/play.py
Outdated
| elif hasattr(self.env.unwrapped, "get_keys_to_action"): | ||
| keys_to_action = self.env.unwrapped.get_keys_to_action() | ||
| else: | ||
| assert False, ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly raise some Exception instead of a failing assert, you can either use one of the existing exceptions like AttributeError or something; or make a new one in this file to make it more descriptive.
gym/utils/play.py
Outdated
| + " does not have explicit key to action mapping, " | ||
| + "please specify one manually" | ||
| ) | ||
| relevant_keys = set(sum(map(list, keys_to_action.keys()), [])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
map is nice (functional programming is great), but in Python it's typically more readable to use a comprehension, so it'd be something like
set(sum(list(key) for key in keys_to_action), []))
But is this actually what we want this to do? I'm not sure what each key is meant to be (see previous comment about types), so take another look at the logic here. Intuition tells me that you might have wanted to just do set(keys_to_actions.keys()) or something like that (which might be equivalent to just set(keys_to_actions))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to explore it more in depth, this logic was already there in the old play(), it seems also to me that set(keys_to_actions.keys()) may be fine
| return relevant_keys | ||
|
|
||
| def _get_video_size(self, zoom: float = None) -> Tuple[int, int]: | ||
| rendered = self.env.render(mode="rgb_array") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put like a TODO here so that we remember to update this when the render API change goes through?
gym/utils/play.py
Outdated
| if event.type == pygame.KEYDOWN: | ||
| if event.key in self.relevant_keys: | ||
| self.pressed_keys.append(event.key) | ||
| elif event.key == 27: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is 27? Ideally replace this with some constant that already might exist in pygame, or define it yourself, or at the very least add a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole _get_relevant_keys() was extracted "as is" from the old play(). Didn't want to change it before having all tests in place but this is safe enough to replace with a pygame constants 😅 (it should be the exit key)
gym/utils/play.py
Outdated
| self.screen = pygame.display.set_mode(self.video_size) | ||
|
|
||
|
|
||
| def display_arr(screen, arr, video_size, transpose): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add type hints here and to play()?
| obs = env.reset() | ||
| else: | ||
| action = keys_to_action.get(tuple(sorted(pressed_keys)), 0) | ||
| action = keys_to_action.get(tuple(sorted(game.pressed_keys)), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we always safely assume that 0 is a good default action? (if I'm understanding this correctly)
Maybe this should be part of the specification?
In particular this seems like it'd crash with any continuous-action environments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is actually the default action if no relevant keys are pressed; I have the same doubt regarding 0, I will explore it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RedTachyon About this I think we can safely say that, since we can not map a continuos-action environment to discrete keyboard presses it is ok to leave 0 as default because play() will crash before since there is no keys_to_action. (Maybe we could add an explicit message for continuos-action environment?)
An example of a keys to action mapping it is indeed this:
gym/gym/envs/classic_control/mountain_car.py
Lines 240 to 242 in 5ae6bf9
| def get_keys_to_action(self): | |
| # Control with left and right arrow keys. | |
| return {(): 1, (276,): 0, (275,): 2, (275, 276): 1} |
tests/utils/test_play.py
Outdated
| def test_play_relevant_keys(): | ||
| env = DummyPlayEnv() | ||
| game = PlayableGame(env, dummy_keys_to_action()) | ||
| assert game.relevant_keys == {97, 100} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No magic numbers please
| assert game.pressed_keys == [] | ||
|
|
||
|
|
||
| def test_play_loop(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to do the same test with some actual environment? Say, CartPole, define the game, input a bunch of random actions in a loop just to see that it doesn't crash.
For a more "advanced" version, if possible, you could instantiate one PlayableGame of an env with a fixed seed, and one normal env with the same fixed seed. You input the same sequence of actions through the game interface, and through the regular env.step, and check that the outcome is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, following what suggested by @pseudo-rnd-thoughts I'm adding a test with an actual environment. It's a little bit tricky to manage the Keydown and Keyup event with pygame but I'm on the way
Hey! Thanks for the reply, yes I had some fun trying to balance the cartpole myself 😄 I'm going to look into the comments 👍 |
|
Pushed a bunch of updates; still need to assess some things like
|
|
@RedTachyon I think everything is in place now |
|
Thanks @gianlucadecola I think the play function changes are complete. |
|
I think we can merge this and then I'll submit another PR assessing only the PlayPlot part as soon as I can manage to work on it 👍 |
|
@jkterry1 Looks good to merge |
Referring to #2729; to add test to gym.utils.play I have refactored the play() script and extracted a PlayableGame class to have more separation and easier testing.
Running headless the whole game is possible but it seems that pygame events are not available (https://www.pygame.org/wiki/HeadlessNoWindowsNeeded) so I prefer to mock them in tests and test the PlayableGame class