Skip to content

Conversation

@0x404
Copy link
Collaborator

@0x404 0x404 commented Apr 15, 2025

Currently the naive reward manager will return reward_extra_info if some reward function reward a dict, and return_dict in manager is set to True.

This PR enable PrimeRewardManager to have the same behavior to compatible with the naive manger, also, this can fix if the prime manager use some reward function that return a dict.

Fix #1070

@TranSirius
Copy link

Great, this snippet of codes works for me

@0x404
Copy link
Collaborator Author

0x404 commented Apr 24, 2025

Hi @eric-haibin-lin, could you help take a look of this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PRIME reward manager implementation out of sync

3 participants