-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[rollout] fix: add missing extra_reward_info to AgentLoopOuput #3194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rollout] fix: add missing extra_reward_info to AgentLoopOuput #3194
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to add extra_reward_info to the AgentLoopOutput for metrics calculation. The changes are propagated through the agent loop, reward managers, and tests. While the overall direction is correct, I've found two critical issues in verl/experimental/agent_loop/agent_loop.py that could lead to runtime errors or incorrect behavior due to improper handling of aggregated data from multiple sources. My review provides specific fixes for these issues.
…ngine#3194) ### What does this PR do? Fix volcengine#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.
…ngine#3194) ### What does this PR do? Fix volcengine#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.
…ngine#3194) ### What does this PR do? Fix volcengine#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.
…ngine#3194) ### What does this PR do? Fix volcengine#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.
…ngine#3194) ### What does this PR do? Fix volcengine#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.
…ngine#3194) ### What does this PR do? Fix volcengine#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.
…ngine#3194) ### What does this PR do? Fix volcengine#3055, add missing `extra_reward_info` to AgentLoopOuput, which is needed by metrics calculation.
What does this PR do?
Fix #3055, add missing
extra_reward_infoto AgentLoopOuput, which is needed by metrics calculation.