Skip to content

Conversation

@manjavacas
Copy link
Contributor

@manjavacas manjavacas commented Jan 12, 2023

Description

According to #346, the Polyak update rate (tau) has been included as an argument to dqn.py and dqn_atari.py implementations.

It will allow soft updates of the target network, as already done for dqn_jax.py.

Types of changes

  • Bug fix
  • New feature
  • New algorithm
  • Documentation

Checklist:

  • I've read the CONTRIBUTION guide (required).
  • I have ensured pre-commit run --all-files passes (required).
  • I have updated the documentation and previewed the changes via mkdocs serve.
  • I have updated the tests accordingly (if applicable).

If you are adding new algorithm variants or your change could result in performance difference, you may need to (re-)run tracked experiments. See #137 as an example PR.

  • I have contacted vwxyzjn to obtain access to the openrlbenchmark W&B team (required).
  • I have tracked applicable experiments in openrlbenchmark/cleanrl with --capture-video flag toggled on (required).
  • I have added additional documentation and previewed the changes via mkdocs serve.
    • I have explained note-worthy implementation details.
    • I have explained the logged metrics.
    • I have added links to the original paper and related papers (if applicable).
    • I have added links to the PR related to the algorithm variant.
    • I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
    • I have added the learning curves (in PNG format).
    • I have added links to the tracked experiments.
    • I have updated the overview sections at the docs and the repo
  • I have updated the tests accordingly (if applicable).

@vercel
Copy link

vercel bot commented Jan 12, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
cleanrl ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Jan 16, 2023 at 9:37PM (UTC)

@manjavacas manjavacas marked this pull request as ready for review January 12, 2023 16:24
Copy link
Owner

@vwxyzjn vwxyzjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. Could you add a note in implementation details section in the documentation /docs/rl-algorithms/dqn.md about that we implement target network update as a polyak update?

@manjavacas
Copy link
Contributor Author

Sure! I have updated the documentation. Please let me know if you need anything else 😃

@manjavacas manjavacas requested a review from vwxyzjn January 13, 2023 11:43
Copy link
Owner

@vwxyzjn vwxyzjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the docs. I just realized this — would you mind adding the same --tau argument to dqn_jax.py and dqn_atari_jax.py? I think you just need to replace

q_state = q_state.replace(target_params=optax.incremental_update(q_state.params, q_state.target_params, 1))

 q_state = q_state.replace(target_params=optax.incremental_update(q_state.params, q_state.target_params, args.tau)) 

Similarly, please update docs to reflect this change.

@vwxyzjn
Copy link
Owner

vwxyzjn commented Jan 16, 2023

Thanks for the contribution! @manjavacas

@vwxyzjn
Copy link
Owner

vwxyzjn commented Jan 16, 2023

This PR closes #346.

@vwxyzjn vwxyzjn merged commit d43e70d into vwxyzjn:master Jan 16, 2023
@vwxyzjn vwxyzjn mentioned this pull request Jan 16, 2023
1 task
@manjavacas
Copy link
Contributor Author

Thanks for the contribution! @manjavacas

Thanks to you for this great project!
Looking forward to continue contributing :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants