Fix DQN target update frequency #323

qgallouedec · 2022-11-22T14:51:50Z

Description

Closes #322

Types of changes

Bug fix
New feature
New algorithm
Documentation

Checklist:

I've read the CONTRIBUTION guide (required).
I have ensured pre-commit run --all-files passes (required).
I have updated the documentation and previewed the changes via mkdocs serve.
I have updated the tests accordingly (if applicable).

If you are adding new algorithm variants or your change could result in performance difference, you may need to (re-)run tracked experiments. See #137 as an example PR.

vercel · 2022-11-22T14:51:55Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
cleanrl	✅ Ready (Inspect)	Visit Preview	Nov 22, 2022 at 3:14PM (UTC)

vwxyzjn

@qgallouedec thanks so much for the fix!

cleanrl/dqn.py

vwxyzjn

LGTM. Given that this change does not impact the performance of our algorithm variants using their default parameters, we do not need to re-run the benchmark for these variants.

vwxyzjn · 2022-11-22T15:35:08Z

Thanks @qgallouedec for this catch.

* add draft of SAC discrete implementation * run pre-commit * Use log softmax instead of author's log-pi code * Revert to cleanrl SAC delay implementation (it's more stable) * Remove docstrings and duplicate code * Use correct clipreward wrapper * fix bug in log softmax calculation * adhere to cleanrl log_prob naming * fix bug in entropy target calculation * change layer initialization to match existing cleanrl codebase * working minimal diff version * implement original learning update frequency * parameterize the entropy scale for autotuning * add benchmarking script * rename target entropy factor and set new default value * add docs draft * fix SAC-discrete links to work pre merge * add preliminary result table for SAC-discrete * clean up todos and add header * minimize diff between sac_atari and sac_continuous * add sac-discrete end2end test * SAC-discrete docs rework * Update SAC-discrete @100k results * Fix doc links and unify naming in code * update docs * fix target update frequency (see PR #323) * clarify comment regarding CNN encoder sharing * fix benchmark installation * fix eps in minimal diff version and improve code readability * add docs for eps and finalize code * use no_grad for actor Q-vals and re-use action-probs & log-probs in alpha loss * update docs for new code and settings * fix links to point to main branch * update sac-discrete training plots * new sac-d training plots * update results table and fix link * fix pong chart title * add Jimmy Ba name as exception to code spell check * change target_entropy_scale default value to same value as experiments * remove blank line at end of pre-commit Co-authored-by: Costa Huang <[email protected]>

Fix target update freq

be0e514

vercel bot deployed to Preview November 22, 2022 14:52 View deployment

vwxyzjn requested changes Nov 22, 2022

View reviewed changes

cleanrl/dqn.py Outdated Show resolved Hide resolved

Ensure target is update after learning start

7c7e272

vercel bot deployed to Preview November 22, 2022 15:14 View deployment

vwxyzjn approved these changes Nov 22, 2022

View reviewed changes

vwxyzjn merged commit c515aef into vwxyzjn:master Nov 22, 2022

qgallouedec deleted the fix-target-update-freq branch November 22, 2022 15:36

timoklein added a commit to timoklein/cleanrl that referenced this pull request Nov 24, 2022

fix target update frequency (see PR vwxyzjn#323)

3a3f41b

timoklein mentioned this pull request Nov 24, 2022

SAC-discrete implementation #270

Merged

20 tasks

sdpkjc mentioned this pull request Dec 2, 2022

Fix sync_target sdpkjc/abcdrl#28

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix DQN target update frequency #323

Fix DQN target update frequency #323

Uh oh!

qgallouedec commented Nov 22, 2022 •

edited

Loading

Uh oh!

vercel bot commented Nov 22, 2022 •

edited

Loading

Uh oh!

vwxyzjn left a comment

Uh oh!

Uh oh!

vwxyzjn left a comment

Uh oh!

vwxyzjn commented Nov 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix DQN target update frequency #323

Fix DQN target update frequency #323

Uh oh!

Conversation

qgallouedec commented Nov 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Types of changes

Checklist:

Uh oh!

vercel bot commented Nov 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vwxyzjn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vwxyzjn left a comment

Choose a reason for hiding this comment

Uh oh!

vwxyzjn commented Nov 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

qgallouedec commented Nov 22, 2022 •

edited

Loading

vercel bot commented Nov 22, 2022 •

edited

Loading