| ✅ Proximal Policy Gradient (PPO) |
:material-github: ppo.py, :material-file-document: docs |
|
:material-github: ppo_atari.py, :material-file-document: docs |
|
:material-github: ppo_continuous_action.py, :material-file-document: docs |
|
:material-github: ppo_atari_lstm.py, :material-file-document: docs |
|
:material-github: ppo_atari_envpool.py, :material-file-document: docs |
|
:material-github: ppo_atari_envpool_xla_jax.py, :material-file-document: docs |
|
:material-github: ppo_atari_envpool_xla_jax_scan.py, :material-file-document: docs |
|
:material-github: ppo_procgen.py, :material-file-document: docs |
|
:material-github: ppo_atari_multigpu.py, :material-file-document: docs |
|
:material-github: ppo_pettingzoo_ma_atari.py, :material-file-document: docs |
|
:material-github: ppo_continuous_action_isaacgym.py, :material-file-document: docs |
| ✅ Deep Q-Learning (DQN) |
:material-github: dqn.py, :material-file-document: docs |
|
:material-github: dqn_atari.py, :material-file-document: docs |
|
:material-github: dqn_jax.py, :material-file-document: docs |
|
:material-github: dqn_atari_jax.py, :material-file-document: docs |
| ✅ Categorical DQN (C51) |
:material-github: c51.py, :material-file-document: docs |
|
:material-github: c51_atari.py, :material-file-document: docs |
|
:material-github: c51_jax.py, :material-file-document: docs |
|
:material-github: c51_atari_jax.py, :material-file-document: docs |
| ✅ Soft Actor-Critic (SAC) |
:material-github: sac_continuous_action.py, :material-file-document: docs |
|
:material-github: sac_atari.py, :material-file-document: docs |
| ✅ Deep Deterministic Policy Gradient (DDPG) |
:material-github: ddpg_continuous_action.py, :material-file-document: docs |
|
:material-github: ddpg_continuous_action_jax.py, :material-file-document: docs |
| ✅ Twin Delayed Deep Deterministic Policy Gradient (TD3) |
:material-github: td3_continuous_action.py, :material-file-document: docs |
|
:material-github: td3_continuous_action_jax.py, :material-file-document: docs |
| ✅ Phasic Policy Gradient (PPG) |
:material-github: ppg_procgen.py, :material-file-document: docs |
| ✅ Random Network Distillation (RND) |
:material-github: ppo_rnd_envpool.py, :material-file-document: docs |
| ✅ Qdagger |
:material-github: qdagger_dqn_atari_impalacnn.py, :material-file-document: docs |
|
:material-github: qdagger_dqn_atari_jax_impalacnn.py, :material-file-document: docs |