This Repository Will Implement the Classic Deep Reinforcement Learning Algorithms.
- Deep Q-Learning Network(DQN)
- Double DQN(DDQN)
- Dueling Network Architecture
- Deep Deterministic Policy Gradient(DDPG)
- Normalized Advantage Function(NAF)
- Asynchronous Advantage Actor-Critic(A3C)
- Trust Region Policy Optimization(TRPO)
- Proximal Policy Optimization(PPO)
- Actor Critic using Kronecker-Factored Trust Region(ACKTR)
I has already implemented five of these algorithms. I will implement the rest of algorithms and keep update them in the future.
In this repository, the actions are sampled from the beta distribution which could improve the performance. The paper about this is: The Beta Policy for Continuous Control Reinforcement Learning
However, I can't calculate the Back-Propagation of Beta Distribution's Entropy. If someone has the solution of it, please contact me.
The instruction has been introduced in each repository. In the future, I will revise them and use a common format.
[1] A Brief Survey of Deep Reinforcement Learning
[2] The Beta Policy for Continuous Control Reinforcement Learning
[3] Playing Atari with Deep Reinforcement Learning
[4] Deep Reinforcement Learning with Double Q-learning
[5] Dueling Network Architectures for Deep Reinforcement Learning
[6] Continuous control with deep reinforcement learning
[7] Continuous Deep Q-Learning with Model-based Acceleration
[8] Asynchronous Methods for Deep Reinforcement Learning
[9] Trust Region Policy Optimization
[10] Proximal Policy Optimization Algorithms
[11] Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation