GitHub - ShangtongZhang/DeepRL at MVPI

Name	Name	Last commit message	Last commit date
Latest commit History 392 Commits
deep_rl	deep_rl
images	images
.gitignore	.gitignore
Dockerfile	Dockerfile
LICENSE	LICENSE
README.md	README.md
docker_batch.sh	docker_batch.sh
docker_build.sh	docker_build.sh
docker_clean.sh	docker_clean.sh
docker_python.sh	docker_python.sh
docker_shell.sh	docker_shell.sh
docker_stop.sh	docker_stop.sh
examples.py	examples.py
requirements.txt	requirements.txt
setup.py	setup.py
template_jobs.py	template_jobs.py
template_plot.py	template_plot.py

Name

Last commit message

Last commit date

This branch is the code for the paper

Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang, Bo Liu, Shimon Whiteson (AAAI 2021)

.
├── Dockerfile                                      # Dependencies
├── requirements.txt                                # Dependencies
├── template_jobs.py                                # Entrance for the experiments
|   ├── mvpi_td3_continuous                         # MVPI-TD3 / TD3 calling
|   ├── var_ppo_continuous                          # TRVO calling
|   ├── mvp_continuous                              # MVP calling
|   ├── risk_a2c_continuous                         # Prashanth baseline calling
|   ├── tamar_continuous                            # Tamar baseline calling
|   ├── off_policy_mvpi                             # Offline MVPI calling
├── deep_rl/agent/MVPITD3_agent.py                  # MVPI-TD3 / TD3 implementation 
├── deep_rl/agent/VarPPO_agent.py                   # TRVO implementation 
├── deep_rl/agent/MVP_agent.py                      # MVP implementation 
├── deep_rl/agent/RiskA2C_agent.py                  # Prashanth baseline implementation 
├── deep_rl/agent/Tamar_agent.py                    # Tamar baseline implementation 
├── deep_rl/agent/OffPolicyMVPI_agent.py            # Offline MVPI implementation 
└── template_plot.py                                # Plotting

I can send the data for plotting via email upon request.

This branch is based on the DeepRL codebase and is left unchanged after I completed the paper. Algorithm implementations not used in the paper may be broken and should never be used. It may take extra effort if you want to rebase/merge the master branch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Uh oh!

Contributors 3

Uh oh!

Languages

License

ShangtongZhang/DeepRL

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 3

Uh oh!

Languages