Skip to content

Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning (NeurIPS2021)

Notifications You must be signed in to change notification settings

frt03/inference-based-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning

[arxiv], Accepted at NeurIPS 2021.

This codebase includes inference-based off-policy algorithms, both KL control (SAC) and EM control (MPO, AWR, AWAC) methods.

If you use this codebase for your research, please cite the paper:

@inproceedings{furuta2021inference,
  title={Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning},
  author={Hiroki Furuta and Tadashi Kozuno and Tatsuya Matsushima and Yutaka Matsuo and Shixiang Shane Gu},
  booktitle = {Advances in Neural Information Processing Systems},
  year={2021}
}

Dependencies

We recommend you to use Docker. See README for setting up.

Examples

See examples for the details.

python train_sac.py exp=HalfCheetah-v2 seed=0 gpu=0
python train_mpo.py exp=HalfCheetah-v2 seed=0 gpu=0
python train_awr.py exp=HalfCheetah-v2 seed=0 gpu=0
python train_awac.py exp=HalfCheetah-v2 seed=0 gpu=0

For ablation experiments (ELU or LayerNorm), use following command:

python train_sac2.py gpu=0 seed=0 env=Ant-v2 actor.nn_size=256 critic.nn_size=256 agent.architecture='nn2' agent.activation='elu' agent.use_layer_norm=False

python train_sac2.py gpu=0 seed=0 env=Ant-v2 actor.nn_size=256 critic.nn_size=256 agent.architecture='nn2' agent.activation='relu' agent.use_layer_norm=True

python train_sac2.py gpu=0 seed=0 env=Ant-v2 actor.nn_size=256 critic.nn_size=256 agent.architecture='nn2' agent.activation='elu' agent.use_layer_norm=True

For MPO w/o ELU and LayerNorm:

python train_mpo2.py gpu=0 seed=0 exp=Ant-v2

Reference

This codebase is based on PFRL.

About

Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning (NeurIPS2021)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages