A PyTorch reimplementation of MBPO (When to trust your model: model-based policy optimization)
The owner of this repo has graduated and this repo is no longer maintained. Please refer to this new MBPO Pytorch re-implementation, which is a submodule of the Unstable Baselines project maintained by researchers from the same lab. This new MBPO re-implementation strictly follows the original TF implementation and has been tested on several MuJoCo tasks.
Please refer to ./requirements.txt.
pip install -e .
# default hyperparams in ./configs/mbpo.yaml
# remember to CHANGE proj_dir to your actual directory
python ./mbpo_pytorch/scripts/run_mbpo.py
# you can also overwrite hyperparams by passing args, e.g.
python ./mbpo_pytorch/scripts/run_mbpo.py --set seed=0 verbose=1 device="'cuda:0'" env.env_name='FixedHopper'