Skip to content

Reproducing experiments

Sebastian Kochman edited this page Nov 21, 2022 · 21 revisions

This wiki provides instructions on how to reproduce most of the experiments presented in the NeurIPS 2022 Offline RL workshop paper ``Towards Data-Driven Offline Simulations for Online Reinforcement Learning'' by Shengpu Tang, Felipe Vieira Frujeri, Dipendra Misra, Alex Lamb, John Langford, Paul Mineiro, Sebastian Kochman.

Figure 2: Illustrative Example of Evaluation Protocol

image

Figure 2 in the paper illustrates fidelity vs efficiency trade-off between different simulations. See appendix B.1. in the paper for details.

To see how this figure was produced, see notebook notebooks/metrics.ipynb.

Figure 4: state encoder

image

TODO: can we add a bit more explanations to the steps so they are easier to understand and the reader can pick which steps they would like to execute?

Collecting Continuous Grid data through a random policy

python examples/continuous_grid/random_agent_rollout.py

Visualize state visitation in the collected dataset

  • To visualize the state visitation in your dataset, use the to the notebook

Using the observation encoder (state decoder) to generate latent states

  • We made available a model checkpoint for the HOMER based encoder here.

  • To visualize the latent state representation captured by the encoder, use the to the same notebook as before

  • To train the encoder from scratch use this script with the following configurations:

python examples/continuous_grid/train_homer_encoder.py --num_epochs=1000 --seed=0 --batch_size=64 --latent_size=50 --hidden_size=64 --lr=1e-3 --weight_decay=0.0 --temperature_decay=False --output_dir='outputs/models' --num_samples=100000

Clone this wiki locally