Reproducibility study on Performative Reinforcement Learning

This repository contains code to reproduce and extend the results of the paper Performative Reinforcement Learning.

The original repository is available here: https://github.com/gradanovic/icml2023-performative-rl-paper-code.

Overview

This repository extends the original repository by the following features:

feature which estimates the rewards and transition probabilities using samples from the ocupancy measure at each iteration. (no trajectories)

python run_experiment.py --sampling --occupancy_iid

additional plots show the state space coverage (Specify the iterations printed according to the run_experiment.py file. By default the first 10 iterations will be printed.)
additional plots show the trajectory length
additional files provide the transition probabilities for the main agent and the follower agents. These files are only generated for the last iteration.

Structure of the repository

src/ : This folder contains all the source code files required for generating the experiments' data and figures.
data/ : This folder is where all the data will be generated.
figures/ : This folder is where all the figures will be generated.
limiting_envs/ : This folder is for storing visualizations of the environment.

Prerequisites:

Python3
matplotlib
seaborn
numpy
copy
itertools
time
cvxpy
cvxopt
click
multiprocessing
statistics
json
contextlib
joblib
tqdm
os
cmath

Running the code

To replicate the paper exactly as we did please run the with the following specifications.

Repeated Policy Optimization (Fig. 2)

python run_experiment.py --fbeta=10

Repeated Gradient Ascent (Fig. 3)

python run_experiment.py --gradient

Repeated Policy Optimization with Finite Samples (Fig. 4a)

python run_experiment.py --sampling

Solving Lagrangian (Fig. 4b)

python run_experiment.py --sampling --lagrangian

Sampling from the occupancy measure (Fig. 5)

python run_experiment.py --sampling --occupancy_iid

Results

After running the above scripts, new plots will be generated in the figures directory. The output data and the transition probabilities are generated in the data directory.

Contact Details

For any questions or comments, contact [email protected] or [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
data		data
figures		figures
limiting_envs		limiting_envs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
run_experiment.py		run_experiment.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reproducibility study on Performative Reinforcement Learning

Overview

Structure of the repository

Prerequisites:

Running the code

Repeated Policy Optimization (Fig. 2)

Repeated Gradient Ascent (Fig. 3)

Repeated Policy Optimization with Finite Samples (Fig. 4a)

Solving Lagrangian (Fig. 4b)

Sampling from the occupancy measure (Fig. 5)

Results

Contact Details

About

Releases

Packages

Languages

License

timostenz/performativerl

Folders and files

Latest commit

History

Repository files navigation

Reproducibility study on Performative Reinforcement Learning

Overview

Structure of the repository

Prerequisites:

Running the code

Repeated Policy Optimization (Fig. 2)

Repeated Gradient Ascent (Fig. 3)

Repeated Policy Optimization with Finite Samples (Fig. 4a)

Solving Lagrangian (Fig. 4b)

Sampling from the occupancy measure (Fig. 5)

Results

Contact Details

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages