To compile the tile-coding
library:
cd tile-coding
make
export PYTHONPATH=$PYTHONPATH:pathto/tile-coding
A collections of notebooks written by the students of McGill COMP-767, Intro to RL. We had a "bring your own assignment" model in which the students would create their own "assignment" related to the course material. The assignments would generally take the form of a Jupyter notebook exploring some questions empirically and/or theoretically.
With no particular order, a few awesome notebooks :
- Empirical study of $ETD(\lambda,\beta)$
- An Empirical Exploration of Provisional Temporal Difference Learning
- Synthetic gradients for REINFORCE
- A tile coder in theano for Reinforcement Learning tasks
- An implementation of LSTD(lambda), TD(0) and RLSTD(0) for the Boyan Chain problem
- Bias-Variance trade-off for Monte-Carlo methods
- An analysis of bias-variance tradeoff of Sarsa, Expected Sarsa, Double Sarsa, and Double Expected Sarsa
- Convergence visualization in the eigenbasis of $P_\pi$
The instructions are provided in the README.md
in :
The example code relies on memory overcommitment
which is rather useful to know about. The overcommit mode can be read/set via cat /proc/sys/vm/overcommit_memory
. From
the Kernel documentation :
1 - Always overcommit. Appropriate for some scientific applications. Classic example is code using sparse arrays and just relying on the virtual memory consisting almost entirely of zero pages.
- OpenAI Baselines
- The original Mountain Car code written by Richard Sutton Mountain Car Software
- The tile coding library used by the RLAI Tile Coding Software