This repo was designed to be run in the pytorch/pytorch docker image.
Clone the repo:
git clone https://github.com/persuck/memory-pixel-interp.git
Install deps:
./setup.sh
Or run the commands yourself:
pip install -r requirements.txt
# To get video logging to work you may need to install video codecs:
# check if "libx264" is installed:
ffmpeg -encoders | grep 264
# install them if necessary
# conda install -y -c conda-forge x264
conda install -y -c conda-forge x264=='1!164.3095' ffmpeg=6.1.1
Optional - make a virtual env before pip install:
python -m venv env && source ./env/bin/activate && pip install -r requirements.txt
VSCode:
- ⌘⇧P
- Extensions: Show Recommended Extensions
- ☁ Install Workspace Recommended Extensions
- Select default python env:
python 3.10.13 ('base')
Core components of this project:
-
memory agent: Uses a simplified view of the internal memory (raw RAM) of the game "breakout" as state.
-
pixel agent: Uses downscaled, greyscale screen pixels of the game "breakout" as state.
-
state extractor: The raw memory of breakout is a very large state space, so a script is used to extract key variables from memory (by selecting only the variables that change between frames, which are likely to be a useful part of the game state and not constants). Note that any information about the screen pixels (aka the "frame buffer") must be removed in order for the memory agent to learn strictly from the variables that make up the state, and not the pixel data (which is supposed to be the sole domain of the pixel agent)
-
Atari breakout: Emulation of the game "breakout", for the Atari, provided by gym.
[X] Create state extractor: Write a script to extract useful variables from emulator memory by tracking which variables change between frames (i.e. remove constants)
[X] Create memory agent for GBC breakout: Train RL model with PPO to play breakout using the extracted state
[X] Create pixel agent for GBC breakout: Train RL model with PPO to play breakout using downscaled, greyscale screen pixels from emulated breakout
[X] Compare performance (loss, and time taken to converge) of memory agent against pixel agent on GBC breakout
[ ] Once memory agent has identified key variables in GBC breakout RAM, identify function vectors in the pixel agent by randomly sampling these variables, and generating the corresponding screen pixels before feeding them into pixel agent and measuring activations
[ ] Visualise all activations in pixel agent, labelled with the variable (RAM address) they represent
[ ] Repeat the Atari breakout experiment across a wider variety of Atari games and extract features for all games
[ ] Use the pixel_agent's function vectors as a guide to which RAM addresses map to which parts of the screen, and annotate activations chart with pictures of the screen