Skip to content

An attempt to reproduce the results of "Asynchronous Methods for Deep Reinforcement Learning" (http://arxiv.org/abs/1602.01783)

Notifications You must be signed in to change notification settings

nerdylinius/async-rl

 
 

Repository files navigation

Async-RL

This is a repository where I attempt to reproduce the results of Asynchronous Methods for Deep Reinforcement Learning. It's still work-in-progress and not so successfull compared to the original results.

Any feedback is welcome :)

Current Status

I trained A3C for ALE's Breakout with 8 processes for about 2 days and 5 hours. The scores of test runs along training are plotted below. One test run for every 100000 training steps (counted by the global shared counter).

A3C scores on Breakout

You can make the trained model to play Breakout by the following command:

python demo_a3c_ale.py <path-to-breakout-rom> trained_model/breakout_48100000.h5

Some Hyperparameters

  • RMSprop
  • learning rate: initialize with 3.5e-4 (policy) and 7e-4 (value function) and linearly decrease to zero
  • epsilon: 0.1 (epsilon is inside sqrt)
  • alpha: 0.99

Requirements

  • Python 3.5.1
  • chainer 1.8.1
  • cached-property 1.3.0
  • h5py 2.5.0
  • Arcade-Learning-Environment

Training

python a3c_ale.py <number-of-processes> <path-to-atari-rom>

a3c_ale.py will save best-so-far models and test scores into the output directory.

Evaluation

python demo_a3c_ale.py <path-to-atari-rom> <trained-model>

Similar Projects

About

An attempt to reproduce the results of "Asynchronous Methods for Deep Reinforcement Learning" (http://arxiv.org/abs/1602.01783)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%