Skip to content

Implementation of actor-critic algorithms using ray core.

Notifications You must be signed in to change notification settings

mjadiaz/actor-critic-ray

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Actor-Critic algorithms with Ray core

Asynchronous Advantage Actor-Critic 1

In A3C multiple agents asynchronously run in parallel to generate data. This approach provides a more practical alternative to experience replay since parallelization also diversifies and decorrelates the data 2.

there is a global network and many worker agents that each has its own parameters. Each of these agents interacts with its copy of the environment simultaneously as the other agents are interacting with their environments, and updates independently of the execution of other agents when they want to update their shared network

We use a parameter server to hold the global network, following 3

Todo

  • Implement the evaluation in n-step
  • Implement continuous mode
  • Implement the config for hyper parameter configuration
  • Add explanation of a3c

Footnotes

  1. Asynchronous Methods for Deep Reinforcement Learning

  2. Distributed Deep Reinforcement Learning: An Overview

  3. Parameter server Ray documentation

About

Implementation of actor-critic algorithms using ray core.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages