Actor-Critic algorithms with Ray core

Asynchronous Advantage Actor-Critic ¹

In A3C multiple agents asynchronously run in parallel to generate data. This approach provides a more practical alternative to experience replay since parallelization also diversifies and decorrelates the data ².

there is a global network and many worker agents that each has its own parameters. Each of these agents interacts with its copy of the environment simultaneously as the other agents are interacting with their environments, and updates independently of the execution of other agents when they want to update their shared network

We use a parameter server to hold the global network, following ³

Todo

Implement the evaluation in n-step
Implement continuous mode
Implement the config for hyper parameter configuration
Add explanation of a3c

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
resources		resources
runs		runs
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
networks.py		networks.py
run.py		run.py
utils.py		utils.py
worker.py		worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Actor-Critic algorithms with Ray core

Asynchronous Advantage Actor-Critic ¹

Todo

About

Releases

Packages

Languages

mjadiaz/actor-critic-ray

Folders and files

Latest commit

History

Repository files navigation

Actor-Critic algorithms with Ray core

Asynchronous Advantage Actor-Critic 1

Todo

Footnotes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Asynchronous Advantage Actor-Critic ¹

Packages