Skip to content

Implementations of Deep Reinforcement Learning Algorithms

Notifications You must be signed in to change notification settings

rohithmukku/deep_rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Reinforcement Learning Methods

This repository contains implementations of some of the popular DRL methods.

Methods

  • Deep Q-Networks (DQN)

  • Vanilla Policy Gradient (VPG)

  • Vanilla Actor Critic (VAC)

  • Advantage Actor Critic (A2C)

  • Natural Policy Gradient (NPG)

  • Proximal Policy Optimization (PPO)

  • Deep Deterministic Policy Gradient (DDPG)

  • Twin Delayed DDPG (TD3)

  • Asynchronous Advantage Actor Critic (A3C)

  • Soft Actor Critic (SAC)

    NOTE: These methods are not optimized. Only PPO, DDOG Work on continuous action space.

Environments

  • Classic OpenAI Gym

    Environment Observation Space Action Space
    CartPole-v1 Box, 4 Discrete, 2
    Pendulum-v0 Box, 3 Box, 1
    MountainCar-v0 Box, 2 Discrete, 3
    MountainCarContinuous-v0 Box, 2 Box, 1
    Acrobot-v1 Box, 6 Discrete, 3
  • MuJoCo

    Environment Observation Space Action Space
    Ant-v2 Box, 111 Box, 8
    HalfCheetah-v2 Box, 17 Box, 6
    Hopper-v2 Box, 11 Box, 3
    Humanoid-v2 Box, 376 Box, 17
    HumanoidStandup-v2 Box, 376 Box, 17
    InvertedDoublePendulum-v2 Box, 11 Box, 1
    InvertedPendulum-v2 Box, 4 Box, 1
    Reacher-v2 Box, 11 Box, 2
    Swimmer-v2 Box, 8 Box, 2
    Walker2d-v2 Box, 17 Box, 6
  • Robotics (Not implemented)

Installation

For MuJoCo installation, refer to these links:

Usage

$ python main.py --help

main is powered by Hydra.

== Configuration groups ==
Compose your configuration from those groups (group=option)

agent: dqn, vpg


== Config ==
Override anything in the config (foo.bar=value)

agent:
  _target_: agent.dqn.DQNAgent
  obs_dim: ???
  act_dim: ???
  buffer_size: 10000
  min_size: 100
  batch_size: 32
  epsilon: 1
  epsilon_decay: 0.95
  min_epsilon: 0.01
  gamma: 0.99
  alpha: 0.001
  target_update: 100
env: CartPole-v0
num_episodes: 1000
device: cuda
plot: false


Powered by Hydra (https://hydra.cc)
Use --hydra-help to view Hydra specific help

References

Blogs/Tutorials

Papers

Code

Todo

  • Implement SAC
  • Unit testing
  • Better logs, plots

About

Implementations of Deep Reinforcement Learning Algorithms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages