Skip to content

Maramy93/Deep_RL_Project

 
 

Repository files navigation

Problem understanding

Environment description

A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force of +1 or -1 to the cart. The pendulum starts upright, and the goal is to prevent it from falling over. A reward of +1 is provided for every timestep that the pole remains upright. The episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center. More information can be founded here . The figure below illustrates:

Goal of the paper

In this paper, we we used Deep Q-network (DQN) and Dueling DQN agent control to CartPole v1 sytem. The main Papers that we used are: DQN and Dueling DQN .

The problem is to prevent the vertical bar from falling by moving the car left or right (these represent the action space). To solve the problem CartPole v1 description , the agent needs to receive an average total reward greater or equal to $475$ over $100$ consecutive episodes. As the figure below shows:

Open In Colab

Install the Project

$ git clone https://github.com/benjaminbenteke/Deep_RL_Project.git 
$ cd Deep_RL_Project

Virtual environment

Mac OS

Create virtual environment

$ python3 -m venv ENV_NAME

Activate your environment

$ source ENV_NAME/bin/activate

Linux OS

Create virtual environment

$ conda create -n venv ENV_NAME

Activate your environment

$ activate ENV_NAME

Requirement installations

To run this, make sure to install all the requirements by:

$ pip install -r requirements.txt 

Running the model

$ python main.py --model MODEL_NAME

Example of running models

$ python main.py --model dqn
$ python main.py --model dueling

Results Presentation

DQN result
caption

Dueling DQN result

caption

Contributors

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 96.9%
  • Python 3.1%