A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force of +1 or -1 to the cart. The pendulum starts upright, and the goal is to prevent it from falling over. A reward of +1 is provided for every timestep that the pole remains upright. The episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center. More information can be founded here . The figure below illustrates:
In this paper, we we used Deep Q-network (DQN) and Dueling DQN agent control to CartPole v1 sytem. The main Papers that we used are: DQN and Dueling DQN .
The problem is to prevent the vertical bar from falling by moving the car left or right (these represent the action space). To solve the problem CartPole v1 description , the agent needs to receive an average total reward greater or equal to
$ git clone https://github.com/benjaminbenteke/Deep_RL_Project.git
$ cd Deep_RL_Project
$ python3 -m venv ENV_NAME
$ source ENV_NAME/bin/activate
$ conda create -n venv ENV_NAME
$ activate ENV_NAME
To run this, make sure to install all the requirements by:
$ pip install -r requirements.txt
$ python main.py --model MODEL_NAME
$ python main.py --model dqn
$ python main.py --model dueling
Dueling DQN result