Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Code of the paper "Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks" to be presented at ISER 2018.

This code is based on the following publication:

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks (Preprint)

Authors: Rodrigo Pérez-Dattari, Carlos Celemin, Javier Ruiz-del-Solar, Jens Kober.

Link to paper video:

Installation

To use the code, it is necessary to first install the gym toolkit (release v0.9.6): https://github.com/openai/gym

Then, the files in the gym folder of this repository should be replaced/added in the installed gym folder on your PC. There are modifications of two gym environments:

Continuous-CartPole: a continuous-action version of the Gym CartPole environment.
CarRacing: the same CarRacing environment of Gym with some bug fixes and modifications in the main loop for database generation.

To download and install some pretrained networks (CarRacing autoencoder, CarRacing simulated teacher, CartPole simulated teacher) and a CarRacing database for training the autoencoder run (inside the folder D-COACH):

sh install_comp_files.sh

Requirements

setuptools==38.5.1
numpy==1.13.3
opencv_python==3.4.0.12
matplotlib==2.2.2
tensorflow==1.4.0
pyglet==1.3.2
gym==0.9.6

Usage

To run the main program type in the terminal (inside the folder D-COACH):

python main.py --config-file <environment>

The default configuration files are car_racing and cartpole.

To be able to give feedback to the agent, the environment rendering window must be selected/clicked.

To train the autoencoder for the high-dimensional state environments run (inside the folder D-COACH):

python autoencoder.py

To generate a database for the CarRacing environment run the (replaced) file car_racing.py in the downloaded gym repository.

To modify the dimension of the images in the generated database, this database must be in the folder D-COACH and from this folder run:

python tools/transform_database_dim.py

Comments

The D-COACH algorithm is designed to work with continuous-action problems. Given that the Cartpole environment of gym was designed to work with discrete action spaces, a modified continuous version of this environment is used.

This code has been tested in Ubuntu 16.04 and python >= 3.5.

Tests with the CartPole environment using the classic COACH algorithm can be done using the following repository: COACH

Troubleshooting

If you run into problems of any kind, don't hesitate to open an issue on this repository. It is quite possible that you have run into some bug we are not aware of.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Installation

Requirements

Usage

Comments

Troubleshooting

Files

README.md

Latest commit

History

README.md

File metadata and controls

Interactive Learning with Corrective Feedback for Policies based on Deep Neural Networks

Installation

Requirements

Usage

Comments

Troubleshooting