Skip to content

A final project of (5SC28) Machine Learning for System and Control 2021/2022 course at TU Eindhoven

Notifications You must be signed in to change notification settings

grafaelw/5SC28-ML4SC

Repository files navigation

Machine Learning for System and Control

A final project of 5SC28-Machine Learning for System and Control 2021/2022 course at TU Eindhoven. The project is all about the unbalanced disk modelling and controlling it so that it can swing up and also having a $\pm10^{\circ}$ after reaching the $180^{\circ}$ swing-up for the multi-target policy.

Setup

Data-driven modelling (System Identification)

The modelling here are using the NARX model structure, where it is implemented in both the Gaussian Process and Artificial Neural Network using scikit-learn and PyTorch respectively. For the Gaussian Process, we managed to use the exact method of inference, thus it may take several hours to train the Gaussian Process. For the grid search Gaussian Process implementation, it provides a good approximations as well.

Data-driven control (Reinforcement Learning)

The overall objective is to control an unbalanced disk to swing-up or making a swing-up policy (see Gym Unbalacned Disk library by Gerben Beintema). There are several methods that we have done, which are:

  1. DQN (Deep Q-Network) with stable-baselines3
  2. A2C (Advantage Actor Critic) with PyTorch. Use the a2c_image.ipynb and a2c_eval.py for generating figures and evaluate the model respectively.
  3. SAC (Soft Actor-Critic) with stable-baselines3
  4. Classical Q-Learning (Tabular Q-learning)
  5. Multi_SAC for multi-target policy $\pm10^{\circ}$ using the SAC method with stable-baselines3

About

A final project of (5SC28) Machine Learning for System and Control 2021/2022 course at TU Eindhoven

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published