This repository includes many projects developed during the course Intelligent Systems - 366 in the University of Alberta (Edmonton, Canada), taught by Richard Sutton.
All the projects implement an interface: RL-Glue (http://glue.rl-community.org/wiki/Main_Page), however, the environment and agent in every problem was completely developed by the student (Thiago Mayllart Macedo Silva). The code was developed entirely for solving the problems proposed during the course as Assignments.
This code is not authorized for "copy and paste" and its usage by other students enrolled on Intelligent Systems may lead to plagiarism.
All the solutions and techniques followed the algorithms implemented in the book: "Reinforcement Learning:An Introduction"
-
- Bandit task Programming: Recreation of the learning curves for the optimistic bandit agent, and the epsilon-greedy agent in Figure 2.3 of Reinforcement Learning:An Introduction.(https://github.com/thiagomayllart/Reinforcement-Learning---UofA/tree/master/Bandit%20Task%20Programming)
-
- On-policy Monte Carlo Control with Exploring Starts for action values (described in Section 5.3) on the Gambler’s problem described in Chapter 4 (Example 4.3). https://github.com/thiagomayllart/Reinforcement-Learning---UofA/tree/master/On-policy%20Monte%20Carlo%20Control%20with%20Exploring%20Starts%20for%20action%20values Two changes from the problem specification in the book: set pr(heads) = 0.55 and zero is not allowed for bet action.
-
- Windy Gridworld with King’s Moves.https://github.com/thiagomayllart/Reinforcement-Learning---UofA/tree/master/Windy%20Gridworld%20with%20King%E2%80%99s%20Moves
-
- Dyna-Q on the grid world: described in Example 8.1 of the "Reinforcement Learning: An introduction" textbook. https://github.com/thiagomayllart/Reinforcement-Learning---UofA/tree/master/Dyna-Q%20on%20the%20grid%20world
-
- Three prediction agents based on TD(0); each using a different function approximation schemes in RL-glue. Tabular feature encoding. Tile coding features. State aggregation. https://github.com/thiagomayllart/Reinforcement-Learning---UofA/tree/master/Prediction%20agents%20based%20on%20TD(0)
-
- Solving Mountain Car in RL-Glue: A car that learns how to climb a mountain given the mountain slope, a starting position and a goal position (the peak). It uses physics properties to learn and climb the mountain. https://github.com/thiagomayllart/Reinforcement-Learning---UofA/tree/master/Solving%20Mountain%20Car%20in%20RL-Glue
All my sincere thanks to Richard Sutton (https://www.ualberta.ca/science/about-us/contact-us/faculty-directory/rich-sutton), for being able to provide us a great part of his knowledge about Reinforcement Learning and lot of techniques developed by him. An incredible professor and academic.