Collection of general RL algorithms as a training exercise. Work in progress.
Reinforcement learning is a type of machine learning that is concerned with teaching agents how to make decisions in an environment. The agent learns to achieve a goal in an uncertain, potentially complex environment.
Do not run any of these on your CPU.
https://arxiv.org/abs/1509.02971
DDPG is an off-policy, model-free reinforcement learning algorithm that combines deterministic policy gradient methods with experience replays and actor-critic networks. It is used for continuous action spaces.
Used in this project to solve Lunar Lander problem from Gymnasium (similar to the deprecated OpenAI Gym) which is considered solved at +200 score. This model was able to achieve a 100 episode running average of +200 at 891 episodes.
Requires Python3, PyTorch, NumPy, Matplotlib and Gymnasium. Ensure compatibility between versions.
https://arxiv.org/abs/1801.01290
TODO
https://arxiv.org/abs/1707.06347
TODO
https://arxiv.org/abs/1706.02275
TODO
https://arxiv.org/abs/1803.11485
TODO