Some basic examples for reinforcement learning

Installing Anaconda and Gymnasium

Download and install Anaconda here
Install the essential dev libraries on Linux or WSL (Windows Subsystem for Linux)

sudo apt-get update
sudo apt-get install build-essential

Create conda env for managing dependencies and activate the conda env

conda create -n conda_env python=3.10
conda activate conda_env

Install gymnasium (Dependencies installed by pip will also go to the conda env)

pip install gymnasium[all]
pip install gymnasium[atari]
pip install gymnasium[accept-rom-license]

# Try the next line if box2d-py fails to install.
conda install swig

Install ai2thor if you want to run navigation_agent.py

pip install ai2thor==2.4.10

Install torch with either conda or pip

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

pip install torch torchvision torchaudio

Install other dependencies

pip install numpy pandas matplotlib

Examples

Play with the environment and visualize the agent behaviour

import gymnasium as gym
render = True # switch if visualize the agent
if render:
    env = gym.make('CartPole-v0', render_mode='human')
else:
    env = gym.make('CartPole-v0')
env.reset(seed=0)
for _ in range(1000):
    env.step(env.action_space.sample()) # take a random action
env.close()

Random play with CartPole-v0

import gymnasium as gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        print(observation)
        action = env.action_space.sample()
        observation, reward, terminated, truncated, info = env.step(action)
        done = np.logical_or(terminated, truncated)
env.close()

Example code for random playing (Pong-ram-v0,Acrobot-v1,Breakout-v0)

python my_random_agent.py Pong-ram-v0

Very naive learnable agent playing CartPole-v0 or Acrobot-v1

python my_learning_agent.py CartPole-v0

Playing Pong on CPU (with a great blog). One pretrained model is pong_model_bolei.p(after training 20,000 episodes), which you can load in by replacing save_file in the script.

python pg-pong.py

Random navigation agent in AI2THOR

python navigation_agent.py

Training PPO agent to control car with MetaDrive and Stable-Baselines3:

https://metadrive-simulator.readthedocs.io/en/latest/training.html

Training PPO agent to control robot dog (quadruped robot) with Genesis and rsl_rl:

https://genesis-world.readthedocs.io/en/latest/user_guide/getting_started/locomotion.html

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
MDP		MDP
RLalgorithm		RLalgorithm
bandits		bandits
derivativefree		derivativefree
modelfree		modelfree
policygradient		policygradient
project_template		project_template
.gitignore		.gitignore
README.md		README.md
_policies.py		_policies.py
my_learning_agent.py		my_learning_agent.py
my_random_agent.py		my_random_agent.py
navigation_agent.py		navigation_agent.py
pg-pong.py		pg-pong.py
pong_model_bolei.p		pong_model_bolei.p

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Some basic examples for reinforcement learning

Installing Anaconda and Gymnasium

Examples

About

Releases

Packages

Contributors 5

Languages

ucla-rlcourse/RLexample

Folders and files

Latest commit

History

Repository files navigation

Some basic examples for reinforcement learning

Installing Anaconda and Gymnasium

Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages