I am currently working on a project in which Reinforcement Learning is used in combination with V-Rep. My V-Rep server is run on Windows 10.
The reinforcemetn learning algorithms are implemented using the Nervana Systems Coach, which runs on Ubuntu. More specifically i run it on Windows Subsystem for Linux.
To allow for communication between the AI and simulation is wrapped as a OpenAI Gym environment.
For this reason i have cloned vrep_env, Gym and Coach, to be able to document my changes, allow for easier reinstallation of this project on other computer systems and also to manage versions and changes.
V-REP integrated with OpenAI Gym. This project aims to provide a superclass for V-REP gym environments. It is analogous to MuJoCo-env for MuJoCo.
![]() |
![]() |
![]() |
In order to smooth the installation process, define the variables VREP_PATH
and VREP_SCENES_PATH
that contain your V-REP scenes.
Example:
export VREP_PATH=/example/some/path/to/V-REP_PRO_EDU_V3_4_0_Linux/
export VREP_SCENES_PATH=/example/again/V-REP_PRO_EDU_V3_4_0_Linux/scenes/
These variables will be used as default if the respective argument is not provided. Next, simply install via pip:
pip3 install --upgrade git+https://github.com/ycps/vrep-env.git#egg=vrep_env
In order to create your own V-REP Gym environments, simply extend the VrepEnv
class and fill in the gaps.
You may use the ExampleVrepEnv
as a template base or check the fully functional HopperVrepEnv
(similar to the MuJoCo / Roboschool Hopper)
Before starting your environment, an instance of V-REP should already be running. It uses port 19997 by default, but it can be overriden in class initialization.
Check the HopperVrepEnv
for a simple running example.
It can be run as:
python3 hopper_vrep_env.py
If everything was installed correctly, you should see a random agent struggling to hop:
You may have to register the envs as following:
register(id='VrepCartPole-v0', entry_point='cartpole_vrep_env:CartPoleVrepEnv', max_episode_steps=200, reward_threshold=195.0)
register(id='VrepCartPoleContinuous-v0', entry_point='cartpole_continuous_vrep_env:CartPoleContinuousVrepEnv', max_episode_steps=200, reward_threshold=195.0)
register(id='VrepHopper-v0', entry_point='hopper_vrep_env:HopperVrepEnv', max_episode_steps=1000)
Environment Id | Observation Space | Action Space | tStepL | BasedOn |
---|---|---|---|---|
VrepCartPole-v0 | Box(4) | Discrete(2) | 200 | CartPole-v1 |
VrepCartPoleContinuous-v0 | Box(4) | Box(1) | 200 | CartPole-v1 |
VrepHopper-v0 | Box(25) | Box(3) | 1000 | Hopper-v* |
VrepAnt-v0 | Box(28) | Box(8) | 1000 | Ant-v* |
Based on Gym CartPole-v1 (cart-pole problem described by Barto, Sutton, and Anderson). An agent trained in CartPole-v1 may be able to succeed in VrepCartPole-v0 without additional training.
Similar to VrepCartPole-v0, but with continuous actions values.
Loosely based on MuJoCo/Roboschool/PyBullet Hopper, but the dynamics act numerically different. (Warning: it is not known if this env is learnable nor if the model is capable of hopping.)
Based on MuJoCo/Roboschool/PyBullet Ant, the dynamics act numerically similar (but not identical). An agent trained in the original Ant envs may be able to succeed in VrepAnt-v0 with little or no additional training.
There are other similar projects that attempt to wrap V-Rep and create a gym interface. Some of these projects also contain interesting scenes for learning different tasks.