-
Hi, I notice that MAPPO is supportted in 0.11.0, and I'm really eager for using MAPPO algorithm in NVIDIA Isaac Sim, but this work may have not shown for us. Could you please tell me what time may I use MAPPO? In addition, If I create MAPPO class in skrl.agents.torch by myself, is it possible to work? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Hi @394262597 Multi-agent reinforcement learning is one of the feature I am working for the next (mayor) release. However, you can try it with skrl by changing to or cloning the In the next link you can find more information and an example for the Bi-Dexhands ShadowHand Over environment (created on top of Isaac Gym preview 4). It should be straightforward to adapt for MAPPO and Omniverse Isaac Gym |
Beta Was this translation helpful? Give feedback.
-
Hi @394262597 The training/evaluation of multi-agent RL algorithms using skrl requires the environment (wrapped environment) to have a specific interface. In your case, it is necessary to program the wrapper (inheriting from skrl's MultiAgentEnvWrapper base class)... In the second case, your Omniverse Isaac Gym environment must have the following properties:
Also, the observation, shared observation, reward, done and action tensors must have the following shape:
I hope it will be useful :) |
Beta Was this translation helpful? Give feedback.
Hi @394262597
The training/evaluation of multi-agent RL algorithms using skrl requires the environment (wrapped environment) to have a specific interface.
The wrapped environment interface follows the Famama PettingZoo API as show in https://skrl.readthedocs.io/en/multi-agent/api/envs/multi_agents_wrapping.html
In your case, it is necessary to program the wrapper (inheriting from skrl's MultiAgentEnvWrapper base class)...
or (better!?) design your environment to follow the Bi-DexHands interface, then you can just use the skrl's Bi-DexHands wrapper
In the second case, your Omniverse Isaac Gym environment must have the following properties:
num_envs
: intnum_agents
: intobservation_space
: …