Skip to content

Commit c11ac05

Browse files
Minor Documentation improvements in HumanoidStandup (#1284)
1 parent e732459 commit c11ac05

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

gymnasium/envs/mujoco/humanoidstandup_v5.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -195,11 +195,11 @@ class HumanoidStandupEnv(MujocoEnv, utils.EzPickle):
195195
A reward for moving up (trying to stand up).
196196
This is not a relative reward, measuring how far up the robot has moved since the last timestep,
197197
but an absolute reward measuring how far up the Humanoid has moved up in total.
198-
It is measured as $w{uph} \times (z_{after action} - 0)/dt$,
199-
where $z_{after action}$ is the z coordinate of the torso after taking an action,
198+
It is measured as $w_{uph} \times \frac{z_{after\_action} - 0}{dt}$,
199+
where $z_{after\_action}$ is the z coordinate of the torso after taking an action,
200200
and $dt$ is the time between actions, which depends on the `frame_skip` parameter (default is $5$),
201201
and `frametime`, which is $0.01$ - so the default is $dt = 5 \times 0.01 = 0.05$,
202-
and $w_{uph}$ is `uph_cost_weight`.
202+
and $w_{uph}$ is `uph_cost_weight` (default is $1$).
203203
- *quad_ctrl_cost*:
204204
A negative reward to penalize the Humanoid for taking actions that are too large.
205205
$w_{quad\_control} \times \|action\|_2^2$,

0 commit comments

Comments
 (0)