-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add safety limits on relative action target #373
Add safety limits on relative action target #373
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alexander-soare, I just tried it and it works on my arm.
I was wondering however if we shouldn't put that action capping logic in the robot's config & class rather than in the record script. My reasons for wanting this are:
- I'd like to have that safety feature in general when operating the robot, not just when running a policy. For instance if I start the teleop script when my follower arm is in resting position while the leader is at a 180° position, that follower is going to do a sudden jump which is not very safe (both for the user and the motors/arm). Another example I can think of is if I (human) do a sudden move during teleop, for any reason.
- The values you defined in
policy_action_safety_cap
seem to be fundamentally associated with the motors/robot rather than the policy.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as simon. We should move it to dynamixel.py no?
this_action = action[from_idx:to_idx] | ||
|
||
if self.config.max_relative_target is not None: | ||
max_relative_target = torch.tensor(self.config.max_relative_target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -164,6 +165,7 @@ class KochRobotConfig: | |||
leader_arms: dict[str, MotorsBus] = field(default_factory=lambda: {}) | |||
follower_arms: dict[str, MotorsBus] = field(default_factory=lambda: {}) | |||
cameras: dict[str, Camera] = field(default_factory=lambda: {}) | |||
max_relative_target: list[float] | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -474,21 +498,53 @@ def capture_observation(self): | |||
obs_dict[f"observation.images.{name}"] = torch.from_numpy(images[name]) | |||
return obs_dict | |||
|
|||
def send_action(self, action: torch.Tensor): | |||
"""The provided action is expected to be a vector.""" | |||
def send_action(self, action: torch.Tensor, follower_names: list[str] | None = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.config.max_relative_target is not None: | ||
max_relative_target = torch.tensor(self.config.max_relative_target) | ||
# Cap relative action target magnitude for safety. | ||
current_pos = torch.tensor(self.follower_arms[name].read("Present_Position")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
Args: | ||
action: tensor containing the concatenated joint positions for the follower arms. | ||
follower_names: Pass follower arm names to only control a subset of all the follower arms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we allow to have one value? max_relative_target: int | list[int]
max_relative_target=[10, 10, 10, 10, 10, 15],
That way it's a bit more automated.
@Cadene I found that the gripper needs a higher limit, and it's also okay because there's little load. Also, I found that in general I sometimes want to tweak the other motors. Mostly because of the gripper though, I usually don't want to use a single value. Anyway. I'll put it in, but won't suggest it in the docs. |
follower_goal_pos[name] = action[from_idx:to_idx].numpy() | ||
from_idx = to_idx | ||
for name in follower_names: | ||
to_idx += len(self.follower_arms[name].motor_names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have resolved this by adding a post_init validation to the config class.
lerobot/configs/robot/koch.yaml
Outdated
@@ -37,3 +37,4 @@ cameras: | |||
fps: 30 | |||
width: 640 | |||
height: 480 | |||
max_relative_target: [10.0, 10.0, 10.0, 10.0, 10.0, 15.0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry I was not specific enough. One value should be the default. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Ideally we can add a little test. But not a blocker.
What this does
At the script level, add a mechanism and option for capping the size of the relative joint space target provided by the policy. This prevents a wild policy from burning the motors.
For a future PR
We decided to leave the default value of the safety limit to
None
(meaning no limit) in order to avoid friction for the user who doesn't know about this parameter. Here @Cadene suggests a follow up if we want to default the safety limit to a reasonable value:How it was tested / how to check out and try
Run a policy rollout like:
Then observe the output. You might see lines like:
especially if you go OOD.