Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Voxel-based vision for SERL #77

Open
wants to merge 343 commits into
base: main
Choose a base branch
from

Conversation

nisutte
Copy link

@nisutte nisutte commented Oct 25, 2024

Hi all

This is a pull request as mentioned here.
The goal was to implement voxel-based vision for a box picking task with a UR5 robot arm.

New implementations:

  • serl_robot_infra/robot_controllers/ur5_controller.py UR5 Impedance controller
  • serl_robot_infra/ur_env A new environment for picking boxes
  • examples/box_picking_drq New experiment files for the same purpose
  • serl_launcher/vision/resnet_v1_18.py A pre-trained ResNet18 model by Pytorch trained on ImagNet.
  • serl_launcher/vision/voxel_grid_encoders.py Implementation of a 3D-conv based Voxel encoder.
  • serl_launcher/wrappers Some new wrappers for: (1) Scaling the observation (2) Logging the observation statistics to wandb.

Modifications to SERL:

These modifications improved the performance for my case, you can choose which you want to include / change.

  • serl_launcher/agents/continuous/sac.py Used original jax_rl implementation with backup_entropy enabled. Also, the target entropy is set to -action_space instead of -action_space/2.
  • serl_launcher/agents/continuous/drq.py Added classmethod create_voxel_drq with voxel encoders. Additionally, implemented the 3D augmentation technique for voxels analogous to batched_random_crop for images.
  • serl_launcher/common/encoding.py Added a Masked EncodingWrapper to easily ignore certain proprioceptive states (in our experiments, forces/torques were too noisy and decreased the performance).
  • serl_launcher/vision/resnet_v1.py Dropout bugfix and added an optional num_kp parameter, which controls the number of keypoints.

Additional library requirements

  • ur-rtde Library implements API for Universal Robots RTDE realtime interface.
  • open3d Used to visualize the voxel perception of the end effector.
  • clu Common Loop Utils for a quick parameter overview of the model. (Is not needed, but useful)

Videos showing the voxel-based policies can be seen here: Videos

@jianlanluo
Copy link
Collaborator

Hi, thanks for the work, since this is quite different from the main branch, I think it would make more sense if you push to a different branch on its own and write a documentation how to use it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants