Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Voxel-based vision for SERL #77

Open
wants to merge 343 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
343 commits
Select commit Hold shift + click to select a range
9e07430
changed pressure to be within [0, 1]
Apr 4, 2024
2b3796c
addd gripper action span
Apr 4, 2024
800668a
minor imporvements
Apr 4, 2024
ef252ef
added RelativeFrame
Apr 4, 2024
9511f76
added Reward Transformer
Apr 4, 2024
5a9a71a
more streamlined reward
Apr 4, 2024
7620cbd
soft actor critic worked with this, so I commit :)
Apr 9, 2024
fafa19b
added evaluation to SAC actor
Apr 9, 2024
ace4624
bugfix
Apr 9, 2024
840d74f
namechange
Apr 9, 2024
071374f
namechange
Apr 9, 2024
17d1fab
new actor and learner config
Apr 9, 2024
059cc05
more cleanup when terminated
Apr 9, 2024
4ac40e1
params for boxes with rough surfaces
Apr 9, 2024
9be84db
boxes tried out
Apr 9, 2024
6ea821a
higher reset height
Apr 11, 2024
8ed035f
added step cost and ignore xy cost
Apr 11, 2024
77f9244
catch if actor is terminated early
Apr 11, 2024
6e914a1
set original pose (without shift) as the relative pose
Apr 11, 2024
ca58a20
pose shift and multiple reset poses implementation
Apr 11, 2024
6c95ca3
some controller cleanup
Apr 12, 2024
35daf1c
bugfix
Apr 12, 2024
aec9b32
RelativeFrame Wrapper has been customized for Robotiq Env
Apr 12, 2024
f4e9e2c
clean a few TODO's
Apr 12, 2024
660b62a
Merge branch 'rail-berkeley:main' into develop
nisutte Apr 12, 2024
30e3a6a
added suction cost
Apr 19, 2024
7e7f984
new actor and learner params
Apr 19, 2024
e9c0739
added camera support (not tested yet)
Apr 19, 2024
e4c77ac
Merge remote-tracking branch 'origin/develop' into develop
Apr 19, 2024
8b97499
name change
Apr 23, 2024
8856153
added camera to environment
Apr 23, 2024
fe8bedc
added camera to environment
Apr 23, 2024
dadf758
cleanup
Apr 23, 2024
361ae1d
force as moving average in the controller (too much fluctuations othe…
Apr 23, 2024
63799eb
added realtime plotter to env
Apr 23, 2024
9a03482
drq tools for the box lifting task
Apr 23, 2024
6f9ceb8
new reset pose
Apr 23, 2024
323cfc9
readme update
Apr 23, 2024
93ee00a
bugfix of RuntimeWarning
Apr 24, 2024
412fe44
output encoder definition
Apr 24, 2024
c9361d1
misc
Apr 24, 2024
836d8e0
Added depth only for realsense cameras
Apr 25, 2024
dbc6b89
Merge branch 'rail-berkeley:main' into develop
nisutte Apr 29, 2024
c5fbceb
Merge remote-tracking branch 'origin/develop' into develop
Apr 29, 2024
ee1760a
faulty spacemouse action fix
Apr 29, 2024
b736ff2
added depth images as option
May 3, 2024
b654ecb
new configs
May 3, 2024
6d8de72
added running reward info
May 3, 2024
7a8b49b
added cost plotter
May 3, 2024
116c573
changed grip status representation
May 3, 2024
089199b
added wrapper to scale the obs space
May 3, 2024
a1a457d
small optimizations and modified env for SAC backwards compatibility
May 6, 2024
92f0a93
controller optimization (force check)
May 7, 2024
6f4d3ea
added encoded observation output (through print and parse)
May 7, 2024
b7e4fa4
try to use mean and std from demo images
May 7, 2024
cb222d5
DRQ TODO's
May 7, 2024
6f2d628
streamlined reward
May 7, 2024
a2f11bc
controller corrections
May 16, 2024
4441f1a
backup entropy as in original SAC paper and slight loss change
May 16, 2024
bf60dd4
translation scale change
May 16, 2024
9781215
small changes to drq policy
May 16, 2024
6483203
reward change
May 16, 2024
5cc4779
reward change
May 16, 2024
66e0972
better lightning conditions for the camera sensor
May 17, 2024
e5012db
new cameras
May 17, 2024
be846e0
run that got drq working
May 17, 2024
9bd136e
new observation space definition (rgb, depth, both)
May 17, 2024
4175d83
new observation space definition (rgb, depth, both)
May 17, 2024
62661a8
run that got drq working (new loggers)
May 17, 2024
78fecca
run that make drq work
May 17, 2024
613c208
cleanup
May 17, 2024
5f46b81
bugfix
May 17, 2024
a7004b9
depth images implemented in gym environment
May 21, 2024
9586022
bugfixes
May 21, 2024
c881102
bugfixes
May 21, 2024
0a68a2e
obs space bugfix
May 21, 2024
b267884
added function to print params & small changes
May 22, 2024
6af4d13
added return for encoder == None
May 22, 2024
6c0f712
shape check in ResNet
May 22, 2024
87db0f6
added some model examination tools
May 23, 2024
aa7ff86
cleanup
May 23, 2024
f68fc2e
added model examination tools and cleanup
May 23, 2024
4a8bfcd
new reward computation (orientation & pose cost)
May 23, 2024
c83a497
added evaluation script
May 23, 2024
e4eae40
bugfix 2
May 23, 2024
defa690
examination if trajs==0
May 23, 2024
1abfd96
do not log anymore
May 23, 2024
3d8efe6
added env with 5 different boxes
May 31, 2024
10fa294
depth to pointcloud calculation
May 31, 2024
1cdaba1
get camera intrinsics for the calculation of the pointcloud
May 31, 2024
811a745
added pretrained ResNet18 from pytorch (converted)
Jun 5, 2024
84c8aca
improved hyperparameter handling
Jun 5, 2024
82f6951
pretrained is from pytorch
Jun 6, 2024
a756c16
do not use statistics stored in batch_stats (too much effort to imple…
Jun 6, 2024
c56e91f
controller can somewhat ignore singularities (with human input & time)
Jun 6, 2024
05e018c
downgraded RealSense firmware to 5.13
Jun 6, 2024
4b5209c
new training and actor params for ResNet18
Jun 6, 2024
be49bdc
name change
Jun 10, 2024
77957f4
bugfixes
Jun 10, 2024
2d3d383
prototype distance sensor
Jun 18, 2024
d134d2c
added num keypoints in spatial softmax
Jun 18, 2024
1405e55
cam change
Jun 19, 2024
77f1d4b
cleanup
Jun 19, 2024
ce376b1
implemented pointcloud fusion & calibration (multi camera setup)
Jun 21, 2024
c9e87c6
implemented pointcloud visualization and finalized pointcloud fusion …
Jun 26, 2024
8bf54b8
continuous box change instead of random (better setup, easier to repr…
Jun 27, 2024
fb79d9f
some bugfixes regarding pc calibration
Jun 27, 2024
dc54e44
added new X & Y max range for the robot arm in the final configuration
Jun 27, 2024
bbf8f64
more bugfixes
Jun 27, 2024
b49c0e6
Important Relative Env fix (do not use adjoint with matrix with acti…
Jun 28, 2024
854d007
Voxel Grid Env setup (for now too slow, optimize or use gpu)
Jun 28, 2024
96905df
New config file for Box setup
Jun 28, 2024
c530b78
new cost config
Jun 28, 2024
a7c5e83
implemented image space dynamic adaption (if multi or single cam)
Jun 28, 2024
c077c36
updated config
Jun 28, 2024
deef286
pointcloud and voxel bugfixes
Jun 28, 2024
d08b205
new actor learner configs
Jun 28, 2024
c66abd2
made now Env config to test policy on
Jun 28, 2024
dec750f
test boxes at an angle
Jun 28, 2024
7a2c2c2
new name for wandb
Jun 28, 2024
480a564
added Observation statistics wrapper to investigate episodes
Jun 28, 2024
e63063f
no invert
Jul 2, 2024
6f21a35
bugfix
Jul 2, 2024
d9c1328
added cost to info stats
Jul 2, 2024
3288fc0
better structure of cost info
Jul 2, 2024
8d161d7
cleanup
Jul 2, 2024
426b671
bugfix for starting resetq
Jul 2, 2024
060d46b
implemented grey camera mode
Jul 2, 2024
20d668a
grey bugfix
Jul 2, 2024
59f5f1a
grey bugfix 2
Jul 2, 2024
327684b
range sensor bugfix
Jul 2, 2024
affc0ef
move away from open3d and do pointcloud fusion and voxelization in pu…
Jul 2, 2024
434aeff
new implementation of VoxNet encoder (conv3d on downsampled voxelgrid)
Jul 8, 2024
c035ada
some slight improvements
Jul 8, 2024
15dc1c4
added all the costs and total cost
Jul 10, 2024
5912c87
slight change in voxel dim and visualization
Jul 10, 2024
8ab5bbc
improved memory efficiency of pointcloud data
Jul 10, 2024
0c5f2e3
small improvements
Jul 10, 2024
2f97694
added voxel augmentation (3d shift)
Jul 11, 2024
c24cb2d
do not use bool compacting anymore (interferes with augmentation & no…
Jul 11, 2024
fcfb9e4
bugfix and new gripper release fix
Jul 11, 2024
c630963
changing pressure and grip status (more distinct)
Jul 11, 2024
6d3866c
16 features are enough (faster training)
Jul 11, 2024
05f636f
bugfix with releasing the gripper
Jul 15, 2024
5ed726d
cost for the suction reward fix
Jul 16, 2024
5327f9a
bugfix
Jul 16, 2024
ecfab63
timing change in getting the current position of the robot arm
Jul 16, 2024
7720de4
do use damping, so the controller does not run into things that easil…
Jul 16, 2024
695ac81
bugfix in downward force check
Jul 16, 2024
26683d0
new: use conv instead of maxpool and reduce over z dim
Jul 16, 2024
bc4737f
bugfix for using dropout in the critic network
Jul 16, 2024
fcac56c
cleanup
Jul 17, 2024
95cc778
dropout successful run
Jul 17, 2024
4dff73c
added voxnet final activation choice
Jul 17, 2024
8905469
tests done
Jul 18, 2024
dfd7268
new PointCloudFusion starting parameters
Jul 18, 2024
8929184
implemented 4-fold voxel augmentation (randomly rotate the state, vox…
Jul 19, 2024
30db461
augmentation bugfixes
Jul 19, 2024
f0439ed
reset position bugfix for controller
Jul 19, 2024
9a9b036
transform force and torque to relative frame as well
Jul 22, 2024
a4aaf84
rlds path fix
Jul 22, 2024
43cb564
action needs to be roatated as well in augmentation
Jul 22, 2024
b19f939
voxel rotation needs to be clockwise (tcp frame is upside down)
Jul 22, 2024
6180ac9
drq bugfix (actions were not updated, stupid me...)
Jul 22, 2024
4448ffd
deactivate batch rotation for now
Jul 23, 2024
8c0f018
adjoint matrix should not be used at all, bug found after some veloci…
Jul 24, 2024
59092eb
added a scale factor to scale voxel input in between [0, 1]
Jul 24, 2024
63e0d44
transform the velocity relative to the start rotation and not the cur…
Jul 24, 2024
53f40d3
implemented real rot90 (orientation rotation is different because of …
Jul 24, 2024
5ba0752
bugfixes and TODO removal
Jul 24, 2024
570b0bd
bugfixe 2
Jul 24, 2024
c3925e1
consider observation and action scaling in augmentation
Jul 25, 2024
d15015d
consider observation and action scaling in augmentation
Jul 25, 2024
e55c228
bugfixes for augmentation
Jul 25, 2024
0d75fa5
added basic PointNet Encoder (not tested yet)
Jul 25, 2024
2222c6c
NaN bugfix
Jul 26, 2024
9404660
bugfix to prevent starting velocities
Jul 26, 2024
e26c50a
implemented 3d Spatial Soft Argmax
Jul 31, 2024
c1e665a
added sampling rotation (and inverse after) for the actor sampling. A…
Jul 31, 2024
b424c28
implemented 180 deg rotation and ignoring force/torque
Jul 31, 2024
d51499f
ignoring force and torque
Jul 31, 2024
2707a74
use pretrained first Conv3D Layer in Voxnet (ugly for now, but it works)
Jul 31, 2024
ea7aa9a
added 3d kernel plot and pretrained VoxNet kernel weights
Jul 31, 2024
06a19de
pretrained Conv3D setup
Jul 31, 2024
ff68b4f
abnormal frame wait time checker
Jul 31, 2024
0895695
changed environment step and safety margins to mrp instead of euler, …
Aug 6, 2024
f08c816
changed environment step and safety margins to mrp instead of euler, …
Aug 6, 2024
23306b2
added pretrained voxnet params loader
Aug 6, 2024
f15cf98
optimized voxnet params loader
Aug 6, 2024
e3aa2fa
actually use mrp wrapper that was created
Aug 6, 2024
0c41372
print cleanup
Aug 6, 2024
b56c04d
added a new wrapper that rotates every observation within the first q…
Aug 8, 2024
701aecc
use new mrp wrapper & some cleanup
Aug 8, 2024
36b141a
voxel grid configuration for 6 aug run (and relu after layernorm)
Aug 8, 2024
d9fa6d7
added observation rotation wrapper
Aug 9, 2024
4b0ac11
some cleanup
Aug 9, 2024
c643022
bugfix
Aug 9, 2024
ef684e3
added a state mask to easily ignore observation states
Aug 9, 2024
e1dd104
re_added force and torque but ignore them with state mask
Aug 9, 2024
66030cd
Added layernorm params and bugfix
Aug 12, 2024
a409efd
cleanup & small improvements
Aug 13, 2024
c923e61
added new flags for clearer policy definition
Aug 13, 2024
8d2c818
name change
Aug 13, 2024
def5167
new VoxNet architecture
Aug 13, 2024
0aced4f
final VoxNet hyperparams (not pretrained)
Aug 14, 2024
ac47335
name change and bugfixes
Aug 15, 2024
7c49e73
some bugfixes in the augmentation handling
Aug 15, 2024
405206e
added the current action to the observation
Aug 15, 2024
154e15b
random rotation in x y and z, triangular (so probability is higher ne…
Aug 20, 2024
5649041
bugfixes and cleanup
Aug 20, 2024
ef30711
added trajectory save and some minor changes
Aug 20, 2024
a0622f8
IMPORTANT CHANGE: proprio-latent-dim is not used anymore, substantial…
Aug 20, 2024
61a80d7
bash files for the different experiments
Aug 20, 2024
219b08d
deactivate evaluation if period==0
Aug 20, 2024
db3cdfd
created demos for the experiment (rgb, depth and pointcloud)
Aug 20, 2024
75d9466
cleaned some inconsistencies and created policy with only VoxNet
Aug 20, 2024
b17855d
name change
Aug 20, 2024
b6f1dfb
change costs to be positive (better plots)
Aug 20, 2024
52b1088
implemented behavior tree and behavioral cloning
Aug 22, 2024
b41fb04
new runfiles for the experiment
Aug 22, 2024
9e95f4c
encoder kwargs bugfix
Aug 22, 2024
12fead7
wrong name in config file
Aug 22, 2024
b1c4dbc
cleanup
Aug 23, 2024
0c08a19
implemented Temporal Action Ensemble for smoother action sequences in…
Aug 23, 2024
8dab130
added evaluation bash files
Aug 23, 2024
87ec541
added wandb evaluation logger for convenience
Aug 23, 2024
6d8c167
bugfixes
Aug 23, 2024
e1c50a8
fix the seed for the evaluation
Aug 23, 2024
82a6c56
preolad bias as well
Aug 24, 2024
e43788f
slight changes and bugfixes
Aug 24, 2024
5a2ffdf
bugfix
Aug 28, 2024
cf851e8
new config for evaluation on unseen boxes
Aug 28, 2024
ab7fb8c
new scale for depth images
Aug 28, 2024
64310fd
use small encoder for depth images
Aug 28, 2024
fd972a8
random without numpy to not mess with seeded run
Sep 9, 2024
ced305f
final experiment files
Sep 9, 2024
7ce1336
send eval metrics to wandb
Sep 9, 2024
5f17545
seed change
Sep 9, 2024
1a244c8
final config
Sep 9, 2024
33ab188
bt and bc changes to evaluate
Sep 9, 2024
25f35a3
results calculation to latex table from experiments
Sep 9, 2024
8101b61
Readme update
Oct 4, 2024
407d090
name change from robotiq to ur and general cleanup
Oct 9, 2024
c2c601f
readme updates
Oct 9, 2024
75badcc
fix
Oct 9, 2024
7ead273
bugfix
Oct 16, 2024
8ab7f59
cleanup for potential pr
Oct 22, 2024
6501276
upstream merge
Oct 22, 2024
f52fa79
cleanup
Oct 25, 2024
ed6957a
more cleanup
Oct 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
660 changes: 660 additions & 0 deletions examples/box_picking_drq/drq_policy.py

Large diffs are not rendered by default.

131 changes: 131 additions & 0 deletions examples/box_picking_drq/record_demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
import gym
from tqdm import tqdm
import numpy as np
import copy
import pickle as pkl
import datetime
import os
import threading
from pynput import keyboard

from ur_env.envs.relative_env import RelativeFrame
from ur_env.envs.wrappers import (
SpacemouseIntervention,
Quat2MrpWrapper,
ObservationRotationWrapper,
)

from serl_launcher.wrappers.serl_obs_wrappers import (
SERLObsWrapper,
ScaleObservationWrapper,
)
from serl_launcher.wrappers.chunking import ChunkingWrapper

import ur_env

exit_program = threading.Event()


def on_space(key, info_dict):
if key == keyboard.Key.space:
for key, item in info_dict.items():
print(f"{key}: {item}", end=" ")
print()


def on_esc(key):
if key == keyboard.Key.esc:
exit_program.set()


if __name__ == "__main__":
env = gym.make(
"box_picking_camera_env",
camera_mode="pointcloud",
max_episode_length=100,
)
env = SpacemouseIntervention(env)
env = RelativeFrame(env)
env = Quat2MrpWrapper(env)
env = ScaleObservationWrapper(env)
# env = ObservationRotationWrapper(env) # if it should be enabled
env = SERLObsWrapper(env)
env = ChunkingWrapper(env, obs_horizon=1, act_exec_horizon=None)

obs, _ = env.reset()

transitions = []
success_count = 0
success_needed = 20
total_count = 0
pbar = tqdm(total=success_needed)

info_dict = {
"state": env.unwrapped.curr_pos,
"gripper_state": env.unwrapped.gripper_state,
"force": env.unwrapped.curr_force,
"reset_pose": env.unwrapped.curr_reset_pose,
}
listener_1 = keyboard.Listener(
daemon=True, on_press=lambda event: on_space(event, info_dict=info_dict)
)
listener_1.start()

listener_2 = keyboard.Listener(on_press=on_esc, daemon=True)
listener_2.start()

uuid = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
file_name = f"box_picking_{success_needed}_demos_{uuid}.pkl"
file_dir = os.path.dirname(os.path.realpath(__file__)) # same dir as this script
file_path = os.path.join(file_dir, file_name)

if not os.access(file_dir, os.W_OK):
raise PermissionError(f"No permission to write to {file_dir}")

try:
running_reward = 0.0
while success_count < success_needed:
if exit_program.is_set():
raise KeyboardInterrupt # stop program, but clean up before

next_obs, rew, done, truncated, info = env.step(action=np.zeros((7,)))
actions = info["intervene_action"]

transition = copy.deepcopy(
dict(
observations=obs,
actions=actions,
next_observations=next_obs,
rewards=rew,
masks=1.0 - done,
dones=done,
)
)
transitions.append(transition)

obs = next_obs
running_reward += rew

if done or truncated:
success_count += int(rew > 0.99)
total_count += 1
print(
f"{rew}\tGot {success_count} successes of {total_count} trials. {success_needed} successes needed."
)
pbar.update(int(rew > 0.99))
obs, _ = env.reset()
print("Reward total:", running_reward)
running_reward = 0.0

with open(file_path, "wb") as f:
pkl.dump(transitions, f)
print(f"saved {success_needed} demos to {file_path}")

except KeyboardInterrupt as e:
print(f"\nProgram was interrupted, cleaning up... ", e.__str__())

finally:
pbar.close()
env.close()
listener_1.stop()
listener_2.stop()
20 changes: 20 additions & 0 deletions examples/box_picking_drq/run_actor.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
export XLA_PYTHON_CLIENT_PREALLOCATE=false && \
export XLA_PYTHON_CLIENT_MEM_FRACTION=.1 && \
python drq_policy.py "$@" \
--actor \
--env box_picking_camera_env \
--max_traj_length 100 \
--exp_name=box_picking \
--camera_mode pointcloud \
--seed 1 \
--max_steps 20000 \
--random_steps 0 \
--training_starts 500 \
--utd_ratio 8 \
--batch_size 128 \
--eval_period 1000 \
--encoder_type voxnet-pretrained \
--state_mask all \
--encoder_bottleneck_dim 128 \
# --enable_obs_rotation_wrapper \
# --debug
18 changes: 18 additions & 0 deletions examples/box_picking_drq/run_evaluation.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
export XLA_PYTHON_CLIENT_PREALLOCATE=false && \
export XLA_PYTHON_CLIENT_MEM_FRACTION=.2 && \
python drq_policy.py "$@" \
--actor \
--env box_picking_camera_env \
--exp_name=drq_evaluation \
--camera_mode pointcloud \
--batch_size 128 \
--max_traj_length 100 \
--checkpoint_path "checkpoint folder path here"\
--eval_checkpoint_step 10000 \
--eval_n_trajs 20 \
\
--encoder_type voxnet-pretrained \
--state_mask all \
--encoder_bottleneck_dim 128 \
# --enable_obs_rotation_wrapper \
# --debug
22 changes: 22 additions & 0 deletions examples/box_picking_drq/run_learner.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
export XLA_PYTHON_CLIENT_PREALLOCATE=false && \
export XLA_PYTHON_CLIENT_MEM_FRACTION=.3 && \
python drq_policy.py "$@" \
--learner \
--env box_picking_camera_env \
--exp_name=ox_picking \
--camera_mode pointcloud \
--max_traj_length 100 \
--seed 1 \
--max_steps 25000 \
--random_steps 0 \
--training_starts 500 \
--utd_ratio 8 \
--batch_size 128 \
--eval_period 20000 \
--checkpoint_period 1000 \
--encoder_type voxnet-pretrained \
--state_mask all \
--encoder_bottleneck_dim 128 \
--demo_path "demo path here *.pkl" \
# --enable_obs_rotation_wrapper \
# --debug
Loading
Loading