Releases: irom-princeton/dppo
Releases · irom-princeton/dppo
v0.7
Major changes
- [Faster pre-training] Fix the issue of EMA updates being infrequent as it is only triggered on the epoch level, not on batch level. Now we may use saved EMA checkpoints from much earlier epochs, such as 3000 for robomimic state input and 1000 for robomimic pixel input.
- [Faster pre-training] Update pre-training configs for all tasks, generally using fewer epochs as the EMA update issue has been fixed
- [Faster fine-tuning] Update fine-tuning configs for robomimic tasks, using higher learning rate and possibly higher update ratio for better sample efficiency
Minor changes
- Clean up D3IL data pre-processing
- Fix data normalization bug in robomimic pre-processing (does not affect existing experiment results)
- Allow saving full observations for plotting in eval agent
- Add a simple implementation of ViT + UNet and provide pre-trained checkpoints on Google Drive
- Fix the isaacgym download path
v0.6
DPPO update now uses transitions that are random sampled over both environment steps and denoising steps, instead of only over environment steps (i.e., the original update uses the entire denoising chain of each sampled environment step). We find a minor improvement in training stability and final performance with the new kind of update and the config files are mostly updated (with larger train.batch_size
now).
We also add configs for experiments with Franka Kitchen environments from D4RL and Robomimic MH dataset.
In progress: finish updating all configs and update arxiv with updated experiment results
v0.5
Major updates since initial release
- Fix double critic initialization; always using target critic
- Fix MC return calculation in RWR (following RLPD)
- Switch to using
terminated
andtruncated
instead ofdone
- Add SAC, RLPD, Cal-QL, and IBRL implementation, tested with halfcheetah results
Minors
- Log training steps
- Rename
transition_dim
toaction_dim
- Fix robomimic lowdim rendering issue
In progress (v1.0)
- Updating baseline results
- Modifications to DPPO updates with potential performance improvement
v0.1 - Initial release
Initial release with minor updates until Sep 2024