Skip to content

Releases: irom-princeton/dppo

v0.7

20 Nov 20:59
Compare
Choose a tag to compare

Major changes

  • [Faster pre-training] Fix the issue of EMA updates being infrequent as it is only triggered on the epoch level, not on batch level. Now we may use saved EMA checkpoints from much earlier epochs, such as 3000 for robomimic state input and 1000 for robomimic pixel input.
  • [Faster pre-training] Update pre-training configs for all tasks, generally using fewer epochs as the EMA update issue has been fixed
  • [Faster fine-tuning] Update fine-tuning configs for robomimic tasks, using higher learning rate and possibly higher update ratio for better sample efficiency

Minor changes

  • Clean up D3IL data pre-processing
  • Fix data normalization bug in robomimic pre-processing (does not affect existing experiment results)
  • Allow saving full observations for plotting in eval agent
  • Add a simple implementation of ViT + UNet and provide pre-trained checkpoints on Google Drive
  • Fix the isaacgym download path

v0.6

31 Oct 00:07
dc8e0c9
Compare
Choose a tag to compare

DPPO update now uses transitions that are random sampled over both environment steps and denoising steps, instead of only over environment steps (i.e., the original update uses the entire denoising chain of each sampled environment step). We find a minor improvement in training stability and final performance with the new kind of update and the config files are mostly updated (with larger train.batch_size now).

We also add configs for experiments with Franka Kitchen environments from D4RL and Robomimic MH dataset.

In progress: finish updating all configs and update arxiv with updated experiment results

v0.5

07 Oct 20:38
e0842e7
Compare
Choose a tag to compare

Major updates since initial release

  • Fix double critic initialization; always using target critic
  • Fix MC return calculation in RWR (following RLPD)
  • Switch to using terminated and truncated instead of done
  • Add SAC, RLPD, Cal-QL, and IBRL implementation, tested with halfcheetah results

Minors

  • Log training steps
  • Rename transition_dim to action_dim
  • Fix robomimic lowdim rendering issue

In progress (v1.0)

  • Updating baseline results
  • Modifications to DPPO updates with potential performance improvement

v0.1 - Initial release

06 Oct 21:09
Compare
Choose a tag to compare

Initial release with minor updates until Sep 2024