Skip to content

v0.6

Compare
Choose a tag to compare
@allenzren allenzren released this 31 Oct 00:07
· 7 commits to main since this release
dc8e0c9

DPPO update now uses transitions that are random sampled over both environment steps and denoising steps, instead of only over environment steps (i.e., the original update uses the entire denoising chain of each sampled environment step). We find a minor improvement in training stability and final performance with the new kind of update and the config files are mostly updated (with larger train.batch_size now).

We also add configs for experiments with Franka Kitchen environments from D4RL and Robomimic MH dataset.

In progress: finish updating all configs and update arxiv with updated experiment results