Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few questions about this implementation #6

Open
sebbyjp opened this issue Feb 3, 2024 · 0 comments
Open

A few questions about this implementation #6

sebbyjp opened this issue Feb 3, 2024 · 0 comments

Comments

@sebbyjp
Copy link

sebbyjp commented Feb 3, 2024

  1. Are the past images in a video used to condition the hidden layers like in https://deepimagination.cc/eDiff-I/ ? : image

  2. Why are you predicting the actions for each frame of the video (output is (b, f, action dim, vocab_size)) instead of the expected (b, action dim, vocab_size) for a next action prediction)? . The cross entropy loss for the final action prediction (labeled single eval loss) seems rather high, although still an improvement over rt1x released by Google and Octo:

image
  1. Additionally the training cross entropy loss over the entire frame prediction seems to saturate before reaching 0 for the LR schedules I tried:
image

Additional info:
-I'm only able to run batch size of 16 on my GPUs, maybe that is the issue. Or potentially data augmentation from https://github.com/octo-models/octo/blob/main/examples/06_pytorch_oxe_dataloader.py is the issue.

-I am using a pre-trained MaxViT from pytorch with your classifier_free_guidance layers as seen here: https://github.com/kyegomez/RT-X/blob/031e6edb1734774e772f497b11fb49df634fef8d/rtx/rtx1.py#L402 (I'm happy to make a pull request to add this option here as well).

-I am using https://github.com/sebbyjp/robo_transformers for comparision to official rt1x and octo baselines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant