Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3D VAE finetune #111

Closed
Simona0212 opened this issue Aug 11, 2024 · 7 comments
Closed

3D VAE finetune #111

Simona0212 opened this issue Aug 11, 2024 · 7 comments
Assignees

Comments

@Simona0212
Copy link

Simona0212 commented Aug 11, 2024

我们的科研项目想要在3D VAE的基础上进行改动和微调。我们尝试在原有模块的训练代码上进行修改,发现有点困难。
请问能否开源3D VAE模块的训练代码和配置文件?
感谢CogVideoX团队!

@Simona0212
Copy link
Author

Simona0212 commented Aug 11, 2024

I am going to modify and finetune the 3D VAE for our research project. We have tried to modify the training code of the original module and found it a little difficult. I would highly appreciate if you could provide the code and configuration files required to train and finetune the 3D VAE module (for video encoding and video decoding). We look forward to receiving feedback from your team. Thank you very much!

@zRzRzRzRzRzRzR
Copy link
Member

Fine-tuning the VAE alone doesn’t seem very meaningful. If your goal is to fine-tune in conjunction with Transformers while reducing memory usage in the VAE encoder, we will continue working on this optimization. You can check for updates here.
#194 huggingface/diffusers#9302

@serend1p1ty
Copy link

serend1p1ty commented Oct 11, 2024

@Simona0212 I recommend you use Open-Sora-Plan's code. I have trained a stronger VAE than CogVideoX using Open-Sora-Plan's codebase, along with many tricks.

@vanche
Copy link

vanche commented Nov 18, 2024

@serend1p1ty I'm really interested in the tricks you've used and how much you've improved the VAE's performance. Could you share more details about the techniques you applied and the results you achieved?

@serend1p1ty
Copy link

@vanche Sorry I cannot share more tech detail. Here is my results.
image
There is still a lot of room for improvement (loss has not converged).

@vanche
Copy link

vanche commented Nov 19, 2024

@serend1p1ty Thank you for sharing the performance results. Do you plan to publish a paper or technical report?

@serend1p1ty
Copy link

@vanche Basically, it's just some increment improvements, and unable to support a paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants