Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cogview support with c and uc different length (Not work now) #10649

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

zRzRzRzRzRzRzR
Copy link
Contributor

What does this PR do?

This PR aims to support the adaptation of the CogView with c and uc different length for diffusers.

We have reproduced the algorithm implementation, but this PR still requires further refinement. Currently, the output is pure green noise, so this PR remains in draft status and requires help from @a-r-r-o-w and @yiyixuxu.

zRzRzRzRzRzRzR and others added 29 commits January 14, 2025 20:27
Implement the basic CogView4 pipeline structure with the following changes:
- Add CogView4 pipeline implementation
- Implement DDIM scheduler for CogView4
- Add CogView3Plus transformer architecture
- Update embedding models

Current limitations:
- CFG implementation uses padding for sequence length alignment
- Need to verify transformer inference alignment with Megatron

TODO:
- Consider separate forward passes for condition/uncondition
  instead of padding approach
…n CogView4 pipeline

Split the forward pass for conditional and unconditional predictions in the CogView4 pipeline to match the original implementation. The noise prediction is now done separately for each case before combining them for guidance. However, the results still need improvement.

This is a work in progress as the generated images are not yet matching expected quality.
@a-r-r-o-w
Copy link
Member

@zRzRzRzRzRzRzR @OleehyO Thanks for the PR! I'm going to take a look soon and try to help with debugging Megatron -> Diffusers. I see some additional changes made to the files that are not relevant to CogView (currently more than 200 files have been changed 😅). Could you revert those changes by doing something like git restore -s main examples/ scripts/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants