Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to improve text-conditioned generation? #17

Open
Nikita-Sherstnev opened this issue Jun 30, 2024 · 1 comment
Open

How to improve text-conditioned generation? #17

Nikita-Sherstnev opened this issue Jun 30, 2024 · 1 comment

Comments

@Nikita-Sherstnev
Copy link

I see that model not very good at text conditioned generation. How to improve this situation? Maybe train CLIP model itself, or just train ldm for longer?

@explainingai-code
Copy link
Owner

explainingai-code commented Jul 1, 2024

When I trained this on Celeb captions, I also found that for the captions that are very common(like hair color), the trained text conditioned diffusion model was performing very well for them. But for words which weren't quite frequent , the model wasn't honouring them at all.
I suspect training the ldm longer(or getting more images for the infrequent captions) should indeed improve the generation results for them.
You can definitely try training CLIP as well but I feel unless you have very rare words in your captions (or maybe very different from what clip was trained on), training ldm for longer should be more fruitful than training CLIP model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants