How to improve text-conditioned generation? #17

Nikita-Sherstnev · 2024-06-30T08:50:32Z

I see that model not very good at text conditioned generation. How to improve this situation? Maybe train CLIP model itself, or just train ldm for longer?

explainingai-code · 2024-07-01T16:33:27Z

When I trained this on Celeb captions, I also found that for the captions that are very common(like hair color), the trained text conditioned diffusion model was performing very well for them. But for words which weren't quite frequent , the model wasn't honouring them at all.
I suspect training the ldm longer(or getting more images for the infrequent captions) should indeed improve the generation results for them.
You can definitely try training CLIP as well but I feel unless you have very rare words in your captions (or maybe very different from what clip was trained on), training ldm for longer should be more fruitful than training CLIP model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to improve text-conditioned generation? #17

How to improve text-conditioned generation? #17

Nikita-Sherstnev commented Jun 30, 2024

explainingai-code commented Jul 1, 2024 •

edited

Loading

How to improve text-conditioned generation? #17

How to improve text-conditioned generation? #17

Comments

Nikita-Sherstnev commented Jun 30, 2024

explainingai-code commented Jul 1, 2024 • edited Loading

explainingai-code commented Jul 1, 2024 •

edited

Loading