-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Estimated Number of Epoch Required For Training #9
Comments
Did not intend to close the issue. Any suggestions or insights would be appreciated! Thank you! @zh-ding @Mq-Zhang1 |
Hello, thanks for your interest. Can you share more details on how you train the latents stage? It looks a bit weird to me since the images from training and testing should look similar. I can see from you results that the latent training stage doesn't learn the data latent distribution at all. So I'm wondering if there is any mismatch in this process. BTW, why is the batch size only 1 for the latent training? |
Hi @zh-ding, Thank you so much for the prompt reply. Yes, I agree that the latent code was not sampled properly to match the training latent code distribution. Sorry for the confusion, I did not use batch size of 1. The batch size during latents stage was 128 (same as the original GitHub code). For training the latents stage, I set the model_path the same as the last model from the training stage. I keep most of the code the same. However, I did modify the codes on line 432,
After the above modification, the tensorboard would no longer log the images during the latent stage. Thus, I used P.S. I wonder if the number of training images is not enough for the latent code sampler to learn the latent distribution? I only used 5000 256x256 celebA images for training. Did you ever use 5000 images for training and have any luck with them? In your paper, I think the smallest dataset (nature 21K) you used has 21K images. Please let me know if you have any insights or suggestions! I really appreciate that! Thank you. |
Hi @WeiyunJiang, Thank you for all the details provided! The problem may caused by conf.latent_znormalize set to True, learning a normalized latent distribution instead of the original. Add I will fix this normalization config problem during latent training and sampling immediately to make it more clear. Thanks for raising this up! Plz feel free to let me know if results are still weird. |
Hi @Mq-Zhang1 and @zh-ding , Thanks again for the prompt response. After the fix, it works like a charm. THANK YOU! Fantastic work! :) |
Hi Zheng and Mengqi,
Thank you for your amazing work and making your code public! I wonder if you could kindly provide some insights on my experiments below?
I am training on the truncated CelebA dataset, which only has 5000 256x256 images. And I am using CLIP embedding. My batch-size is 96.
The text was updated successfully, but these errors were encountered: