Missing Scaling factor #33

sunly92 · 2024-11-23T00:07:52Z

Hi,

This is really a very good repo for learning stable diffusion from scratch. However, I found the missing scaling factor that should have been applied to latent $z$ before U-Net. It was said to keep the variance of the latent onto a unit circle which could facilitate training. A detailed discussion can be found at:
huggingface/diffusers#437

Cheers,
Liyan

explainingai-code · 2024-11-23T03:09:37Z

Hello @sunly92 ,

Thank you :)
You are right regarding the scaling factor not present, but this scaling is only used by the authors for VAE and not VQVAE.
You can see scale_by_std parameter defined in this config but not here
From paper - "Note that the VQ-regularized space has a variance close to 1, such that it does not have to be rescaled."
Since I only provided the code for VQVAE training in this repo, so this scaling was not required.

Interestingly, for both the datasets when I computed latent std, I ended up with a std close to 1(not just for vqvae but also for vae), so latent scaling would not have made any difference even for vae. Which is why even in other repos where I have also trained a vae, I decided to skip it rather than adding the logic to compute std for every batch item and do the scaling.

Hope this helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing Scaling factor #33

Missing Scaling factor #33

sunly92 commented Nov 23, 2024

explainingai-code commented Nov 23, 2024 •

edited

Loading

Missing Scaling factor #33

Missing Scaling factor #33

Comments

sunly92 commented Nov 23, 2024

explainingai-code commented Nov 23, 2024 • edited Loading

explainingai-code commented Nov 23, 2024 •

edited

Loading