-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
negetive total correlation loss for btc-vae #54
Comments
I don't remember having that issue, but it's important to notice that while the KL is always positive its estimate might not be. Try increasing the batch size, which should improve the estimate. |
I find the same thing on DSprites if beta is set too high, tc_loss becomes negative, even with large batch sizes (O(1e3)). It also seems to affect disentanglement negatively, at least by visual inspection of latent traversals. |
And, I also noticed that in case of Beta-TC VAE if we use MSS to estimate mutual information, it could be positive (50 or 40). But if we just use MWS, the mutual information is close to 0. And if we test vanilla VAE, the mutual information estimated by MWS is also close to 0. Just mnist |
It is negative in the results you have also posted. https://github.com/YannDubs/disentangling-vae/blob/master/results/btcvae_dsprites/train_losses.log For factorVAE, tc_loss is +ve. https://github.com/YannDubs/disentangling-vae/blob/master/results/factor_dsprites/train_losses.log |
closing in favour of #60 |
Nice Work!!!!!!!!!
I tried the beta-TC VAE, but I found that tc_loss is negetive. Actually, this term is KL divergence which is always positive.
I am confused about it.
Thanks!!!!
The text was updated successfully, but these errors were encountered: