Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why mixture loss is so large? #6

Open
hdmjdp opened this issue Nov 14, 2018 · 3 comments
Open

why mixture loss is so large? #6

hdmjdp opened this issue Nov 14, 2018 · 3 comments

Comments

@hdmjdp
Copy link

hdmjdp commented Nov 14, 2018

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 126/126 [02:48<00:00, 1.34s/it]
epoch:0, running loss:186126176.4375, average loss:1477191.8764880951, current lr:0.0001
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 126/126 [02:41<00:00, 1.28s/it]
epoch:1, running loss:167085797.546875, average loss:1326077.7583085317, current lr:0.0001
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 126/126 [02:45<00:00, 1.31s/it]
epoch:2, running loss:167197651.21875, average loss:1326965.4858630951, current lr:0.0001
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 126/126 [02:43<00:00, 1.30s/it]
epoch:3, running loss:162484302.359375, average loss:1289557.955233135, current lr:0.0001

@G-Wang
Copy link
Owner

G-Wang commented Nov 14, 2018

mixture of logstics I pulled from r9r9's wavenet implementation, including the log loss and sampling code. It really doesn't do work very well with the current settings. I don't have enough compute to run abliation studies to see what training set up works. I recommend sticking to raw bit, and beta/gaussian for now.

@hdmjdp
Copy link
Author

hdmjdp commented Nov 15, 2018

@G-Wang Thanks for replying. I think bits 9 maybe the best configuration.

@chaiyujin
Copy link

The mixture of logsitics loss from r9y9's implementation is reduced by 'sum'. If you use 'mean' to reduce the loss, it will be like 6~11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants