Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about loss #4

Open
yuyiyi opened this issue Feb 6, 2018 · 1 comment
Open

Question about loss #4

yuyiyi opened this issue Feb 6, 2018 · 1 comment

Comments

@yuyiyi
Copy link

yuyiyi commented Feb 6, 2018

Thanks for posting the code! It is very useful.
I have some confusion about the loss function. The loss function used in this method was Kullback-Leibler divergence. It suppose to be non-negative. However, I'm getting negative loss values. Would you be able comment on that? Thank you very much!

@dimitri-yatsenko
Copy link
Member

dimitri-yatsenko commented Feb 8, 2018

The loss function in Eq. 9 of the paper is indeed the Kullback-Leibler divergence. It is computed relative to the ground truth and is non-negative. This is what is plotted in row 5 of Figure 1. However, it cannot be used in empirical studies because the ground truth is unavailable.

However, the validation loss in Eq. 10 is measured with respect to a validation sample and it omits the unavailable constant term. It is no longer guaranteed non-negative. That is what is plotted in row 6 of Figure 1. Notice that in these plots, the values are relative to those of the correct model family. The actual values are not shown.

The derivation in Eq. 11 demonstrates that minimization of the validation loss in Eq. 10 is equivalent to minimization of the true loss in Eq. 9, which is not necessarily true for many other definitions of loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants