Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having trouble replicating your results #3

Open
chrisorm opened this issue Sep 1, 2018 · 2 comments
Open

Having trouble replicating your results #3

chrisorm opened this issue Sep 1, 2018 · 2 comments

Comments

@chrisorm
Copy link

chrisorm commented Sep 1, 2018

Hi Kaspar,

I tried replicating the results in your post in PyTorch, and I'm unable to get even close to the kind of results you display on your blog. I am sure there is an error on my end somewhere, but I have poured over the paper and your code and your blog post and I'm unable to see anything that could be behind it. I had a friend look over my implementation too and were unable to spot anything of substance different.

I have tried as best as possible to follow the architecture and setup of your experiment 1, and I see very different behavior. The code is very simple, It's afterall only a handful of simple nns,

https://github.com/chrisorm/Machine-Learning/blob/ngp/Neural%20GP.ipynb

Some things I witness that you don't seem to see (shown in the notebook):
-My q distribution concentrates (i.e. std goes to 0).

  • Related to above, I see no substantial difference between function samples when extrapolating outside of the data like you seem to.
    -My prior function samples display substantially less variance than yours seem to.

The first led me to suspect an error in my KLD term, but that does not seem to be the case - I unit tested my implementation and I think it is correct. The loss looks good and the network clearly converges.

The second is a bit stranger - do you perhaps use some particular initialization of the weights to draw these samples, over and above setting z ~ N(0,1)?

Would you happen to have any insights as to what may be behind this difference?

Thanks for taking the time to do your post, it has some really great insights into the method!

Chris

@kasparmartens
Copy link
Owner

Hi Chris,

Sorry for my slow response.

You say that you see very different behaviour from what I saw, but I don't think you can conclude much from the example where we are learning a single fixed function.

I agree that initialisation can sometimes have an effect, and that in this particular case, out-of-sample uncertainty can disappear when training for a large number of iterations (afterall, we are learning a single function on a grid, and I think it is non-obvious what sort of behaviour to expect from the model outside this set of points).

The fact that you have experienced q collapse, seems plausible given the model formulation, but I didn't experience this myself in my experiments.

Kaspar

@kasparmartens
Copy link
Owner

Sorry, I accidentally clicked "close"

@kasparmartens kasparmartens reopened this Sep 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants