Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification about implementation #13

Open
nlgranger opened this issue Jun 6, 2018 · 6 comments
Open

Clarification about implementation #13

nlgranger opened this issue Jun 6, 2018 · 6 comments

Comments

@nlgranger
Copy link

Sorry to use the bug tracker for this, it's actually more of a question.
How did you interpret the concatenation of the hidden state and the readout in equation 3 of the paper?
It seems to me the state has twice the required shape after the concatenation, how is one supposed to manage that?

@AntreasAntoniou
Copy link
Owner

Your initial state should have twice the amount of zeros. Then you can easily concatenate and have the expected size.

@AntreasAntoniou
Copy link
Owner

It appears I have to change my implementation a bit. I just realized a minor difference, which shouldn't affect results too much. That's only in the full context embeddings case.

@AntreasAntoniou
Copy link
Owner

Actually, scratch what I said before. In practice it's not working as intended since the size will keep increasing. I implemented the concatenation with a summation for now.

@nlgranger
Copy link
Author

Thanks for looking into it. I think the paper is lacking some details to do a faithfull reimplementation.

For what it's worth, the paper H. Altae-Tran, B. Ramsundar, A. S. Pappu, and V. Pande, “Low Data Drug Discovery with One-Shot Learning,” ACS central science, vol. 3, no. 4, pp. 283–293, 2017. seems to have an interpretation of how f should work which I think makes sense. They propose a refined version but I imagine the vanilla matching network would have an equation 3 like:

image

Basically, the hidden state/output of the LSTM is an additive correction over the original input vector (as implied by eq. 4)

@nlgranger
Copy link
Author

I have put an implementation of this method here if you want to try it out. I haven't run it on Omniglot but on my data the fully conditional embedding has not benefit whatsoever.

@nlgranger
Copy link
Author

nlgranger commented Aug 14, 2018

Sorry to bump this up but have you had any time to look into this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants