Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reproduce this visualization from the VQVAE 2 paper #8

Closed
theAdamColton opened this issue Feb 13, 2023 · 1 comment
Closed

Comments

@theAdamColton
Copy link

The original paper had this cool graphic in it, which showed what I believe is a decoded representation of different parts of the network. But I don't understand how in practice you could obtain a decoded image using only the top level FFHQ encoder representation. In the case of the three level FFHQ model, the final decoder layer is applied to a concatenation of the upscaled middle layer and the double upscaled top layer, and expects 192 layers.

Is there a way, only using information from the top level encoded quantized representation, to get an image out of the network?

vqvae2

@vvvm23
Copy link
Owner

vvvm23 commented Feb 13, 2023

This is something I also never quite understood about the original paper, nor have I explored myself. So I can't really answer this question. It could be something as naive as passing a zero tensor as a substitute for the lower level codes in the final decoder, but only the original authors know.

Thanks for bringing this to my attention, I am currently working on a refactor of this repo (see #5) so I might investigate this once that is done. There are actually quite a lot of unclear things in the paper that we may never know for sure how it was done for the paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants