Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request for pixelsnail pretrained .pt files #7

Open
robot0321 opened this issue Dec 30, 2022 · 2 comments
Open

request for pixelsnail pretrained .pt files #7

robot0321 opened this issue Dec 30, 2022 · 2 comments

Comments

@robot0321
Copy link

Hi @vvvm23,
I saw your code today, it is novel and inspiring an I benefited a lot from it.
I'm sorry to bother you, but could you upload the .pt file you learned with pixel snail?
Since I want to test the decoder alone, so I followed your very kind instructions.
but it takes a very long time because my GPU performance is not enough. 😅

Thanks for your time!

@vvvm23
Copy link
Owner

vvvm23 commented Jan 10, 2023

Hi, I also didn't have the resources to train the PixelSnail component at the time, so no file exists.

I am actively working on a refactor of the repository with weights, so should have weights available after that (maybe not PixelSnail, but something that does the same thing)

@shibbit
Copy link

shibbit commented May 25, 2023

You don't necessarily need the pixelCNN pretrained to test the decoder, however you will need it to produce the best results(otherwise they will be blurry), because in my opinion which might not be right, the main target to insert pixelCNN is to add more variety to the generated results. The biggest difference between VQ-VAE and vanilla VAE is that the pixelCNN used here is designed to fit a prior distribution of the codebook which is low-dimention representation of the input, whereas the vanilla VAE simply presume the prior is Gaussian, both these priors are used to control the generated results. However, training the autoregressive model like pixelCNN takes enormous computation cost, thus the VQ-VAE combine with codebook and pixelCNN together to make this whole thing feasible, and did improve the image quality a lot.
So, to summarize, if your task is GENERATION which takes variety and image quality as the targets, then you will need the pretrained pixelCNN, but if your task is something like RECONSTRUCTION or DENOISING, etc. then you can just directly inference but you have to designate the inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants