Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResidualLFQ was successful, but ResidualVQ failed severely! #73

Open
fighting-Zhang opened this issue Mar 19, 2024 · 2 comments
Open

ResidualLFQ was successful, but ResidualVQ failed severely! #73

fighting-Zhang opened this issue Mar 19, 2024 · 2 comments

Comments

@fighting-Zhang
Copy link

When training the MeshAutoencoder, I compared ResidualLFQ and ResidualVQ.
ResidualLFQ is your default option, which can rebuild a reasonable structure.

However, when I use ResidualVQ (without changing any of your default parameters), the reconstruction results in significant errors, the validation set loss gradually increases, and the reconstruction only yields a few faces(e.g. 3 faces).
I'm not quite sure what the reason is.

if use_residual_lfq: self.quantizer = ResidualLFQ( dim = dim_codebook, num_quantizers = num_quantizers, codebook_size = codebook_size, commitment_loss_weight = 1., **rlfq_kwargs, **rq_kwargs ) else: self.quantizer = ResidualVQ( dim = dim_codebook, num_quantizers = num_quantizers, codebook_size = codebook_size, shared_codebook = True, commitment_weight = 1., stochastic_sample_codes = rvq_stochastic_sample_codes, # sample_codebook_temp = 0.1, # temperature for stochastically sampling codes, 0 would be equivalent to non-stochastic **rvq_kwargs, **rq_kwargs )
some defalt parameters:

use_residual_lfq = True, # whether to use the latest lookup-free quantization
rq_kwargs: dict = dict(
quantize_dropout = True,
quantize_dropout_cutoff_index = 1,
quantize_dropout_multiple_of = 1,
),
rvq_kwargs: dict = dict(
kmeans_init = True,
threshold_ema_dead_code = 2,
),
rlfq_kwargs: dict = dict(
frac_per_sample_entropy = 1.
),
rvq_stochastic_sample_codes = True,

loss curve:
red curve is ResidualVQ, and grey curve is ResidualLFQ.
image

@fighting-Zhang
Copy link
Author

LFQ seems not to support shared_codebook, resulting in the same index in mesh codes corresponding to different meanings. Could this affect the model's learning?

Additionally, the paper utilizes RVQ-VAE, indicating that RVQ should also possess a certain capability. However, in practical training, the performance of RVQ is very poor. Could this be an issue with the code?

@lucidrains
Copy link
Owner

lucidrains commented Mar 28, 2024

@fighting-Zhang i think scalar quantization is the future. you aren't the only one reporting great results without loss of generalization

LFQ has a fixed codebook, so it doesn't matter whether it is shared or not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants