Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the issue with the new version #14

Open
manxswl opened this issue Aug 13, 2024 · 10 comments
Open

Regarding the issue with the new version #14

manxswl opened this issue Aug 13, 2024 · 10 comments

Comments

@manxswl
Copy link

manxswl commented Aug 13, 2024

In the first stage of the new version, the reconstructed signal appears smoother compared to the old version. The GT and reconstructed signals are almost overlapping, but is this form correct?
signal

@danelee2601
Copy link
Collaborator

yes, the improvement is brought by using the Snake activation function instead of ReLU.

@danelee2601
Copy link
Collaborator

danelee2601 commented Aug 13, 2024

I guess it's often the case that we get hindsight about little simple things to tweak for further improvement.
But I'm not planning to release v2, as it's not a major update. Just put the implementation modification notes in the repo readme.

@manxswl
Copy link
Author

manxswl commented Aug 13, 2024

Your work's idea is very interesting, so I tried to reproduce your paper. In this process, I found that the reconstruction effects of the previous version and the later version differ greatly. I'm not sure which form is correct. The effect generated by the old version is shown in the image below. Is the activation function the only reason for this phenomenon?
old_version

@manxswl
Copy link
Author

manxswl commented Aug 13, 2024

And, I noticed that you have implemented some functionalities in both the encoder and decoder, and you've removed the loss updates related to zero padding. I don't quite understand why this was done.

@manxswl
Copy link
Author

manxswl commented Aug 13, 2024

Your work's idea is very interesting, so I tried to reproduce your paper. In this process, I found that the reconstruction effects of the previous version and the later version differ greatly. I'm not sure which form is correct. The effect generated by the old version is shown in the image below. Is the activation function the only reason for this phenomenon? old_version

I apologize, it was an issue with my dataset. This work is very interesting. Thank you for your explanations.

@manxswl
Copy link
Author

manxswl commented Aug 13, 2024

In the first stage of the new version, the reconstructed signal appears smoother compared to the old version. The GT and reconstructed signals are almost overlapping, but is this form correct? signal

The image after correctly introducing the dataset is shown below
Uploading new_signal.jpg…

@manxswl
Copy link
Author

manxswl commented Aug 13, 2024

After reproducing the results, I found that the reconstruction error for low-frequency signals is larger than before the update, and it's unstable.
new_signal

@danelee2601
Copy link
Collaborator

danelee2601 commented Aug 13, 2024

hmm.. I've noticed that the training on the FordA dataset generally resulted in unsatisfactory validation loss (which is shown in your figure -- reconstruction on a test sample). Have you tried it on other datasets? Additionally, the same model with the ReLU activation results in poorer optimization (i.e., slower training and poorer convergence). I think you can try it out yourself.

I've personally experimented quite extensively to confirm the positive effect of the current VQVAE. I found it more stable with faster and better convergence.

@danelee2601
Copy link
Collaborator

danelee2601 commented Aug 13, 2024

And, I noticed that you have implemented some functionalities in both the encoder and decoder, and you've removed the loss updates related to zero padding. I don't quite understand why this was done.

I've simplified the code by leaving the essentials only so that end users can read and use it more easily.
The performance wasn't affected much.

@manxswl
Copy link
Author

manxswl commented Aug 14, 2024

I didn't use any other dataset. I just put all the instructions from the readme into a slurm script at once without paying attention, and I got the dataset wrong. I noticed that the log generated in the second stage is in this form (see image). Is this form correct? It seems a bit strange.
I just saw in the readme that the dataset has been changed to Wafer. I think I can try the effect again.
media_images_visual comp (X_test vs Xhat)_8043_3c12001a4b29f3b360ad (1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants