Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReACGAN + ADC has lower performance than plain ReACGAN #184

Open
Adversarian opened this issue Apr 14, 2023 · 4 comments
Open

ReACGAN + ADC has lower performance than plain ReACGAN #184

Adversarian opened this issue Apr 14, 2023 · 4 comments

Comments

@Adversarian
Copy link

Adversarian commented Apr 14, 2023

Hi,
first of all I would like to thank you tremendously for your work. I'm certain that this repository alone has saved countless hours for researchers and developers all across the world.

I am conducting a series of experiments on image generation using ADCGAN and ReACGAN and I just noticed that, against my expectations, ReACGAN + ADC actually has a worse performance than plain ReACGAN on CIFAR100, resulting in a lower Inception Score and a higher FID.

I'm using the default configuration files provided within the repo, namely ReACGAN-ADC-DiffAug.yaml and ReACGAN-DiffAug.yaml. I also conduced the training procedure twice at 200k steps each to make sure I haven't hit an odd outlier the first time.

I would like to inquire as to why this could be happing since it appears counterintuitive that adding an ADC component to ReACGAN should be negatively affecting its performance, as it doesn't seem there should be any apparent downsides to this augmentation from a theoretical standpoint. I would appreciate your insight on the matter.

Thank you in advance and please do let me know if you need any further information.

@mingukkang
Copy link
Collaborator

I am conducting a series of experiments on image generation using ADCGAN and ReACGAN and I just noticed that, against my expectations, ReACGAN + ADC actually has a worse performance than plain ReACGAN on CIFAR100, resulting in a lower Inception Score and a higher FID.

We have encountered the same issue, and I am glad that you brought it up.

How about the result on CIFAR10? My assumption is that the increased number of classes may have contributed to the training instability, given that training ReACGAN with ADC trick is generally more challenging than vanilla ReACGAN training. Despite this, I still believe that there should be no theoretical issues. Thus, I am interested in exploring why the results did not meet our expectations:)

@Adversarian
Copy link
Author

Due to certain constraints on my training configuration, I was not able to evaluate the results using InceptionV3_TF and have fallen back on the Torch version which I'm assuming gives results in the same ballpark as the Tensorflow version. With that said, here are my best results on CIFAR10 with ReACGAN-ADC-DiffAug and ReACGAN-DiffAug:

ReACGAN + ADC + DiffAug: (evaluated at best checkpoint 198000/200000)

  • Best IS: 8.797917366027832
  • Best FID: 2.614202591374294

ReACGAN + DiffAug: (evaluated at best checkpoint 198000/200000)

  • Best IS: 8.668767929077148
  • Best FID: 2.626044013052365

Both of which seem to conform to my expectations.

@Adversarian
Copy link
Author

It might be possible that applying DRA can guide the results to our expectations as per the claims of the authors about how DRA can alleviate the stability and mode collapse issues with a regret minimization objective. I'm not entirely sure since I'm not very well versed into the theory that goes into the DRA regularization algorithm and how it might affect these particular GAN architectures, but I will try to re-do my experiments while applying DRA to both configurations if I have time.

@Adversarian
Copy link
Author

Adversarian commented Apr 24, 2023

I just finished training both models on Tiny ImageNet and the differences are even more stark. Although plain ReACGAN collapsed after about 100k steps, it still achieved far better results than ReACGAN + ADC after 200k steps.

It seems, as you pointed out earlier, there is some correlation between the increase in the number of classes (200 in the case of Tiny ImageNet for instance) and how far these two configurations diverge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants