-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the performance of MAE not match with the paper results in ImageNet #313
Comments
Hey, Thanks for letting us know about this. I'm a bit busy until probably next year, but I'll try to check before then. Nonetheless, try to see if the parameters that we have are the same as in the original paper as we might have missed some. I'll try to check it myself as soon as I can. |
thanks for your help ~ |
Newly update :Hello, this several week, we found that the start_lr (or warmup_start_lr in my config) should be setup to exactly 0 in fine-tuning stage, this modification will increase about 2% top-1 acc (from 77.4%). Since the default setup is a small number instead of zero. So, the reproduced version of MAE in solo-learn could achieved 79.6% top-1 acc in ImageNet for 100ep pertaining (100ep fine-tuning) currently.The wandb link (the other run is deleted) are provided for pretraining and fine-tuning, respectively. However, the official code released accuracy is 82.1%, still have some hyparameters need to be tuned. |
Glad to hear that. I still haven't found the time to look into it. One experiment that might be very interesting (resources permit) is see how much a model pretrained with the official code differs from a model pretrained on solo. I would advise to pretrain a model with the official and then run our finetune, in that way, we can know of the problem is the pretraining or the fine-tuning. |
Another thing to consider is that MAE is very sensitive to the jpeg decoding library that you use. So for instance if you are using pillow simd you can expect a ~1% loss in accuracy wrt normal pillow. |
TL; DR (option to read) One of our member re-run the 100ep pertaining based on the official MAE 1600ep-config on the official code, and we only got 80.62%. At the same time, he also found a paper report 100ep pertaining accuracy 81.2%. So, we can believe that the report of 82.1% top-1 acc must have some params tuning on it. On the other hand, the accuracy between 80.62% and 81.2% could be counted as the effect of random seed setup. (variance : 0.6%). Thanks for your helping, if you have free time for survey this issue we still wait for that and remaining this issue. |
For speedup the data augmentation, we applied the dali, which indeed decode jpeg with some special function. So, do you suggest that we can disable the dali and apply the |
@HuangChiEn Hi, I wonder if you also tried to benchmark MAE on cifar10. I ran the training script and using the config file in the repo and got 83% top1 evaluation accuracy. Does this look right? |
I'm sorry, but we don't have any experience with pretrain/finetune MAE on cifar10. However, I think it's a bit lower than expected if you have pretrain & fine-tune it. Also in a practical view, the small-scale supervised trained model can easily surpass this accuracy, so it may not be the target of SSL research. Besides, I believe the contributor of solo-learn has already provided a well-tuned config for cifar10/cifar100 in here. |
Yes. This is config I used to pretrain MAE. This is the run in wandb: https://wandb.ai/chobitstian/solo-learn. It's the first one with name "mae-cifar10", please ignore all the other runs. |
I have quickly scane your pretraining config, may i suggest that may be you can follow the configuration given by solo-learn, which i believe is well-tested.
Also, note that this version solo-learn config is more straightforward.. Since seldom benchmarks directly runs for cifar-10 pretrain and then fintune, i also can not judge this accuracy. However, i believe it should be nearby MoCoV3 performance (93.10/99.80). At least, DeepCluster V2 is the lower bound (88.85/99.58). |
Uh. That's interesting. The current config (the one you mentioned before) didn't mention warmup-epoch. I think it might also cause by the effect of DDP? When you simply increase number of GPUs, the effect batch size become larger, but I don't think the current code is adjusting the learning rate for that. So I think I will try single gpu and the old config file you suggested. I will let you know the result. Thanks for the help! |
Good morning, thanks to DonkeyShot21 suggestion, we're glad to find the suitable configuration for MAE both in pretaining and fine-tuning. The finally accuracy could achieve 81.6% top-1 acc, 95.5% top-5 acc (yes, solo-learn can works slightly better then official code with same config). The following wandb link provides the detail configuration of pretrain on MAE with 100ep in ImageNet dataset :
While I also provide the configuration with easy_configer format : pretraining 100 ep on ImageNet : MAE
fine-tuning 100ep
@zeyuyun1 |
Sorry I am little confused about your comment. You said "The finally accuracy could achieve 81.6% top-1 acc, 95.5% top-5 acc," is this for cifar10? 81.6% top-1 acc is pretty low right? |
Oh.. www ~ (81.6% top-1 acc, 95.5% top-5 acc) it's ImageNet accuracy actually.. yeah, i think cifar10 should be higher then ~ |
@HuangChiEn thanks for providing that! So the issue was fixed by using default image folder instead of Dali? I'll convert your config to our configuration format and open a PR. Can you also provide the pretrained and finetuned checkpoints? |
Yes, i think we only modify the following part :
About the checkpoint, we can provide it. But it may take a while to prepare. |
Sure. I'll update the config files. Thanks for the help. Let me know when you have the checkpoints and I'll add the results/checkpoint to the readme. |
Morning, the resulting checkpoint could be found in the following google drive link: let this issue open 2 days, if you encounter any issue about downloading or can not find the ckpt,... ,etc. plz tag me ~ |
@HuangChiEn Added the checkpoints to our zoo and added the results in #321. |
everything looks good ~ |
@HuangChiEn Thanks again for debugging it for us and providing the checkpoints/results :) |
Firstly, thanks for your release of such an amazing framework, which almost covers all of the SOTA-SSL methods.
Do you mind to look into why the performance of MAE does not match the paper results in ImageNet? This paper recorded 82.1% top-1 acc for 100 epoch pretraining on ViT-base architecture with 4096 batch size on ImageNet dataset (100 epoch for fine-tuning).
They mention the results come from performing the official code on 100 epoch / 300 epoch and 1600 epoch. For the 300 epoch and 1600 epoch, we also find the accuracy is matched with the other paper, so we think the 100 epoch results is also verified.
On the other hand, we use the solo-lean of this version to run the mae pretraining and the pretraining configuration as well as the procedure tracing can be found in the link.
We keep the exactly same configuration, but the resulting performance of top-1 accuracy is only 77.4%, which is lower then the aforementioned 82.1% about 4%, and i think it's largely beyond the random seed and acceptable variance of experiments.
In addition, the fine-tuning configuration as well as the finetuning tracing can be found in the link.
All the above configurations of finetuning match the pretraining script of official implementation (except the num of epochs).
Any suggestion is appreciated !!
The text was updated successfully, but these errors were encountered: