This repository contains the official PyTorch implementation of the following paper:
Masked Frequency Modeling for Self-Supervised Visual Pre-Training,
Jiahao Xie, Wei Li, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy
In: International Conference on Learning Representations (ICLR), 2023
[arXiv][Project Page][Bibtex]
- [04/2023] Code and models of SR, Deblur, Denoise and MFM are released.
ImageNet-1K Pre-trained and Fine-tuned Models
Method | Backbone | Pre-train epochs | Fine-tune epochs | Top-1 acc (%) | Pre-trained model | Fine-tuned model |
---|---|---|---|---|---|---|
SR | ViT-B/16 | 300 | 100 | 82.4 | config | model | config | model |
Deblur | ViT-B/16 | 300 | 100 | 81.7 | config | model | config | model |
Denoise | ViT-B/16 | 300 | 100 | 82.7 | config | model | config | model |
MFM | ViT-B/16 | 300 | 100 | 83.1 | config | model | config | model |
ImageNet-1K Pre-trained and Fine-tuned Models
Method | Backbone | Pre-train epochs | Fine-tune epochs | Top-1 acc (%) | Pre-trained model | Fine-tuned model |
---|---|---|---|---|---|---|
SR | ResNet-50 | 300 | 100 | 77.9 | config | model | config | model |
Deblur | ResNet-50 | 300 | 100 | 78.0 | config | model | config | model |
Denoise | ResNet-50 | 300 | 100 | 77.5 | config | model | config | model |
MFM | ResNet-50 | 300 | 100 | 78.5 | config | model | config | model |
MFM | ResNet-50 | 300 | 300 | 80.1 | config | model | config | model |
Please refer to INSTALL.md for installation and dataset preparation.
Please refer to PRETRAIN.md for the pre-training instruction.
Please refer to FINETUNE.md for the fine-tuning instruction.
If you find our work useful for your research, please consider giving a star ⭐ and citation 🍺:
@inproceedings{xie2023masked,
title={Masked Frequency Modeling for Self-Supervised Visual Pre-Training},
author={Xie, Jiahao and Li, Wei and Zhan, Xiaohang and Liu, Ziwei and Ong, Yew Soon and Loy, Chen Change},
booktitle={ICLR},
year={2023}
}
This code is built using the timm library, the BEiT repository and the SimMIM repository.