Training code can be downloaded at: https://drive.google.com/file/d/1jaf4iB66mAn_J5iDwqvha2xAj4AVyx3y/view?usp=sharing.
Classification on ImageNet with PyTorch.
- PyTorch compatible GPU
- Python 3.7
- PyTorch >= 1.2.0
- opencv-python 4.1.1
- libjpeg-turbo 2.0.3
- jpeg2dct
-
Install PyTorch
-
Clone this repo recursively
git clone --recursive https://github.com/calmevtime1990/supp
-
Install required packages
pip install -r requirements.txt
-
Install libjpeg-turbo
bash install_libjpegturbo.sh
-
Download pretrained models and extract to
pretrained
. The folder structure should look like this:
pretrained
├── resnet50dct_upscaled_static_24
│ ├── log.txt
│ └── model_best.pth.tar
└── resnet50dct_upscaled_static_64
├── log.txt
└── model_best.pth.tar
- Prepare datasets
It is recommended to symlink the dataset root to
data
. The folder structure should look like this:
data
├── train
├── val
└── README.md
Run resnet_upscaled_static.sh
to start testing. Change the --data $imagenet_dir to the location of the ImageNet dataset.
bash scripts/resnet_upscaled_static.sh 24
bash scripts/resnet_upscaled_static.sh 64
ResNet-50 | #Channels | Size Per Channel | Top-1 | Top-5 | Normalized Input Size |
---|---|---|---|---|---|
RGB | 3 | 224x224 | 75.780 | 92.650 | 1.0 |
DCT-24 (ours) | 24 | 56x56 | 76.792 | 93.254 | 0.5 |
DCT-64 (ours) | 64 | 56x56 | 77.160 | 93.474 | 1.3 |
MobileNetV2 | #Channels | Size Per Channel | Top-1 | Top-5 |
---|---|---|---|---|
RGB | 3 | 224x224 | 71.702 | 90.415 |
DCT-24 (ours) | 24 | 112x112 | 72.364 | 90.606 |
DCT-32 (ours) | 32 | 112x112 | 72.282 | 90.592 |
If you use our code/models in your research, please cite our paper:
@InProceedings{Xu_2020_CVPR,
author = {Xu, Kai and Qin, Minghai and Sun, Fei and Wang, Yuhao and Chen, Yen-Kuang and Ren, Fengbo},
title = {Learning in the Frequency Domain},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}