This folder contains the implementation of the ConvMAE transfer learning for object detection on COCO.
Models | Pretrain | Pretrain Epochs | Finetune Epochs | #Params(M) | FLOPs(T) | box AP | mask AP | logs/weights |
---|---|---|---|---|---|---|---|---|
ConvMAE-B | IN1K w/o lables | 1600 | 25 | 104 | 0.9 | 53.2 | 47.1 | log/weight |
- Clone this repo:
git clone https://github.com/Alpha-VL/ConvMAE
cd ConvMAE/DET
- Create a conda environment and activate it:
conda create -n mimdet python=3.9
conda activate mimdet
- Install
torch==1.9.0
andtorchvision==0.10.0
- Install
Detectron2==0.6
, follow d2 doc. - Install
timm==0.4.12
, follow timm doc. - Install
einops
, follow einops repo. - Prepare
COCO
dataset, follow d2 doc.
You can download the COCO-2017 here and prepare the COCO follow this format:
├── data
│ ├── coco
│ │ ├── annotations
│ │ ├── train2017
│ │ ├── val2017
│ │ ├── test2017
It is suggested to link the data path as:
export DETECTRON2_DATASETS=/path/to/data
Download the finetuned model here.
# inference
python lazyconfig_train_net.py --config-file <CONFIG_FILE> --num-gpus <GPU_NUM> --eval-only train.init_checkpoint=<MODEL_PATH>
Download the pretrained ConvMAE model here.
# single-machine training
python lazyconfig_train_net.py --config-file <CONFIG_FILE> --num-gpus <GPU_NUM> model.backbone.bottom_up.pretrained=<PRETRAINED_MODEL_PATH>
# multi-machine training
python lazyconfig_train_net.py --config-file <CONFIG_FILE> --num-gpus <GPU_NUM> --num-machines <MACHINE_NUM> --master_addr <MASTER_ADDR> --master_port <MASTER_PORT> model.backbone.bottom_up.pretrained=<PRETRAINED_MODEL_PATH>