ConvMAE: Masked Convolution Meets Masked Autoencoders

This folder contains the implementation of the ConvMAE transfer learning for object detection on COCO.

Pipeline

Model Zoo

Models	Pretrain	Pretrain Epochs	Finetune Epochs	#Params(M)	FLOPs(T)	box AP	mask AP	logs/weights
ConvMAE-B	IN1K w/o lables	1600	25	104	0.9	53.2	47.1	log/weight

Usage

Install

Clone this repo:

git clone https://github.com/Alpha-VL/ConvMAE
cd ConvMAE/DET

Create a conda environment and activate it:

conda create -n mimdet python=3.9
conda activate mimdet

Install torch==1.9.0 and torchvision==0.10.0
Install Detectron2==0.6, follow d2 doc.
Install timm==0.4.12, follow timm doc.
Install einops, follow einops repo.
Prepare COCO dataset, follow d2 doc.

Data preparation

You can download the COCO-2017 here and prepare the COCO follow this format:

├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017

It is suggested to link the data path as:

export DETECTRON2_DATASETS=/path/to/data

Evaluation

Download the finetuned model here.

# inference
python lazyconfig_train_net.py --config-file <CONFIG_FILE> --num-gpus <GPU_NUM> --eval-only train.init_checkpoint=<MODEL_PATH>

Training

Download the pretrained ConvMAE model here.

# single-machine training
python lazyconfig_train_net.py --config-file <CONFIG_FILE> --num-gpus <GPU_NUM> model.backbone.bottom_up.pretrained=<PRETRAINED_MODEL_PATH>

# multi-machine training
python lazyconfig_train_net.py --config-file <CONFIG_FILE> --num-gpus <GPU_NUM> --num-machines <MACHINE_NUM> --master_addr <MASTER_ADDR> --master_port <MASTER_PORT> model.backbone.bottom_up.pretrained=<PRETRAINED_MODEL_PATH>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DETECTION.md

DETECTION.md

ConvMAE: Masked Convolution Meets Masked Autoencoders

Pipeline

Model Zoo

Usage

Install

Data preparation

Evaluation

Training

Files

DETECTION.md

Latest commit

History

DETECTION.md

File metadata and controls

ConvMAE: Masked Convolution Meets Masked Autoencoders

Pipeline

Model Zoo

Usage

Install

Data preparation

Evaluation

Training