Skip to content

Latest commit

 

History

History
113 lines (81 loc) · 7.31 KB

README.md

File metadata and controls

113 lines (81 loc) · 7.31 KB

SeMask FPN

This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation. It is based on mmsegmentaion.

semask

Contents

  1. Results
  2. Setup Instructions
  3. Demo
  4. Citing SeMask

1. Results

ADE20K

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512x512 42.11 43.16 35M config TBD
SeMask-S FPN SeMask Swin-S 512x512 45.92 47.63 56M config checkpoint
SeMask-B FPN SeMask Swin-B 512x512 49.35 50.98 96M config checkpoint
SeMask-L FPN SeMask Swin-L 640x640 51.89 53.52 211M config checkpoint

Cityscapes

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 768x768 74.92 76.56 34M config checkpoint
SeMask-S FPN SeMask Swin-S 768x768 77.13 79.14 56M config checkpoint
SeMask-B FPN SeMask Swin-B 768x768 77.70 79.73 96M config checkpoint
SeMask-L FPN SeMask Swin-L 768x768 78.53 80.39 211M config checkpoint

COCO-Stuff 10k

Method Backbone Crop Size mIoU mIoU (ms+flip) #params config Checkpoint
SeMask-T FPN SeMask Swin-T 512x512 37.53 38.88 35M config checkpoint
SeMask-S FPN SeMask Swin-S 512x512 40.72 42.27 56M config checkpoint
SeMask-B FPN SeMask Swin-B 512x512 44.63 46.30 96M config checkpoint
SeMask-L FPN SeMask Swin-L 640x640 47.47 48.54 211M config checkpoint

2. Setup Instructions

Installation

Inference

# single-gpu testing
python tools/test.py <CONFIG_FILE> <SEG_CHECKPOINT_FILE> --eval mIoU

# multi-gpu testing
tools/dist_test.sh <CONFIG_FILE> <SEG_CHECKPOINT_FILE> <GPU_NUM> --eval mIoU

# multi-gpu, multi-scale testing
tools/dist_test.sh <CONFIG_FILE> <SEG_CHECKPOINT_FILE> <GPU_NUM> --aug-test --eval mIoU

Training

To train with pre-trained models, run:

# single-gpu training
python tools/train.py <CONFIG_FILE> --options model.pretrained=<PRETRAIN_MODEL> [model.backbone.use_checkpoint=True] [other optional arguments]

# multi-gpu training
tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> --options model.pretrained=<PRETRAIN_MODEL> [model.backbone.use_checkpoint=True] [other optional arguments] 

For example, to train an Semantic-FPN model with a SeMask Swin-T backbone and 8 gpus, run:

tools/dist_train.sh configs/semask_swin/cityscapes/semfpn_semask_swin_tiny_patch4_window7_768x768_80k_cityscapes.py 8 --options model.pretrained=<PRETRAIN_MODEL> 

Notes:

  • use_checkpoint is used to save GPU memory. Please refer to this page for more details.
  • The default learning rate and training schedule are as follows:
    • ADE20K: 2 GPUs and 8 imgs/gpu. For Large variant, we use 4 GPUs with 4 imgs/gpu.
    • Cityscapes: 2 GPUs and 4 imgs/gpu. For Large variant, we use 4 GPUs with 2 imgs/gpu.
    • COCO-Stuff 10k: 4 GPUs and 4 imgs/gpu. For Base and Large variant, we use 8 GPUs with 2 imgs/gpu.

3. Demo

To save the predictions, run the following command:

python tools/test.py <CONFIG_FILE> <SEG_CHECKPOINT_FILE> --eval mIoU --show-dir visuals

demo

4. Citing SeMask

@article{jain2021semask,
  title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
  author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
  journal={arXiv},
  year={2021}
}