Skip to content

Latest commit

 

History

History
69 lines (49 loc) · 3.51 KB

README.md

File metadata and controls

69 lines (49 loc) · 3.51 KB

Applying PVT to Semantic Segmentation

Here, we take MMSegmentation v0.13.0 as an example, applying PVT to SemanticFPN.

For details see Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions.

If you use this code for a paper please cite:

PVTv1

@misc{wang2021pyramid,
      title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions}, 
      author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
      year={2021},
      eprint={2102.12122},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PVTv2

@misc{wang2021pvtv2,
      title={PVTv2: Improved Baselines with Pyramid Vision Transformer}, 
      author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
      year={2021},
      eprint={2106.13797},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Usage

Install MMSegmentation.

Data preparation

Preparing ADE20K according to the guidelines in MMSegmentation.

Results and models

Method Backbone Pretrain Iters mIoU(code) mIoU(paper) Config Download
Semantic FPN PVT-Tiny ImageNet-1K 40K 36.6 35.7 config log & model
Semantic FPN PVT-Small ImageNet-1K 40K 41.9 39.8 config log & model
Semantic FPN PVT-Medium ImageNet-1K 40K 43.5 41.6 config log & model
Semantic FPN PVT-Large ImageNet-1K 40K 43.5 42.1 config log & model

Evaluation

To evaluate PVT-Small + Semantic FPN on a single node with 8 gpus run:

dist_test.sh configs/sem_fpn/PVT/fpn_pvt_s_ade20k_40k.py /path/to/checkpoint_file 8 --out results.pkl --eval mIoU

Training

To train PVT-Small + Semantic FPN on a single node with 8 gpus run:

dist_train.sh configs/sem_fpn/PVT/fpn_pvt_s_ade20k_40k.py 8

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.