This repo try to implement state-of-art fast semantic segmentation model s on road scene dataset(CityScape, Mapillary, Camvid).
Check out our Fast Segmentation Framework in SFSegNets. SFNet-ECCV-2020, SFNet-Lite, IJCV-2023
This repo aims to do experiments and verify the idea of fast semantic segmentation, and this repo also provides some fast models.
Our ICnet implementation achieves 74.5% mIoU, which is 5% point higher than the original paper. !!!!! Here: model
GALD-net provides some state-of-art accurate methods of implementation.
- ICNet: ICnet for real-time semantic segmentation on high-resolution images. ECCV-2018, paper
- DF-Net: Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search.CVPR-2019, paper
- Bi-Seg: Bilateral segmentation network for real-time semantic segmentation.ECCV-2018, paper
- DFA-Net: Deep feature aggregation for real-time semantic segmentation.CVPR-2019,paper
- ESP-Net: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation. ECCV-2018,paper
- SwiftNet: In defense of pre-trained imagenet architectures for real-time semantic segmentation of road-driving images. CVPR2019, paper
- Real-Time Semantic Segmentation via Multiply Spatial Fusion Network.(face++) arxiv,paper
- Fast-SCNN: Fast Semantic Segmentation Network.BMVC-2019 paper
- use train_distribute.py for training For example, use scripts in exp floder for training and evaluation.
- use prediction_test_different_size.py for prediction with different size input.
- You can download [cityscapes] dataset (https://www.cityscapes-dataset.com/) from here. Note: please download leftImg8bit_trainvaltest.zip(11GB) and gtFine_trainvaltest(241MB).
- You can download camvid dataset from here.
- You can download pretrained XceptionA with RGB input and ResNet18 with bgr input and ResNet50 with bgr input [link]:(https://pan.baidu.com/s/1mM_Lc44iX9CT1nPq6tjOAA) password:bnfv. or ['link']: resnet50-deep.pth, icnet_final.pth, resnet18-deep-caffe.pth, xceptiona_imagenet.pth
- use syn-bn(apex).
- use batch-size >=8.
- use deep supervised loss for easier optimation.
- use large crop size during training.
- longer training time for small models(60,000 interaction or more).
- use Mapillary data for pretraining for boosting performance.
- Deeply based resnet runs slowly than torch pretrained resnet but with higher accuracy.
- The small network doesn't need ImageNet pretraining if training longer time on Cityscape.(Fast-SCNN paper)
(a) test image | (b) ground truth | (c) predicted result |
---|---|---|
This project is released under the Apache 2.0 license.
Thanks to the previous open-sourced repo: Encoding CCNet TorchSeg pytorchseg