DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut
conda create -n diffcut python=3.10
conda activate diffcut
pip install -r requirements.txt
For evaluation, install detectron2
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
Try our DiffCut method by running the notebook diffcut.ipynb
Visualize the semantic coherence of vision encoders (SD, CLIP, DINO...) with semantic_coherence.ipynb
In the paper, we evaluate DiffCut on 6 benchmarks: PASCAL VOC (20 classes + background), PASCAL Context (59 classes + background), COCO-Object (80 classes + background), COCO-Stuff (27 classes), Cityscapes (27 classes) and ADE20k (150 classes). See Preparing Datasets for DiffCut.
python eval_diffcut.py --dataset_name Cityscapes --tau 0.5 --alpha 10 --refinement
python eval_diffcut_openvoc.py --dataset_name VOC20 --tau 0.5 --alpha 10 --refinement
@misc{couairon2024zeroshot,
title={Zero-Shot Image Segmentation via Recursive Normalized Cut on Diffusion Features},
author={Paul Couairon and Mustafa Shukor and Jean-Emmanuel Haugeard and Matthieu Cord and Nicolas Thome},
year={2024},
eprint={2406.02842},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This repo relies on the following projects:
Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion
Emergent Correspondence from Image Diffusion
Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
Cut and Learn for Unsupervised Image & Video Object Detection and Instance Segmentation