Pytorch implementation for SMFNet: Unleashing the Power of Motion and Depth: A Selective Fusion Strategy for RGB-D Video Salient Object Detection.
- Python 3.7.0
- Torch 1.7.1
- Torchvision 0.8.2
- Cuda 11.0
- Download the datasets (RDVS and DVisal) from Baidu Driver (PSW: d4ew) and save it at './dataset/'.
- Download the pre_trained RGB, depth and flow stream models from Baidu Driver (PSW: lm6d) to './checkpoints/'.
- Run
python train.py
in terminal.
- Download VSOD datasets from Baidu Driver (PSW: hveg) and save the training datasets (DAVIS, DAVSOD, FBMS) at './vsod_dataset/train'.
- Download the pre_trained RGB, depth and flow stream models from Baidu Driver (PSW: 3c48) to './checkpoints/'.
- Run
python train.py
in terminal.
Run python pretrain.py
in terminal. When pretraining RGB stream, we additionally use DUTS-TR Baidu Driver (PSW: h5sn) and the pre_trained ResNet34 Baidu Driver (PSW: mthj).
- Download the trained model from Baidu Driver (PSW: hgm3) to './checkpoints/'.
- Run
python test.py
in the terminal.
- Download the trained model from Baidu Driver (PSW: p2q0) to './checkpoints/'.
- Run
python test.py
in the terminal.
- The saliency maps of our SMFNet can be download from Baidu Driver (PSW: u8rz, RGB-D VSOD benchmarks) and Baidu Driver (PSW: 8mgu, VSOD benchmarks).
- We have constructed the first RGB-D VSOD benchmark, which contains the results of 19 state-of-the-art (SOTA) methods evaluated on RDVS and DVisal.
- We evaluate the originally trained models on the testing set of RDVS and DVisal. The saliency maps can be download from Baidu Driver (PSW: bjyk).
- We first fine-tune the originally trained models on the training set of RDVS and DVisal, and then evaluate the fine-tuned models on the testing set of RDVS and DVisal. The saliency maps can be download from Baidu Driver (PSW: hjwy).