Visually guided sound source separation using cascaded opponent filter network

ACCV2020(Oral) | project

This repository contains PyTorch implementation of "Visually guided sound source separation using cascaded opponent filter network". Authors: Lingyu Zhu and Esa Rahtu. Tampere University, Finland.

Environment

Operating System: Ubuntu 18.04.4 LTS, CUDA=10.1, Python=3.7, PyTorch=1.3.0

Datasets

-The original MUSIC dataset can be downloaded from: https://github.com/roudimit/MUSIC_dataset.

-The train/val/test splits of the A-NATURAL and A-MUSIC datasets can be downloaded from link. We suggest you to download the video or audio from the original AudioSet using the provided YouTube ID in splits files.

-Please put the train/test split path in the scripts/train*.sh and scripts/eval.sh

Training

./scripts/train_sSep01_C2D_DYN.sh

Evaluation

./scripts/eval.sh

Reference

[1] Zhao, Hang, et al. "The sound of pixels." Proceedings of the European conference on computer vision (ECCV). 2018.

[2] Zhao, Hang, et al. "The sound of motions." Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2019.

[3] Xu, Xudong, Bo Dai, and Dahua Lin. "Recursive visual sound separation using minus-plus net." Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2019.

[4] Gemmeke, Jort F., et al. "Audio set: An ontology and human-labeled dataset for audio events." 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2017.

Citation

If you find this work useful in your research, please cite:

@inproceedings{zhu2020visually,
  title={Visually guided sound source separation using cascaded opponent filter network},
  author={Zhu, Lingyu and Rahtu, Esa},
  booktitle={Proceedings of the Asian Conference on Computer Vision},
  year={2020}
}

Acknowledgement

This repo is developed based on Sound-of-Pixels.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dataset		dataset
dynamicimage		dynamicimage
figures		figures
models		models
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arguments.py		arguments.py
main_sSep01_C2D_DYN.py		main_sSep01_C2D_DYN.py
utils.py		utils.py
viz.py		viz.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visually guided sound source separation using cascaded opponent filter network

Environment

Datasets

Training

Evaluation

Reference

Citation

Acknowledgement

About

Releases

Packages

Languages

License

ly-zhu/cof-net

Folders and files

Latest commit

History

Repository files navigation

Visually guided sound source separation using cascaded opponent filter network

Environment

Datasets

Training

Evaluation

Reference

Citation

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages