Skip to content

omshrivastava26/IPR_diff-bgm-210685_om_shrivastava

 
 

Repository files navigation

Diff-BGM: A Diffusion Model for Video Background Music Generation

Implementation

1. Installation

pip install -r requirements.txt
pip install -e diffbgm
pip install -e diffbgm/mir_eval

Note : mir-eval file is not present in this repo, this has to be downloaded from here.

2. Training

Preparations

// Please download this files, as they are very big to be uploaded in repository.

  1. The extracted features of the dataset POP909 can be accessed here. Please put it under /data/ after extraction.

  2. The extracted features of the dataset BGM909 can be accessed here. Please put them under /data/bgm909/ after extraction. We use VideoCLIP to extract the video feature, use BLIP to gain the video caption then use Bert-base-uncased as the language encoder and use TransNetV2 to capture the shot.
    We also provide the original captions here.

  3. The needed pre-trained models for training can be accessed here. Please put them under /pretrained/ after extraction. The split of the dataset can be find here.

Commands

python diffbgm/main.py --model ldm_chd8bar --output_dir [output_dir]

This code is not working in the original repo, I have resolved multiple errors in my local system and local environment. I made the code work but in the first epoch it is throwing an error related to ran out of input (while loading the pickel file, which is again not present in original repo, I have added it).

3. Inference

Please use the following message to generate music for videos in BGM909.

python diffbgm/inference_sdf.py --model_dir=[model_dir] --uncond_scale=5.

This code is working but it is running over multiple folders, so it is giving inference for only files present in repo (code is set also for other folders but it is not present in repo or any other drive of the author), but for present folders, I am getting the inference result in diffbgm/exp folder.

4. Test

To reproduce the metrics in our original paper, please refer to /diffbgm/test.ipynb.

Backbone PCHE GPS SI P@20 Weights
Diff-BGM (original) 2.840 0.601 0.521 44.10 weights
Diff-BGM (only visual) 2.835 0.514 0.396 43.20 weights
Diff-BGM (w/o SAC-Att) 2.721 0.789 0.523 38.47 weights

We provide our generation results here.

See our demo!

About

official code for CVPR'24 paper Diff-BGM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.6%
  • Jupyter Notebook 2.4%