Implementation
pip install -r requirements.txt
pip install -e diffbgm
pip install -e diffbgm/mir_eval
Note : mir-eval file is not present in this repo, this has to be downloaded from here.
// Please download this files, as they are very big to be uploaded in repository.
-
The extracted features of the dataset POP909 can be accessed here. Please put it under
/data/
after extraction. -
The extracted features of the dataset BGM909 can be accessed here. Please put them under
/data/bgm909/
after extraction. We use VideoCLIP to extract the video feature, use BLIP to gain the video caption then use Bert-base-uncased as the language encoder and use TransNetV2 to capture the shot.
We also provide the original captions here. -
The needed pre-trained models for training can be accessed here. Please put them under
/pretrained/
after extraction. The split of the dataset can be find here.
python diffbgm/main.py --model ldm_chd8bar --output_dir [output_dir]
This code is not working in the original repo, I have resolved multiple errors in my local system and local environment. I made the code work but in the first epoch it is throwing an error related to ran out of input (while loading the pickel file, which is again not present in original repo, I have added it).
Please use the following message to generate music for videos in BGM909.
python diffbgm/inference_sdf.py --model_dir=[model_dir] --uncond_scale=5.
This code is working but it is running over multiple folders, so it is giving inference for only files present in repo (code is set also for other folders but it is not present in repo or any other drive of the author), but for present folders, I am getting the inference result in diffbgm/exp folder.
To reproduce the metrics in our original paper, please refer to /diffbgm/test.ipynb
.
Backbone | PCHE | GPS | SI | P@20 | Weights |
---|---|---|---|---|---|
Diff-BGM (original) | 2.840 | 0.601 | 0.521 | 44.10 | weights |
Diff-BGM (only visual) | 2.835 | 0.514 | 0.396 | 43.20 | weights |
Diff-BGM (w/o SAC-Att) | 2.721 | 0.789 | 0.523 | 38.47 | weights |
We provide our generation results here.
See our demo!