Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Better documentation of eval_testset parameters, particularly adding new marks parameter.
  • Loading branch information
alanngnet committed May 21, 2024
1 parent f61ff92 commit 272af19
Showing 1 changed file with 14 additions and 2 deletions.
16 changes: 14 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,19 +109,28 @@ The important output from that is `full.txt` and the `cqt_feat` subfolder's cont
3. Run the evaluation script. This example assumes you are using the trained model you created in `training/covers80` and you want to use all the optional features I added in this fork:<br>
`python3 -m tools.eval_testset training/covers80 data/covers80_testset/full.txt data/covers80_testset/full.txt -plot_name="training/covers80/tSNE.png" -dist_name='distmatrix' -test_only_labels='data/covers80/test-only-work-ids.txt'`

See the "Training checkpoint output" section below for a description of the embeddings saved by the `eval_for_map_with_feat()` function called in this script. They are saved in a new subfolder of the `pretrained_model` folder named `embed_NN_tmp` where NN is the highest-numbered epoch subfolder in the `pretrained_model` folder.

### Arguments

#### query_in_ref_path
CoverHunter only shared an evaluation example for the case when query and reference data are identical, presumably to do a self-similarity evaluation of the model. But there is an optional 4th parameter for `query_in_ref_path` that would be relevant if query and reference are not identical. See the "query_in_ref" heading below under "Input and Output Files."

#### plot_name
The optional `plot_name` argument is a path or just a filename where you want to save the t-SNE plot output. If you provide just a filename, `model_dir` will be used as the path. See example plot below. Note that your query and reference files must be identical to generate a t-SNE plot (to do a self-similarity evaluation).

#### test_only_labels
The optional `test_only_labels` argument is a path to the text file generated by `extract_csi_features.py` if its hyperparameters asked for some work_ids to be reserved exclusively for the test dataset. The t-SNE plot will then mark those for you to see how well your model can cluster classes (work_ids) it has never seen before.

This figure shows the results of training from scratch on the covers80 dataset with a train/val/test split of 8:1:1 and 3 classes (work_ids) reserved exclusively for the test dataset.
![t-SNE plot for Covers80](tSNE-example.png)

#### dist_name
The optional `dist_name` argument is a path where you want to save the distance matrix and ref labels so that you can study the results separately, such as perhaps doing custom t-SNE plots, etc.

See the "Training checkpoint output" section below for a description of the embeddings saved by the `eval_for_map_with_feat()` function called in this script. They are saved in a new subfolder of the `pretrained_model` folder named `embed_NN_tmp` where NN is the highest-numbered epoch subfolder in the `pretrained_model` folder.

#### marks
The default value for the optional `marks` argument is 'markers', which makes the output for `plot_name` differentiate works by using using standard matplotlib markers in various colors and shapes. The alternative value is 'ids' which uses the `work_id` numbers defined by extract_csi_features instead of matplotlib markers.

## Inference (work identification)

After you have trained a model and run the evaluation script, you can use the model to identify any music you give it. Provide the music input to the tools.identify.py script by creating a one-line text file that has the metadata about the music, following the format of the text files generated by tools.extract_csi_features.py. For example, you could select any of the entries in the data/covers80/full.txt file, like a speed-augmented version of one of the 80 works
Expand Down Expand Up @@ -187,7 +196,10 @@ The hparams.yaml file located in the "config" subfolder of the path you provide
| data_type | "cqt" (default) or "raw" or "mel". It remains unknown whether the CoverHunter team actually implemented or tested anything but CQT-based training. |
| device | 'mps' or 'cuda', corresponding to your GPU hardware and PyTorch library support. Theoretically 'cpu' could work but untested and probably of no value. |
| early_stopping_patience | how many epochs to wait for validation loss to improve before early stopping |
| learning_rate | The initial value for how much variability to allow the model during each learning step. See `lr_decay`. Default = .001. |
| lr_decay | Learning-rate decay - see `learning_rate`. Default = .9975, but for small data sets, such as during testing and tuning work, we found lower values like .99 more appropriate. |
| mean_size | See `chunk_s` above. An integer used in combination with `chunk_frame` to define the length of the chunks. |
| min_lr | Minimum learning rate, below which `lr_decay` is ignored. Default = 0.0001. |
| mode | "random" (default) or "defined". Changes behavior of AudioFeatDataset related to how it cuts each audio sample into chunks. "random" is described in CoverHunter code as "cut chunk from feat from random start". "defined" is described as "cut feat with 'start/chunk_len' info from line." We observed better training results using "defined" when working with datasets that are very consistently trimmed so that CSI-relevant audio always starts right at the beginning of the recording. "random" would be better when CSI-irrelevant audio may be present at the start of many of your audio data samples. |
| m_per_class | From CoverHunter code comments: "m_per_class must divide batch_size without any remainder" and: "At every iteration, this will return m samples per class. For example, if dataloader's batch-size is 100, and m = 5, then 20 classes with 5 samples iter will be returned." |
| spec_augmentation | spectral(?) augmentation settings, used to generate temporary data augmentation on the fly during training. CoverHunter settings were:<br>`random_erase`:<br> &nbsp; `prob`: 0.5<br> &nbsp; `erase_num`: 4<br>`roll_pitch`:<br> &nbsp; `prob`: 0.5<br> &nbsp; `shift_num`: 12 |
Expand Down

0 comments on commit 272af19

Please sign in to comment.