Skip to content

Commit

Permalink
t-SNE example plot
Browse files Browse the repository at this point in the history
  • Loading branch information
alanngnet committed Mar 26, 2024
1 parent 9e9149f commit 640619e
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,15 +77,15 @@ This script evaluates your trained model by providing mAP and MR1 metrics and an
2. Run your query data through extract_csi_features. In the hparams.yaml file for the feature extraction, turn off all augmentation. See data/covers80_testset/hparams.yaml for an example configuration to treat covers80 as the query data:<br> `python3 -m tools.extract_csi_features data/covers80_testset`<br>
The important output from that is full.txt and the cqt_feat subfolder's contents.
3. Run the evaluation script:<br>
`python3 -m tools.eval_testset egs/covers80 data/covers80_testset/full.txt data/covers80_testset/full.txt -plot_name="tSNE.png" -dist_name='distmatrix' -test_only_labels='data/covers80/dev-only-song-ids.txt'`
`python3 -m tools.eval_testset egs/covers80 data/covers80_testset/full.txt data/covers80_testset/full.txt -plot_name="egs/covers80/tSNE.png" -dist_name='distmatrix' -test_only_labels='data/covers80/dev-only-song-ids.txt'`

CoverHunter only shared an evaluation example for the case when query and reference data are identical, presumably to do a self-similarity evaluation of the model. But there is an optional 4th parameter for `query_in_ref_path` that would be relevant if query and reference are not identical. See the "query_in_ref" heading below under "Input and Output Files."

The optional `plot_name` argument is a path where you want to save the t-SNE plot output. Note that your query and reference files must be identical in this case (to do a self-similarity evaluation).

The optional `test_only_labels` argument is a path to the text file generated by `extract_csi_features.py` if its hyperparamaters asked for some song_ids to be reserved exclusively for the test aka "dev" dataset. The t-SNE plot will then mark those for you to see how well your model can cluster classes (song_ids) it has never seen before.

![t-SNE plot for Covers80 training with 3 song_ids in test held out from training data](https://drive.google.com/file/d/1cBewle3Wj58OtImNuBLwyVmMiVTzXuWg/view?usp=sharing)
![t-SNE plot for Covers80 training with 3 song_ids in test held out from training data](tSNE-example.png)

The optional `dist_name` argument is a path where you want to save the distance matrix and ref labels so that you can study the results separately, such as perhaps doing custom t-SNE plots, etc.

Expand Down
Binary file added tSNE-example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 640619e

Please sign in to comment.