Skip to content

Commit

Permalink
documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
alanngnet committed Mar 20, 2024
1 parent f93a898 commit edfa3e9
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 4 deletions.
13 changes: 12 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,10 @@ The important output from that is full.txt and the cqt_feat subfolder's contents
3. Run the evaluation script:<br>
`python3 -m tools.eval_testset pretrained_model data/covers80/dataset.txt data/covers80/dataset.txt`

CoverHunter only implemented evaluation for the case when query and reference data are identical. But there is an optional 4th parameter for `query_in_ref_path` that would be relevant if query and reference are not identical. See the "query_in_ref" heading below under "Input and Output Files."
CoverHunter only shared an evaluation example for the case when query and reference data are identical(!). But there is an optional 4th parameter for `query_in_ref_path` that would be relevant if query and reference are not identical. See the "query_in_ref" heading below under "Input and Output Files."

This script saves embeddings generated by the `eval_for_map_with_feat` function as .npy arrays in a new subfolder of the `pretrained_model` folder named `embed_{epoch#}_tmp` where epoch# is the highest-numbered epoch subfolder in the `pretrained_model` folder.


## Coarse-to-Fine Alignment Training

Expand Down Expand Up @@ -175,6 +178,14 @@ Listed in the order that the script creates them:
| song_name_num.map | Text file, not used by train.py, maybe not by anything else? |
| full.txt | See above detailed description.|

## Training checkpoint output

Using the default configuration, training saves checkpoints after each epoch in the egs/covers80 folder.

The `pt_model` folder gets two files per epoch: do_000000NN and g_00000NN where NN=epoch number. The do_ files contain the AdamW optimizer state. The g_ files contain the model's state dictionary. "g" might be an abbreviation for "generator" given that a transfomer architecture is involved?

The `eval_for_map_with_feat()` function, called at the end of each epoch, also saves the embeddings for every sample in the training data, plus the embeddings for time-chunked sections of those samples, named with a suffix of ...__start-N.npy where N is the timecode in seconds of where the chunk starts. These arrays are placed in a `query_embed` subfolder. An accompanying `query.txt` is also saved (with an identical copy as `ref.txt`) which is text file listing the attributes of every utt represented in the `query_embed` subfolder, following the same format as described above for `full.txt`.

## query_in_ref

The file you can prepare for the tools/eval_testset.py script to pass as the 4th parameter `query_in_ref_path` (CoverHunter did not provide an example file or documentation) assumes:
Expand Down
8 changes: 5 additions & 3 deletions src/eval_testset.py
Original file line number Diff line number Diff line change
Expand Up @@ -255,8 +255,8 @@ def eval_for_map_with_feat(hp, model, embed_dir, query_path, ref_path,
hp: dict contains hparams
model: nnet model, should have method 'infer'
embed_dir: dir for saving embedding, None for not saving anything
query_path: contains query info
ref_path: contains ref info
query_path: text file with query utt info
ref_path: text file with ref utt info
query_in_ref_path: path to prepared query in ref index. None means that
query index equals ref index
batch_size: for nnet infer
Expand Down Expand Up @@ -318,12 +318,13 @@ def eval_for_map_with_feat(hp, model, embed_dir, query_path, ref_path,
query_embed_dir = os.path.join(embed_dir, "query_embed")
query_chunk_lines = _cut_lines_with_dur(query_lines, chunk_s, query_embed_dir)
write_lines(os.path.join(embed_dir, "query.txt"), query_chunk_lines, False)
# select query utts for which there is not yet a saved embedding
to_cal_lines = [l for l in query_chunk_lines
if not os.path.exists(line_to_dict(l)["embed"])]
if logger:
logger.info("query chunk lines: {}, to compute lines: {}".format(
len(query_chunk_lines), len(to_cal_lines)))

# generate any missing embeddings
if len(to_cal_lines) > 0:
data_loader = DataLoader(AudioFeatDataset(hp, data_lines=to_cal_lines,
mode="defined",
Expand All @@ -341,6 +342,7 @@ def eval_for_map_with_feat(hp, model, embed_dir, query_path, ref_path,
ref_chunk_lines = _cut_lines_with_dur(ref_lines, chunk_s, ref_embed_dir)
write_lines(os.path.join(embed_dir, "ref.txt"), ref_chunk_lines, False)
if ref_path != query_path:
# select ref utts for which there is not yet a saved embedding
to_cal_lines = [l for l in ref_chunk_lines
if not os.path.exists(line_to_dict(l)["embed"])]
if logger:
Expand Down

0 comments on commit edfa3e9

Please sign in to comment.