From 487b374852b5ec271e4ef222cb77d6d97c2f38ad Mon Sep 17 00:00:00 2001
From: Alan Ng <15185920+alanngnet@users.noreply.github.com>
Date: Tue, 14 May 2024 14:39:23 -0500
Subject: [PATCH] explanation of "mode" hyperparameter
---
README.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/README.md b/README.md
index 9012520..be11f94 100644
--- a/README.md
+++ b/README.md
@@ -190,7 +190,7 @@ The hparams.yaml file located in the "config" subfolder of the path you provide
| device | 'mps' or 'cuda', corresponding to your GPU hardware and PyTorch library support. Theoretically 'cpu' could work but untested and probably of no value. |
| early_stopping_patience | how many epochs to wait for validation loss to improve before early stopping |
| mean_size | See `chunk_s` above. An integer used in combination with `chunk_frame` to define the length of the chunks. |
-| mode | "random" (default) or "defined". Changes behavior of AudioFeatDataset related to how it cuts each audio sample into chunks. "random" is described in CoverHunter code as "cut chunk from feat from random start". "defined" is described as "cut feat with 'start/chunk_len' info from line"|
+| mode | "random" (default) or "defined". Changes behavior of AudioFeatDataset related to how it cuts each audio sample into chunks. "random" is described in CoverHunter code as "cut chunk from feat from random start". "defined" is described as "cut feat with 'start/chunk_len' info from line." We observed better training results using "defined" when working with datasets that are very consistently trimmed so that CSI-relevant audio always starts right at the beginning of the recording. "random" would be better when CSI-irrelevant audio may be present at the start of many of your audio data samples. |
| m_per_class | From CoverHunter code comments: "m_per_class must divide batch_size without any remainder" and: "At every iteration, this will return m samples per class. For example, if dataloader's batch-size is 100, and m = 5, then 20 classes with 5 samples iter will be returned." |
| spec_augmentation | spectral(?) augmentation settings, used to generate temporary data augmentation on the fly during training. CoverHunter settings were:
`random_erase`:
`prob`: 0.5
`erase_num`: 4
`roll_pitch`:
`prob`: 0.5
`shift_num`: 12 |