Custom Sample Rate of 44,1 k #135
-
We have a dataset of wav files with a sample rate of 44,100 that we would like to train from scratch We noticed the configuration option in the training for It looks like the ASR module was trained with 24k as well StyleTTS2/Utils/ASR/config.yml Line 14 in d05a463 Wondering if it's fine, to just have the training dataset at 44,1k or if we would need to also have the ASR component being trained at 44,1k for the overall architecture to work? Thx |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
From README:
|
Beta Was this translation helpful? Give feedback.
-
You don't even need to re-train. You can just downsample it to 24 kHz and get the melspectrograms for text aligner. Of course the best way is to re-train because you don't have to align the 24 kHz mel to 44.1 kHz waveform. |
Beta Was this translation helpful? Give feedback.
From README: