Incorrect pronunciation of some single words with libritts model #2

dscripka · 2023-10-13T22:22:53Z

Based on an issue from the openWakeWord repo, I'm exploring why the initial pronunciation of certain single words is incorrect for most combinations of speaker IDs and noise scale values (though some seem fine).

It seems to be due to this line, where the phonemes for the target text have the "^" phoneme (id 1) preprended. I experimented by instead prepending the "^_" phoneme sequence (ids [1, 0]) when the target text is only a single word and the produced speech then sounds correct.

This is fairly odd behavior, as the pronunciation of these same words is correct when they are part of a multi-word text sequence. I could theorize that since most TTS datasets are trained on sentences and not single words this is an example of unexpected out-of-domain behavior, but ultimately I'm not sure.

Does using the [1, 0] phoneme sequence seem like a viable work-around? Might that cause any detrimental side-affects that I'm not considering?

lumpidu · 2024-01-15T11:25:57Z

I think, you are right: this is a bug. All phoneme symbols need to be accompanied by the padding symbol (binary 0). The above symbol is the so called BOS (begin of sentence) symbol. Looking at the code, also the EOS (end of sentence) symbol ($) should be padded with a 0, but isn't.

synesthesiam · 2024-02-05T20:23:09Z

I've now added the proper padding after BOS. There is not a pad symbol after EOS.
Maybe related, but there is now some options to help with shorter phrases: https://github.com/rhasspy/piper-sample-generator?tab=readme-ov-file#short-phrases

dscripka mentioned this issue Oct 13, 2023

Problem with "Sh" dscripka/openWakeWord#59

Closed

dscripka mentioned this issue Nov 3, 2023

"Computer" as wake word dscripka/openWakeWord#71

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect pronunciation of some single words with libritts model #2

Incorrect pronunciation of some single words with libritts model #2

dscripka commented Oct 13, 2023

lumpidu commented Jan 15, 2024

synesthesiam commented Feb 5, 2024

Incorrect pronunciation of some single words with libritts model #2

Incorrect pronunciation of some single words with libritts model #2

Comments

dscripka commented Oct 13, 2023

lumpidu commented Jan 15, 2024

synesthesiam commented Feb 5, 2024