You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello everyone,
I am currently working on a scientific paper on the subject of STT. It would be important for me to know how the "confidence" value is calculated inside of STT. I have looked it up in the deepspeech paper / and also in the playbook -> but unfortunately, I can't find anything precise about this and that's why I currently don´t know how to rate it (what is a good result, what is a bad one, what is the value based on...).
I initially assumed that the value describes the distance between the acoustic-model-result and the language-model-result. But I am surprised -> there is also a value when the scorer is disabled.
If I just overlooked it, I apologize(!) and thank you anyway for a helpful hint =)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello everyone,
I am currently working on a scientific paper on the subject of STT. It would be important for me to know how the "confidence" value is calculated inside of STT. I have looked it up in the deepspeech paper / and also in the playbook -> but unfortunately, I can't find anything precise about this and that's why I currently don´t know how to rate it (what is a good result, what is a bad one, what is the value based on...).
I initially assumed that the value describes the distance between the acoustic-model-result and the language-model-result. But I am surprised -> there is also a value when the scorer is disabled.
If I just overlooked it, I apologize(!) and thank you anyway for a helpful hint =)
best regards,
Robin
Beta Was this translation helpful? Give feedback.
All reactions