You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to build an adaptable speech Recognition system based on Mozilla DeepSpeech (which is TensorFlow implementation of the DeepSpeech paper)
The idea is that,
1. We will pretrain a model on a certain voice. Then, save the model + create a checkpoint. 2. The saved model is used for transcribing speech to text. 3. If user notices something is incorrectly transcribed, he can provide a feedback on what the correct text should be for the voice he just recorded. 4. This forms a new sample for training. The model is restored to the previous checkpoint, and then trained on the new sample. (We would also use some data augmentation techniques to increase the number of samples) 5. Now the resulting model should be better adopted to the user voice / pronunciation 6. Repeat from Step 3, if there is an incorrect transcription
Is this the proper way of using checkpoints? I mean, every time when I train on new sample, I restore to last checkpoint & replace the complete training data with the new sample.
Any suggestions would be appreciated!
Thanks in advance!
[This is an archived TTS discussion thread from discourse.mozilla.org/t/checkpoints-for-online-learning]
I am trying to build an adaptable speech Recognition system based on Mozilla DeepSpeech (which is TensorFlow implementation of the DeepSpeech paper)
The idea is that,
1. We will pretrain a model on a certain voice. Then, save the model + create a checkpoint. 2. The saved model is used for transcribing speech to text. 3. If user notices something is incorrectly transcribed, he can provide a feedback on what the correct text should be for the voice he just recorded. 4. This forms a new sample for training. The model is restored to the previous checkpoint, and then trained on the new sample. (We would also use some data augmentation techniques to increase the number of samples) 5. Now the resulting model should be better adopted to the user voice / pronunciation 6. Repeat from Step 3, if there is an incorrect transcription
Is this the proper way of using checkpoints? I mean, every time when I train on new sample, I restore to last checkpoint & replace the complete training data with the new sample.
Any suggestions would be appreciated!
Thanks in advance!
[This is an archived TTS discussion thread from discourse.mozilla.org/t/checkpoints-for-online-learning]
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
>>> krishna91
[February 20, 2018, 11:24pm]
I am trying to build an adaptable speech Recognition system based on
Mozilla DeepSpeech (which is TensorFlow implementation of the DeepSpeech
paper)
The idea is that,
1. We will pretrain a model on a certain voice. Then, save the model +
create a checkpoint.
2. The saved model is used for transcribing speech to text.
3. If user notices something is incorrectly transcribed, he can provide
a feedback on what the correct text should be for the voice he just
recorded.
4. This forms a new sample for training. The model is restored to the
previous checkpoint, and then trained on the new sample. (We would
also use some data augmentation techniques to increase the number of
samples)
5. Now the resulting model should be better adopted to the user voice /
pronunciation
6. Repeat from Step 3, if there is an incorrect transcription
Is this the proper way of using checkpoints? I mean, every time when I
train on new sample, I restore to last checkpoint & replace the complete
training data with the new sample.
Any suggestions would be appreciated!
Thanks in advance!
[This is an archived TTS discussion thread from discourse.mozilla.org/t/checkpoints-for-online-learning]
Beta Was this translation helpful? Give feedback.
All reactions