-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About ASR #6
Comments
The most relevant aspect of discrete tokenizers in Automatic Speech Recognition (ASR) tasks is demonstrated in the experiments on the ARCH benchmark using WavTokenizer, as presented in the paper.
|
Thank you for your detailed answer. I feel the same way. I hope that one day in the future tokenizer will have good expression both acoustically and semantically. |
Regarding the two new questions raised, our perspectives are as follows:
|
Thanks for your answer, I learned a lot. I agree with you very much. Regarding the second point, I would like to try to do end-to-end research using encoders such as WavTokenizer. I also look forward to your follow-up work! |
Thanks for your excellent work!
I want to ask how the Discrete tokenizer's perform on the ASR?Can you tell me your understand? Thanks!
The text was updated successfully, but these errors were encountered: