-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use whisper as ML model #26
base: main
Are you sure you want to change the base?
Conversation
faster_whisper is much faster than vosk (less than 10 minutes, as opposed to hours). It also handles downloading the required models, allowing us to drop the download option. Whisper also correctly capitalizes words, resulting the Chapter markers needing to be updated to account for that, as well as transcribing Prologue wrong. fix: captilize words, add prolog
It does appear faster, though there's a lot of work that would need to be done to do this.
This has a lot of potential. Whisper is pretty awesome. |
|
|
The default device is set to |
I did some digging and found this: https://github.com/m-bain/whisperX Potentially solves every issue:
Thoughts? If it seems like it might work, I can go ahead and work on a PR for it. |
Can't hurt to try. It looks relatively similar to |
Replaces
vosk
withfaster_whisper
, resulting in much faster audio transcriptions. On my system the same audio file took multiple hours withvosk
and less than 10 minutes with the patch.