-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcribe Vimeo videos #6
Comments
Here's my initial research: Important questionsHow many hours of video do we have? General requirementsFrom what I've seen while investigating various speech to text APIs, it's always either cheaper or a requirement, that audio is submitted to the API as opposed to video. Based on that, it looks like our solution will need to download the videos locally for processes. If the chosen API requires audio only, we'll have to extract the audio from the videos, then push the audio to the API. Currently, downloads are not enabled for videos.restfest.org. Here's what I found about that: https://help.vimeo.com/hc/en-us/articles/229678128-Downloading-videos
Google Cloud Speech to Textis an option, but it wouldn't be free. https://cloud.google.com/speech-to-text/ CloudinaryIs this the service Benjamin was talking about? If so, hook me up with some contact deets. If not, what was the service you were talking about and hook me up with some contact deets. :) Azure Cognitive Serviceshttps://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services/ This looks to be the winner for now. The free tier allows 5 hours of transcription per month. It might take a couple months, but we can work with that. Kaldi?May be a thing. I mean, it's a thing. Just not sure what kind of thing it is. http://kaldi-asr.org/doc/index.html IBM's thingIBM has a thing. Hard cap at 100 minutes of translation. The AWS thingUltimately, it's got a hard cap at 12 hours https://aws.amazon.com/transcribe/pricing/ The non-profit advantageWe could speed this up if restfest was a non-profit. There's always a non-profit tier with these services that gives you additional monthly credits. Should we create an issue to track the non-profit effort? |
@bradgrace calculated 75.8 hours of video by summing all the duration properties in the Vimeo JSON data. He's made a Powershell script that collects and parses all of the data. From that, we've got a good starting point for scripting the translation. Given Azure's 5/hours per month limit, it would take 16 months to translate all of the videos... UNLESS we split the job across multiple computers. :) |
Getting started with C# - the next sample is more important, but this sample at least shows how to get the nuget package. Continuous speech recognition from a file Our goal is to attempt to translate those C# samples to Powershell. Powershell runs on most OSs now. If Powershell fails us, we'll switch to .NET Core. |
We'll use https://www.ffmpeg.org/ to extract the wav from the mp4 I confirmed I can use ffmpeg to extract the wav from one of the restfest mp4 files that I downloaded manually. |
Plan is to loop through all of the video data at
https://github.com/RESTFest/videos.restfest.org/tree/master/_data/videos
and pass them to a transcription API. We'll turn the result into JSON and drop it somewhere here in teh repo.The text was updated successfully, but these errors were encountered: