The video-to-text Tool is a Python-based GUI application that allows users to transcribe audio from video files.
It utilizes Azure's Cognitive Services Speech SDK for transcription and provides a simple interface for selecting video files, entering Azure Speech Service credentials, and viewing the transcription results.
- Video File Selection: Easily select the video file you want to transcribe.
- Azure Speech Service Integration: Use your Azure Speech Service subscription key and region to transcribe audio.
- Transcription with Timestamps: Get transcription results with timestamps.
- Save Settings: Save your Azure Speech Service settings for future use.
- Python 3.6 or higher
- Azure Cognitive Services Speech subscription
-
Clone the repository to your local machine:
git clone https://github.com/Utesgui/video-to-text.git
-
Navigate to the cloned directory:
cd video-to-text
-
Install the required Python packages:
pip install -r requirements.txt
-
Run the
video-to-text.py
script to start the application:python video-to-text.py
-
Use the "Select Video File" button to choose the video file you want to transcribe.
-
Enter your Azure Speech Service subscription key and region in the respective fields.
-
Click "Start" to begin the transcription process. The transcription results will appear in the log section at the bottom of the window.
-
If needed, you can save your Azure Speech Service settings by clicking "Save Settings".
The application uses a video-to-text.ini
file to store Azure Speech Service settings. This file is automatically generated and updated when you save settings through the GUI.