# Real-Time Audio Transcription Using Google Speech-to-Text API
This project demonstrates how to implement real-time audio transcription using the Google Speech-to-Text API. The application captures audio from the microphone, streams it to the Google Speech API, and prints the transcription live.
loom - https://www.loom.com/share/0d879c20b8d14ff9bb1e0cfdd26995e6?sid=0071336e-7dff-4325-b854-9de10a0d0d14
## Prerequisites
Before you can run this project, you need the following:
- A Google Cloud account.
- A Google Cloud project with the Speech-to-Text API enabled.
- Billing enabled on your Google Cloud account.
- Python 3 installed on your machine.
## Setup
### 1. Google Cloud Setup
- Follow Google's official guide to [set up a Google Cloud project](https://cloud.google.com/resource-manager/docs/creating-managing-projects) and enable the Speech-to-Text API.
- Create and download a service account key from the Google Cloud Console. This key will authenticate your API requests.
### 2. Local Environment Setup
1. **Clone the Repository**
```bash
git clone https://github.com/ranmalmendis/Speech-to-Text-Streaming.git
cd Speech-to-Text-Streaming
-
Install Dependencies Ensure you have Python installed, then run:
pip install -r requirements.txt
For macOS Users: If you encounter issues installing PyAudio (e.g., missing
portaudio.h
), you need to install PortAudio first. Run:brew install portaudio
Then, try installing PyAudio again:
pip install pyaudio
-
Set Environment Variable Set the
GOOGLE_APPLICATION_CREDENTIALS
environment variable to the path of the JSON file that contains your service account key.- On Linux/Mac:
export GOOGLE_APPLICATION_CREDENTIALS= "../../configs/creds.json"
- On Windows:
set GOOGLE_APPLICATION_CREDENTIALS=path\to\your\service-account-file.json
- On Linux/Mac:
To run the application, execute the following command in the terminal:
python src/script.py
Speak into your microphone. The script should print what you say as it receives the transcription results from Google's Speech-to-Text API.
- Audio Device Issues: If you encounter issues with PyAudio not recognizing your microphone, ensure your audio devices are correctly configured and accessible by PyAudio.
- API Errors: If you see errors related to the Google Speech-to-Text API, ensure your API is enabled and your billing information is correct on the Google Cloud Console.
- Microphone Permissions: Make sure Python and your terminal have the necessary permissions to access your microphone. This might require adjusting system or security settings on your operating system.
- Audio Quality: Ensure you are using a good quality microphone and that it is correctly configured in your system settings to optimize transcription accuracy.
Contributions are welcome! Please feel free to submit pull requests or open issues to improve the functionality or documentation of this project.
The project is licensed under the MIT License.