GitHub - AmoKorankye/Speech-Applications: Speech Applications is a user-friendly toolkit for text-to-speech and speech-to-text tasks. It seamlessly integrates with Google Cloud services, ensuring efficient processing. Whether converting text to speech or transcribing spoken words, it offers powerful capabilities for audio processing .

Speech Applications

The Speech Applications repository hosts two robust applications: Text to Speech and Speech to Text. Leveraging the capabilities of Google Cloud services, these applications provide streamlined solutions for converting between text and speech formats. Additionally, the repository contains essential API keys, necessary for seamless integration with Google Cloud, and a utility script for audio preprocessing. With its user-friendly Streamlit interface and comprehensive functionality, Speech Applications simplifies the process of working with speech data.

Speech Applications Overview

Text to Speech Application

The Text to Speech application allows users to input text and convert it into speech. Implemented using Streamlit, this application simplifies the process of synthesizing speech. It involves setting up Google Cloud credentials, taking user input, synthesizing speech based on the input text, and presenting the synthesized audio to the user.

Speech to Text Application

The Speech to Text application facilitates the transcription of uploaded audio files into text. Also built using Streamlit, this application streamlines the process of transcribing spoken language. It includes functionalities such as setting up Google Cloud Speech client, defining a function to transcribe audio files, allowing users to upload audio files, and presenting the transcription result to the user.

Audio Preprocessing

The project includes a Python script, convert_to_mono.py, for converting audio files to mono channels. This preprocessing step is essential for utilizing the Speech to Text application effectively. The script reads audio files, converts them to mono channels, and saves the modified audio files.

API Keys The repository contains JSON files (sa_speech_demo.json and sa_text_demo.json) that store API keys necessary for authentication with Google Cloud services. These keys are utilized by the Text to Speech and Speech to Text applications to access the respective Google Cloud APIs.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
Speech Applications.png		Speech Applications.png
demo1.py		demo1.py
demo2.py		demo2.py
mul_to_mono.py		mul_to_mono.py
sa_speech_demo.json		sa_speech_demo.json
sa_text_demo.json		sa_text_demo.json
speech_to_text_app.py		speech_to_text_app.py
text_to_speech_app.py		text_to_speech_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Applications

Speech Applications Overview

About

Releases

Packages

Languages

AmoKorankye/Speech-Applications

Folders and files

Latest commit

History

Repository files navigation

Speech Applications

Speech Applications Overview

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages