This is an AI web application that offers transcription of text to speech and speech to text using Google pretrained model. The goal is to extract insight from audio speech in the form of text
The most common part of the Natural Language Processing is the written text, which is hugely available and can come in the form of documents, scraped data from websites etc. Many firms and organization rely on the processing of these collected data to derive insights to better serve their customers. On the other hand, speech is another basic form of human language that is quite difficult to process and achieve state of the art performance owing to it's dependency on several factors. There are many organization for instance, the Telecommunication industries that generate audio files from their customers in the form of complaints or expression regarding a particular product or service. The major goal of this project is to leverage google API to transform audio speech to text and apply the same processing steps like every other text document to extract insights like specific key words from the speech and analyzing sentiment in the speech. Another part of this project featured using Google translate to recognize the three major Nigerian native languages. However, google does not support this feature yet, but recognizes Nigerian accent which was included in the app.
- Developing a hate speech detecting algorithm to classify hate speech
- Training Neural Network model to classify raw audio files into Sad, Happy, Disgust, and Fearful
- Possibly having a model that can classify Nigerian languages in the form of audio or text preferably (Hausa, Igbo, Yoruba)
After deployment,It was observed that the app performance dropped in recognizing audio speech and transcribing to text, but perfectly works locally at present.
Wanna contribute or know a place we can source for data to train? Feel free to reach out to me or send a mail for further query.