GitHub - MP242/WHISPER-FLASK-API

Run your local Whisper model with a Flask API

Enhance your user experience by integrating speech-to-text capabilities into your application.
Explore the documentation

See demo · Report a Bug · Ask a feature

Table of Contents

About this project
- Built with
To get started
- Prerequisites
- Installation
Usage
Contact

About this project

Whisper, an advanced automatic speech recognition system developed by OpenAI, stands out due to its open-source nature, despite being a product of OpenAI. Built on 680,000 hours of diverse, multilingual, and multitask supervised data from the web, Whisper demonstrates exceptional accuracy in speech recognition across various accents, in the presence of background noise, and with technical language. It not only supports robust transcription in numerous languages but also offers translation of these transcriptions into English. To harness this potent technology for web application enhancement, we've developed a Flask API that encapsulates Whisper's features, facilitating easy integration and communication with your web applications.

Built with

This project was developed using several key technologies in the fields of artificial intelligence and web development:

(Back to top)

To get started

To set up the project locally, follow these simple instructions.

Prerequisites

Install docker
```
brew install docker
```

Installation

Clone the repo

git clone https://github.com/MP242/WHISPER-FLASK-API.git

Docker - image
```
docker build -t whisper-api .
```
Docker - run server
```
docker run -p 5000:5000 whisper-api
```

(back to top)

Route

This Flask API offers two primary routes for easy interaction:

GET Request to the Root Path

Route: GET "/"
Action: Returns a simple Hello World message.

fetch('http://localhost:5000/')
    .then(response => response.text())
    .then(data => console.log(data));

Response :

"Hello World"

POST Request for Speech-to-Text Conversion

Route: POST "/whisper"
Input: Form data with an audio file included under the key "file".
Action: Processes the provided audio file through the Whisper model to perform speech-to-text conversion.

    // Assuming you have a File object or Blob representing the audio file
    const audioFile = document.querySelector('input[type="file"]').files[0];
    const formData = new FormData();
    formData.append("file", audioFile, "audio.wav");

    fetch('http://localhost:5000/whisper', {
    method: "POST",
    body: formData
    })
    .then(response => response.json())
    .then(data => console.log(data.text));

Expected Response:

  results:[{ "filename":"audio.wav","transcript": "The transcribed text from your audio file." }]

These routes enable straightforward interaction with the speech-to-text capabilities provided by the Whisper model through your Flask API. The examples demonstrate how to make requests using JavaScript, facilitating integration into web applications.

(back to top)

Roadmap

to be defined

(back to top)

Contact

Marc POLLET - @Marc_linkedin - [email protected]

Project Link: https://github.com/MP242/vocal-chat

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Run your local Whisper model with a Flask API

About this project

Built with

To get started

Prerequisites

Installation

Route

Roadmap

Contact

About

Releases

Packages

Languages

License

MP242/WHISPER-FLASK-API

Folders and files

Latest commit

History

Repository files navigation

Run your local Whisper model with a Flask API

About this project

Built with

To get started

Prerequisites

Installation

Route

Roadmap

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages