Skip to content

MP242/WHISPER-FLASK-API

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Run your local Whisper model with a Flask API

Enhance your user experience by integrating speech-to-text capabilities into your application.
Explore the documentation

See demo · Report a Bug · Ask a feature

Table of Contents
  1. About this project
  2. To get started
  3. Usage
  4. Contact

About this project


Whisper, an advanced automatic speech recognition system developed by OpenAI, stands out due to its open-source nature, despite being a product of OpenAI. Built on 680,000 hours of diverse, multilingual, and multitask supervised data from the web, Whisper demonstrates exceptional accuracy in speech recognition across various accents, in the presence of background noise, and with technical language. It not only supports robust transcription in numerous languages but also offers translation of these transcriptions into English. To harness this potent technology for web application enhancement, we've developed a Flask API that encapsulates Whisper's features, facilitating easy integration and communication with your web applications.

Built with

This project was developed using several key technologies in the fields of artificial intelligence and web development:

(Back to top)

To get started

To set up the project locally, follow these simple instructions.

Prerequisites

  • Install docker
    brew install docker

Installation

  1. Clone the repo
    git clone https://github.com/MP242/WHISPER-FLASK-API.git
  2. Docker - image
    docker build -t whisper-api .
  3. Docker - run server
    docker run -p 5000:5000 whisper-api

(back to top)

Route

This Flask API offers two primary routes for easy interaction:

  1. GET Request to the Root Path
  • Route: GET "/"
  • Action: Returns a simple Hello World message.
fetch('http://localhost:5000/')
    .then(response => response.text())
    .then(data => console.log(data));

Response :

"Hello World"
  1. POST Request for Speech-to-Text Conversion
  • Route: POST "/whisper"
  • Input: Form data with an audio file included under the key "file".
  • Action: Processes the provided audio file through the Whisper model to perform speech-to-text conversion.
    // Assuming you have a File object or Blob representing the audio file
    const audioFile = document.querySelector('input[type="file"]').files[0];
    const formData = new FormData();
    formData.append("file", audioFile, "audio.wav");

    fetch('http://localhost:5000/whisper', {
    method: "POST",
    body: formData
    })
    .then(response => response.json())
    .then(data => console.log(data.text));

Expected Response:

  results:[{ "filename":"audio.wav","transcript": "The transcribed text from your audio file." }]

These routes enable straightforward interaction with the speech-to-text capabilities provided by the Whisper model through your Flask API. The examples demonstrate how to make requests using JavaScript, facilitating integration into web applications.

(back to top)

Roadmap

  • to be defined

(back to top)

Contact

Marc POLLET - @Marc_linkedin - [email protected]

Project Link: https://github.com/MP242/vocal-chat

(back to top)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published