Skip to content

An end to end deep speech fast api containing speech to text and text speech services.

License

Notifications You must be signed in to change notification settings

MBAZA-NLP/RW-DEEPSPEECH-API

 
 

Repository files navigation

Contributors Forks Stargazers Issues


RW DEEPSPEECH API

A Kinyarwanda based end to end deepspeech with speech to text and text to speech services!
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

There are many great README templates available on GitHub; however, I didn't find one that really suited my needs so I created this enhanced one. I want to create a README template so amazing that it'll be the last one you ever need -- I think this is it.

Here's why:

  • Your time should be focused on creating something amazing. A project that solves a problem and helps others
  • You shouldn't be doing the same tasks over and over like creating a README from scratch
  • You should implement DRY principles to the rest of your life 😄

Of course, no one template will serve all projects since your needs may be different. So I'll be adding more in the near future. You may also suggest changes by forking this repo and creating a pull request or opening an issue. Thanks to all the people have contributed to expanding this template!

(back to top)

Built With

  • Python
  • FastAPI
  • WebSockets
  • Transformers
  • TTS
  • Uvicorn
  • Nemo

(back to top)

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Prerequisites

It is highly recomended to run the application in docker container to avoid dependency erros but it also possible to run it without docker In terms of specifications needed

  • With Docker:
    • DISK SPACE >= 10GB
    • RAM >= 2GB
  • Without Docker:
    • RAM >= 2GB free/spare

Installation with docker

Follow the steps bellow to set up your project on server/machine running docker.

  1. Clone the repo
    git clone https://github.com/agent87/RW-DEEPSPEECH-API.git
  2. create an environment file named as ".env" and paste the variables
    MONGO_USERNAME=myuser
    MONGO_PASSWORD=mypassword
    MONGO_HOST=localhost
    MONGO_PORT=27017
    MONGO_DATABASE=feedback
    MONGO_COLLECTION=logs
    NOTE: For security purposes, make sure to change the variables above!
  3. build the docker image
    docker compose build
    Note: if you have an earlier docker version use "docker-compose build"
  4. Start the docker containers
    docker compose up

(back to top)

Usage

(back to top)

Roadmap

  • Add database
  • Add Authentication
  • Additional testing
  • [ ]

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Arnaud Kayonga - @kayarn_ - [email protected]

Project Link: https://github.com/agent87/RW-DEEPSPEECH-API

(back to top)

Acknowledgments

Use this space to list resources you find helpful and would like to give credit to. I've included a few of my favorites to kick things off!

(back to top)

About

An end to end deep speech fast api containing speech to text and text speech services.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 92.5%
  • Dockerfile 7.5%