Sunbird API Inference Service

This repository contains code for a flask server that's containerized and deployed to Vertex AI on GCP.

The flask server provides access to the following Sunbird AI models:

The process of deployment is as follows:

The models are pulled from HuggingFace. See asr_inference and translate_inference.
The flask app exposes 2 endpoints: isalive and predict as required by Vertex AI. The predict endpoint receives a list of inference requests, passes them to the model and returns the results.
A docker container is built from this flask app and is pushed to the Google container repository (GCR).
On Vertex AI, a "model" is created from this container and then deployed to a Vertex endpoint.

NOTE: Check out this article for a detailed tutorial on this process.

The resulting endpoint is then used in the main Sunbird AI API.

Add TTS
Handle long audio files.
Use a smaller base container, current container (huggingface/transformers-pytorch-gpu) is pretty heavy and maybe unncessary. This would enable us to end up with a smaller artificat which takes up less memory.
Automate the deployment process for both the API and this inference service (using Github Actions or Terraform...or both?)
Come up with an end-to-end workflow from data ingestion to deployment (what tools are required for this?).

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
SEMA1-2022-11-04T120932-3.wav		SEMA1-2022-11-04T120932-3.wav
requirements.txt		requirements.txt
temp.wav		temp.wav
test_api.py		test_api.py
test_bas64_inference.py		test_bas64_inference.py
test_translate_inference.py		test_translate_inference.py

Provide feedback