This project demonstrates how to create a real-time conversational AI by streaming responses from an LLM. It uses FastAPI to create a web server that accepts user inputs and streams generated responses back to the user and streamlit for creating a simple chatbot interface.
- Python 3.11
- Poetry -> pip install poetry
- Docker (optional, for containerized deployment) [Container have been tested in Window environment, for Mac might be needed some adjustment]
- Make (to use Makefile commands)
-
Install dependencies:
make install_env
-
Run Backend Application:
make run_backend
-
Run Frontend Application:
make run_backend
-
Build images
make docker_build_app
-
Run containers
make docker_app_up
Note: In the logs, it may appear that the app is deployed on 0.0.0.0:, but it is actually accessible on localhost:.