This repository contains an example of building a simple Retrieval-Augmented Generation (RAG) application using FastAPI and LangChain. The project was developed as part of a university lecture at BSUIR and demonstrates the step-by-step creation of a RAG system.
The project is based on an example described in this article and has been adapted to work with Llama3.2 and to demonstrate the incremental creation of the application.
I would like to express my sincere gratitude to the author of the article!
- Technologies: FastAPI, LangChain, Llama3.2, OpenAI API, ChromaDB, SQLite, Streamlit
- Architecture:
- Key Features:
- 💬 Interactive chat interface
- 📚 Document upload and processing
- 🔍 Context-aware responses using RAG
- 🗄️ Use of SQLite and Chroma databases for persistent data storage
- 📝 Chat history tracking
- 🔒 Session management
git clone https://github.com/KovalM/rag-fastapi-project.git
cd rag-fastapi-project
python -m venv venv
source venv/bin/activate # For Linux/Mac
venv\Scripts\activate # For Windows
# Installing dependencies may take up to 8 GB of disk space
pip install -r requirements.txt
curl -fsSL https://ollama.com/install.sh | sh # Install Ollama (Linux/Mac)
powershell -Command "irm https://ollama.com/install.ps1 | iex" # Install Ollama (Windows)
ollama pull llama3.2 # Download the model
You can read more about installing Ollama here: Ollama Download
cd api
uvicorn main:app --reload
The API will be available at http://127.0.0.1:8000
.
API documentation: http://127.0.0.1:8000/docs
.
cd app
streamlit run streamlit_app.py
The Streamlit interface will be available in your browser at http://localhost:8501
.
You can explore the process of developing the RAG application by reviewing the commit history. Each commit represents a logically complete stage of work.
A detailed explanation of the project is available in the presentation.
If you have any questions or suggestions, please reach out via Issues or create a Pull Request!
-
Telegram: @KovalM_tg
-
Email: [email protected]