- Project Description
- The Dataset
- Structure of the Repository
- Solution Components
- How to Start the Solution
The goal of this solution is to explore the concept of a system that can efficiently respond to customer emails, providing personalized and human-like replies. The system is designed to ensure that customers feel they are interacting with real people.
The dataset is generated using publicly available content in German from my employer's website as of September 2024. The following pages were used to create the dataset:
- https://www.ev-digitalinvest.de/anleger/faq
- https://www.ev-digitalinvest.de/analyseprozess
- https://www.ev-digitalinvest.de/anleger
- https://www.ev-digitalinvest.de/agb
The content from these pages was converted into a set of FAQ-style questions and answers, which are stored in the FAQs folder.
The llm-zoomcamp-smart-mail
project is organized as follows:
llm-zoomcamp-smart-mail/
├─ .devcontainer/
│ └─ (development container configurations)
├─ images/
│ └─ (documentation images)
├─ mage/
│ ├─ data/
│ │ └─ (pipeline dataset)
│ └─ zoomcamp-smart-mail/
│ └─ (pipeline files)
├─ notebook/
│ └─ (Jupyter notebooks for evaluation)
├─ smart_mail/
│ ├─ src/
│ │ ├─ streamlit_runner.py
│ │ ├─ email_client.py
│ │ └─ customer_support_client.py
│ └─ tests/
│ └─ (tests for the solution)
├─ .gitignore
├─ requirements.txt
└─ README.md
-
.devcontainer/
- Purpose: Provides development container configuration.
- Usage: Optional. Ensures consistent development environment across different machines.
-
images/
- Purpose: Stores documentation images.
- Contents: Image files used in project documentation.
-
mage/
- Purpose: Contains Mage.AI pipeline files for data processing.
- Subfolders:
data/
: Stores the dataset used by the Mage.AI pipeline.zoomcamp-smart-mail/
: Contains specific pipeline files for smart mail processing.
-
notebook/
- Purpose: Houses Jupyter notebooks for evaluation and analysis.
- Contents:
.ipynb
files used for data exploration, model evaluation, and result visualization.
-
smart_mail/
- Purpose: Core application directory containing source code and tests.
- Subfolders:
src/
: Source code for the main application:streamlit_runner.py
: Launches Streamlit UI applications. See Interface section for details.email_client.py
: Implements the Email Client application (composition root).customer_support_client.py
: Implements the Customer Support Client application (composition root).
tests/
: Contains unit tests for the servicereciprocal_rank_fusion_service.py
.
Two applications have been developed to verify the concept:
- Email Client: Simulates sending emails to the system.
- Customer Support Client: Allows the support team to review generated responses.
The Retrieval-Augmented Generation (RAG) flow consists of two components: retrieval
and generation
, as shown in the diagram below.
sequenceDiagram
actor User as User
User-->>App: 📧 User question (using Email Client App)
rect rgb(240, 240, 240)
note right of App: Retrieval phase
App->>EmbeddingModel: User query
EmbeddingModel->>App: Encoded query (vector)
App->>KnowledgeDatabase: Encoded query + metadata
KnowledgeDatabase->>App: Collection of answers
end
rect rgb(230, 230, 230)
note right of App: Generation phase
App->>LLM: Generate answer based on retrieved answers
LLM->>App: Response to user query
end
App->>MonitoringDatabase: Logs question, answer, token usage
actor CustomerSupportTeam as Customer Support Team
CustomerSupportTeam-->>App: Human review of auto-generated answer (dislikes 👎)
CustomerSupportTeam-->>User: 📧 Responds to user (copy-paste from the system to Outlook)
- Components:
- The
retrieval
service is implemented in retrieval_service.py - The
generation
service is represented by two services:- For the local (offline) execution the service ollama_generation_service.py is used.
- For the online execution the service aws_generation_service.py is used.
- The
-
A ground truth dataset was generated using the notebook 02_create_ground_truth.ipynb, containing five questions per Q&A pair from the original dataset.
-
The generated ground truth dataset is stored in the file ground_truth.csv.
-
Four retrieval methods were tested with three different models:
- text retrieval
- vector retrieval for the pair of fields question/answer
- re-ranking against a set of vector-question-answer-retrieval and text retrieval
- vector retrieval for the answer field
-
Retrieval evaluation was performed using the following notebooks accordingly:
-
The evaluation metrics used are:
- Mean Reciprocal Rank (MRR): Measures how well the system ranks the correct answer. A higher MRR indicates better performance.
- Recall@k: Measures how many relevant documents are retrieved in the top k results. Higher Recall@k means better performance.
-
Data visualization was performed in the notebook 20_analytics.ipynb.
-
The evaluation results are represented in the picture below:
-
Conclusion:
- The embedding model
distiluse-base-multilingual-cased-v1
performed the best results for the given dataset in German. - Re-ranking with top 5 retrieval results should be used.
- The embedding model
The RAG evaluation has not been conducted yet.
The system logs key metrics into a PostgreSQL database, including:
- Number of input and output tokens: Tracks token usage for both the input prompt and the generated response.
- LLM Processing Time: Measures the time taken by the LLM to generate a response.
- Total Processing Time: Includes the LLM processing time and any additional processing overhead.
- Processing Status: Tracks if a request is pending, processed, or encountered an error.
These metrics help monitor the system's performance and detect any processing issues.
The entire solution is containerized. Refer to Run Components as Docker Containers for instructions on how to start the system locally.
The solution implements document re-ranking, combining rankings from multiple retrieval systems into a final ranking. The implementation can be found in reciprocal_rank_fusion_service.py.
Tests for the re-ranking service are located in the reciprocal_rank_fusion_service_test.py file.
-
The ingestion pipeline is powered by Mage.AI. Below is the high-level overview of the process:
sequenceDiagram actor CustomerSupportTeam as Customer Support Team note right of CustomerSupportTeam: Building a knowledge base CustomerSupportTeam-->>FileStorage: Prepare Q&A list participant IngestionPipeline as Ingestion Pipeline (Mage.ai) IngestionPipeline->>FileStorage: Retrieve Q&A list FileStorage->>IngestionPipeline: CSV (or PDF) files IngestionPipeline->>EmbeddingModel: Generate embeddings EmbeddingModel->>IngestionPipeline: Embeddings (vectors) IngestionPipeline->>KnowledgeDatabase: Index embeddings
-
The pipeline files are located in the mage/zoomcamp-smart-mail/smart-mail folder.
-
The overview of the pipeline steps represented in the picture below:
The solution can be launched in two ways:
-
Execution with local LLM: The simplest way to test the solution components. It uses a local OLLAMA model and does not require access to external services and API keys. Please note, the local LLM shows average or even poor performance and quality for generating German responses. To achieve better results, use the next option.
-
Execution with online LLM: The recommended way to test the solution. It uses the AWS LLM service and requires an AWS account and API keys.
The approach uses a local LLM model and does not require access to external services and API keys. No any extra setup is required. Everything is included in the repository.
To start the solution, run the following command. It may take 30-40 minutes or more to download the required Docker images and initialize the system:
docker compose -f docker-compose.yml -f docker-compose.test.yml -p smart-mail up --build
Wait for the solution to initialize. Logs will be displayed in the terminal.
To run the pre-configured ingestion pipeline, open the browser and navigate to:
http://localhost:6789/pipelines/ingestion_evdi/triggers
Click the Run@once
button. The pipeline will take 5-10 minutes to complete.
To open the Email Client, visit http://localhost:8501/ and select the email_client.py
option from the sidebar.
Please note that the first start can take time to download the sentence-transformers model. It will be improved in the future.
Test the system by entering questions from the Question Examples section below.
- Wie kann ich das Risiko einer Investition in Immobilien einschätzen?
- Wer ist für die finale Projekteinschätzung verantwortlich?
- Wie lange dauert es, bis ich mein Geld zurückbekommen kann, wenn ich es brauche?
- Welche Schritte muss ich unternehmen, um mein Geld zügig zurückzuerhalten, falls notwendig?
To review answers, open http://localhost:8501/ and select customer_support_client.py
from the sidebar. Input the Answer ID and click Read Answer
.
After testing, clean up the Docker containers by running:
docker compose -p smart-mail down
It is the recommended way to test the solution. This requires an AWS account and API keys.
- Create an AWS account if you don't have one: https://aws.amazon.com/
- Configure the AWS CLI: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html. Name your AWS CLI profile
private
or change the environment variableAWS_CONFIGURATION_PROFILE_NAME
in the file .env.dev. - Follow the instructions of the article Getting started with Amazon Bedrock to configure access to AWS LLM models.
- Run the following command to start the required services from the root directory of the repository:
docker compose -f docker-compose.yml -p smart-mail-online-llm up --build
- Wait for the solution to initialize. Logs will be displayed in the terminal.
- Execute the ingestion pipeline as described in the Start the Pipeline section.
- Go to the directory
smart_mail
. - Install required Python packages:
pip install -U --user -r requirements.txt
- Read the .env.dev file and exports all the environment variables defined in it to the current shell session:
export $(grep -v '^#' .env.dev | xargs)
- Run the following command to start the UI clients:
export POSTGRES_HOST=localhost && streamlit run ./src/streamlit_runner.py ./src
- Open the Email Client and Customer Support Client as described in the Execution with local LLM section.
After testing, clean up the Docker containers by running:
docker compose -p smart-mail-online-llm down