Skip to content

Final project of the LLM Zoomcamp course: https://github.com/DataTalksClub/llm-zoomcamp. LLM Zoomcamp - a free online course about real-life applications of LLMs. In 10 weeks I learned how to build an AI system that answers questions about a knowledge base.

Notifications You must be signed in to change notification settings

ArkadiyShuvaev/llm-zoomcamp-smart-mail

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Description

The goal of this solution is to explore the concept of a system that can efficiently respond to customer emails, providing personalized and human-like replies. The system is designed to ensure that customers feel they are interacting with real people.

The Dataset

The dataset is generated using publicly available content in German from my employer's website as of September 2024. The following pages were used to create the dataset:

The content from these pages was converted into a set of FAQ-style questions and answers, which are stored in the FAQs folder.

Structure of the Repository

Overview of the Repository Structure

The llm-zoomcamp-smart-mail project is organized as follows:

llm-zoomcamp-smart-mail/
├─ .devcontainer/
│  └─ (development container configurations)
├─ images/
│  └─ (documentation images)
├─ mage/
│  ├─ data/
│  │  └─ (pipeline dataset)
│  └─ zoomcamp-smart-mail/
│     └─ (pipeline files)
├─ notebook/
│  └─ (Jupyter notebooks for evaluation)
├─ smart_mail/
│  ├─ src/
│  │  ├─ streamlit_runner.py
│  │  ├─ email_client.py
│  │  └─ customer_support_client.py
│  └─ tests/
│     └─ (tests for the solution)
├─ .gitignore
├─ requirements.txt
└─ README.md

Detailed Description of the Repository Structure

  1. .devcontainer/

    • Purpose: Provides development container configuration.
    • Usage: Optional. Ensures consistent development environment across different machines.
  2. images/

    • Purpose: Stores documentation images.
    • Contents: Image files used in project documentation.
  3. mage/

    • Purpose: Contains Mage.AI pipeline files for data processing.
    • Subfolders:
      • data/: Stores the dataset used by the Mage.AI pipeline.
      • zoomcamp-smart-mail/: Contains specific pipeline files for smart mail processing.
  4. notebook/

    • Purpose: Houses Jupyter notebooks for evaluation and analysis.
    • Contents: .ipynb files used for data exploration, model evaluation, and result visualization.
  5. smart_mail/

    • Purpose: Core application directory containing source code and tests.
    • Subfolders:
      • src/: Source code for the main application:
        • streamlit_runner.py: Launches Streamlit UI applications. See Interface section for details.
        • email_client.py: Implements the Email Client application (composition root).
        • customer_support_client.py: Implements the Customer Support Client application (composition root).
      • tests/: Contains unit tests for the service reciprocal_rank_fusion_service.py.

Solution Components

Interface

Two applications have been developed to verify the concept:

  • Email Client: Simulates sending emails to the system.
  • Customer Support Client: Allows the support team to review generated responses.

Below are sample interfaces: Email Client Customer Support Client

RAG Flow

The Retrieval-Augmented Generation (RAG) flow consists of two components: retrieval and generation, as shown in the diagram below.

sequenceDiagram
    actor User as User
    User-->>App: 📧 User question (using Email Client App)
    rect rgb(240, 240, 240)
    note right of App: Retrieval phase
    App->>EmbeddingModel: User query
    EmbeddingModel->>App: Encoded query (vector)
    App->>KnowledgeDatabase: Encoded query + metadata
    KnowledgeDatabase->>App: Collection of answers
    end
    rect rgb(230, 230, 230)
    note right of App: Generation phase
    App->>LLM: Generate answer based on retrieved answers
    LLM->>App: Response to user query
    end
    App->>MonitoringDatabase: Logs question, answer, token usage
    actor CustomerSupportTeam as Customer Support Team

    CustomerSupportTeam-->>App: Human review of auto-generated answer (dislikes 👎)
    CustomerSupportTeam-->>User: 📧 Responds to user (copy-paste from the system to Outlook)
Loading

Retrieval Evaluation

  • A ground truth dataset was generated using the notebook 02_create_ground_truth.ipynb, containing five questions per Q&A pair from the original dataset.

  • The generated ground truth dataset is stored in the file ground_truth.csv.

  • Four retrieval methods were tested with three different models:

    • text retrieval
    • vector retrieval for the pair of fields question/answer
    • re-ranking against a set of vector-question-answer-retrieval and text retrieval
    • vector retrieval for the answer field
  • Retrieval evaluation was performed using the following notebooks accordingly:

  • The evaluation metrics used are:

    • Mean Reciprocal Rank (MRR): Measures how well the system ranks the correct answer. A higher MRR indicates better performance.
    • Recall@k: Measures how many relevant documents are retrieved in the top k results. Higher Recall@k means better performance.
  • Data visualization was performed in the notebook 20_analytics.ipynb.

  • The evaluation results are represented in the picture below: Evaluation Metrics by Method and Model

  • Conclusion:

    • The embedding model distiluse-base-multilingual-cased-v1 performed the best results for the given dataset in German.
    • Re-ranking with top 5 retrieval results should be used.

RAG Evaluation

The RAG evaluation has not been conducted yet.

Monitoring

The system logs key metrics into a PostgreSQL database, including:

  • Number of input and output tokens: Tracks token usage for both the input prompt and the generated response.
  • LLM Processing Time: Measures the time taken by the LLM to generate a response.
  • Total Processing Time: Includes the LLM processing time and any additional processing overhead.
  • Processing Status: Tracks if a request is pending, processed, or encountered an error.

These metrics help monitor the system's performance and detect any processing issues.

Database Overview

Containerization

The entire solution is containerized. Refer to Run Components as Docker Containers for instructions on how to start the system locally.

Document Re-ranking

The solution implements document re-ranking, combining rankings from multiple retrieval systems into a final ranking. The implementation can be found in reciprocal_rank_fusion_service.py.

Tests for the re-ranking service are located in the reciprocal_rank_fusion_service_test.py file.

Ingestion Pipeline

  • The ingestion pipeline is powered by Mage.AI. Below is the high-level overview of the process:

    sequenceDiagram
        actor CustomerSupportTeam as Customer Support Team
        note right of CustomerSupportTeam: Building a knowledge base
        CustomerSupportTeam-->>FileStorage: Prepare Q&A list
        participant IngestionPipeline as Ingestion Pipeline (Mage.ai)
        IngestionPipeline->>FileStorage: Retrieve Q&A list
        FileStorage->>IngestionPipeline: CSV (or PDF) files
        IngestionPipeline->>EmbeddingModel: Generate embeddings
        EmbeddingModel->>IngestionPipeline: Embeddings (vectors)
        IngestionPipeline->>KnowledgeDatabase: Index embeddings
    
    Loading
  • The pipeline files are located in the mage/zoomcamp-smart-mail/smart-mail folder.

  • The overview of the pipeline steps represented in the picture below:

How to Start the Solution

The solution can be launched in two ways:

  1. Execution with local LLM: The simplest way to test the solution components. It uses a local OLLAMA model and does not require access to external services and API keys. Please note, the local LLM shows average or even poor performance and quality for generating German responses. To achieve better results, use the next option.

  2. Execution with online LLM: The recommended way to test the solution. It uses the AWS LLM service and requires an AWS account and API keys.

Execution with local LLM

The approach uses a local LLM model and does not require access to external services and API keys. No any extra setup is required. Everything is included in the repository.

Run Components as Docker Containers

To start the solution, run the following command. It may take 30-40 minutes or more to download the required Docker images and initialize the system:

docker compose -f docker-compose.yml -f docker-compose.test.yml -p smart-mail up --build

Wait for the solution to initialize. Logs will be displayed in the terminal.

Start the Pipeline

To run the pre-configured ingestion pipeline, open the browser and navigate to:

http://localhost:6789/pipelines/ingestion_evdi/triggers

Click the Run@once button. The pipeline will take 5-10 minutes to complete.

Start Pipeline

Test the UI

Open UI Applications

To open the Email Client, visit http://localhost:8501/ and select the email_client.py option from the sidebar.

Email Client Selection

Please note that the first start can take time to download the sentence-transformers model. It will be improved in the future.

Input Questions

Test the system by entering questions from the Question Examples section below.

Input Question

Question Examples
  1. Wie kann ich das Risiko einer Investition in Immobilien einschätzen?
  2. Wer ist für die finale Projekteinschätzung verantwortlich?
  3. Wie lange dauert es, bis ich mein Geld zurückbekommen kann, wenn ich es brauche?
  4. Welche Schritte muss ich unternehmen, um mein Geld zügig zurückzuerhalten, falls notwendig?

Review the Answers

To review answers, open http://localhost:8501/ and select customer_support_client.py from the sidebar. Input the Answer ID and click Read Answer.

Read Answer

Clean Up

After testing, clean up the Docker containers by running:

docker compose -p smart-mail down

Execution with online LLM

It is the recommended way to test the solution. This requires an AWS account and API keys.

Setup AWS Environment

  1. Create an AWS account if you don't have one: https://aws.amazon.com/
  2. Configure the AWS CLI: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html. Name your AWS CLI profile private or change the environment variable AWS_CONFIGURATION_PROFILE_NAME in the file .env.dev.
  3. Follow the instructions of the article Getting started with Amazon Bedrock to configure access to AWS LLM models.

Start Dependencies

  1. Run the following command to start the required services from the root directory of the repository:
    docker compose -f docker-compose.yml -p smart-mail-online-llm up --build
  2. Wait for the solution to initialize. Logs will be displayed in the terminal.
  3. Execute the ingestion pipeline as described in the Start the Pipeline section.

Install Python packages

  1. Go to the directory smart_mail.
  2. Install required Python packages:
    pip install -U --user -r requirements.txt

Start the Solution

  1. Read the .env.dev file and exports all the environment variables defined in it to the current shell session:
    export $(grep -v '^#' .env.dev | xargs)
  2. Run the following command to start the UI clients:
    export POSTGRES_HOST=localhost && streamlit run ./src/streamlit_runner.py ./src
  3. Open the Email Client and Customer Support Client as described in the Execution with local LLM section.

Clean Up

After testing, clean up the Docker containers by running:

docker compose -p smart-mail-online-llm down

About

Final project of the LLM Zoomcamp course: https://github.com/DataTalksClub/llm-zoomcamp. LLM Zoomcamp - a free online course about real-life applications of LLMs. In 10 weeks I learned how to build an AI system that answers questions about a knowledge base.

Topics

Resources

Stars

Watchers

Forks