Skip to content

Simple, High Quality, RAG application using TiDB vector store

License

Notifications You must be signed in to change notification settings

gurveervirk/TiChat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TiChat

gurveervirk/TiChat

This project is a Simple, high quality, ChatGPT-esque, extensible RAG application, that makes use of AI models and indices to query documents and retrieve better-informed responses from the models. It allows you to upload your documents that can be used to answer any corresponding queries. It automatically stores your chats for future usage.

A sample online implementation of this project is available here.

Prerequisites

This app has a single dependency that needs to be installed separately:

  • Ollama for quick and easy model download, serving as well as automatic and smart device loading

This app makes use of TiDB vector store. For quick setup, head over to TiDB Cloud and sign up for a free account.

Get the connection string for your database (default: 'test'), after generating the password. Also, download the CA certificate to your system.

Getting Started

  1. Go to the Releases page, and download the latest TiChatInstaller.exe.
  2. Run it and follow the steps to complete the installation.
  3. Go to the installation directory (default: "C:\Program Files (x86)\TiChat"), and make the following changes to settings.json:
    • connectionString to the connection string for your TiDB cloud account, with ssl_ca = full path to your installed CA cert

Done! You can now run the application.

Usage

Start the app from desktop or start menu after completing the above tasks.

The app allows the user to simply chat with the bot, if the checkbox is left unchecked, or use the index created with the uploaded documents for better-informed responses.

Upload documents using the top right button.

This is how it should look like:

gurveervirk/TiChat

Application Explained

Components:

  1. Ollama (Local LLM Runtime):
  • Hosts the local LLM model (Mistral-7B-Instruct-v0.3).
  • Runs inference locally for generating responses based on prompts.
  1. FastEmbedEmbedding (ONNX Model):
  • Runs locally on the CPU.
  • Generates vector embeddings from text data (e.g., document uploads).
  • Model: mixedbread-ai/mxbai-embed-large-v1.
  1. TiDB (Vector Store):
  • A distributed, scalable vector database that stores the embeddings.
  • Provides vector search capabilities.
  1. Llama-Index (Vector Index):
  • Interface layer between the application and TiDB.
  • Manages vector indexes and performs efficient retrieval for relevant documents.
  1. RAG Chatbot Application:
  • The main user interface where users interact with the chatbot.
  • Orchestrates the flow of data between different components.

Frontend has been built using React and Bootstrap 5.

Data Flow:

gurveervirk/TiChat

There are 2 main flows:

  1. If the user does not use the index, it directly calls the LLM, like a normal chatbot app.
  2. If the user checks 'Use Index Querying', the following occurs:
  • User Input: The user inputs a query via the chatbot interface.
  • Embedding Generation: The query is passed to the FastEmbedEmbedding ONNX model to generate its vector embedding. The question, along with the previous messages, is condensed into a single question for better response.
  • Vector Search: The generated embedding is sent to Llama-Index, which queries TiDB for relevant document embeddings. TiDB returns the most relevant document vectors.
  • Contextual Response Generation: The retrieved documents (in their original text form) are provided as context to the LLM model in Ollama. The LLM generates a response based on the query and the retrieved documents.
  • Response Delivery: The generated response is displayed to the user through the chatbot interface.