-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #17 from prrao87/qdrant
Qdrant: A vector database built on Rust
- Loading branch information
Showing
19 changed files
with
942 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -134,4 +134,5 @@ dmypy.json | |
# data | ||
data/*.json | ||
data/*.jsonl | ||
*/*/meili_data | ||
*/*/meili_data | ||
dbs/qdrant/scripts/onnx_models |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
QDRANT_VERSION = "v1.1.1" | ||
QDRANT_PORT = 6333 | ||
QDRANT_HOST = "localhost" | ||
QDRANT_SERVICE = "qdrant" | ||
API_PORT = 8005 | ||
EMBEDDING_MODEL_CHECKPOINT = "sentence-transformers/multi-qa-MiniLM-L6-cos-v1" | ||
|
||
# Container image tag | ||
TAG = "0.1.0" | ||
|
||
# Docker project namespace (defaults to the current folder name if not set) | ||
COMPOSE_PROJECT_NAME = qdrant_wine |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
FROM python:3.10-slim-bullseye | ||
|
||
WORKDIR /wine | ||
|
||
COPY ./requirements-docker.txt /wine/requirements-docker.txt | ||
|
||
RUN pip install --no-cache-dir -U pip wheel setuptools | ||
RUN pip install --no-cache-dir -r /wine/requirements-docker.txt | ||
|
||
COPY ./api /wine/api | ||
COPY ./schemas /wine/schemas | ||
COPY ./scripts/onnx_models /wine/scripts/onnx_models | ||
|
||
EXPOSE 8000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
# Qdrant | ||
|
||
[Qdrant](https://qdrant.tech/) is a vector database and vector similarity search engine written in Rust. The primary use case for a vector database is to answer business questions that involve connected data. | ||
|
||
* Which wines from Chile were tasted by at least two different tasters? | ||
* What are the top-rated wines from Italy that share their variety with my favourite ones from Portugal? | ||
|
||
Code is provided for ingesting the wine reviews dataset into Qdrant in an async fashion. In addition, a query API written in FastAPI is also provided that allows a user to query available endpoints. As always in FastAPI, documentation is available via OpenAPI (http://localhost:8000/docs). | ||
|
||
* All code (wherever possible) is async | ||
* [Pydantic](https://docs.pydantic.dev) is used for schema validation, both prior to data ingestion and during API request handling | ||
* The same schema is used for data ingestion and for the API, so there is only one source of truth regarding how the data is handled | ||
* For ease of reproducibility, the whole setup is orchestrated and deployed via docker | ||
|
||
## Setup | ||
|
||
Note that this code base has been tested in Python 3.10, and requires a minimum of Python 3.10 to work. Install dependencies via `requirements.txt`. | ||
|
||
```sh | ||
# Setup the environment for the first time | ||
python -m venv qdrant_venv # python -> python 3.10 | ||
|
||
# Activate the environment (for subsequent runs) | ||
source qdrant_venv/bin/activate | ||
|
||
python -m pip install -r requirements.txt | ||
``` | ||
|
||
--- | ||
|
||
## Step 1: Set up containers | ||
|
||
Use the provided `docker-compose.yml` to initiate separate containers, one that runs Qdrant, and another one that serves as an API on top of the database. | ||
|
||
``` | ||
docker compose up -d | ||
``` | ||
|
||
This compose file starts a persistent-volume Qdrant database with credentials specified in `.env`. The `qdrant` variable in the environment file indicates that we are opening up the database service to a FastAPI server (running as a separate service, in a separate container) downstream. Both containers can communicate with one another with the common network that they share, on the exact port numbers specified. | ||
|
||
The services can be stopped at any time for maintenance and updates. | ||
|
||
``` | ||
docker compose down | ||
``` | ||
|
||
**Note:** The setup shown here would not be ideal in production, as there are other details related to security and scalability that are not addressed via simple docker, but, this is a good starting point to begin experimenting! | ||
|
||
|
||
## Step 1: Ingest the data | ||
|
||
Because Qdrant is a vector database, we ingest not only the wine reviews JSON blobs for each item, but also vectors (i.e., sentence embeddings) for the fields on which we want to perform a semantic similarity search. For this dataset, it's reasonable to expect that a simple concatenation of fields like `title`, `variety` and `description` would result in a useful sentence embedding that can be compared against a search query (which is also converted to a vector during query time). | ||
|
||
As an example, consider the following data snippet form the `data/` directory in this repo: | ||
|
||
```json | ||
"title": "Castello San Donato in Perano 2009 Riserva (Chianti Classico)", | ||
"description": "Made from a blend of 85% Sangiovese and 15% Merlot, this ripe wine delivers soft plum, black currants, clove and cracked pepper sensations accented with coffee and espresso notes. A backbone of firm tannins give structure. Drink now through 2019.", | ||
"variety": "Red Blend" | ||
``` | ||
|
||
### Choice of embedding model | ||
|
||
[SentenceTransformers](https://www.sbert.net/) is a Python framework for a range of sentence and text embeddings. It results from extensive work on fine-tuning BERT to work well on semantic similarity tasks using Siamese BERT networks, where the model is trained to predict the similarity between sentence pairs. The original work is [described here](https://arxiv.org/abs/1908.10084). | ||
|
||
#### Why use sentence transformers? | ||
|
||
Although larger and more powerful text embedding models exist (such as [OpenAI embeddings](https://platform.openai.com/docs/guides/embeddings)), they can become really expensive as they are not free, and charge per token of text they generate vectors for. SentenceTransformers are free and open-source, and have been optimized for years for performance (to utilize all CPU cores) as well as accuracy. A full list of sentence transformer models [is in their project page](https://www.sbert.net/docs/pretrained_models.html). | ||
|
||
For this work, it makes sense to use among the fastest models in this list, which is the `multi-qa-MiniLM-L6-cos-v1` **uncased** model. As the name suggests, it was tuned for semantic search and question answering, and generates sentence embeddings for single sentences or paragraphs up to a maximum sequence length of 512. It was trained on 215M question answer pairs from various sources. Compared to the more general-purpose `all-MiniLM-L6-v2` model, it shows slightly improved performance on semantic search tasks while offering a similar level of performance. [See the sbert docs](https://www.sbert.net/docs/pretrained_models.html) for more details on performance comparisons between the various pretrained models. | ||
|
||
|
||
### Run data loader | ||
|
||
Data is ingested into the Qdrant database through the scripts in the `scripts` directly. | ||
|
||
```sh | ||
cd scripts | ||
python bulk_index_sbert.py | ||
``` | ||
|
||
This script validates the input JSON data via [Pydantic](https://docs.pydantic.dev), and then indexes them to Qdrant using the [Qdrant Python client](https://github.com/qdrant/qdrant-client). | ||
|
||
We simply concatenate the key fields that contain useful information about each wine, and vectorize them prior to indexing them to the database. | ||
|
||
|
||
## Step 3: Test API | ||
|
||
Once the data has been successfully loaded into Qdrant and the containers are up and running, we can test out a search query via an HTTP request as follows. | ||
|
||
```sh | ||
curl -X 'GET' \ | ||
'http://localhost:8000/wine/search?terms=tuscany%20red&max_price=50' | ||
``` | ||
|
||
This cURL request passes the search terms "**tuscany red**" to the `/wine/search/` endpoint, which is then parsed into a working Cypher query by the FastAPI backend. The query runs and retrieves results from a full text search index (that looks for these keywords in the wine's title and description), and, if the setup was done correctly, we should see the following response: | ||
|
||
```json | ||
[ | ||
{ | ||
"wineID": 66393, | ||
"country": "Italy", | ||
"title": "Capezzana 1999 Ghiaie Della Furba Red (Tuscany)", | ||
"description": "Very much a baby, this is one big, bold, burly Cab-Merlot-Syrah blend that's filled to the brim with extracted plum fruit, bitter chocolate and earth. It takes a long time in the glass for it to lose its youthful, funky aromatics, and on the palate things are still a bit scattered. But in due time things will settle and integrate", | ||
"points": 90, | ||
"price": 49, | ||
"variety": "Red Blend", | ||
"winery": "Capezzana" | ||
}, | ||
{ | ||
"wineID": 40960, | ||
"country": "Italy", | ||
"title": "Fattoria di Grignano 2011 Pietramaggio Red (Toscana)", | ||
"description": "Here's a simple but well made red from Tuscany that has floral aromas of violet and rose with berry notes. The palate offers bright cherry, red currant and a touch of spice. Pair this with pasta dishes or grilled vegetables.", | ||
"points": 86, | ||
"price": 11, | ||
"variety": "Red Blend", | ||
"winery": "Fattoria di Grignano" | ||
}, | ||
{ | ||
"wineID": 73595, | ||
"country": "Italy", | ||
"title": "I Giusti e Zanza 2011 Belcore Red (Toscana)", | ||
"description": "With aromas of violet, tilled soil and red berries, this blend of Sangiovese and Merlot recalls sunny Tuscany. It's loaded with wild cherry flavors accented by white pepper, cinnamon and vanilla. The palate is uplifted by vibrant acidity and fine tannins.", | ||
"points": 89, | ||
"price": 27, | ||
"variety": "Red Blend", | ||
"winery": "I Giusti e Zanza" | ||
} | ||
] | ||
``` | ||
|
||
Not bad! This example correctly returns some highly rated Tuscan red wines along with their price and country of origin (obviously, Italy in this case). | ||
|
||
### Step 4: Extend the API | ||
|
||
The API can be easily extended with the provided structure. | ||
|
||
- The `schemas` directory houses the Pydantic schemas, both for the data input as well as for the endpoint outputs | ||
- As the data model gets more complex, we can add more files and separate the ingestion logic from the API logic here | ||
- The `api/routers` directory contains the endpoint routes so that we can provide additional endpoint that answer more business questions | ||
- For e.g.: "What are the top rated wines from Argentina?" | ||
- In general, it makes sense to organize specific business use cases into their own router files | ||
- The `api/main.py` file collects all the routes and schemas to run the API | ||
|
||
|
||
#### Existing endpoints | ||
|
||
So far, the following endpoints that help answer interesting questions have been implemented. | ||
|
||
``` | ||
GET | ||
/wine/search | ||
Semantic similarity search | ||
``` | ||
|
||
More to come soon! | ||
|
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
from pydantic import BaseSettings | ||
|
||
|
||
class Settings(BaseSettings): | ||
qdrant_service: str | ||
qdrant_port: str | ||
qdrant_host: str | ||
qdrant_service: str | ||
api_port = str | ||
embedding_model_checkpoint: str | ||
tag: str | ||
|
||
class Config: | ||
env_file = ".env" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
from collections.abc import AsyncGenerator | ||
from contextlib import asynccontextmanager | ||
from functools import lru_cache | ||
|
||
from fastapi import FastAPI | ||
from qdrant_client import QdrantClient | ||
|
||
from api.config import Settings | ||
from api.routers.wine import wine_router | ||
|
||
from scripts.onnx_optimizer import get_embedding_pipeline | ||
|
||
|
||
@lru_cache() | ||
def get_settings(): | ||
# Use lru_cache to avoid loading .env file for every request | ||
return Settings() | ||
|
||
|
||
@asynccontextmanager | ||
async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]: | ||
"""Async context manager for Qdrant database connection.""" | ||
settings = get_settings() | ||
model_checkpoint = settings.embedding_model_checkpoint | ||
app.model = get_embedding_pipeline( | ||
"scripts/onnx_models", model_filename="model_optimized_quantized.onnx" | ||
) | ||
app.client = QdrantClient(host=settings.qdrant_service, port=settings.qdrant_port) | ||
print("Successfully connected to Qdrant") | ||
yield | ||
print("Successfully closed Qdrant connection and released resources") | ||
|
||
|
||
app = FastAPI( | ||
title="REST API for wine reviews on Qdrant", | ||
description=( | ||
"Query from a Qdrant database of 130k wine reviews from the Wine Enthusiast magazine" | ||
), | ||
version=get_settings().tag, | ||
lifespan=lifespan, | ||
) | ||
|
||
|
||
@app.get("/", include_in_schema=False) | ||
async def root(): | ||
return { | ||
"message": "REST API for querying Qdrant database of 130k wine reviews from the Wine Enthusiast magazine" | ||
} | ||
|
||
|
||
# Attach routes | ||
app.include_router(wine_router, prefix="/wine", tags=["wine"]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
from qdrant_client import QdrantClient | ||
from qdrant_client.http import models | ||
from fastapi import APIRouter, HTTPException, Query, Request | ||
from optimum.pipelines import pipeline | ||
|
||
from schemas.retriever import SimilaritySearch | ||
|
||
wine_router = APIRouter() | ||
|
||
|
||
# --- Routes --- | ||
|
||
|
||
@wine_router.get( | ||
"/search", | ||
response_model=list[SimilaritySearch], | ||
response_description="Search wines by title, description and variety", | ||
) | ||
def search_by_keywords( | ||
request: Request, | ||
terms: str = Query(description="Search wine by keywords in title, description and variety"), | ||
max_price: float = Query( | ||
default=10000.0, description="Specify the maximum price for the wine (e.g., 30)" | ||
), | ||
) -> list[SimilaritySearch] | None: | ||
model = request.app.model | ||
client = request.app.client | ||
collection = "wines" | ||
result = _search_by_keywords(client, model, collection, terms, max_price) | ||
if not result: | ||
raise HTTPException( | ||
status_code=404, | ||
detail=f"No wine with the provided terms '{terms}' found in database - please try again", | ||
) | ||
return result | ||
|
||
|
||
# --- Helper functions --- | ||
|
||
|
||
def _search_by_keywords( | ||
client: QdrantClient, model: pipeline, collection: str, terms: str, max_price: float | ||
) -> list[SimilaritySearch] | None: | ||
"""Convert input text query into a vector for lookup in the db""" | ||
vector = model(terms)[0][0] | ||
|
||
# Define a range filter for wine price | ||
filter = models.Filter( | ||
**{ | ||
"must": [ | ||
{ | ||
"key": "price", | ||
"range": { | ||
"lte": max_price, | ||
}, | ||
} | ||
] | ||
} | ||
) | ||
|
||
# Use `vector` for similarity search on the closest vectors in the collection | ||
search_result = client.search( | ||
collection_name=collection, query_vector=vector, query_filter=filter, top=5 | ||
) | ||
# `search_result` contains found vector ids with similarity scores along with the stored payload | ||
# For now we are interested in payload only | ||
payloads = [hit.payload for hit in search_result] | ||
# # Qdrant doesn't appear to have a sort option for fields other than similarity score, so we just filter it ourselves | ||
payloads = sorted(payloads, key=lambda x: x["points"], reverse=True) | ||
if not payloads: | ||
return None | ||
return payloads |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
version: "3" | ||
|
||
services: | ||
qdrant: | ||
image: qdrant/qdrant:${QDRANT_VERSION} | ||
restart: unless-stopped | ||
environment: | ||
- QDRANT_HOST=${QDRANT_HOST} | ||
ports: | ||
- ${QDRANT_PORT}:6333 | ||
volumes: | ||
- qdrant_storage:/qdrant/storage | ||
# networks: | ||
# - wine | ||
|
||
# fastapi: | ||
# image: qdrant_wine_fastapi:${TAG} | ||
# build: . | ||
# restart: unless-stopped | ||
# env_file: | ||
# - .env | ||
# ports: | ||
# - ${API_PORT}:8000 | ||
# depends_on: | ||
# - qdrant | ||
# volumes: | ||
# - ./:/wine | ||
# networks: | ||
# - wine | ||
# command: uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload | ||
|
||
volumes: | ||
qdrant_storage: | ||
|
||
# networks: | ||
# wine: | ||
# driver: bridge |
Oops, something went wrong.