Releases: prrao87/db-hub-fastapi
0.7.0
0.6.0
0.5.0
What's Changed
Added code for Qdrant, a vector database built in Rust
Includes:
Key features
Bulk index both the data and associated vectors (sentence embeddings) using sentence-transformers
into Qdrant so that we can perform similarity search on phrases.
- Unlike keyword based search, similarity search requires vectors that come from an NLP (typically transformer) model
- We use a pretrained model from
sentence-transformers
multi-qa-distilbert-cos-v1
is the model used: As per the docs, "This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs."
- We use a pretrained model from
- Unlike other cases, generating sentence embeddings on a large batch of text is quite slow on a CPU, so some code is provided to generate ONNX-optimized and quantized models so that we both generate and index the vectors into db more rapidly without a GPU
Notes on ONNX performance
It looks like ONNX does utilize all available CPU cores when processing the text and generating the embeddings (the image below was generated from an AWS EC2 T2 ubuntu instance with a single 4-core CPU).
On average, the entire wine reviews dataset of 129,971 reviews is vectorized and ingested into Qdrant in 34 minutes via the quantized ONNX model, as opposed to more than 1 hour for the regular sbert
model downloaded from the sentence-transformers
repo. The quantized ONNX model is also ~33% smaller in size from the original model.
sbert
model: Processes roughly 51 items/sec- Quantized
onnxruntime
model: Processes roughly 92 items/sec
This amounts to a roughly 1.8x reduction in indexing time, with a ~26% smaller (quantized) model that loads and processes results faster. To verify that the embeddings from the quantized models are of similar quality, some example cosine similarities are shown below.
Example results:
The following results are for the sentence-transformers/multi-qa-MiniLM-L6-cos-v1
model that was built for semantic similarity tasks.
Vanilla model
---
Loading vanilla sentence transformer model
---
Similarity between 'I'm very happy' and 'I am so glad': [0.74601071]
Similarity between 'I'm very happy' and 'I'm so sad': [0.6456476]
Similarity between 'I'm very happy' and 'My dog is missing': [0.09541589]
Similarity between 'I'm very happy' and 'The universe is so vast!': [0.27607652]
Quantized ONNX model
---
Loading quantized ONNX model
---
The ONNX file model_optimized_quantized.onnx is not a regular name used in optimum.onnxruntime, the ORTModel might not behave as expected.
Similarity between 'I'm very happy' and 'I am so glad': [0.74153285]
Similarity between 'I'm very happy' and 'I'm so sad': [0.65299551]
Similarity between 'I'm very happy' and 'My dog is missing': [0.09312761]
Similarity between 'I'm very happy' and 'The universe is so vast!': [0.26112114]
As can be seen, the similarity scores are very close to the vanilla model, but the model is ~26% smaller and we are able to process the sentences much faster on the same CPU.
0.4.3
In this release
-
srsly
is a fast and lightweight JSON serialization library from Explosion.- It eliminates a lot of boilerplate for util functions that read/write compressed JSONL files (in gzip format)
- Using this library each bulk indexing script is very simple, doesn't add much overhead to the
pip install
time, and reduces the number of lines of code quite significantly - The code base for Elasticsearch, Meilisearch and Neo4j have all been updated to use
srsly
to read gzipped JSONL - For future DBs, the same approach will be used to also keep things clean and readable
-
For Meilisearch, the settings specification is moved over to a
settings.json
to keep things clean and easy to find all in one place
0.4.2
Enhancements
This release contains updates and enhancements from #15 and #16.
#15 results in a ~4x reduction in indexing time for Meilisearch. The key changes are as follows:
- It's possible to process files concurrently (using process pool executor from
concurrent.futures
), avoiding sequential execution - The process pool is then attached to the running event loop, so that we allow non-blocking execution of each executor that's performing tasks like reading JSON data and validating them in Pydantic
aiofiles
was also tried to process files in async fashion, but the bottleneck seems to be with the validation in pydantic, not with file I/O.- It will be interesting to see how pydantic 2 compares with this approach in the future!
0.4.1
Improvements to Meilisearch section
- #11 resolves an issue where files not being found causes the script to fail
- #12 improves indexing performance by gathering async tasks first (and not processing them in a blocking manner)
- #13 cleans up the comments and docs and fixes a problem with the docker container not firing up when minor version is missing
0.4.0
What's in this release
#8 adds Meilisearch, a fast and responsive search engine database written in Rust. Like the other databases in this repo, the async Python client is used to bulk-index the dataset into the db and async queries are used in FastAPI. The following tasks are implemented:
- Set up Meilisearch DB instance via Docker compose and include
.env.example
- Add async bulk indexing script
- Include schema checks
- Add methods to set searchable, sortable and filterable fields
- Add API code for querying db
- Add docs describing Meilisearch and some of its limitations compared to other dbs
0.3.0
What's in this update
Includes updates from #5 and #6.
Elasticsearch
This release introduces Elasticsearch indexing and API code to the repo.
- Include docker files to set up a basic license (free) Elasticsearch database
- Create a
wines
alias and its associated index in Elastic - Bulk-index the wines dataset into Elastic
- Test queries in Kibana
- Build FastAPI application to query result from Elastic via JSON queries sent to the backend
- Test out sample queries via OpenAPI browser
Neo4j
- Minor fixes to docs: typos and clarity
- Fix type hints in schema and API
- Set the docker container tag as the API version for simplicity (every time the FastAPI container tag changes, the API version in the docs follow suit with the same number)
- Fix issues with type hints in API routers
- Neo4j queries return vanilla dicts, and for some reason, FastAPI + Pydantic don't parse these prior to sending them as a response (this isn't an issue in Elastic)
- Update README example cURL request and docs
- Fix linting issues
0.2.1
Refactor Neo4j data loader schema and queries
This release is for #4.
- There's no need to complicate things by converting the existing data to a nested dict -- keeping the original dict from the raw data makes sense from a build query perspective
- Running each portion (nodes first and then edges after) is also unnecessary -- a single build query with WITH and MERGE statements does the job