Skip to content

Commit

Permalink
feat: Qdrant external retriever (neo4j#154)
Browse files Browse the repository at this point in the history
* feat: Qdrant external retriever

* test: ci updates

* chore: add qdrant-client to dev deps

* chore: poetry.lock

* Update docs/source/api.rst

Co-authored-by: willtai <[email protected]>

* chore: fix mypy nit

* Update docs/source/user_guide_rag.rst

Co-authored-by: willtai <[email protected]>

---------

Co-authored-by: willtai <[email protected]>
Co-authored-by: willtai <[email protected]>
  • Loading branch information
3 people authored Oct 10, 2024
1 parent 6a8cea6 commit eeedf49
Show file tree
Hide file tree
Showing 21 changed files with 996 additions and 39 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/pr-e2e-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ jobs:
ports:
- 7687:7687
- 7474:7474
qdrant:
image: qdrant/qdrant
ports:
- 6333:6333

steps:
- name: Install graphviz package
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/scheduled-e2e-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@ jobs:
credentials:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
qdrant:
image: qdrant/qdrant
ports:
- 6333:6333

steps:
- name: Install graphviz package
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
- Added support for Cohere LLM and embeddings - added optional dependency to `cohere`.
- Added support for Anthropic LLM - added optional dependency to `anthropic`.
- Added support for MistralAI LLM - added optional dependency to `mistralai`.
- Added support for Qdrant - added optional dependency to `qdrant-client`.

### Fixed
- Resolved import issue with the Vertex AI Embeddings class.
Expand Down
6 changes: 6 additions & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,12 @@ PineconeNeo4jRetriever
.. autoclass:: neo4j_graphrag.retrievers.external.pinecone.pinecone.PineconeNeo4jRetriever
:members: search

QdrantNeo4jRetriever
====================

.. autoclass:: neo4j_graphrag.retrievers.external.qdrant.qdrant.QdrantNeo4jRetriever
:members: search


********
Embedder
Expand Down
31 changes: 31 additions & 0 deletions docs/source/user_guide_rag.rst
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,8 @@ We provide implementations for the following retrievers:
- Use this retriever when vectors are saved in a Weaviate vector database
* - :ref:`PineconeNeo4jRetriever <pinecone-neo4j-retriever-user-guide>`
- Use this retriever when vectors are saved in a Pinecone vector database
* - :ref:`QdrantNeo4jRetriever <qdrant-neo4j-retriever-user-guide>`
- Use this retriever when vectors are saved in a Qdrant vector database

Retrievers all expose a `search` method that we will discuss in the next sections.

Expand Down Expand Up @@ -672,6 +674,35 @@ Pinecone Retrievers
Also see :ref:`pineconeneo4jretriever`.

.. _qdrant-neo4j-retriever-user-guide:

Qdrant Retrievers
-----------------

.. note::

In order to import this retriever, the Qdrant Python client must be installed:
`pip install qdrant-client`


.. code:: python
from qdrant_client import QdrantClient
from neo4j_graphrag.retrievers import QdrantNeo4jRetriever
client = QdrantClient(...) # construct the Qdrant client instance
retriever = QdrantNeo4jRetriever(
driver=driver,
client=client,
collection_name="my-collection",
id_property_external="neo4j_id", # The payload field that contains identifier to a corresponding Neo4j node id property
id_property_neo4j="id",
embedder=embedder,
)
See :ref:`qdrantneo4jretriever`.


Other Retrievers
===================
Expand Down
31 changes: 31 additions & 0 deletions examples/qdrant/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
### Start services locally

Run the following command to spin up Neo4j and Qdrant containers.

```bash
docker compose -f tests/e2e/docker-compose.yml up
```

### Write data (once)

Run this from the project root to write data to both Neo4J and Qdrant.

```bash
poetry run python tests/e2e/qdrant_e2e/populate_dbs.py
```

### Install Qdrant client

```bash
pip install qdrant-client
```

### Search

```bash
# search by vector
poetry run python -m examples.qdrant.vector_search

# search by text, with embeddings generated locally
poetry run python -m examples.qdrant.text_search
```
Empty file added examples/qdrant/__init__.py
Empty file.
27 changes: 27 additions & 0 deletions examples/qdrant/text_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from neo4j import GraphDatabase
from neo4j_graphrag.retrievers import QdrantNeo4jRetriever
from qdrant_client import QdrantClient

NEO4J_URL = "neo4j://localhost:7687"
NEO4J_AUTH = ("neo4j", "password")


def main() -> None:
with GraphDatabase.driver(NEO4J_URL, auth=NEO4J_AUTH) as neo4j_driver:
embedder = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
retriever = QdrantNeo4jRetriever(
driver=neo4j_driver,
client=QdrantClient(url="http://localhost:6333"),
collection_name="Jeopardy",
id_property_external="neo4j_id",
id_property_neo4j="id",
embedder=embedder, # type: ignore
)

res = retriever.search(query_text="biology", top_k=2)
print(res)


if __name__ == "__main__":
main()
25 changes: 25 additions & 0 deletions examples/qdrant/vector_search.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
from neo4j import GraphDatabase
from neo4j_graphrag.retrievers import QdrantNeo4jRetriever
from qdrant_client import QdrantClient

from examples.embedding_biology import EMBEDDING_BIOLOGY

NEO4J_URL = "neo4j://localhost:7687"
NEO4J_AUTH = ("neo4j", "password")


def main() -> None:
with GraphDatabase.driver(NEO4J_URL, auth=NEO4J_AUTH) as neo4j_driver:
retriever = QdrantNeo4jRetriever(
driver=neo4j_driver,
client=QdrantClient(url="http://localhost:6333"),
collection_name="Jeopardy",
id_property_external="neo4j_id",
id_property_neo4j="id",
)
res = retriever.search(query_vector=EMBEDDING_BIOLOGY, top_k=2)
print(res)


if __name__ == "__main__":
main()
144 changes: 110 additions & 34 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit eeedf49

Please sign in to comment.