Skip to content

Commit

Permalink
add illustrative doc for reranking (not expected to execute)
Browse files Browse the repository at this point in the history
  • Loading branch information
mattf committed Apr 12, 2024
1 parent f50e1f2 commit c5abecb
Showing 1 changed file with 250 additions and 0 deletions.
250 changes: 250 additions & 0 deletions libs/ai-endpoints/docs/retrievers/nvidia_rerank.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# NVIDIA NeMo Retriever Reranking\n",
"\n",
"Reranking is a critical piece of high accuracy, efficient retrieval pipelines.\n",
"\n",
"Two important use cases:\n",
"- Combining results from multiple data sources\n",
"- Enhancing accuracy for single data sources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Combining results from multiple sources\n",
"\n",
"Consider a pipeline with data from a semantic store, such as FAISS, as well as a BM25 store.\n",
"\n",
"Each store is queried independently and returns results that the individual store considers to be highly relevant. Figuring out the overall relevance of the results is where reranking comes into play."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will search for information about the query `What is the meaning of life?` across a BM25 store and semantic store."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"query = \"What is the meaning of life?\""
]
},
{
"cell_type": "markdown",
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"source": [
"### BM25 relevant documents\n",
"\n",
"Below we assume you have ElasticSearch running with documents stored in a `langchain-index` store."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain-community elasticsearch"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"import elasticsearch\n",
"from langchain_community.retrievers import ElasticSearchBM25Retriever\n",
"\n",
"bm25_retriever = ElasticSearchBM25Retriever(\n",
" elasticsearch.Elasticsearch(\"http://localhost:9200\"),\n",
" \"langchain-index\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"bm25_docs = bm25_retriever.get_relevant_documents(query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Semantic documents\n",
"\n",
"Below we assume you have a saved FAISS index."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain-community langchain-nvidia-ai-endpoints faiss-gpu"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"from langchain_community.vectorstores import FAISS\n",
"from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
"\n",
"embeddings = NVIDIAEmbeddings()\n",
"\n",
"sem_retriever = FAISS.load_local(\"langchain_index\", embeddings).as_retriever()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"sem_docs = sem_retriever.get_relevant_documents(query)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Combine and rank documents\n",
"\n",
"The resulting `docs` will be ordered by their relevance to the query."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"from langchain_nvidia_ai_endpoints import NVIDIARerank\n",
"\n",
"ranker = NVIDIARerank()\n",
"\n",
"all_docs = bm25_docs + sem_docs\n",
"\n",
"docs = ranker.compress_documents(query=query, documents=all_docs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Enhancing accuracy for single data sources\n",
"\n",
"Semantic search with vector embeddings is an efficient way to turn a large corpus of documents into a smaller corpus of relevant documents. This is done by trading accuracy for efficiency. Reranking as a tool adds accuracy back into the search by post-processing the smaller corpus of documents. Typically, ranking on the full corpus is too slow for applications."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain langchain-nvidia-ai-endpoints pgvector psycopg2-binary"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below we assume you have Postgresql running with documents stored in a collection named `langchain-index`.\n",
"\n",
"We will narrow the collection to 1,000 results and further narrow it to 10 with the reranker."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
"from langchain.vectorstores.pgvector import PGVector\n",
"\n",
"ranker = NVIDIARerank(top_n=10)\n",
"embeddings = NVIDIAEmbeddings()\n",
"\n",
"store = PGVector(embeddings=embeddings,\n",
" collection_name=\"langchain-index\",\n",
" connection_string=\"postgresql+psycopg2://postgres@localhost:5432/vector_db\")\n",
"\n",
"subset_docs = store.similarity_search(query, k=1_000)\n",
"\n",
"docs = ranker.compress_documents(query=query, documents=subset_docs)"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

0 comments on commit c5abecb

Please sign in to comment.