-
Notifications
You must be signed in to change notification settings - Fork 39
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add illustrative doc for reranking (not expected to execute)
- Loading branch information
Showing
1 changed file
with
250 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,250 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# NVIDIA NeMo Retriever Reranking\n", | ||
"\n", | ||
"Reranking is a critical piece of high accuracy, efficient retrieval pipelines.\n", | ||
"\n", | ||
"Two important use cases:\n", | ||
"- Combining results from multiple data sources\n", | ||
"- Enhancing accuracy for single data sources" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Combining results from multiple sources\n", | ||
"\n", | ||
"Consider a pipeline with data from a semantic store, such as FAISS, as well as a BM25 store.\n", | ||
"\n", | ||
"Each store is queried independently and returns results that the individual store considers to be highly relevant. Figuring out the overall relevance of the results is where reranking comes into play." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We will search for information about the query `What is the meaning of life?` across a BM25 store and semantic store." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"query = \"What is the meaning of life?\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"source": [ | ||
"### BM25 relevant documents\n", | ||
"\n", | ||
"Below we assume you have ElasticSearch running with documents stored in a `langchain-index` store." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"%pip install --upgrade --quiet langchain-community elasticsearch" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import elasticsearch\n", | ||
"from langchain_community.retrievers import ElasticSearchBM25Retriever\n", | ||
"\n", | ||
"bm25_retriever = ElasticSearchBM25Retriever(\n", | ||
" elasticsearch.Elasticsearch(\"http://localhost:9200\"),\n", | ||
" \"langchain-index\"\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"bm25_docs = bm25_retriever.get_relevant_documents(query)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Semantic documents\n", | ||
"\n", | ||
"Below we assume you have a saved FAISS index." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"%pip install --upgrade --quiet langchain-community langchain-nvidia-ai-endpoints faiss-gpu" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain_community.vectorstores import FAISS\n", | ||
"from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n", | ||
"\n", | ||
"embeddings = NVIDIAEmbeddings()\n", | ||
"\n", | ||
"sem_retriever = FAISS.load_local(\"langchain_index\", embeddings).as_retriever()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"sem_docs = sem_retriever.get_relevant_documents(query)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Combine and rank documents\n", | ||
"\n", | ||
"The resulting `docs` will be ordered by their relevance to the query." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain_nvidia_ai_endpoints import NVIDIARerank\n", | ||
"\n", | ||
"ranker = NVIDIARerank()\n", | ||
"\n", | ||
"all_docs = bm25_docs + sem_docs\n", | ||
"\n", | ||
"docs = ranker.compress_documents(query=query, documents=all_docs)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Enhancing accuracy for single data sources\n", | ||
"\n", | ||
"Semantic search with vector embeddings is an efficient way to turn a large corpus of documents into a smaller corpus of relevant documents. This is done by trading accuracy for efficiency. Reranking as a tool adds accuracy back into the search by post-processing the smaller corpus of documents. Typically, ranking on the full corpus is too slow for applications." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"%pip install --upgrade --quiet langchain langchain-nvidia-ai-endpoints pgvector psycopg2-binary" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Below we assume you have Postgresql running with documents stored in a collection named `langchain-index`.\n", | ||
"\n", | ||
"We will narrow the collection to 1,000 results and further narrow it to 10 with the reranker." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "plaintext" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n", | ||
"from langchain.vectorstores.pgvector import PGVector\n", | ||
"\n", | ||
"ranker = NVIDIARerank(top_n=10)\n", | ||
"embeddings = NVIDIAEmbeddings()\n", | ||
"\n", | ||
"store = PGVector(embeddings=embeddings,\n", | ||
" collection_name=\"langchain-index\",\n", | ||
" connection_string=\"postgresql+psycopg2://postgres@localhost:5432/vector_db\")\n", | ||
"\n", | ||
"subset_docs = store.similarity_search(query, k=1_000)\n", | ||
"\n", | ||
"docs = ranker.compress_documents(query=query, documents=subset_docs)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |