add illustrative doc for reranking (not expected to execute)

langchain-ai · Apr 12, 2024 · c5abecb · c5abecb
1 parent f50e1f2
commit c5abecb
Showing 1 changed file with 250 additions and 0 deletions.
diff --git a/libs/ai-endpoints/docs/retrievers/nvidia_rerank.ipynb b/libs/ai-endpoints/docs/retrievers/nvidia_rerank.ipynb
@@ -0,0 +1,250 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# NVIDIA NeMo Retriever Reranking\n",
+    "\n",
+    "Reranking is a critical piece of high accuracy, efficient retrieval pipelines.\n",
+    "\n",
+    "Two important use cases:\n",
+    "- Combining results from multiple data sources\n",
+    "- Enhancing accuracy for single data sources"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Combining results from multiple sources\n",
+    "\n",
+    "Consider a pipeline with data from a semantic store, such as FAISS, as well as a BM25 store.\n",
+    "\n",
+    "Each store is queried independently and returns results that the individual store considers to be highly relevant. Figuring out the overall relevance of the results is where reranking comes into play."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We will search for information about the query `What is the meaning of life?` across a BM25 store and semantic store."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "query = \"What is the meaning of life?\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "source": [
+    "### BM25 relevant documents\n",
+    "\n",
+    "Below we assume you have ElasticSearch running with documents stored in a `langchain-index` store."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%pip install --upgrade --quiet langchain-community elasticsearch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import elasticsearch\n",
+    "from langchain_community.retrievers import ElasticSearchBM25Retriever\n",
+    "\n",
+    "bm25_retriever = ElasticSearchBM25Retriever(\n",
+    "    elasticsearch.Elasticsearch(\"http://localhost:9200\"),\n",
+    "    \"langchain-index\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "bm25_docs = bm25_retriever.get_relevant_documents(query)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Semantic documents\n",
+    "\n",
+    "Below we assume you have a saved FAISS index."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%pip install --upgrade --quiet langchain-community langchain-nvidia-ai-endpoints faiss-gpu"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from langchain_community.vectorstores import FAISS\n",
+    "from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
+    "\n",
+    "embeddings = NVIDIAEmbeddings()\n",
+    "\n",
+    "sem_retriever = FAISS.load_local(\"langchain_index\", embeddings).as_retriever()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "sem_docs = sem_retriever.get_relevant_documents(query)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Combine and rank documents\n",
+    "\n",
+    "The resulting `docs` will be ordered by their relevance to the query."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from langchain_nvidia_ai_endpoints import NVIDIARerank\n",
+    "\n",
+    "ranker = NVIDIARerank()\n",
+    "\n",
+    "all_docs = bm25_docs + sem_docs\n",
+    "\n",
+    "docs = ranker.compress_documents(query=query, documents=all_docs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Enhancing accuracy for single data sources\n",
+    "\n",
+    "Semantic search with vector embeddings is an efficient way to turn a large corpus of documents into a smaller corpus of relevant documents. This is done by trading accuracy for efficiency. Reranking as a tool adds accuracy back into the search by post-processing the smaller corpus of documents. Typically, ranking on the full corpus is too slow for applications."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "%pip install --upgrade --quiet langchain langchain-nvidia-ai-endpoints pgvector psycopg2-binary"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Below we assume you have Postgresql running with documents stored in a collection named `langchain-index`.\n",
+    "\n",
+    "We will narrow the collection to 1,000 results and further narrow it to 10 with the reranker."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
+    "from langchain.vectorstores.pgvector import PGVector\n",
+    "\n",
+    "ranker = NVIDIARerank(top_n=10)\n",
+    "embeddings = NVIDIAEmbeddings()\n",
+    "\n",
+    "store = PGVector(embeddings=embeddings,\n",
+    "                 collection_name=\"langchain-index\",\n",
+    "                 connection_string=\"postgresql+psycopg2://postgres@localhost:5432/vector_db\")\n",
+    "\n",
+    "subset_docs = store.similarity_search(query, k=1_000)\n",
+    "\n",
+    "docs = ranker.compress_documents(query=query, documents=subset_docs)"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}