Skip to content

Commit

Permalink
Merge pull request #41 from deepset-ai/chroma-rename-retriever
Browse files Browse the repository at this point in the history
Chroma: rename retriever
  • Loading branch information
bilgeyucel authored Feb 14, 2024
2 parents 7ac9de8 + a000e05 commit cbc081d
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 32 deletions.
60 changes: 30 additions & 30 deletions notebooks/amazon_sagemaker_and_chroma_for_qa.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,16 +31,16 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "EX5oCws-etEH",
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "EX5oCws-etEH",
"outputId": "4d46055f-4d58-4d67-b895-ad701c2eb306"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: chroma-haystack in /usr/local/lib/python3.10/dist-packages (0.11.0)\n",
"Requirement already satisfied: amazon-sagemaker-haystack in /usr/local/lib/python3.10/dist-packages (0.1.0)\n",
Expand Down Expand Up @@ -153,6 +153,9 @@
},
{
"cell_type": "markdown",
"metadata": {
"id": "Eg9lSuAJM6MJ"
},
"source": [
"## Deploy a model on Sagemaker\n",
"\n",
Expand All @@ -162,10 +165,7 @@
"- Amazon Sagemaker Jumpstart [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-foundation-models-use.html).\n",
"- [This notebook](https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms/jumpstart-foundation-models/text-generation-falcon.ipynb) on how to deploy Falcon models programmatically with a notebook\n",
"- [This blogpost](https://aws.amazon.com/blogs/machine-learning/build-production-ready-generative-ai-applications-for-enterprise-search-using-haystack-pipelines-and-amazon-sagemaker-jumpstart-with-llms/) about deploying models on Sagemaker for Haystack 1.x\n"
],
"metadata": {
"id": "Eg9lSuAJM6MJ"
}
]
},
{
"cell_type": "markdown",
Expand All @@ -182,10 +182,10 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "tZTz7cHwhZ-9",
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "tZTz7cHwhZ-9",
"outputId": "72a5b7af-5d81-4c2f-e922-f35ee1dda94e"
},
"outputs": [
Expand All @@ -210,19 +210,24 @@
},
{
"cell_type": "markdown",
"metadata": {
"id": "k-CF7LUSy2T7"
},
"source": [
"## Load data from Wikipedia\n",
"\n",
"We are going to download the Wikipedia pages related to NASA's martian rovers using the python library `wikipedia`.\n",
"\n",
"These pages are converted into Haystack Documents."
],
"metadata": {
"id": "k-CF7LUSy2T7"
}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-Nz9MRVgxcfW"
},
"outputs": [],
"source": [
"import wikipedia\n",
"from haystack.dataclasses import Document\n",
Expand All @@ -241,12 +246,7 @@
" page = wikipedia.page(title=title, auto_suggest=False)\n",
" doc = Document(content=page.content, meta={\"title\": page.title, \"url\":page.url})\n",
" raw_docs.append(doc)"
],
"metadata": {
"id": "-Nz9MRVgxcfW"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -309,29 +309,29 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "X7HrON1PFHos",
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "X7HrON1PFHos",
"outputId": "22a57096-f22a-4333-cd85-9ea9ea4b52e0"
},
"outputs": [
{
"output_type": "stream",
"name": "stderr",
"output_type": "stream",
"text": [
"/root/.cache/chroma/onnx_models/all-MiniLM-L6-v2/onnx.tar.gz: 100%|██████████| 79.3M/79.3M [00:06<00:00, 12.6MiB/s]\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": [
"{'writer': {'documents_written': None}}"
]
},
"execution_count": 5,
"metadata": {},
"execution_count": 5
"output_type": "execute_result"
}
],
"source": [
Expand All @@ -346,7 +346,7 @@
"source": [
"## Building the Query Pipeline\n",
"\n",
"Let’s create another pipeline to query our application. In this pipeline, we’ll use [ChromaQueryRetriever](https://docs.haystack.deepset.ai/v2.0/docs/chromaqueryretriever) to retrieve relevant information from the ChromaDocumentStore and a Falcon 7B Instruct BF16 model to generate answers with [SagemakerGenerator](https://docs.haystack.deepset.ai/v2.0/docs/sagemakergenerator).\n",
"Let’s create another pipeline to query our application. In this pipeline, we’ll use [ChromaQueryTextRetriever](https://docs.haystack.deepset.ai/v2.0/docs/chromaqueryretriever) to retrieve relevant information from the ChromaDocumentStore and a Falcon 7B Instruct BF16 model to generate answers with [SagemakerGenerator](https://docs.haystack.deepset.ai/v2.0/docs/sagemakergenerator).\n",
"\n",
"Next, we'll create a prompt for our task using the Retrieval-Augmented Generation (RAG) approach with [PromptBuilder](https://docs.haystack.deepset.ai/v2.0/docs/promptbuilder). This prompt will help generate answers by considering the provided context. Finally, we'll connect these three components to complete the pipeline."
]
Expand All @@ -362,10 +362,10 @@
"from haystack.pipeline import Pipeline\n",
"from haystack.components.builders import PromptBuilder\n",
"from haystack_integrations.components.generators.amazon_sagemaker import SagemakerGenerator\n",
"from haystack_integrations.components.retrievers.chroma import ChromaQueryRetriever\n",
"from haystack_integrations.components.retrievers.chroma import ChromaQueryTextRetriever\n",
"\n",
"# Create pipeline components\n",
"retriever = ChromaQueryRetriever(document_store=document_store, top_k=3)\n",
"retriever = ChromaQueryTextRetriever(document_store=document_store, top_k=3)\n",
"\n",
"# Initialize the AmazonSagemakerGenerator with an Amazon Sagemaker model\n",
"# You may need to change the model name if it differs from your endpoint name.\n",
Expand Down Expand Up @@ -404,16 +404,16 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "mDYCSRRtiAy5",
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "mDYCSRRtiAy5",
"outputId": "b644aeb8-c9eb-4dbf-ed28-ad3080826410"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Opportunity landed on Mars on January 24, 2004.\n"
Expand All @@ -431,16 +431,16 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "giSWajzyAcNp",
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "giSWajzyAcNp",
"outputId": "ac240c1a-657d-447a-8f08-8558616d71e9"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Yes, the Ingenuity mission is over. The helicopter made a total of 72 flights over a period of about 3 years until rotor damage sustained in January 2024 forced an end to the mission.\n"
Expand All @@ -458,16 +458,16 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "ROhJ8VL_JdHc",
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "ROhJ8VL_JdHc",
"outputId": "4c77f458-d472-4644-b073-304986bf7a6c"
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"The first NASA rover to land on Mars was called Sojourner.\n"
Expand Down
4 changes: 2 additions & 2 deletions notebooks/chroma-indexing-and-rag-examples.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@
},
"outputs": [],
"source": [
"from haystack_integrations.components.retrievers.chroma import ChromaQueryRetriever\n",
"from haystack_integrations.components.retrievers.chroma import ChromaQueryTextRetriever\n",
"from haystack.components.generators import HuggingFaceTGIGenerator\n",
"from haystack.components.builders import PromptBuilder\n",
"\n",
Expand All @@ -163,7 +163,7 @@
"\n",
"llm = HuggingFaceTGIGenerator(model=\"mistralai/Mixtral-8x7B-Instruct-v0.1\")\n",
"llm.warm_up()\n",
"retriever = ChromaQueryRetriever(document_store)\n",
"retriever = ChromaQueryTextRetriever(document_store)\n",
"\n",
"querying = Pipeline()\n",
"querying.add_component(\"retriever\", retriever)\n",
Expand Down

0 comments on commit cbc081d

Please sign in to comment.