Skip to content

Commit

Permalink
fmt, add env vars
Browse files Browse the repository at this point in the history
  • Loading branch information
rlancemartin committed Dec 13, 2023
1 parent 3685140 commit e2e437d
Show file tree
Hide file tree
Showing 4 changed files with 19 additions and 362 deletions.
18 changes: 10 additions & 8 deletions templates/rag-chroma-multi-modal-multi-vector/README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,30 @@

# rag-chroma-multi-modal
# rag-chroma-multi-modal-multi-vector

Presentations (slide decks, etc) contain visual content that challenges conventional RAG.

Multi-modal LLMs unlock new ways to build apps over visual content like presentations.

This template performs multi-modal RAG using Chroma with the multi-vector retriever (see [blog](https://blog.langchain.dev/multi-modal-rag-template/)):

* Extract the slides as images
* Use GPT-4V to summarize each image
* Embed the image summaries with a link to the original images
* Retrieve relevant image based on similarity between the image summary and the user input
* Extracts the slides as images
* Uses GPT-4V to summarize each image
* Embeds the image summaries with a link to the original images
* Retrieves relevant image based on similarity between the image summary and the user input
* Finally pass those images to GPT-4V for answer synthesis

## Storage

We will use Upstash to store the images.
We will use Upstash to store the images, which offers Redis with a REST API.

Simply login [here](https://upstash.com/) and create a database.

This will give you:
This will give you a REST API with:

* UPSTASH_URL
* UPSTASH_TOKEN

Set these in chain.py (***TODO: Update this? Env var?***)
Set `UPSTASH_URL` and `UPSTASH_TOKEN` as environment variables to access your database.

We will use Chroma to store and index the image summaries, which will be created locally in the template directory.

Expand All @@ -47,6 +47,8 @@ The app will retrieve images using multi-modal embeddings, and pass them to GPT-

Set the `OPENAI_API_KEY` environment variable to access the OpenAI GPT-4V.

Set `UPSTASH_URL` and `UPSTASH_TOKEN` as environment variables to access your database.

## Usage

To use this package, you should first have the LangChain CLI installed:
Expand Down
9 changes: 5 additions & 4 deletions templates/rag-chroma-multi-modal-multi-vector/ingest.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import base64
import io
import os
import uuid
from io import BytesIO
from pathlib import Path
Expand Down Expand Up @@ -64,8 +65,8 @@ def generate_img_summaries(img_base64_list):
try:
image_summaries.append(image_summarize(base64_image, prompt))
processed_images.append(base64_image)
except:
print(f"BadRequestError with image {i+1}")
except Exception as e:
print(f"Error with image {i+1}: {e}")

return image_summaries, processed_images

Expand Down Expand Up @@ -136,8 +137,8 @@ def create_multi_vector_retriever(vectorstore, image_summaries, images):
"""

# Initialize the storage layer for images
UPSTASH_URL = "https://usw1-bright-beagle-34178.upstash.io"
UPSTASH_TOKEN = "AYWCACQgNzk3OTJjZTItMGIxNy00MTEzLWIyZTAtZWI0ZmI1ZGY0NjFhNGRhMGZjNDE4YjgxNGE4MTkzOWYxMzllM2MzZThlOGY="
UPSTASH_URL = os.getenv("UPSTASH_URL")
UPSTASH_TOKEN = os.getenv("UPSTASH_TOKEN")
store = UpstashRedisByteStore(url=UPSTASH_URL, token=UPSTASH_TOKEN)
id_key = "doc_id"

Expand Down
Loading

0 comments on commit e2e437d

Please sign in to comment.