Presentation: Introduction to RAG
During the workshop you will learn what is Retrieval Augmented Generation, how it can increase trustability of LLM models and how to set up a RAG pipeline using Elastic.
For this workshop, you will need:
- Python 3.6 or later
- An Elastic deployment
- We'll be using Elastic Cloud for this example (available with a free trial)
- OpenAI account. Where do I find my OpenAI API Key?
- Clone the repository
- Start your favorite IDE and navigate to the solutions folder
- Run
pip install -r requirements.txt
- Launch Jupyter App
jupyter notebook
In the question-answering.ipynb
notebook you'll learn how to:
- Retrieve sample workplace documents from a given URL.
- Set up an Elasticsearch client.
- Chunk documents into 800-character passages with an overlap of 400 characters using the
CharacterTextSplitter
fromlangchain
. - Use
OpenAIEmbeddings
fromlangchain
to create embeddings for the content. - Retrieve embeddings for the chunked passages using OpenAI.
- Persist the passage documents along with their embeddings into Elasticsearch.
- Set up a question-answering system using
OpenAI
andElasticKnnSearch
fromlangchain
to retrieve answers along with their source documents.
In the chatbot.ipynb
notebook you'll learn how to:
- Retrieve sample workplace documents from a given URL.
- Set up an Elasticsearch client.
- Chunk documents into 800-character passages with an overlap of 400 characters using the
CharacterTextSplitter
fromlangchain
. - Use
OpenAIEmbeddings
fromlangchain
to create embeddings for the content. - Retrieve embeddings for the chunked passages using OpenAI.
- Run hybrid search in Elasticsearch to find documents that answers asked questions.
- Maintain conversational memory for follow-up questions.
Re-watch this YouTube stream
This workshop was set up by @pyladiesams and @ahavrius