EPIC: RAG Part 2 #271

gphorvath · 2024-03-20T15:38:34Z

Problem Statement:

One of the limitations of Large Language Models (LLMs) is that they are only able to respond to scenarios contained within training data - and training on new data is expensive given the size of the model. Retrieval Augmented Generation (RAG) is a technique to supplement the LLM with new data to enable it to provide more up-to-date responses.

Acceptance Criteria:

IDAM for managing access to RAG Data
API is compliant with OpenAI endpoints (Chat, Embeddings, Files, Assistants at a minimum).
Handle GPT2 (bottleneck) in RAG Backend
Better embeddings model (currently Instructor-XL)
Smarter / Better RAG

Definition of Done:

Tasks

Give feedback

ADR: RAG Refinement Approaches #267

documentation
ADR VectorDB #272

documentation
ADR for Text Embeddings #273

documentation
Options

gphorvath added the enhancement New feature or request label Mar 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EPIC: RAG Part 2 #271

EPIC: RAG Part 2 #271

gphorvath commented Mar 20, 2024 •

edited

Loading

Tasks

EPIC: RAG Part 2 #271

EPIC: RAG Part 2 #271

Comments

gphorvath commented Mar 20, 2024 • edited Loading

Tasks

gphorvath commented Mar 20, 2024 •

edited

Loading