Skip to content

Commit 8f2ae56

Browse files
committed
ADR 001: Vector store for RAG
Signed-off-by: Anastas Stoyanovsky <[email protected]>
1 parent db834ca commit 8f2ae56

File tree

4 files changed

+33
-7
lines changed

4 files changed

+33
-7
lines changed

.spellcheck-en-custom.txt

+1
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,7 @@ Kumar
116116
Langchain
117117
Langgraph
118118
leaderboard
119+
lifecycle
119120
lignment
120121
LLM
121122
LLMs

docs/rag/adrs/README.md

+4-5
Original file line numberDiff line numberDiff line change
@@ -11,22 +11,21 @@ This simple format, which is described below, has a surprising number of functio
1111
* **Knowledge sharing**: The peer review phase allows sharing of expertise between team members.
1212
* **Fewer meetings**: As decision making becomes asynchronous and as the team forms its social norms around the process, there should be less time required in meetings.
1313

14-
# When to write an ADR
14+
## When to write an ADR
1515

1616
* A decision is being made that required discussion between two or more people.
1717
* A decision is being made that required significant investigation.
1818
* A decision is being proposed for feedback / discussion.
1919
* A decision is being proposed that affects multiple teams.
2020

21-
# Template
21+
## Template
2222

2323
[Here](template.md).
2424

25-
# Related Reading
25+
## Related Reading
2626

2727
* [Suggestions for writing good ADRs](https://github.com/joelparkerhenderson/architecture-decision-record?tab=readme-ov-file#suggestions-for-writing-good-adrs)
2828
* [ADRs at RedHat](https://www.redhat.com/architect/architecture-decision-records)
2929
* [ADRs at Amazon](https://docs.aws.amazon.com/prescriptive-guidance/latest/architectural-decision-records/adr-process.html)
3030
* [ADRs at GitHub](https://adr.github.io/)
31-
* [ADRs at Google](https://cloud.google.com/architecture/architecture-decision-records)
32-
31+
* [ADRs at Google](https://cloud.google.com/architecture/architecture-decision-records)

docs/rag/adrs/adr-vectordb.md

+26
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Initial InstructLab Vector Store
2+
3+
## Context
4+
5+
One of the first choices to make in implementing RAG is to choose an initial vector store to develop against. Though the usage of frameworks like LangChain or Haystack make it easy to swap vector databases, we need a working end to end implementation for RAG that is tested against and available to install with InstructLab. There are many options (see [here](https://docs.haystack.deepset.ai/docs/choosing-a-document-store)).
6+
7+
Our main long-term requirements are that our chosen store have fully-developed document update (and thus some sort of notion of primary key), that it be scalable to cluster size, and that it have a permissive license (Apache, MIT, or similar). Among the available choices, [Milvus](https://milvus.io/) provides strategic advantage due to its [investment from watsonx](https://www.ibm.com/new/announcements/ibm-watsonx-data-vector-database-ai-ready-data-management).
8+
9+
Milvus can be used in-process ([Milvus Lite](https://milvus.io/docs/milvus_lite.md)), single-node ([Milvus](https://milvus.io/docs/prerequisite-docker.md)), or cluster-scale ([Milvus Distributed](https://milvus.io/docs/prerequisite-helm.md)).
10+
11+
## Decision
12+
13+
InstructLab will initially integrate with and use Milvus Lite for vector storage and retrieval augmented generation.
14+
15+
## Status
16+
17+
Accepted
18+
19+
## Consequences
20+
21+
* Users will have a clear [upgrade path](https://milvus.io/docs/upgrade_milvus_cluster-operator.md) from the laptop use case to cluster scale.
22+
* We should be able to have access to expert resources with Milvus via IBM.
23+
* The laptop use case of InstructLab will have a minimally resource intensive option for prototyping.
24+
* Since Milvus is used in watsonx, we can have confidence that it can meet expected scaling requirements.
25+
* Document updates can be accommodated using well-established [primary key functionality](https://milvus.io/docs/primary-field.md) and [partition key](https://milvus.io/docs/use-partition-key.md).
26+
* There is a risk of developing against a mature vector store leading to usage of functionality not available in some other vector store that a potential customer requires to be used.

docs/rag/adrs/template.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# ADR 00x: Succinct title
1+
# Succinct title
22

33
## Context
44

@@ -14,4 +14,4 @@ _a single decision statement, written in active voice, stated in a single senten
1414

1515
## Consequences
1616

17-
_A bulleted list and might be the most important section. What are the consequences of this decision? Does it introduce design constraints into a codebase? Does it require further decisions or investigations to be made? Will it require training/onboarding for team members? Does it impact performance? What about cost? Does it impact development processes? What else? As a rule of thumb, there should usually be 4-6 identified consequences_
17+
_A bulleted list and might be the most important section. What are the consequences of this decision? Does it introduce design constraints into a codebase? Does it require further decisions or investigations to be made? Will it require training/onboarding for team members? Does it impact performance? What about cost? Does it impact development processes? What else? As a rule of thumb, there should usually be 4-6 identified consequences_

0 commit comments

Comments
 (0)