Skip to content

Commit 0d8102b

Browse files
committed
ADR: Chat Module Architecutre
Signed-off-by: Anastas Stoyanovsky <[email protected]>
1 parent e12aeea commit 0d8102b

File tree

2 files changed

+35
-0
lines changed

2 files changed

+35
-0
lines changed

.spellcheck-en-custom.txt

+4
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ agentic
66
Akash
77
AMDGPU
88
Anil
9+
architected
910
arge
1011
args
1112
arXiv
@@ -138,6 +139,7 @@ Miniforge
138139
MiniLM
139140
Mixtral
140141
mixtral
142+
modifiability
141143
MLX
142144
MMLU
143145
modularize
@@ -242,6 +244,7 @@ tatsu
242244
TBD
243245
templating
244246
Tesla
247+
testability
245248
th
246249
tl
247250
TODO
@@ -253,6 +256,7 @@ Triagers
253256
triagers
254257
UI
255258
ui
259+
understandability
256260
unquantized
257261
unstaged
258262
URI

docs/adr-chat-module-architecture.md

+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# InstructLab Chat Module Architecture
2+
3+
## Context
4+
5+
InstructLab recently gained the ability of [retrieval augmented generation (RAG) for chat](https://github.com/instructlab/instructlab/pull/2886). The initial implementation is the simplest possible, inserting retrieved document chunks in the chat session history before submitting the user query to the target model. This is done by adding a retriever instance into the chat class which is called by submitting the user query as-is to the vector store as a search query to retrieve those chunks.
6+
7+
Tuning retrieval systems requires significant expertise and testing, and there are many techniques that may be needed specifically and only for that use case. For example, query reformulation or agentic approaches may be necessary to achieve reliable retrieval quality while being redundant and expensive in a non-retrieval scenario.
8+
9+
Having a single code path that conditionally executes various strategies based on a mixture of configuration, application state, and user input can quickly become difficult to maintain, debug, and even understand. That chat module in InstructLab would benefit from a principled approach; that is, to be purposefully architected before becoming a [big ball of mud](http://www.laputan.org/mud/?ref=blog.codinghorror.com).
10+
11+
Prompt template management, model interaction pattern, chat session history, and any interactions with third party systems can vary by use case and so should be logically grouped for purposes of at least maintainability, understandability, modifiability, and testability.
12+
13+
## Decision
14+
15+
The InstructLab chat module will adopt a [strategy pattern](https://en.wikipedia.org/wiki/Strategy_pattern).
16+
17+
## Status
18+
19+
Accepted
20+
21+
## Consequences
22+
23+
* A refactor to the chat module will be necessary before additional feature work.
24+
* Encapsulation of different sets of logic as distinct pattern should decrease the risk of regression in one area when implementing another.
25+
* Division of responsibility will be clear from the code and project structure.
26+
* Testing different chat strategies will become simpler, with a smaller total testing surface.
27+
* Development velocity of RAG-specific improvements should be higher.
28+
* Code review will become simpler.
29+
* Risk of unsustainable maintenance overhead should be decreased.
30+
* As blocks of logic common to multiple strategies emerge over time, there will be a natural path to a pipeline approach in the future.
31+
* Configuration might become more complex; care should be taken in designing it to be flexible and to avoid one-way doors.

0 commit comments

Comments
 (0)