Skip to content

Commit

Permalink
Fix from code review + ADR
Browse files Browse the repository at this point in the history
  • Loading branch information
InAnYan committed Jul 29, 2024
1 parent f669a95 commit efed0cc
Show file tree
Hide file tree
Showing 11 changed files with 122 additions and 12 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv

### Added

- We added an AI chat for linked files. [#11430](https://github.com/JabRef/jabref/pull/11430)
- We added an AI-based chat for entries with linked PDF files. [#11430](https://github.com/JabRef/jabref/pull/11430)
- We added support for selecting and using CSL Styles in JabRef's OpenOffice/LibreOffice integration for inserting bibliographic and in-text citations into a document. [#2146](https://github.com/JabRef/jabref/issues/2146), [#8893](https://github.com/JabRef/jabref/issues/8893)
- We added Tools > New library based on references in PDF file... to create a new library based on the references section in a PDF file. [#11522](https://github.com/JabRef/jabref/pull/11522)
- When converting the references section of a paper (PDF file), more than the last page is treated. [#11522](https://github.com/JabRef/jabref/pull/11522)
Expand Down
1 change: 1 addition & 0 deletions docs/decisions/0033-store-chats-in-mvstore.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ Chosen option: "MVStore", because it is simple and memory-efficient.

* Good, because automatic loading and saving to disk
* Good, because memory-efficient
* Bad, because does not support mutable values in maps.
* Bad, because the order of messages need to be "hand-crafted" (e.g., by mapping from an Integer to the concrete message)
* Bad, because it stores data as key-values, but not as a custom data type (like tables in RDBMS)

Expand Down
107 changes: 107 additions & 0 deletions docs/decisions/0037-rag-architecture-implementation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
nav_order: 0037
parent: Decision Records
---

# RAG architecture implementation

## Context and Problem Statement

The current trend in questions and answering (Q&A) using large language models (LLMs) or other
AI related technology is retrieval-augmented-generation (RAG).

RAG is related to [Open Generative QA](https://huggingface.co/tasks/question-answering)
that means LLM (which generates text) is supplied with context (chunks of information extracted
from various sources) and then it generates answer.

RAG architecture consists of [these steps](https://www.linkedin.com/pulse/rag-architecture-deep-dive-frank-denneman-4lple) (simplified):

How source data is processed:
1. **Indexing**: application is supplied with information sources (PDFs, text files, web pages, etc.)
2. **Conversion**: files are converted to string (because LLM works on text data).
3. **Splitting**: the string from previous step is split into parts (because LLM has fixed context window, meaning
it cannot handle big documents).
4. **Embedding generation**: a vector consisting of float values is generated out of chunks. This vector represents meaning
of text and the main propety of such vectors is that chunks with similar meaning has vectors that are close to.
Generation of such a vector is achieved by using a separate model called *embedding model*.
5. **Store**: chunks with relevant metadata (for example, from which document they were generated) and embedding vector are stored in a vector database.

How answer is generated:
1. **Ask**: user asks AI a question.
2. **Question embedding**: an embedding model generates embedding vector of a query.
3. **Data finding**: vector database performs search of most relevant pieces of information (a finite count of pieces).
That's performed by vector similarity: meaning how close are chunk vector with question vector.
4. **Prompt generation**: using a prompt template the user question is *augmented* with found information. Found information
is not generally supplied to user, as it may seem strange that a user asked a question that was already supplied with
found information. These pieces of text can be either totally ignored or showed separately in UI tab "Sources".
5. **LLM generation**: LLM generates output.

This ADR concerns about implementation of this architecture.

## Decision Drivers

* Prefer good and maintained libraries over self-made solutions for better quality.
* The usage of framework should be easy. It would seem strange when user wants to download a BIB editor, but they are
required to install some separate software (or even Python runtime).
* RAG shouldn't provide any additional money costs. Users should pay only for LLM generation.

## Considered Options

* Use a hand-crafted RAG
* Use a third-party Java library
* Use a standalone application
* Use an online service

## Decision Outcome

Chosen option: mix of "Use a hand-crafted RAG" and "Use a third-party Java library".

Third-party libraries provide excellent resources for connecting to an LLM or extracting text from PDF files. For RAG,
we mostly used all the machinery provided by `langchain4j`, but there were moments that should be hand-crafted:
- **LLM connection**: due to https://github.com/langchain4j/langchain4j/issues/1454 (https://github.com/InAnYan/jabref/issues/77)
this was delegated to another library `jvm-openai`.
- **Embedding generation**: due to https://github.com/langchain4j/langchain4j/issues/1492 (https://github.com/InAnYan/jabref/issues/79),
this was delegated to another library `djl`.
- **Indexing**: `langchain4j` is just a bunch of useful tools, but we still have to orchestrate when indexing should
happen and what files should be processed.
- **Vector database**: there seems to be no embedded vector database (except SQLite with `sqlite-vss` extension). We
implemented vector database using `MVStore` because that was easy.

## Pros and Cons of the Options

### Use a hand-crafted RAG

* Good, because we have the full control over generation
* Good, because extendable
* Bad, because LLM connection, embedding models, vector storage, and file conversion should be implemented manually
* Bad, because it's hard to make a complex RAG architecture

### Use a third-party Java library

* Good, because provides well-tested and maintained tools
* Good, because libraries have many LLM integrations, as well as embedding models, vector storage, and file conversion tools
* Good, because they provide complex RAG pipelines and extensions
* Neutral, because they provide many tools and functions, but they should be orchestrated in a real application
* Bad, because some of them are raw and undocumented
* Bad, because they are all similar to `langchain`
* Bad, because they may have bugs

### Use a standalone application

* Good, because they provide complex RAG pipelines and extensions
* Good, because no additional code is required (except connecting to API)
* Neutral, because they provide not that many LLM integrations, embedding models, and vector storages
* Bad, because a standalone app running is required. Users may be required to set it up properly
* Bad, because the internal working of app is hidden. Additional agreement to Privacy or Terms of Service is needed
* Bad, because hard to extend

### Use an online service

* Good, because all data is processed and stored not on the user's machine: faster and no memory is used.
* Good, because they provide complex RAG pipelines and extensions
* Good, because no additional code is required (except connecting to API)
* Neutral, because they provide not that many LLM integrations, embedding models, and vector storages
* Bad, because requires connection to Internet
* Bad, because data is processed by a third party company
* Bad, because most of them require additional payment (in fact, it would be impossible to develop a free service like
that)
1 change: 1 addition & 0 deletions src/main/java/module-info.java
Original file line number Diff line number Diff line change
Expand Up @@ -154,4 +154,5 @@
// Provides number input fields for parameters in AI expert settings
requires com.dlsc.unitfx;
requires de.saxsys.mvvmfx.validation;
requires dd.plist;
}
1 change: 0 additions & 1 deletion src/main/java/org/jabref/gui/Dark.css
Original file line number Diff line number Diff line change
Expand Up @@ -163,4 +163,3 @@
.file-row-text {
-fx-text-fill: -fx-light-text-color;
}

Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ private void initialize() {
sourceLabel.setText(Localization.lang("AI"));
contentTextArea.setText(aiMessage.text());
} else {
LOGGER.warn("ChatMessageComponent supports only user or AI messages, but other type was passed: " + chatMessage.type().name());
LOGGER.warn("ChatMessageComponent supports only user or AI messages, but other type was passed: {}", chatMessage.type().name());
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

<fx:root prefHeight="200.0" prefWidth="500.0" type="BorderPane" xmlns="http://javafx.com/javafx/17.0.2-ea" xmlns:fx="http://javafx.com/fxml/1" fx:controller="org.jabref.gui.ai.components.errorstate.ErrorStateComponent">
<center>
<VBox alignment="CENTER" spacing="10.0">
<VBox fx:id="contentsVBox" alignment="CENTER" spacing="10.0">
<children>
<Text fx:id="titleText" strokeType="OUTSIDE" strokeWidth="0.0" text="Title">
<font>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
public class ErrorStateComponent extends BorderPane {
@FXML private Text titleText;
@FXML private Text contentText;
@FXML private VBox contentsVBox;

public ErrorStateComponent(String title, String content) {
ViewLoader.view(this)
Expand All @@ -25,7 +26,7 @@ public ErrorStateComponent(String title, String content) {
public static ErrorStateComponent withSpinner(String title, String content) {
ErrorStateComponent errorStateComponent = new ErrorStateComponent(title, content);

((VBox) errorStateComponent.getCenter()).getChildren().add(new ProgressIndicator());
errorStateComponent.contentsVBox.getChildren().add(new ProgressIndicator());

return errorStateComponent;
}
Expand All @@ -36,7 +37,7 @@ public static ErrorStateComponent withTextArea(String title, String content, Str
TextArea textArea = new TextArea(additional);
textArea.setEditable(false);

((VBox) errorStateComponent.getCenter()).getChildren().add(textArea);
errorStateComponent.contentsVBox.getChildren().add(textArea);

return errorStateComponent;
}
Expand Down
4 changes: 2 additions & 2 deletions src/main/java/org/jabref/gui/preferences/ai/AiTab.java
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,6 @@ public class AiTab extends AbstractPreferenceTabView<AiTabViewModel> implements
@FXML private IntegerInputField ragMaxResultsCountTextField;
@FXML private DoubleInputField ragMinScoreTextField;

private final ControlsFxVisualizer visualizer = new ControlsFxVisualizer();

@FXML private Button chatModelHelp;
@FXML private Button embeddingModelHelp;
@FXML private Button apiBaseUrlHelp;
Expand All @@ -54,6 +52,8 @@ public class AiTab extends AbstractPreferenceTabView<AiTabViewModel> implements

@FXML private Button resetExpertSettingsButton;

private final ControlsFxVisualizer visualizer = new ControlsFxVisualizer();

public AiTab() {
ViewLoader.view(this)
.root(this)
Expand Down
6 changes: 4 additions & 2 deletions src/main/java/org/jabref/logic/ai/models/EmbeddingModel.java
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,13 @@
import dev.langchain4j.model.output.Response;

/**
* Wrapper around langchain4j embedding model.
* Wrapper around langchain4j {@link dev.langchain4j.model.embedding.EmbeddingModel}.
* <p>
* This class listens to preferences changes.
*/
public class EmbeddingModel implements dev.langchain4j.model.embedding.EmbeddingModel, AutoCloseable {
private static final String DJL_AI_DJL_HUGGINGFACE_PYTORCH_SENTENCE_TRANSFORMERS = "djl://ai.djl.huggingface.pytorch/sentence-transformers/";

private final AiPreferences aiPreferences;

private final ExecutorService executorService = Executors.newCachedThreadPool(
Expand All @@ -48,7 +50,7 @@ private void rebuild() {
return;
}

String modelUrl = "djl://ai.djl.huggingface.pytorch/sentence-transformers/" + aiPreferences.getEmbeddingModel().getLabel();
String modelUrl = DJL_AI_DJL_HUGGINGFACE_PYTORCH_SENTENCE_TRANSFORMERS + aiPreferences.getEmbeddingModel().getLabel();

Criteria<String, float[]> criteria =
Criteria.builder()
Expand Down
3 changes: 1 addition & 2 deletions src/main/resources/tinylog.properties
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,7 @@ exception = strip: jdk.internal

[email protected] = debug

# FIXME: Remove before merging the branch

# AI debugging
#[email protected] = trace
#[email protected] = trace
#[email protected] = trace
Expand Down

0 comments on commit efed0cc

Please sign in to comment.