Quarkus Service that allows for the routing to different RAG sources and LLMs.
- Assistants - An assistant is the top level component that describes how all the components below are connected.
- Content Retrievers- Content Retrievers is the RAG(Retrieval-Augmented Generation) connection info used to retrieve data that will be included in the message to the LLM
- Embedding Models- These are generally the model used to convert data that is stored/retrieved from a vector database, which is a common pattern for RAG Datasources
- LLMs- The connection information to the Runtime Serving Environment hosting the Large Language Model
- AI Services - The component orchestrating the calls to the Content Retrievers and LLMs
Note
Currently working on finding a less restrictive way than AI Services
in which to perform the orchestration of our calls
The assistant/chat/streaming
is the primary entrypoint into the application. This used to specify an assistant
Example Message
{
"message": "User Message",
"assistantName": "assistant_name"
}
The following assistants are loaded into the application by default using liquibase and changeLog File:
Assistant Name | Description |
---|---|
default_ocp | Default assistant for OpenShift Container Platform (OCP) |
default_rhel | Default assistant for Red Hat Enterprise Linux (RHEL) |
default_rho_2025_faq | Default assistant for RHO 2025 FAQ |
default_ansible | Default assistant for Ansible automation |
default_rhoai | Default assistant for RHO AI |
default_assistant | General default assistant |
The /chatbot/chat/stream
endpoint allows for connections to be specified directly, and can be used for initial testing of connections.
{
"message": "User Message",
"context": "Message history",
"retriverRequest": {
"index": "weveIndex",
"scheme": "weveScheme",
"host": "weavHost.com",
"apiKey": "xxx"
},
"modelRequest": {
"modelType": "servingRuntime",
"apiKey": "xxxxx",
"modelName": "mistral-instruct"
}
}
Use the following commands to run locally:
mvn clean install
# Setting the profile to to use the `application-local.properties` as explained below
mvn quarkus:dev -Dquarkus.profile=local
Tip
Recommended that properties below are set in the application-local.properties
file which is gitignored. This will prevent any accidental check-ins of secret information
The following properties should be set in order to connect to properly connect to your LLM running on an OpenAI instance:
openai.default.url=<RUNTIME_URL>/v1
openai.default.apiKey=<API_KEY> # If Required
openai.default.modelName=<MODEL_NAME>
Tip
The default_assistant
can be used without having to configure a rag data source
The default assistants all assume a connection to a Weaviate DB for RAG purposes.
A locally hosted Weaviate can be deployed and used, more information found here(TBD)
If a remote instance of weaviate exist on an OpenShift cluster and has the correct indexes, that instance can be used with the following port forward commands:
oc project $PROJECT
oc port-forward service/weaviate-vector-db 8086:8080 50051:50051
Once forwarded the following values can be changed
weaviate.default.scheme=http
weaviate.default.host=localhost:8086
weaviate.default.apiKey=<API KEY>
If using the App of Apps repo the API key is retrieved from autogenerated secret
weaviate-api-key-secret
Currently the supported models are added to the resources folder and loaded directly. We would like to move this logic to pull these models using maven as seen here
Important
The embedding model is too large to check into our repo.
Download it from huggingface or here if internal to RH.
Then add it to resources/embedding/nomic
with the name model.onnx
, it should be gitignored if done correctly.
The download can be performed by running the download-nomic-embeddings-model.sh
script.
If the LLM Connection has been setup correctly the following curl command should stream a response from your LLM
curl -X 'POST' 'http://localhost:8080/assistant/chat/streaming' -H 'Content-Type: application/json' -d '{
"message": "What is this product?",
"assistantName": "default_assistant"
}' -N
The assistantName
can be swapped out for other assistants inside of the table above, but the other assistants will required a connection to a weaviate db with the correct indexes. The App Of Apps repository contains a validation script that can be used to show which indexes currently exist.
To send a local curl request with an uploaded file, the following command may be used:
curl 'http://localhost:8080/assistant/chat/streaming' -F 'jsonRequest={
"message": "Please summarize the document that I uploaded",
"assistantName": "default_assistant"
};type=application/json' -F 'document=@/path/to/my/file.txt'
Information about the creation/updating of Assistants, ContentRetrievers, and LLMs can be found in the admin flow docs
Authentication is disabled by default. It can be enabled by using the environment variable DISABLE_AUTHORIZATION=false
. If enabled in dev mode, a keycloak instance will be spun up locally and populated with a default realm. The authentication can be tested with the following steps:
# Keycloak runs on a random port, retrieve the port and update the curl command based on that.
docker ps
export access_token=$(\
curl --insecure -X POST http://localhost:32886/realms/quarkus/protocol/openid-connect/token \
--user backend-service:secret \
-H 'content-type: application/x-www-form-urlencoded' \
-d 'username=alice&password=alice&grant_type=password' | jq --raw-output '.access_token' \
)
curl -X 'POST' 'http://localhost:8080/assistant/chat/streaming' -H 'Authorization: Bearer '$access_token -H 'Content-Type: application/json' -d '{
"message": "What is this product?",
"assistantName": "default_assistant"
}' -N -v
See https://quarkus.io/guides/security-keycloak-authorization if an external keycloak instance is required.
We welcome contributions from the community! Here's how you can get involved:
1. Find an Issue:
- Check the issue tracker or jira board for open issues that interest you.
- If you find a bug or have a feature request, please open a new issue with a clear description.
Note
Currently this project is primarlly tracking using Red Hat's internal Jira.
2. Fork the Repository:
- Fork this repository to your own GitHub account.
3. Create a Branch:
- Create a new branch for your changes. Use a descriptive name that reflects the issue or feature you're working on (e.g.,
fix-issue-123
oradd-new-feature
).
4. Make Changes:
- Make your desired changes to the codebase.
- Follow the existing code style and conventions.
- Write clear commit messages that explain the purpose of your changes.
5. Test Your Changes:
- Thoroughly test your changes to ensure they work as expected.
- If there are existing tests, make sure they all pass. Consider adding new tests to cover your changes.
6. Code Style and Formatting:
- Ensure your code adheres to the project's established code style guidelines.
- This project uses Checkstyle's automated code formatting tools. Your code must pass these checks before it can be merged.
Tip
Checkstyle can be run locally using mvn site
which creates a report page under target/site
.
It is also recommended that you use a checkstyle tool in your IDE such as this VS Code Plugin in order order to adhear to guidelines as you code.
7. Open a Pull Request:
- Push your branch to your forked repository.
- Open a pull request to the main repository.
- In the pull request description, clearly explain the changes you've made and reference the related issue (if applicable).
- Validate that all automate checks are passing
8. Review Process:
- Your pull request will be reviewed by the project maintainers.
- Feel free to ping in our general slack channel asking for approval/assistants
- Be prepared to address any feedback or questions.
- Once your code has passed all automated checks and received at least one approval from a maintainer, it will be merged.
Important Notes:
- All code contributions must pass automated code scanning checks before they can be merged.
- At least one approval from a maintainer is required for all pull requests.
Thank you for your contributions!
The OWASP Dependency-Check Plugin not required to pass but is included, we ask that the scanner be run if any changes are made to the dependencies.
mvn validate -P security-scanner