Skip to content

Commit

Permalink
Merge branch 'main' of github.com:mhbuehler/GenAIExamples into mmqna-…
Browse files Browse the repository at this point in the history
…image-query

Signed-off-by: dmsuehir <[email protected]>
  • Loading branch information
dmsuehir committed Jan 16, 2025
2 parents 888cb71 + 71e3c57 commit 9402fcd
Show file tree
Hide file tree
Showing 208 changed files with 5,126 additions and 571 deletions.
1 change: 1 addition & 0 deletions .github/workflows/_example-workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ jobs:
fi
if [[ $(grep -c "vllm-gaudi:" ${docker_compose_path}) != 0 ]]; then
git clone https://github.com/HabanaAI/vllm-fork.git
cd vllm-fork && git checkout v0.6.4.post2+Gaudi-1.19.0 && cd ../
fi
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps && git checkout ${{ inputs.opea_branch }} && git rev-parse HEAD && cd ../
Expand Down
90 changes: 69 additions & 21 deletions AgentQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

## Overview

This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram is shown below. The supervisor agent interfaces with the user and dispatch tasks to the worker agent and other tools to gather information and come up with answers. The worker agent uses the retrieval tool to generate answers to the queries posted by the supervisor agent. Other tools used by the supervisor agent may include APIs to interface knowledge graphs, SQL databases, external knowledge bases, etc.
![Architecture Overview](assets/agent_qna_arch.png)
This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram is shown below. The supervisor agent interfaces with the user and dispatch tasks to two worker agents to gather information and come up with answers. The worker RAG agent uses the retrieval tool to retrieve relevant documents from the knowledge base (a vector database). The worker SQL agent retrieve relevant data from the SQL database. Although not included in this example, but other tools such as a web search tool or a knowledge graph query tool can be used by the supervisor agent to gather information from additional sources.
![Architecture Overview](assets/img/agent_qna_arch.png)

The AgentQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.

Expand Down Expand Up @@ -38,6 +38,7 @@ flowchart LR
end
AG_REACT([Agent MicroService - react]):::blue
AG_RAG([Agent MicroService - rag]):::blue
AG_SQL([Agent MicroService - sql]):::blue
LLM_gen{{LLM Service <br>}}
DP([Data Preparation MicroService]):::blue
TEI_RER{{Reranking service<br>}}
Expand All @@ -51,6 +52,7 @@ flowchart LR
direction LR
a[User Input Query] --> AG_REACT
AG_REACT --> AG_RAG
AG_REACT --> AG_SQL
AG_RAG --> DocIndexRetriever-MegaService
EM ==> RET
RET ==> RER
Expand All @@ -59,6 +61,7 @@ flowchart LR
%% Embedding service flow
direction LR
AG_RAG <-.-> LLM_gen
AG_SQL <-.-> LLM_gen
AG_REACT <-.-> LLM_gen
EM <-.-> TEI_EM
RET <-.-> R_RET
Expand All @@ -75,11 +78,11 @@ flowchart LR
### Why Agent for question answering?

1. Improve relevancy of retrieved context.
Agent can rephrase user queries, decompose user queries, and iterate to get the most relevant context for answering user's questions. Compared to conventional RAG, RAG agent can significantly improve the correctness and relevancy of the answer.
2. Use tools to get additional knowledge.
For example, knowledge graphs and SQL databases can be exposed as APIs for Agents to gather knowledge that may be missing in the retrieval vector database.
3. Hierarchical agent can further improve performance.
Expert worker agents, such as retrieval agent, knowledge graph agent, SQL agent, etc., can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information together to provide a comprehensive answer.
RAG agent can rephrase user queries, decompose user queries, and iterate to get the most relevant context for answering user's questions. Compared to conventional RAG, RAG agent can significantly improve the correctness and relevancy of the answer.
2. Expand scope of the agent.
The supervisor agent can interact with multiple worker agents that specialize in different domains with different skills (e.g., retrieve documents, write SQL queries, etc.), and thus can answer questions in multiple domains.
3. Hierarchical multi-agents can improve performance.
Expert worker agents, such as RAG agent and SQL agent, can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information together to provide a comprehensive answer. If we only use one agent and provide all the tools to this single agent, it may get overwhelmed and not able to provide accurate answers.

## Deployment with docker

Expand Down Expand Up @@ -148,28 +151,55 @@ docker build -t opea/agent:latest --build-arg https_proxy=$https_proxy --build-a
bash run_ingest_data.sh
```

4. Launch other tools. </br>
4. Prepare SQL database
In this example, we will use the Chinook SQLite database. Run the commands below.

```
# Download data
cd $WORKDIR
git clone https://github.com/lerocha/chinook-database.git
cp chinook-database/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite $WORKDIR/GenAIExamples/AgentQnA/tests/
```

5. Launch other tools. </br>
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.

```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```

5. Launch agent services</br>
We provide two options for `llm_engine` of the agents: 1. open-source LLMs, 2. OpenAI models via API calls.

Deploy it on Gaudi or Xeon respectively
6. Launch multi-agent system. </br>
We provide two options for `llm_engine` of the agents: 1. open-source LLMs on Intel Gaudi2, 2. OpenAI models via API calls.

::::{tab-set}
:::{tab-item} Gaudi
:sync: Gaudi

To use open-source LLMs on Gaudi2, run commands below.
On Gaudi2 we will serve `meta-llama/Meta-Llama-3.1-70B-Instruct` using vllm.

First build vllm-gaudi docker image.

```bash
cd $WORKDIR
git clone https://github.com/vllm-project/vllm.git
cd ./vllm
git checkout v0.6.6
docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy
```
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi
bash launch_tgi_gaudi.sh
bash launch_agent_service_tgi_gaudi.sh

Then launch vllm on Gaudi2 with the command below.

```bash
vllm_port=8086
model="meta-llama/Meta-Llama-3.1-70B-Instruct"
docker run -d --runtime=habana --rm --name "vllm-gaudi-server" -e HABANA_VISIBLE_DEVICES=0,1,2,3 -p $vllm_port:8000 -v $vllm_volume:/data -e HF_TOKEN=$HF_TOKEN -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HOME=/data -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e VLLM_SKIP_WARMUP=true --cap-add=sys_nice --ipc=host opea/vllm-gaudi:latest --model ${model} --max-seq-len-to-capture 16384 --tensor-parallel-size 4
```

Then launch Agent microservices.

```bash
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
bash launch_agent_service_gaudi.sh
```

:::
Expand All @@ -179,6 +209,7 @@ docker build -t opea/agent:latest --build-arg https_proxy=$https_proxy --build-a
To use OpenAI models, run commands below.

```
export OPENAI_API_KEY=<your-openai-key>
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
bash launch_agent_service_openai.sh
```
Expand All @@ -195,8 +226,11 @@ Refer to the [AgentQnA helm chart](./kubernetes/helm/README.md) for instructions
First look at logs of the agent docker containers:

```
# worker agent
# worker RAG agent
docker logs rag-agent-endpoint
# worker SQL agent
docker logs sql-agent-endpoint
```

```
Expand All @@ -206,22 +240,36 @@ docker logs react-agent-endpoint

You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>

Second, validate worker agent:
Second, validate worker RAG agent:

```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
"messages": "Michael Jackson song Thriller"
}'
```

Third, validate supervisor agent:
Third, validate worker SQL agent:

```
curl http://${host_ip}:9096/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"messages": "How many employees are in the company"
}'
```

Finally, validate supervisor agent:

```
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
"messages": "How many albums does Iron Maiden have?"
}'
```

## Deploy AgentQnA UI

The AgentQnA UI can be deployed locally or using Docker.

For detailed instructions on deploying AgentQnA UI, refer to the [AgentQnA UI Guide](./ui/svelte/README.md).

## How to register your own tools with agent

You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/src/README.md).
Binary file removed AgentQnA/assets/agent_qna_arch.png
Binary file not shown.
Binary file added AgentQnA/assets/img/agent_qna_arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added AgentQnA/assets/img/agent_ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added AgentQnA/assets/img/agent_ui_result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 32 additions & 9 deletions AgentQnA/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,21 +41,33 @@ This example showcases a hierarchical multi-agent system for question-answering
bash run_ingest_data.sh
```

4. Launch Tool service
4. Prepare SQL database
In this example, we will use the SQLite database provided in the [TAG-Bench](https://github.com/TAG-Research/TAG-Bench/tree/main). Run the commands below.

```
# Download data
cd $WORKDIR
git clone https://github.com/TAG-Research/TAG-Bench.git
cd TAG-Bench/setup
chmod +x get_dbs.sh
./get_dbs.sh
```

5. Launch Tool service
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
```
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
```
5. Launch `Agent` service
6. Launch multi-agent system

The configurations of the supervisor agent and the worker agent are defined in the docker-compose yaml file. We currently use openAI GPT-4o-mini as LLM, and llama3.1-70B-instruct (served by TGI-Gaudi) in Gaudi example. To use openai llm, run command below.
The configurations of the supervisor agent and the worker agents are defined in the docker-compose yaml file. We currently use openAI GPT-4o-mini as LLM.

```
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
bash launch_agent_service_openai.sh
```

6. [Optional] Build `Agent` docker image if pulling images failed.
7. [Optional] Build `Agent` docker image if pulling images failed.

```
git clone https://github.com/opea-project/GenAIComps.git
Expand All @@ -68,8 +80,11 @@ This example showcases a hierarchical multi-agent system for question-answering
First look at logs of the agent docker containers:

```
# worker agent
# worker RAG agent
docker logs rag-agent-endpoint
# worker SQL agent
docker logs sql-agent-endpoint
```

```
Expand All @@ -79,19 +94,27 @@ docker logs react-agent-endpoint

You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>

Second, validate worker agent:
Second, validate worker RAG agent:

```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"messages": "Michael Jackson song Thriller"
}'
```

Third, validate worker SQL agent:

```
curl http://${host_ip}:9095/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
"messages": "How many employees are in the company?"
}'
```

Third, validate supervisor agent:
Finally, validate supervisor agent:

```
curl http://${host_ip}:9090/v1/chat/completions -X POST -H "Content-Type: application/json" -d '{
"query": "Most recent album by Taylor Swift"
"messages": "How many albums does Iron Maiden have?"
}'
```

Expand Down
27 changes: 27 additions & 0 deletions AgentQnA/docker_compose/intel/cpu/xeon/compose_openai.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,33 @@ services:
LANGCHAIN_PROJECT: "opea-worker-agent-service"
port: 9095

worker-sql-agent:
image: opea/agent:latest
container_name: sql-agent-endpoint
volumes:
- ${WORKDIR}/TAG-Bench/:/home/user/TAG-Bench # SQL database
ports:
- "9096:9096"
ipc: host
environment:
ip_address: ${ip_address}
strategy: sql_agent
db_name: ${db_name}
db_path: ${db_path}
use_hints: false
hints_file: /home/user/TAG-Bench/${db_name}_hints.csv
recursion_limit: ${recursion_limit_worker}
llm_engine: openai
OPENAI_API_KEY: ${OPENAI_API_KEY}
model: ${model}
temperature: 0
max_new_tokens: ${max_new_tokens}
stream: false
require_human_feedback: false
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
port: 9096

supervisor-react-agent:
image: opea/agent:latest
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,10 @@ export temperature=0
export max_new_tokens=4096
export OPENAI_API_KEY=${OPENAI_API_KEY}
export WORKER_AGENT_URL="http://${ip_address}:9095/v1/chat/completions"
export SQL_AGENT_URL="http://${ip_address}:9096/v1/chat/completions"
export RETRIEVAL_TOOL_URL="http://${ip_address}:8889/v1/retrievaltool"
export CRAG_SERVER=http://${ip_address}:8080
export db_name=california_schools
export db_path="sqlite:////home/user/TAG-Bench/dev_folder/dev_databases/${db_name}/${db_name}.sqlite"

docker compose -f compose_openai.yaml up -d
Loading

0 comments on commit 9402fcd

Please sign in to comment.