Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AgentQnA helm chart deploy update #837

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yongfengdu
Copy link
Collaborator

Sync latest changes with GenAIExamples.
Added cpu deployment with smaller model.
Updated README with detailed instructions.
Support using PVC for passing tools configuration. Fix minor issues.

Description

Update agentqna helm charts

Issues

Closed #827
Closed #798
Closed #783
Example 1524
Example 1523

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Manual tested deploy and helm test with gaudi-values.yaml and cpu-values.yaml.

Sync latest changes with GenAIExamples.
Added cpu deployment with smaller model.
Updated README with detailed instructions.
Support using PVC for passing tools configuration.
Fix minor issues.

Signed-off-by: Dolpher Du <[email protected]>
Copy link
Contributor

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed few trivial things.


Note that this is an example to demonstrate how agent works and tested with prepared data and questions. Using different datasets, models and questions may get different results.

Agent usually requires larger models to performance better, we used Llama-3.3-70B-Instruct for test, which requires 4x Gaudi devices for local deployment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo:

Suggested change
Agent usually requires larger models to performance better, we used Llama-3.3-70B-Instruct for test, which requires 4x Gaudi devices for local deployment.
Agent usually requires larger models to perform better, we used Llama-3.3-70B-Instruct for test, which requires 4x Gaudi devices for local deployment.

I guess it could be run (slowly) also on CPU with enough memory?


Agent usually requires larger models to performance better, we used Llama-3.3-70B-Instruct for test, which requires 4x Gaudi devices for local deployment.

With helm chart, we also provided option with smaller model(Meta-Llama-3-8B-Instruct) with compromised performance on Xeon CPU only environment for you to try.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With helm chart, we also provided option with smaller model(Meta-Llama-3-8B-Instruct) with compromised performance on Xeon CPU only environment for you to try.
With helm chart, we also provided option with smaller model (Meta-Llama-3-8B-Instruct) with compromised performance on Xeon CPU only environment for you to try.


## Deploy

helm install agentqna oci://ghcr.io/opea-project/charts/agentqna --set global.HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} --set tgi.enabled=True
The Deployment includes preparing tools and sql data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Deployment includes preparing tools and sql data.
The Deployment includes preparing tools and SQL data.


A volume is required to put tools configuration used by agent, and the database data used by sqlagent.

We'll use hostPath in this readme, which is convenient for single worker node deployment. PVC is recommended in a bigger cluster. If you want to use a PVC, comment out the `toolHostPath` and replace with `toolPVC` in the values.yaml.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We'll use hostPath in this readme, which is convenient for single worker node deployment. PVC is recommended in a bigger cluster. If you want to use a PVC, comment out the `toolHostPath` and replace with `toolPVC` in the values.yaml.
We'll use hostPath in this readme, which is convenient for single worker node deployment. PVC is recommended in a bigger cluster. If you want to use a PVC, comment out the `toolHostPath` and replace with `toolPVC` in the `values.yaml`.


We'll use hostPath in this readme, which is convenient for single worker node deployment. PVC is recommended in a bigger cluster. If you want to use a PVC, comment out the `toolHostPath` and replace with `toolPVC` in the values.yaml.

Create the directory /mnt/tools in the worker node, which is the default in values.yaml. We use the same directory for all 3 agents for easy configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Create the directory /mnt/tools in the worker node, which is the default in values.yaml. We use the same directory for all 3 agents for easy configuration.
Create the directory `/mnt/tools` in the worker node, which is the default in `values.yaml`. We use the same directory for all 3 agents for easy configuration.

OPENAI_API_KEY: EMPTY
model: "YourModel"
# Use OpenAI KEY
# llm_engine: openai
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these are commented out?

Comment on lines +18 to +22
OPENAI_API_KEY: EMPTY
model: "YourModel"
# Use OpenAI KEY
# llm_engine: openai
# OPENAI_API_KEY: YourOpenAIKey
Copy link
Contributor

@eero-t eero-t Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These extra key comments are redundant for all of these 3 subcharts.

@@ -6,6 +6,24 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case:

Suggested change
vllm:
enabled: false

@@ -6,6 +6,15 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case:

Suggested change
tgi:
enabled: false

Comment on lines +30 to +31
# Uncomment this if you have an tool configuration file
tools: /home/user/comps/agent/src/tools/custom_tools.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already uncommented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants