From 3c94fab766d5e1e2729cba4177de885ee7f5d068 Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Tue, 12 Nov 2024 16:26:08 -0800 Subject: [PATCH 1/9] adding codegen sample guide for gaudi deployment Signed-off-by: alexsin368 --- examples/CodeGen/deploy/gaudi.md | 372 +++++++++++++++++++++++++++++++ 1 file changed, 372 insertions(+) create mode 100644 examples/CodeGen/deploy/gaudi.md diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md new file mode 100644 index 00000000..9ad3971d --- /dev/null +++ b/examples/CodeGen/deploy/gaudi.md @@ -0,0 +1,372 @@ +# Single node on-prem deployment with vLLM or TGI on Gaudi AI Accelerator + +This deployment section covers single-node on-prem deployment of the CodeGen +example with OPEA comps to deploy using the TGI service. We will be showcasing how +to build an e2e CodeGen solution with the CodeLlama-7b-hf model, +deployed on Intel® Tiber™ AI Cloud (ITAC). To quickly learn about OPEA in just 5 minutes and set up the required hardware and software, please follow the instructions in the +[Getting Started](https://opea-project.github.io/latest/getting-started/README.html) section. If you do +not have an ITAC instance or the hardware is not supported in the ITAC yet, you can still run this on-prem. + +## Overview + +The CodeGen use case uses a single microservice called LLM. In this tutorial, we +will walk through the steps on how on enable it from OPEA GenAIComps to deploy on +a single node TGI megaservice solution. + +The solution is aimed to show how to use the CodeLlama-7b-hf model on the Intel® +Gaudi® AI Accelerator. We will go through how to setup docker containers to start +the microservice and megaservice. The solution will then take text input as the +prompt and generate code accordingly. It is deployed with a UI with 2 modes to +choose from: + +1. Svelte-Based UI +2. React-Based UI + +The React-based UI is optional, but this feature is supported in this example if you +are interested in using it. + +Below is the list of content we will be covering in this tutorial: + +1. Prerequisites +2. Prepare (Building / Pulling) Docker images +3. Use case setup +4. Deploy the use case +5. Interacting with CodeGen deployment + +## Prerequisites + +The first step is to clone the GenAIExamples and GenAIComps. GenAIComps are +fundamental necessary components used to build examples you find in +GenAIExamples and deploy them as microservices. + +```bash +git clone https://github.com/opea-project/GenAIComps.git +git clone https://github.com/opea-project/GenAIExamples.git +``` + +The examples utilize model weights from HuggingFace and langchain. + +Setup your [HuggingFace](https://huggingface.co/) account and generate +[user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token). + +Setup the HuggingFace token +``` +export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" +``` + +Additionally, if you plan to use the default model CodeLlama-7b-hf, you will +need to [request access](https://huggingface.co/meta-llama/CodeLlama-7b-hf) from HuggingFace. + +The example requires you to set the `host_ip` to deploy the microservices on +endpoint enabled with ports. Set the host_ip env variable +``` +export host_ip=$(hostname -I | awk '{print $1}') +``` + +Make sure to setup Proxies if you are behind a firewall +``` +export no_proxy=${your_no_proxy},$host_ip +export http_proxy=${your_http_proxy} +export https_proxy=${your_http_proxy} +``` + +## Prepare (Building / Pulling) Docker images + +This step will involve building/pulling relevant docker +images with step-by-step process along with sanity check in the end. For +CodeGen, the following docker images will be needed: LLM with TGI. +Additionally, you will need to build docker images for the +CodeGen megaservice, and UI (React UI is optional). In total, +there are **3 required docker images** and an optional docker image. + +### Build/Pull Microservice image + +::::::{tab-set} + +:::::{tab-item} Pull +:sync: Pull + +If you decide to pull the docker containers and not build them locally, +you can proceed to the next step where all the necessary containers will +be pulled in from dockerhub. + +::::: +:::::{tab-item} Build +:sync: Build + +From within the `GenAIComps` folder, checkout the release tag. +``` +cd GenAIComps +git checkout tags/v1.1 +``` + +#### Build LLM Image + +```bash +docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . +``` + +### Build Mega Service images + +The Megaservice is a pipeline that channels data through different +microservices, each performing varied tasks. The LLM microservice and +flow of data are defined in the `codegen.py` file. You can also add or +remove microservices and customize the megaservice to suit your needs. + +Build the megaservice image for this use case + +```bash +cd .. +cd GenAIExamples +git checkout tags/v1.1 +cd CodeGen +``` + +```bash +docker build --no-cache -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . +cd ../.. +``` + +### Build the UI Image + +You can build 2 modes of UI + +*Svelte UI* + +```bash +cd GenAIExamples/CodeGen/ui/ +docker build --no-cache -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +cd ../../.. +``` + +*React UI (Optional)* +If you want a React-based frontend. + +```bash +cd GenAIExamples/CodeGen/ui/ +docker build --no-cache -t opea/codegen-react-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . +cd ../../.. +``` + +### Sanity Check +Check if you have the following set of docker images by running the command `docker images` before moving on to the next step: + +* `opea/llm-tgi:latest` +* `opea/codegen:latest` +* `opea/codegen-ui:latest` +* `opea/codegen-react-ui:latest` (optional) + +::::: +:::::: + +## Use Case Setup + +The use case will use the following combination of GenAIComps and tools + +|Use Case Components | Tools | Model | Service Type | +|---------------- |--------------|-----------------------------|-------| +|LLM | TGI | meta-llama/CodeLlama-7b-hf | OPEA Microservice | +|UI | | NA | Gateway Service | + +Tools and models mentioned in the table are configurable either through the +environment variables or `compose.yaml` file. + +Set the necessary environment variables to setup the use case case by running the `set_env.sh` script. +Here is where the environment variable `LLM_MODEL_ID` is set, and you can change it to another model +by specifying the HuggingFace model card ID. + +```bash +cd GenAIExamples/CodeGen/docker_compose/ +source ./set_env.sh +cd ../../.. +``` + +## Deploy the Use Case + +In this tutorial, we will be deploying via docker compose with the provided +YAML file. The docker compose instructions should be starting all the +above mentioned services as containers. + +```bash +cd GenAIExamples/CodeGen/docker_compose/intel/hpu/gaudi +docker compose up -d +``` + + +### Checks to Ensure the Services are Running +#### Check Startup and Env Variables +Check the start up log by running `docker compose logs` to ensure there are no errors. +The warning messages print out the variables if they are **NOT** set. + +Here are some sample messages if proxy environment variables are not set: + + WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string. + +#### Check the Container Status + +Check if all the containers launched via docker compose has started. + +The CodeGen example starts 4 docker containers. Check that these docker +containers are all running, i.e, all the containers `STATUS` are `Up`. +You can do this with the `docker ps -a` command. + +``` +CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES +bbd235074c3d opea/codegen-ui:latest "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp codegen-gaudi-ui-server +8d3872ca66fa opea/codegen:latest "python codegen.py" About a minute ago Up About a minute 0.0.0.0:7778->7778/tcp, :::7778->7778/tcp codegen-gaudi-backend-server +b9fc39f51cdb opea/llm-tgi:latest "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-gaudi-server +39994e007f15 ghcr.io/huggingface/tgi-gaudi:2.0.1 "text-generation-lau…" About a minute ago Up About a minute 0.0.0.0:8028->80/tcp, :::8028->80/tcp tgi-gaudi-server +``` + +## Interacting with CodeGen for Deployment + +This section will walk you through the different ways to interact with +the microservices deployed. After a couple minutes, rerun `docker ps -a` +to ensure all the docker containers are still up and running. Then proceed +to validate each microservice and megaservice. + +### TGI Service + +```bash +curl http://${host_ip}:8028/generate \ + -X POST \ + -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ + -H 'Content-Type: application/json' +``` + +Here is the output: + +``` +{"generated_text":"\n\nIO iflow diagram:\n\n![IO flow diagram(s)](TodoList.iflow.svg)\n\n### TDD Kata walkthrough\n\n1. Start with a user story. We will add story tests later. In this case, we'll choose a story about adding a TODO:\n ```ruby\n as a user,\n i want to add a todo,\n so that i can get a todo list.\n\n conformance:\n - a new todo is added to the list\n - if the todo text is empty, raise an exception\n ```\n\n1. Write the first test:\n ```ruby\n feature Testing the addition of a todo to the list\n\n given a todo list empty list\n when a user adds a todo\n the todo should be added to the list\n\n inputs:\n when_values: [[\"A\"]]\n\n output validations:\n - todo_list contains { text:\"A\" }\n ```\n\n1. Write the first step implementation in any programming language you like. In this case, we will choose Ruby:\n ```ruby\n def add_"} +``` + +### LLM Microservice + +```bash +curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' +``` + +The output is given one character at a time. It is too long to show +here but the last item will be +``` +data: [DONE] +``` + +### MegaService + +```bash +curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." + }' +``` + +The output is given one character at a time. It is too long to show +here but the last item will be +``` +data: [DONE] +``` + +## Launch UI +### Svelte UI +To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below: +```bash + codegen-gaudi-ui-server: + image: ${REGISTRY:-opea}/codegen-ui:${TAG:-latest} + ... + ports: + - "5173:5173" +``` + +### React-Based UI (Optional) +To access the React-based frontend, modify the UI service in the `compose.yaml` file. Replace `codegen-gaudi-ui-server` service with the codegen-gaudi-react-ui-server service as per the config below: +```bash +codegen-gaudi-react-ui-server: + image: ${REGISTRY:-opea}/codegen-react-ui:${TAG:-latest} + container_name: codegen-gaudi-react-ui-server + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - APP_CODE_GEN_URL=${BACKEND_SERVICE_ENDPOINT} + depends_on: + - codegen-gaudi-backend-server + ports: + - "5174:80" + ipc: host + restart: always +``` +Once the services are up, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below: +```bash + codegen-gaudi-react-ui-server: + image: ${REGISTRY:-opea}/codegen-react-ui:${TAG:-latest} + ... + ports: + - "80:80" +``` + +## Check Docker Container Logs + +You can check the log of a container by running this command: + +```bash +docker logs -t +``` + +You can also check the overall logs with the following command, where the +`compose.yaml` is the megaservice docker-compose configuration file. + +Assumming you are still in this directory `GenAIExamples/CodeGen/docker_compose/intel/hpu/gaudi`, +run the following command to check the logs: +```bash +docker compose -f compose.yaml logs +``` + +View the docker input parameters in `./CodeGen/docker_compose/intel/hpu/gaudi/compose.yaml` + +```yaml + tgi-service: + image: ghcr.io/huggingface/tgi-gaudi:2.0.1 + container_name: tgi-gaudi-server + ports: + - "8028:80" + volumes: + - "./data:/data" + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + HABANA_VISIBLE_DEVICES: all + OMPI_MCA_btl_vader_single_copy_mechanism: none + HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + runtime: habana + cap_add: + - SYS_NICE + ipc: host + command: --model-id ${LLM_MODEL_ID} --max-input-length 1024 --max-total-tokens 2048 +``` + +The input `--model-id` is `${LLM_MODEL_ID}`. Ensure the environment variable `LLM_MODEL_ID` +is set correctly. Check spelling. Whenever this is changed, restart the containers to use +the newly selected model. + + +## Stop the services + +Once you are done with the entire pipeline and wish to stop and remove all the containers, use the command below: +``` +docker compose down +``` From 6f235c1bb211075aa43f95719376b5e4ef9dba3e Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Fri, 15 Nov 2024 14:40:37 -0800 Subject: [PATCH 2/9] update with TAG instead of version number Signed-off-by: alexsin368 --- examples/CodeGen/deploy/gaudi.md | 33 +++++++++++++++++--------------- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md index 9ad3971d..8877a166 100644 --- a/examples/CodeGen/deploy/gaudi.md +++ b/examples/CodeGen/deploy/gaudi.md @@ -37,11 +37,13 @@ Below is the list of content we will be covering in this tutorial: The first step is to clone the GenAIExamples and GenAIComps. GenAIComps are fundamental necessary components used to build examples you find in -GenAIExamples and deploy them as microservices. +GenAIExamples and deploy them as microservices. Also set the `TAG` +environment variable with the version. ```bash git clone https://github.com/opea-project/GenAIComps.git git clone https://github.com/opea-project/GenAIExamples.git +export TAG=1.1 ``` The examples utilize model weights from HuggingFace and langchain. @@ -97,13 +99,13 @@ be pulled in from dockerhub. From within the `GenAIComps` folder, checkout the release tag. ``` cd GenAIComps -git checkout tags/v1.1 +git checkout tags/v${TAG} ``` #### Build LLM Image ```bash -docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . +docker build --no-cache -t opea/llm-tgi:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . ``` ### Build Mega Service images @@ -118,12 +120,12 @@ Build the megaservice image for this use case ```bash cd .. cd GenAIExamples -git checkout tags/v1.1 +git checkout tags/v${TAG} cd CodeGen ``` ```bash -docker build --no-cache -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . +docker build --no-cache -t opea/codegen:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . cd ../.. ``` @@ -135,7 +137,7 @@ You can build 2 modes of UI ```bash cd GenAIExamples/CodeGen/ui/ -docker build --no-cache -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +docker build --no-cache -t opea/codegen-ui:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . cd ../../.. ``` @@ -144,17 +146,18 @@ If you want a React-based frontend. ```bash cd GenAIExamples/CodeGen/ui/ -docker build --no-cache -t opea/codegen-react-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . +docker build --no-cache -t opea/codegen-react-ui:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . cd ../../.. ``` ### Sanity Check -Check if you have the following set of docker images by running the command `docker images` before moving on to the next step: +Check if you have the following set of docker images by running the command `docker images` before moving on to the next step. +The tags are based on what you set the environment variable `TAG` to. -* `opea/llm-tgi:latest` -* `opea/codegen:latest` -* `opea/codegen-ui:latest` -* `opea/codegen-react-ui:latest` (optional) +* `opea/llm-tgi:${TAG}` +* `opea/codegen:${TAG}` +* `opea/codegen-ui:${TAG}` +* `opea/codegen-react-ui:${TAG}` (optional) ::::: :::::: @@ -223,9 +226,9 @@ You can do this with the `docker ps -a` command. ``` CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -bbd235074c3d opea/codegen-ui:latest "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp codegen-gaudi-ui-server -8d3872ca66fa opea/codegen:latest "python codegen.py" About a minute ago Up About a minute 0.0.0.0:7778->7778/tcp, :::7778->7778/tcp codegen-gaudi-backend-server -b9fc39f51cdb opea/llm-tgi:latest "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-gaudi-server +bbd235074c3d opea/codegen-ui:${TAG} "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp codegen-gaudi-ui-server +8d3872ca66fa opea/codegen:${TAG} "python codegen.py" About a minute ago Up About a minute 0.0.0.0:7778->7778/tcp, :::7778->7778/tcp codegen-gaudi-backend-server +b9fc39f51cdb opea/llm-tgi:${TAG} "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-gaudi-server 39994e007f15 ghcr.io/huggingface/tgi-gaudi:2.0.1 "text-generation-lau…" About a minute ago Up About a minute 0.0.0.0:8028->80/tcp, :::8028->80/tcp tgi-gaudi-server ``` From b5d8d84c4d5b0656f64a3e37fa44cb24d17ccee0 Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Fri, 15 Nov 2024 15:02:37 -0800 Subject: [PATCH 3/9] remove mention of vllm Signed-off-by: alexsin368 --- examples/CodeGen/deploy/gaudi.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md index 8877a166..b5a9a78c 100644 --- a/examples/CodeGen/deploy/gaudi.md +++ b/examples/CodeGen/deploy/gaudi.md @@ -1,4 +1,4 @@ -# Single node on-prem deployment with vLLM or TGI on Gaudi AI Accelerator +# Single node on-prem deployment with TGI on Gaudi AI Accelerator This deployment section covers single-node on-prem deployment of the CodeGen example with OPEA comps to deploy using the TGI service. We will be showcasing how From 7abaef9da76d805f737deb9626cded8ff42a2915 Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Tue, 12 Nov 2024 16:26:08 -0800 Subject: [PATCH 4/9] adding codegen sample guide for gaudi deployment Signed-off-by: alexsin368 --- examples/CodeGen/deploy/gaudi.md | 372 +++++++++++++++++++++++++++++++ 1 file changed, 372 insertions(+) create mode 100644 examples/CodeGen/deploy/gaudi.md diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md new file mode 100644 index 00000000..9ad3971d --- /dev/null +++ b/examples/CodeGen/deploy/gaudi.md @@ -0,0 +1,372 @@ +# Single node on-prem deployment with vLLM or TGI on Gaudi AI Accelerator + +This deployment section covers single-node on-prem deployment of the CodeGen +example with OPEA comps to deploy using the TGI service. We will be showcasing how +to build an e2e CodeGen solution with the CodeLlama-7b-hf model, +deployed on Intel® Tiber™ AI Cloud (ITAC). To quickly learn about OPEA in just 5 minutes and set up the required hardware and software, please follow the instructions in the +[Getting Started](https://opea-project.github.io/latest/getting-started/README.html) section. If you do +not have an ITAC instance or the hardware is not supported in the ITAC yet, you can still run this on-prem. + +## Overview + +The CodeGen use case uses a single microservice called LLM. In this tutorial, we +will walk through the steps on how on enable it from OPEA GenAIComps to deploy on +a single node TGI megaservice solution. + +The solution is aimed to show how to use the CodeLlama-7b-hf model on the Intel® +Gaudi® AI Accelerator. We will go through how to setup docker containers to start +the microservice and megaservice. The solution will then take text input as the +prompt and generate code accordingly. It is deployed with a UI with 2 modes to +choose from: + +1. Svelte-Based UI +2. React-Based UI + +The React-based UI is optional, but this feature is supported in this example if you +are interested in using it. + +Below is the list of content we will be covering in this tutorial: + +1. Prerequisites +2. Prepare (Building / Pulling) Docker images +3. Use case setup +4. Deploy the use case +5. Interacting with CodeGen deployment + +## Prerequisites + +The first step is to clone the GenAIExamples and GenAIComps. GenAIComps are +fundamental necessary components used to build examples you find in +GenAIExamples and deploy them as microservices. + +```bash +git clone https://github.com/opea-project/GenAIComps.git +git clone https://github.com/opea-project/GenAIExamples.git +``` + +The examples utilize model weights from HuggingFace and langchain. + +Setup your [HuggingFace](https://huggingface.co/) account and generate +[user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token). + +Setup the HuggingFace token +``` +export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" +``` + +Additionally, if you plan to use the default model CodeLlama-7b-hf, you will +need to [request access](https://huggingface.co/meta-llama/CodeLlama-7b-hf) from HuggingFace. + +The example requires you to set the `host_ip` to deploy the microservices on +endpoint enabled with ports. Set the host_ip env variable +``` +export host_ip=$(hostname -I | awk '{print $1}') +``` + +Make sure to setup Proxies if you are behind a firewall +``` +export no_proxy=${your_no_proxy},$host_ip +export http_proxy=${your_http_proxy} +export https_proxy=${your_http_proxy} +``` + +## Prepare (Building / Pulling) Docker images + +This step will involve building/pulling relevant docker +images with step-by-step process along with sanity check in the end. For +CodeGen, the following docker images will be needed: LLM with TGI. +Additionally, you will need to build docker images for the +CodeGen megaservice, and UI (React UI is optional). In total, +there are **3 required docker images** and an optional docker image. + +### Build/Pull Microservice image + +::::::{tab-set} + +:::::{tab-item} Pull +:sync: Pull + +If you decide to pull the docker containers and not build them locally, +you can proceed to the next step where all the necessary containers will +be pulled in from dockerhub. + +::::: +:::::{tab-item} Build +:sync: Build + +From within the `GenAIComps` folder, checkout the release tag. +``` +cd GenAIComps +git checkout tags/v1.1 +``` + +#### Build LLM Image + +```bash +docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . +``` + +### Build Mega Service images + +The Megaservice is a pipeline that channels data through different +microservices, each performing varied tasks. The LLM microservice and +flow of data are defined in the `codegen.py` file. You can also add or +remove microservices and customize the megaservice to suit your needs. + +Build the megaservice image for this use case + +```bash +cd .. +cd GenAIExamples +git checkout tags/v1.1 +cd CodeGen +``` + +```bash +docker build --no-cache -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . +cd ../.. +``` + +### Build the UI Image + +You can build 2 modes of UI + +*Svelte UI* + +```bash +cd GenAIExamples/CodeGen/ui/ +docker build --no-cache -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +cd ../../.. +``` + +*React UI (Optional)* +If you want a React-based frontend. + +```bash +cd GenAIExamples/CodeGen/ui/ +docker build --no-cache -t opea/codegen-react-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . +cd ../../.. +``` + +### Sanity Check +Check if you have the following set of docker images by running the command `docker images` before moving on to the next step: + +* `opea/llm-tgi:latest` +* `opea/codegen:latest` +* `opea/codegen-ui:latest` +* `opea/codegen-react-ui:latest` (optional) + +::::: +:::::: + +## Use Case Setup + +The use case will use the following combination of GenAIComps and tools + +|Use Case Components | Tools | Model | Service Type | +|---------------- |--------------|-----------------------------|-------| +|LLM | TGI | meta-llama/CodeLlama-7b-hf | OPEA Microservice | +|UI | | NA | Gateway Service | + +Tools and models mentioned in the table are configurable either through the +environment variables or `compose.yaml` file. + +Set the necessary environment variables to setup the use case case by running the `set_env.sh` script. +Here is where the environment variable `LLM_MODEL_ID` is set, and you can change it to another model +by specifying the HuggingFace model card ID. + +```bash +cd GenAIExamples/CodeGen/docker_compose/ +source ./set_env.sh +cd ../../.. +``` + +## Deploy the Use Case + +In this tutorial, we will be deploying via docker compose with the provided +YAML file. The docker compose instructions should be starting all the +above mentioned services as containers. + +```bash +cd GenAIExamples/CodeGen/docker_compose/intel/hpu/gaudi +docker compose up -d +``` + + +### Checks to Ensure the Services are Running +#### Check Startup and Env Variables +Check the start up log by running `docker compose logs` to ensure there are no errors. +The warning messages print out the variables if they are **NOT** set. + +Here are some sample messages if proxy environment variables are not set: + + WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string. + WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string. + +#### Check the Container Status + +Check if all the containers launched via docker compose has started. + +The CodeGen example starts 4 docker containers. Check that these docker +containers are all running, i.e, all the containers `STATUS` are `Up`. +You can do this with the `docker ps -a` command. + +``` +CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES +bbd235074c3d opea/codegen-ui:latest "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp codegen-gaudi-ui-server +8d3872ca66fa opea/codegen:latest "python codegen.py" About a minute ago Up About a minute 0.0.0.0:7778->7778/tcp, :::7778->7778/tcp codegen-gaudi-backend-server +b9fc39f51cdb opea/llm-tgi:latest "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-gaudi-server +39994e007f15 ghcr.io/huggingface/tgi-gaudi:2.0.1 "text-generation-lau…" About a minute ago Up About a minute 0.0.0.0:8028->80/tcp, :::8028->80/tcp tgi-gaudi-server +``` + +## Interacting with CodeGen for Deployment + +This section will walk you through the different ways to interact with +the microservices deployed. After a couple minutes, rerun `docker ps -a` +to ensure all the docker containers are still up and running. Then proceed +to validate each microservice and megaservice. + +### TGI Service + +```bash +curl http://${host_ip}:8028/generate \ + -X POST \ + -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \ + -H 'Content-Type: application/json' +``` + +Here is the output: + +``` +{"generated_text":"\n\nIO iflow diagram:\n\n![IO flow diagram(s)](TodoList.iflow.svg)\n\n### TDD Kata walkthrough\n\n1. Start with a user story. We will add story tests later. In this case, we'll choose a story about adding a TODO:\n ```ruby\n as a user,\n i want to add a todo,\n so that i can get a todo list.\n\n conformance:\n - a new todo is added to the list\n - if the todo text is empty, raise an exception\n ```\n\n1. Write the first test:\n ```ruby\n feature Testing the addition of a todo to the list\n\n given a todo list empty list\n when a user adds a todo\n the todo should be added to the list\n\n inputs:\n when_values: [[\"A\"]]\n\n output validations:\n - todo_list contains { text:\"A\" }\n ```\n\n1. Write the first step implementation in any programming language you like. In this case, we will choose Ruby:\n ```ruby\n def add_"} +``` + +### LLM Microservice + +```bash +curl http://${host_ip}:9000/v1/chat/completions\ + -X POST \ + -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ + -H 'Content-Type: application/json' +``` + +The output is given one character at a time. It is too long to show +here but the last item will be +``` +data: [DONE] +``` + +### MegaService + +```bash +curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ + "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception." + }' +``` + +The output is given one character at a time. It is too long to show +here but the last item will be +``` +data: [DONE] +``` + +## Launch UI +### Svelte UI +To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below: +```bash + codegen-gaudi-ui-server: + image: ${REGISTRY:-opea}/codegen-ui:${TAG:-latest} + ... + ports: + - "5173:5173" +``` + +### React-Based UI (Optional) +To access the React-based frontend, modify the UI service in the `compose.yaml` file. Replace `codegen-gaudi-ui-server` service with the codegen-gaudi-react-ui-server service as per the config below: +```bash +codegen-gaudi-react-ui-server: + image: ${REGISTRY:-opea}/codegen-react-ui:${TAG:-latest} + container_name: codegen-gaudi-react-ui-server + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - APP_CODE_GEN_URL=${BACKEND_SERVICE_ENDPOINT} + depends_on: + - codegen-gaudi-backend-server + ports: + - "5174:80" + ipc: host + restart: always +``` +Once the services are up, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below: +```bash + codegen-gaudi-react-ui-server: + image: ${REGISTRY:-opea}/codegen-react-ui:${TAG:-latest} + ... + ports: + - "80:80" +``` + +## Check Docker Container Logs + +You can check the log of a container by running this command: + +```bash +docker logs -t +``` + +You can also check the overall logs with the following command, where the +`compose.yaml` is the megaservice docker-compose configuration file. + +Assumming you are still in this directory `GenAIExamples/CodeGen/docker_compose/intel/hpu/gaudi`, +run the following command to check the logs: +```bash +docker compose -f compose.yaml logs +``` + +View the docker input parameters in `./CodeGen/docker_compose/intel/hpu/gaudi/compose.yaml` + +```yaml + tgi-service: + image: ghcr.io/huggingface/tgi-gaudi:2.0.1 + container_name: tgi-gaudi-server + ports: + - "8028:80" + volumes: + - "./data:/data" + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + HABANA_VISIBLE_DEVICES: all + OMPI_MCA_btl_vader_single_copy_mechanism: none + HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + runtime: habana + cap_add: + - SYS_NICE + ipc: host + command: --model-id ${LLM_MODEL_ID} --max-input-length 1024 --max-total-tokens 2048 +``` + +The input `--model-id` is `${LLM_MODEL_ID}`. Ensure the environment variable `LLM_MODEL_ID` +is set correctly. Check spelling. Whenever this is changed, restart the containers to use +the newly selected model. + + +## Stop the services + +Once you are done with the entire pipeline and wish to stop and remove all the containers, use the command below: +``` +docker compose down +``` From c0dbde0a3e5485090fc214da6c594281d76a50de Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Fri, 15 Nov 2024 14:40:37 -0800 Subject: [PATCH 5/9] update with TAG instead of version number Signed-off-by: alexsin368 --- examples/CodeGen/deploy/gaudi.md | 33 +++++++++++++++++--------------- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md index 9ad3971d..8877a166 100644 --- a/examples/CodeGen/deploy/gaudi.md +++ b/examples/CodeGen/deploy/gaudi.md @@ -37,11 +37,13 @@ Below is the list of content we will be covering in this tutorial: The first step is to clone the GenAIExamples and GenAIComps. GenAIComps are fundamental necessary components used to build examples you find in -GenAIExamples and deploy them as microservices. +GenAIExamples and deploy them as microservices. Also set the `TAG` +environment variable with the version. ```bash git clone https://github.com/opea-project/GenAIComps.git git clone https://github.com/opea-project/GenAIExamples.git +export TAG=1.1 ``` The examples utilize model weights from HuggingFace and langchain. @@ -97,13 +99,13 @@ be pulled in from dockerhub. From within the `GenAIComps` folder, checkout the release tag. ``` cd GenAIComps -git checkout tags/v1.1 +git checkout tags/v${TAG} ``` #### Build LLM Image ```bash -docker build --no-cache -t opea/llm-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . +docker build --no-cache -t opea/llm-tgi:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile . ``` ### Build Mega Service images @@ -118,12 +120,12 @@ Build the megaservice image for this use case ```bash cd .. cd GenAIExamples -git checkout tags/v1.1 +git checkout tags/v${TAG} cd CodeGen ``` ```bash -docker build --no-cache -t opea/codegen:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . +docker build --no-cache -t opea/codegen:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . cd ../.. ``` @@ -135,7 +137,7 @@ You can build 2 modes of UI ```bash cd GenAIExamples/CodeGen/ui/ -docker build --no-cache -t opea/codegen-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +docker build --no-cache -t opea/codegen-ui:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . cd ../../.. ``` @@ -144,17 +146,18 @@ If you want a React-based frontend. ```bash cd GenAIExamples/CodeGen/ui/ -docker build --no-cache -t opea/codegen-react-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . +docker build --no-cache -t opea/codegen-react-ui:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react . cd ../../.. ``` ### Sanity Check -Check if you have the following set of docker images by running the command `docker images` before moving on to the next step: +Check if you have the following set of docker images by running the command `docker images` before moving on to the next step. +The tags are based on what you set the environment variable `TAG` to. -* `opea/llm-tgi:latest` -* `opea/codegen:latest` -* `opea/codegen-ui:latest` -* `opea/codegen-react-ui:latest` (optional) +* `opea/llm-tgi:${TAG}` +* `opea/codegen:${TAG}` +* `opea/codegen-ui:${TAG}` +* `opea/codegen-react-ui:${TAG}` (optional) ::::: :::::: @@ -223,9 +226,9 @@ You can do this with the `docker ps -a` command. ``` CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -bbd235074c3d opea/codegen-ui:latest "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp codegen-gaudi-ui-server -8d3872ca66fa opea/codegen:latest "python codegen.py" About a minute ago Up About a minute 0.0.0.0:7778->7778/tcp, :::7778->7778/tcp codegen-gaudi-backend-server -b9fc39f51cdb opea/llm-tgi:latest "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-gaudi-server +bbd235074c3d opea/codegen-ui:${TAG} "docker-entrypoint.s…" About a minute ago Up About a minute 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp codegen-gaudi-ui-server +8d3872ca66fa opea/codegen:${TAG} "python codegen.py" About a minute ago Up About a minute 0.0.0.0:7778->7778/tcp, :::7778->7778/tcp codegen-gaudi-backend-server +b9fc39f51cdb opea/llm-tgi:${TAG} "bash entrypoint.sh" About a minute ago Up About a minute 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp llm-tgi-gaudi-server 39994e007f15 ghcr.io/huggingface/tgi-gaudi:2.0.1 "text-generation-lau…" About a minute ago Up About a minute 0.0.0.0:8028->80/tcp, :::8028->80/tcp tgi-gaudi-server ``` From 71182fbbb2067a8cfb65b63545848df358826b5f Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Fri, 15 Nov 2024 15:02:37 -0800 Subject: [PATCH 6/9] remove mention of vllm Signed-off-by: alexsin368 --- examples/CodeGen/deploy/gaudi.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md index 8877a166..b5a9a78c 100644 --- a/examples/CodeGen/deploy/gaudi.md +++ b/examples/CodeGen/deploy/gaudi.md @@ -1,4 +1,4 @@ -# Single node on-prem deployment with vLLM or TGI on Gaudi AI Accelerator +# Single node on-prem deployment with TGI on Gaudi AI Accelerator This deployment section covers single-node on-prem deployment of the CodeGen example with OPEA comps to deploy using the TGI service. We will be showcasing how From 6798948bcc1dc9562d409daf0213a68159a8b0d9 Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Fri, 15 Nov 2024 15:35:23 -0800 Subject: [PATCH 7/9] fix typos, add link to ITAC, attempt to fix IO flow diagram link issue Signed-off-by: alexsin368 --- examples/CodeGen/deploy/gaudi.md | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md index b5a9a78c..b4a4153f 100644 --- a/examples/CodeGen/deploy/gaudi.md +++ b/examples/CodeGen/deploy/gaudi.md @@ -3,14 +3,15 @@ This deployment section covers single-node on-prem deployment of the CodeGen example with OPEA comps to deploy using the TGI service. We will be showcasing how to build an e2e CodeGen solution with the CodeLlama-7b-hf model, -deployed on Intel® Tiber™ AI Cloud (ITAC). To quickly learn about OPEA in just 5 minutes and set up the required hardware and software, please follow the instructions in the -[Getting Started](https://opea-project.github.io/latest/getting-started/README.html) section. If you do -not have an ITAC instance or the hardware is not supported in the ITAC yet, you can still run this on-prem. +deployed on Intel® Tiber™ AI Cloud ([ITAC](https://www.intel.com/content/www/us/en/developer/tools/tiber/ai-cloud.html)). +To quickly learn about OPEA in just 5 minutes and set up the required hardware and software, +please follow the instructions in the [Getting Started](https://opea-project.github.io/latest/getting-started/README.html) +section. If you do not have an ITAC instance or the hardware is not supported in the ITAC yet, you can still run this on-prem. ## Overview The CodeGen use case uses a single microservice called LLM. In this tutorial, we -will walk through the steps on how on enable it from OPEA GenAIComps to deploy on +will walk through the steps on how to enable it from OPEA GenAIComps to deploy on a single node TGI megaservice solution. The solution is aimed to show how to use the CodeLlama-7b-hf model on the Intel® @@ -174,7 +175,7 @@ The use case will use the following combination of GenAIComps and tools Tools and models mentioned in the table are configurable either through the environment variables or `compose.yaml` file. -Set the necessary environment variables to setup the use case case by running the `set_env.sh` script. +Set the necessary environment variables to setup the use case by running the `set_env.sh` script. Here is where the environment variable `LLM_MODEL_ID` is set, and you can change it to another model by specifying the HuggingFace model card ID. @@ -198,7 +199,7 @@ docker compose up -d ### Checks to Ensure the Services are Running #### Check Startup and Env Variables -Check the start up log by running `docker compose logs` to ensure there are no errors. +Check the startup log by running `docker compose logs` to ensure there are no errors. The warning messages print out the variables if they are **NOT** set. Here are some sample messages if proxy environment variables are not set: @@ -218,7 +219,7 @@ Here are some sample messages if proxy environment variables are not set: #### Check the Container Status -Check if all the containers launched via docker compose has started. +Check if all the containers launched via docker compose have started. The CodeGen example starts 4 docker containers. Check that these docker containers are all running, i.e, all the containers `STATUS` are `Up`. @@ -250,7 +251,7 @@ curl http://${host_ip}:8028/generate \ Here is the output: -``` +```bash {"generated_text":"\n\nIO iflow diagram:\n\n![IO flow diagram(s)](TodoList.iflow.svg)\n\n### TDD Kata walkthrough\n\n1. Start with a user story. We will add story tests later. In this case, we'll choose a story about adding a TODO:\n ```ruby\n as a user,\n i want to add a todo,\n so that i can get a todo list.\n\n conformance:\n - a new todo is added to the list\n - if the todo text is empty, raise an exception\n ```\n\n1. Write the first test:\n ```ruby\n feature Testing the addition of a todo to the list\n\n given a todo list empty list\n when a user adds a todo\n the todo should be added to the list\n\n inputs:\n when_values: [[\"A\"]]\n\n output validations:\n - todo_list contains { text:\"A\" }\n ```\n\n1. Write the first step implementation in any programming language you like. In this case, we will choose Ruby:\n ```ruby\n def add_"} ``` @@ -265,7 +266,7 @@ curl http://${host_ip}:9000/v1/chat/completions\ The output is given one character at a time. It is too long to show here but the last item will be -``` +```bash data: [DONE] ``` @@ -279,7 +280,7 @@ curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ The output is given one character at a time. It is too long to show here but the last item will be -``` +```bash data: [DONE] ``` @@ -363,7 +364,7 @@ View the docker input parameters in `./CodeGen/docker_compose/intel/hpu/gaudi/c ``` The input `--model-id` is `${LLM_MODEL_ID}`. Ensure the environment variable `LLM_MODEL_ID` -is set correctly. Check spelling. Whenever this is changed, restart the containers to use +is set and spelled correctly. Check spelling. Whenever this is changed, restart the containers to use the newly selected model. From a9ac58c6e3dfbe18d3381e447f0cfeb4af5e3147 Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Fri, 15 Nov 2024 17:28:54 -0800 Subject: [PATCH 8/9] modify and add index.rst files Signed-off-by: alexsin368 --- examples/CodeGen/deploy/index.rst | 14 ++++++++++++++ examples/index.rst | 2 ++ 2 files changed, 16 insertions(+) create mode 100644 examples/CodeGen/deploy/index.rst diff --git a/examples/CodeGen/deploy/index.rst b/examples/CodeGen/deploy/index.rst new file mode 100644 index 00000000..ac0a37d0 --- /dev/null +++ b/examples/CodeGen/deploy/index.rst @@ -0,0 +1,14 @@ +.. _codegen-example-deployment: + +CodeGen Example Deployment Options +################################### + +Here are some deployment options, depending on your hardware and environment: + +Single Node +*********** + +.. toctree:: + :maxdepth: 1 + + Gaudi AI Accelerator \ No newline at end of file diff --git a/examples/index.rst b/examples/index.rst index 6524dc1d..283693f3 100644 --- a/examples/index.rst +++ b/examples/index.rst @@ -12,6 +12,8 @@ GenAIExamples are designed to give developers an easy entry into generative AI, ChatQnA/deploy/index AgentQnA/AgentQnA_Guide AgentQnA/deploy/index + CodeGen/deploy/gaudi.md + CodeGen/deploy/index ---- From 8f1517a59f699369575f6ca99249f2770944be7d Mon Sep 17 00:00:00 2001 From: alexsin368 Date: Fri, 15 Nov 2024 17:36:36 -0800 Subject: [PATCH 9/9] make text output to not think it's a hyperlink Signed-off-by: alexsin368 --- examples/CodeGen/deploy/gaudi.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md index 2931a237..d4dfafa9 100644 --- a/examples/CodeGen/deploy/gaudi.md +++ b/examples/CodeGen/deploy/gaudi.md @@ -250,8 +250,8 @@ curl http://${host_ip}:8028/generate \ Here is the output: -```bash -{"generated_text":"\n\nIO iflow diagram:\n\n![IO flow diagram(s)](TodoList.iflow.svg)\n\n### TDD Kata walkthrough\n\n1. Start with a user story. We will add story tests later. In this case, we'll choose a story about adding a TODO:\n ```ruby\n as a user,\n i want to add a todo,\n so that i can get a todo list.\n\n conformance:\n - a new todo is added to the list\n - if the todo text is empty, raise an exception\n ```\n\n1. Write the first test:\n ```ruby\n feature Testing the addition of a todo to the list\n\n given a todo list empty list\n when a user adds a todo\n the todo should be added to the list\n\n inputs:\n when_values: [[\"A\"]]\n\n output validations:\n - todo_list contains { text:\"A\" }\n ```\n\n1. Write the first step implementation in any programming language you like. In this case, we will choose Ruby:\n ```ruby\n def add_"} +``` +{"generated_text":"\n\nIO iflow diagram:\n\n!\[IO flow diagram(s)\]\(TodoList.iflow.svg\)\n\n### TDD Kata walkthrough\n\n1. Start with a user story. We will add story tests later. In this case, we'll choose a story about adding a TODO:\n ```ruby\n as a user,\n i want to add a todo,\n so that i can get a todo list.\n\n conformance:\n - a new todo is added to the list\n - if the todo text is empty, raise an exception\n ```\n\n1. Write the first test:\n ```ruby\n feature Testing the addition of a todo to the list\n\n given a todo list empty list\n when a user adds a todo\n the todo should be added to the list\n\n inputs:\n when_values: [[\"A\"]]\n\n output validations:\n - todo_list contains { text:\"A\" }\n ```\n\n1. Write the first step implementation in any programming language you like. In this case, we will choose Ruby:\n ```ruby\n def add_"} ``` ### LLM Microservice @@ -265,7 +265,7 @@ curl http://${host_ip}:9000/v1/chat/completions\ The output is given one character at a time. It is too long to show here but the last item will be -```bash +``` data: [DONE] ``` @@ -279,7 +279,7 @@ curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{ The output is given one character at a time. It is too long to show here but the last item will be -```bash +``` data: [DONE] ```