From 0fd105c7f23d229fcbfcb2d265d2b881efb842a2 Mon Sep 17 00:00:00 2001 From: philschmid Date: Mon, 21 Oct 2024 14:22:41 +0200 Subject: [PATCH 1/6] wip --- docs/source/how-to/cloud/gcp.mdx | 123 ++++++++++++++++++++++++++++++- 1 file changed, 121 insertions(+), 2 deletions(-) diff --git a/docs/source/how-to/cloud/gcp.mdx b/docs/source/how-to/cloud/gcp.mdx index 209f126..addc68c 100644 --- a/docs/source/how-to/cloud/gcp.mdx +++ b/docs/source/how-to/cloud/gcp.mdx @@ -1,3 +1,122 @@ -# HUGS on Google Cloud +# HUGS on Google Cloud -TODO +The Hugging Face Generative AI Services, also known as HUGS, can be deployed in Google Cloud via the Google Cloud Marketplace offering. + +This collaboration brings Hugging Face's extensive library of pre-trained models and their Text Generation Inference (TGI) solution to Google Cloud customers, enabling seamless integration of state-of-the-art Large Language Models (LLMs) within the Google Cloud infrastructure. + +HUGS provides access to a hand-picked and manually benchmarked collection of the most performant and latest open LLMs hosted in the Hugging Face Hub to TGI-optimized container applications, allowing users to deploy third-party Kubernetes applications on AWS or on-premises environments. + +With HUGS, developers can easily find, subscribe to, and deploy Hugging Face models using AWS infrastructure, leveraging the power of NVIDIA GPUs on optimized, zero-configuration TGI containers. + +## Subscribe to HUGS on AWS Marketplace + +1. Go to [HUGS Google Cloud Marketplace listing](https://console.cloud.google.com/marketplace/product/huggingface-public/hugs__draft?authuser=1&project=huggingface-public) + + ![HUGS on Google Cloud Marketplace](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hugs/gcp/hugs-marketplace-listing.png) + +2. Subscribe to the product in Google Cloud by following the instructions on the page. At the time of writing (September 2024), the steps are to: + + 1. Click `Purchase`, then go to the next page. + 2. Configure the order by selecting the right plan, billing account, and confirming the terms. Then click `Subscribe`. + + ![HUGS Configuration on Google Cloud Marketplace](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hugs/gcp/hugs-configuration.png) + +3. You should see a "Your order request has been sent to Hugging Face" message. With a button "Go to Product Page". Click on it. + + ![HUGS Confirmation on Google Cloud Marketplace](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hugs/gcp/hugs-confirmation.png) + + + +To know whether you are subscribed or not, you can either see if the "Purchase" button or "Configure" button is enabled on the product page, meaning that either you or someone else from your organization has already requested access for your account. + + + + +## Deploy HUGS on Google Cloud GKE + +This example showcases how to deploy a HUGS container and model on Google Cloud GKE. + + + +This example assumes that you have an Google Cloud Account, that you have [installed and setup the Google Cloud CLI](https://cloud.google.com/sdk/docs/install), and that you are logged in into your account with the necessary permissions to subscribe to offerings in the Google Cloud Marketplace, and create and manage IAM permissions and resources such as Google Kubernetes Engine (GKE). + + + +When deploying HUGS on Google Cloud through the UI you can either select an existing GKE cluster or create a new one. If you want to create a new one, you can follow the instructions [here](https://cloud.google.com/kubernetes-engine/docs/how-to/creating-a-cluster). Additionally you need to define: + +* Namespace: The namespace to deploy the HUGS container and model. +* App Instance Name: The name of the HUGS container. +* Hugs Model Id: Select the model you want to deploy from the Hugging Face Hub. You can find all supported model [here](../models) +* Reporting Service Account: The service account to use for reporting. + + ![HUGS Deployment Configuration](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hugs/gcp/hugs-deploy.png) + +Next you click on `Deploy` and wait for the deployment to finish. This takes around 10-15 minutes. + + + +If you want to better understand the different deployment options you have, e.g. 1x NVIDIA L4 GPU for Meta Llama 3.1 8B Instruct, you can checkout the [supported model matrix](../models). + + + + +## Create a GPU GKE Cluster for HUGS + +To deploy HUGS on Google Cloud, you'll need a GKE cluster with GPU support. Here's a step-by-step guide to create one: + +1. Ensure you have the [Google Cloud CLI installed and configured](https://cloud.google.com/sdk/docs/install-sdk). + +2. Set up environment variables for your cluster configuration: + +```bash +export PROJECT_ID="your-project-id" # Your Google Cloud Project ID which is subscribed to HUGS +export CLUSTER_NAME="hugs-cluster" # The name of the GKE cluster +export LOCATION="us-central1" # The location of the GKE cluster +export MACHINE_TYPE="g2-standard-12" # The machine type of the GKE cluster +export GPU_TYPE="nvidia-l4" # The type of GPU to use +export GPU_COUNT=1 # The number of GPUs to use +``` + +3. Create the GKE cluster: + +```bash +gcloud container clusters create $CLUSTER_NAME \ + --project=$PROJECT_ID \ + --zone=$LOCATION \ + --release-channel=stable \ + --cluster-version=1.29 \ + --machine-type=$MACHINE_TYPE \ + --num-nodes=1 \ + --no-enable-autoprovisioning +``` + +4. Add a GPU node pool to the cluster: + +```bash +gcloud container node-pools create gpu-pool \ + --cluster=$CLUSTER_NAME \ + --zone=$LOCATION \ + --machine-type=$MACHINE_TYPE \ + --accelerator type=$GPU_TYPE,count=$GPU_COUNT \ + --num-nodes=1 \ + --enable-autoscaling \ + --min-nodes=1 \ + --max-nodes=1 \ + --spot \ + --disk-type=pd-ssd \ + --disk-size=100GB +``` + +5. Configure kubectl to use the new cluster: + +```bash +gcloud container clusters get-credentials $CLUSTER_NAME --zone=$LOCATION +``` + +Your GKE cluster with GPU support is now ready for HUGS deployment. You can proceed to deploy HUGS using the Google Cloud Marketplace as described in the previous section. + + + +For more detailed information on creating and managing GKE clusters, refer to the [official Google Kubernetes Engine documentation](https://cloud.google.com/kubernetes-engine/docs) or [run GPUs in GKE Standard node pools](https://cloud.google.com/kubernetes-engine/docs/how-to/gpus) + + From e3a45a026b67231d03a0deb4ceaf15fb4d9efe08 Mon Sep 17 00:00:00 2001 From: Philipp Schmid <32632186+philschmid@users.noreply.github.com> Date: Mon, 21 Oct 2024 15:54:29 +0200 Subject: [PATCH 2/6] Update docs/source/how-to/cloud/gcp.mdx Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> --- docs/source/how-to/cloud/gcp.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/how-to/cloud/gcp.mdx b/docs/source/how-to/cloud/gcp.mdx index addc68c..54a593d 100644 --- a/docs/source/how-to/cloud/gcp.mdx +++ b/docs/source/how-to/cloud/gcp.mdx @@ -10,7 +10,7 @@ With HUGS, developers can easily find, subscribe to, and deploy Hugging Face mod ## Subscribe to HUGS on AWS Marketplace -1. Go to [HUGS Google Cloud Marketplace listing](https://console.cloud.google.com/marketplace/product/huggingface-public/hugs__draft?authuser=1&project=huggingface-public) +1. Go to [HUGS Google Cloud Marketplace listing](https://console.cloud.google.com/marketplace/product/huggingface-public/hugs__draft) ![HUGS on Google Cloud Marketplace](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hugs/gcp/hugs-marketplace-listing.png) From 16a4e575fd938c354e88872249f24ee9384b1170 Mon Sep 17 00:00:00 2001 From: Philipp Schmid <32632186+philschmid@users.noreply.github.com> Date: Mon, 21 Oct 2024 15:54:34 +0200 Subject: [PATCH 3/6] Update docs/source/how-to/cloud/gcp.mdx Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> --- docs/source/how-to/cloud/gcp.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/how-to/cloud/gcp.mdx b/docs/source/how-to/cloud/gcp.mdx index 54a593d..b040610 100644 --- a/docs/source/how-to/cloud/gcp.mdx +++ b/docs/source/how-to/cloud/gcp.mdx @@ -14,7 +14,7 @@ With HUGS, developers can easily find, subscribe to, and deploy Hugging Face mod ![HUGS on Google Cloud Marketplace](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hugs/gcp/hugs-marketplace-listing.png) -2. Subscribe to the product in Google Cloud by following the instructions on the page. At the time of writing (September 2024), the steps are to: +2. Subscribe to the product in Google Cloud by following the instructions on the page. At the time of writing (October 2024), the steps are to: 1. Click `Purchase`, then go to the next page. 2. Configure the order by selecting the right plan, billing account, and confirming the terms. Then click `Subscribe`. From b72cfee87fc70d242ad0eb1e6e1b92cbf87c5e2a Mon Sep 17 00:00:00 2001 From: Philipp Schmid <32632186+philschmid@users.noreply.github.com> Date: Mon, 21 Oct 2024 15:54:51 +0200 Subject: [PATCH 4/6] Update docs/source/how-to/cloud/gcp.mdx Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> --- docs/source/how-to/cloud/gcp.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/how-to/cloud/gcp.mdx b/docs/source/how-to/cloud/gcp.mdx index b040610..58f081c 100644 --- a/docs/source/how-to/cloud/gcp.mdx +++ b/docs/source/how-to/cloud/gcp.mdx @@ -55,7 +55,7 @@ Next you click on `Deploy` and wait for the deployment to finish. This takes aro -If you want to better understand the different deployment options you have, e.g. 1x NVIDIA L4 GPU for Meta Llama 3.1 8B Instruct, you can checkout the [supported model matrix](../models). +If you want to better understand the different deployment options you have, e.g. 1x NVIDIA L4 GPU for Meta Llama 3.1 8B Instruct, you can checkout the [supported model matrix](../../models.mdx). From 141bada981e79f59554b95197dcf1545282d752c Mon Sep 17 00:00:00 2001 From: Philipp Schmid <32632186+philschmid@users.noreply.github.com> Date: Mon, 21 Oct 2024 15:54:57 +0200 Subject: [PATCH 5/6] Update docs/source/how-to/cloud/gcp.mdx Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> --- docs/source/how-to/cloud/gcp.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/how-to/cloud/gcp.mdx b/docs/source/how-to/cloud/gcp.mdx index 58f081c..4904da9 100644 --- a/docs/source/how-to/cloud/gcp.mdx +++ b/docs/source/how-to/cloud/gcp.mdx @@ -117,6 +117,6 @@ Your GKE cluster with GPU support is now ready for HUGS deployment. You can proc -For more detailed information on creating and managing GKE clusters, refer to the [official Google Kubernetes Engine documentation](https://cloud.google.com/kubernetes-engine/docs) or [run GPUs in GKE Standard node pools](https://cloud.google.com/kubernetes-engine/docs/how-to/gpus) +For more detailed information on creating and managing GKE clusters, refer to the [official Google Kubernetes Engine documentation](https://cloud.google.com/kubernetes-engine/docs) or [run GPUs in GKE Standard node pools](https://cloud.google.com/kubernetes-engine/docs/how-to/gpus). From 6ec63f9ddc2ff9980e3f5773dde43737e9513286 Mon Sep 17 00:00:00 2001 From: Philipp Schmid <32632186+philschmid@users.noreply.github.com> Date: Mon, 21 Oct 2024 15:55:47 +0200 Subject: [PATCH 6/6] Update docs/source/how-to/cloud/gcp.mdx Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> --- docs/source/how-to/cloud/gcp.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/how-to/cloud/gcp.mdx b/docs/source/how-to/cloud/gcp.mdx index 4904da9..377956c 100644 --- a/docs/source/how-to/cloud/gcp.mdx +++ b/docs/source/how-to/cloud/gcp.mdx @@ -1,4 +1,4 @@ -# HUGS on Google Cloud +# HUGS on Google Cloud The Hugging Face Generative AI Services, also known as HUGS, can be deployed in Google Cloud via the Google Cloud Marketplace offering.