This guide shows how to deploy OPEA applications on Amazon Web Service (AWS) Elastic Kubernetes Service (EKS) using Terraform.
The setup uses Terraform to create EKS cluster with the following properties:
- 1-node EKS cluster with 50 GB disk and
m7i.x8large
SPOT instance (16 vCPU and 32 GB memory) - Cluster autoscaling up to 10 nodes
- Storage Class (SC)
efs-sc
and Persistent Volume Claim (PVC)model-volume
for storing the model data LoadBalancer
address type for the service for external consumption- Updates the kubeconfig file for
kubectl
access
Initialize the Terraform environment.
terraform init
By default, 1-node cluster is created which is suitable for running the OPEA application. See variables.tf
and opea-<application-name>.tfvars
if you want to tune the cluster properties, e.g., number of nodes, instance types or disk size.
OPEA needs a volume where to store the model. For that we need to create Kubernetes Persistent Volume Claim (PVC). OPEA requires ReadWriteMany
option since multiple pods needs access to the storage and they can be on different nodes. On EKS, only EFS supports ReadWriteMany
. Thus, each OPEA application below uses the file eks-efs-csi-pvc.yaml
to create PVC in its namespace.
Use the commands below to create EKS cluster.
terraform plan --var-file opea-chatqna.tfvars -out opea-chatqna.plan
terraform apply "opea-chatqna.plan"
Once the cluster is ready, the kubeconfig file to access the new cluster is updated automatically. By default, the file is ~/.kube/config
.
Now you should have access to the cluster via the kubectl
command.
Deploy ChatQnA Application with Helm
helm install -n chatqna --create-namespace chatqna oci://ghcr.io/opea-project/charts/chatqna --set service.type=LoadBalancer --set global.modelUsePVC=model-volume --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN}
Create the PVC as mentioned above
kubectl apply -f eks-efs-csi-pvc.yaml -n chatqna
After a while, the OPEA application should be running. You can check the status via kubectl
.
kubectl get pod -n chatqna
You can now start using the OPEA application.
OPEA_SERVICE=$(kubectl get svc -n chatqna chatqna -ojsonpath='{.status.loadBalancer.ingress[0].hostname}')
curl http://${OPEA_SERVICE}:8888/v1/chatqna \
-H "Content-Type: application/json" \
-d '{"messages": "What is the revenue of Nike in 2023?"}'
Cleanup
Delete the cluster via the following command.
helm uninstall -n chatqna chatqna
terraform destroy -var-file opea-chatqna.tfvars