Skip to content

Latest commit

 

History

History
67 lines (57 loc) · 5.34 KB

File metadata and controls

67 lines (57 loc) · 5.34 KB

Inference model deployment

Using Helm

Repository includes two helm charts:

  1. custom-model-kserve-helm-chart inference Kserve service that waits for input with possible scaling extension by Knative and Istio
    • easy to scale, Knative listens to number of requests and scales up/down the number of pods
    • esy to troubleshoot, monitor and deploy(uses MLflow and Minio database for experiment tracking)
    • easy to extend to forward predictions to central InfluxDB by concept of pre/postprocessing Transformers in Kserve
  2. simple-helm-chart a model deployed as Kubernetes-deployment that listens to Redis channel, makes predictions and forwards them to central InfluxdbDB
    • simple to deploy and understand
    • no scaling, troubleshooting, monitoring, etc. capabilities
    • as the deployment listens to Redis channel (and not expects the input from client/Redis) the only way to scale the solution is replace Redis publish/subscribe channel for Redis stream and consumer groups to allow multiple consumers share the load of processing

Configuration parameters

custom-model-kserve-helm-chart

Name Description Default value
name service/pod name "custom-kserve-model"
storageUri S3 storage uri where the model is stored "s3://mlflow/5/988f6db2906641b8bcc1494c36619f9d/artifacts/model"
serviceAccountName hosts and credentials to reach services by Kserve, e.g. s3, more description in Kserve documentation "success6g"

simple-helm-chart

Name Description Default value
image.repository Deployment Docker image repository 5uperpalo/success6g_custom_kserve
image.pullPolicy Deployment Docker image policy IfNotPresent
image.tag Deployment Docker image tag latest
influxdb.host Central InfluxDB host "10.152.183.219"
influxdb.port Central InfluxDB port "80"
influxdb.user Central InfluxDB username "admin"
influxdb.pass Central InfluxDB password "admin_pass"
redis.host Redis database host "10.152.183.250"
redis.port Redis database port "6379"
redis.pass Redis database password "redis"
resources.requests.cpu Kubernetes requested CPU "2"
resources.requests.memory Kubernetes requested memory "4Gi"
resources.limits.cpu Kubernetes limits to CPU "2"
resources.limits.memory Kubernetes limits to memory "4Gi"

Installation

From cloned repo:

  • custom-model-kserve-helm-chart
helm install custom-model-kserve ./custom-model-kserve-helm-chart --namespace custom-model-kserve --create-namespace
  • simple-helm-chart
helm install simple ./simple-helm-chart --namespace simple --create-namespace

From added helm repo:

  • custom-model-kserve-helm-chart
helm repo add success6g-edge https://5uperpalo.github.io/success6g-edge/charts/
helm install custom-model-kserve success6g-edge/custom-model-kserve --namespace custom-model-kserve --create-namespace
  • simple-helm-chart
helm repo add simple-helm-chart https://5uperpalo.github.io/success6g-edge/charts/
helm install simple success6g-edge/simple --namespace simple --create-namespace
# helm install simple success6g/simple --set redis.host="10.43.128.90" --set influxdb.host="10.17.252.101" --set influxdb.port="30567" --namespace simple --create-namespace