Inference model deployment

Using Helm

Repository includes two helm charts:

custom-model-kserve-helm-chart inference Kserve service that waits for input with possible scaling extension by Knative and Istio
- easy to scale, Knative listens to number of requests and scales up/down the number of pods
- esy to troubleshoot, monitor and deploy(uses MLflow and Minio database for experiment tracking)
- easy to extend to forward predictions to central InfluxDB by concept of pre/postprocessing Transformers in Kserve
simple-helm-chart a model deployed as Kubernetes-deployment that listens to Redis channel, makes predictions and forwards them to central InfluxdbDB
- simple to deploy and understand
- no scaling, troubleshooting, monitoring, etc. capabilities
- as the deployment listens to Redis channel (and not expects the input from client/Redis) the only way to scale the solution is replace Redis publish/subscribe channel for Redis stream and consumer groups to allow multiple consumers share the load of processing

Configuration parameters

custom-model-kserve-helm-chart

Name	Description	Default value
name	service/pod name	"custom-kserve-model"
storageUri	S3 storage uri where the model is stored	"s3://mlflow/5/988f6db2906641b8bcc1494c36619f9d/artifacts/model"
serviceAccountName	hosts and credentials to reach services by Kserve, e.g. s3, more description in Kserve documentation	"success6g"

simple-helm-chart

Name	Description	Default value
image.repository	Deployment Docker image repository	5uperpalo/success6g_custom_kserve
image.pullPolicy	Deployment Docker image policy	IfNotPresent
image.tag	Deployment Docker image tag	latest
influxdb.host	Central InfluxDB host	"10.152.183.219"
influxdb.port	Central InfluxDB port	"80"
influxdb.user	Central InfluxDB username	"admin"
influxdb.pass	Central InfluxDB password	"admin_pass"
redis.host	Redis database host	"10.152.183.250"
redis.port	Redis database port	"6379"
redis.pass	Redis database password	"redis"
resources.requests.cpu	Kubernetes requested CPU	"2"
resources.requests.memory	Kubernetes requested memory	"4Gi"
resources.limits.cpu	Kubernetes limits to CPU	"2"
resources.limits.memory	Kubernetes limits to memory	"4Gi"

Installation

From cloned repo:

custom-model-kserve-helm-chart

helm install custom-model-kserve ./custom-model-kserve-helm-chart --namespace custom-model-kserve --create-namespace

simple-helm-chart

helm install simple ./simple-helm-chart --namespace simple --create-namespace

From added helm repo:

custom-model-kserve-helm-chart

helm repo add success6g-edge https://5uperpalo.github.io/success6g-edge/charts/
helm install custom-model-kserve success6g-edge/custom-model-kserve --namespace custom-model-kserve --create-namespace

simple-helm-chart

helm repo add simple-helm-chart https://5uperpalo.github.io/success6g-edge/charts/
helm install simple success6g-edge/simple --namespace simple --create-namespace
# helm install simple success6g/simple --set redis.host="10.43.128.90" --set influxdb.host="10.17.252.101" --set influxdb.port="30567" --namespace simple --create-namespace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Inference model deployment

Using Helm

Configuration parameters

custom-model-kserve-helm-chart

simple-helm-chart

Installation

From cloned repo:

From added helm repo:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Inference model deployment

Using Helm

Configuration parameters

custom-model-kserve-helm-chart

simple-helm-chart

Installation

From cloned repo:

From added helm repo: