Kountertop is an ML toolkit that integrates the best-of-breed open-source ML systems to address the Build-Train-Deploy workflow in Machine Learning.
- Features a centralized model building and experimentation platform
- High performance, high availability, model versioning supported ML model serving platform
- Abstracts complexities away from Data Scientists
- Auto-serving capability
We will be using MicroK8s, but you can configure it to use with any other Kubernetes provider as well!
Clone this repository to your $HOME
directory
git clone https://github.com/sachua/kountertop.git
If you are installing on-prem, run the scripts to push the images to your local registry
cd on-prem
sh pull_images.sh
sh push_images.sh
- You should then change all the image paths of the different components to point to your private registry
Enable MicroK8s add-ons
microk8s enable dns storage metrics-server metallb prometheus helm3
-
Deploy MinIO
helm install minio ./minio.tgz
-
Create the buckets
mlflow
andconfig
-
Copy
models.config
andprometheus.config
in tensorflow-serving to theconfig
bucket
-
-
Deploy MLflow
kubectl apply -f mlflow-mysql-deployment.yaml kubectl apply -f mlflow-ui-deployment.yaml
-
Deploy OpenLDAP
kubectl apply -f LDAP-server.yaml
-
Create user-accounts based on examples from
LDAP users & groups
-
To seed LDAP with entries:
kubectl exec -it -n kubeflow ldap-0 -- bash ldapadd -x -D "cn=admin,dc=example,dc=com" -W # Enter password "admin". # Press Ctrl+D to complete after pasting the snippets.
-
-
Deploy Jupyterhub
helm install jhub ./jupyterhub.tgz \ --version=0.9.0 \ --values config.yaml
- Template for using AD instead of OpenLDAP is provided in AD-config
-
Deploy Tensorflow Serving
kubectl apply -f tfserving-deployment.yaml
-
Deploy Velero
tar -xzvf velero.tgz export PATH=$PATH:"/$HOME/kountertop" velero install \ --provider aws \ --plugins velero/velero-plugin-for-aws:v1.1.0 \ --bucket velero \ --secret-file ./credentials-velero \ --use-volume-snapshots=false \ --backup-location-config region=minio,s3ForcePathStyle="true",s3Url=http://minio.default.svc.cluster.local:9000
- if minio not recognized, use the plain endpoint address instead (e.g. http://host:port)
-
Configure Prometheus to take config from Tensorflow Serving
-
Change
targets
in prometheus-additional.yaml to point at the Tensorflow Serving REST endpoint -
Create a secret out of the configuration
kubectl create secret generic additional-scrape-configs --from-file=prometheus-additional.yaml -n monitoring -oyaml > additional-scrape-configs.yaml
-
Reference this additional configuration in your Prometheus Configuration
kubectl edit prometheus k8s -n monitoring
apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: name: prometheus labels: prometheus: prometheus spec: replicas: 1 serviceAccountName: prometheus serviceMonitorSelector: + additionalScrapeConfigs: + name: additional-scrape-configs + key: prometheus-additional.yaml ...
-
The default minimal notebook image is already integrated with MLflow
To use the MLflow loggin feature for custom notebooks, you can build your own jupyter image from sachua/jupyter-mlflow:latest
and install your own packages
Alternatively, install MLflow in your custom notebook:
-
Add the code to a cell in your Jupyter Notebook
%%capture !pip install --upgrade pip --user !pip install mlflow[extras] --user %env MLFLOW_TRACKING_URI=http://host:port %env MLFLOW_S3_ENDPOINT_URL=http://host:port %env AWS_ACCESS_KEY_ID=minio %env AWS_SECRET_ACCESS_KEY=minio123
-
Replace
http://host:port
with your MLflow endpoint and MinIO endpoint -
Check endpoints with
kubectl get svc -A
-