This is an unofficial fork from Dataiku' dataiku-tools repository.
- Provide up to date automation tools (docker, kubernetes, ansible not supported for now)
- Define service management policies Next steps - training by industry
links to every doc Next steps - training policy
Support levels are defined by Dataiku for each feature and service :
- Supported (default)
- Experimental
- Tier 2 support
- Not supported
- Public Preview
- Deprecated
This is the recommended mode by dataiku and offers a
Create kubectl secret with the following command to enable usage with the dataiku design instance:
kubectl create secret generic kubeconfig-secret --from-file=config -n dataiku
You must also create a secret to access to the container registry:
kubectl -n dataiku create secret docker-registry container-registry-secret --docker-server ${DOCKER_REGISTRY} --docker-username=${DOCKER_USER} --docker-password=${DOCKER_PASSWORD}
set to be dss version to be build (e.g. 13.4.0):
cd dss-docker
docker build --build-arg dssVersion=${DSS_VERSION} -t dataiku:${DSS_VERSION} .
Docker image release note:
- design node is in almalinux 9 but dss will build other containers in almalinux 8 (e2e consistency to be tested)
- supports R
- includes python 3.11 support (3.10+ required for markitdown - note: dataiku don't support python 3.12)
- supports graphics exports
- includes docker & kube binaries to enable to build images for k8s container execution
To run, with data in DSS_DATADIR
(e.g ~/dss
mkdir -p ${DSS_DATADIR}
docker run -d -p 10000:10000 -v ${DSS_DATADIR}:/home/dataiku/dss dataiku:${DSS_VERSION}
cd kubernetes
kubectl apply -f dataiku.yaml
Kubernetes configurqtion release note:
- supports docker-in-docker (dind) with a sidecar container (support) as it is mandatory to enable dssadmin cli to build containers for k8s exectution
Follow official Dataiku documentation for Managed Kubernetes clusters with Google GKE, Amazon EKS or Microsoft Azure AKS or Custom Kubernetes or Openshift clusters
If admin from UI don't work or if you want to enable custom registry and to publish Dockerfiles, here are some commands to use:
kubectl exec -it -n dataiku $(kubectl get -n dataiku all | grep Running | awk '{print $1}') -- bash
cd dss
# build base image
./bin/dssadmin build-base-image --type container-exec --with-py311 --with-py39
# [... build log ...]
# #43 naming to 0.0s done
# #43 DONE 52.7s
# 2025-01-25 00:11:23,029 INFO Done, cleaning up
# Saved to /home/dataiku/dss/tmp/
# Dockerfile should be committed to be audited by SAST tools
docker login ${DOCKER_REGISTRY}
docker tag dku-exec-base-ru4oxgmkpuoy4djmkkuvxfng:dss-13.4.0 ${DOCKER_REGISTRY}/dku-exec-base-ru4oxgmkpuoy4djmkkuvxfng:dss-13.4.0
docker push ${DOCKER_REGISTRY}/dku-exec-base-ru4oxgmkpuoy4djmkkuvxfng:dss-13.4.0
Same thing for cde image
# build cde image
./bin/dssadmin build-base-image --type cde --with-py311 --with-py39
# [... build logs ]
docker login ${DOCKER_REGISTRY}
docker tag dku-cde-base-ru4oxgmkpuoy4djmkkuvxfng:dss-13.4.0 ${DOCKER_REGISTRY}/dku-cde-base-ru4oxgmkpuoy4djmkkuvxfng:dss-13.4.0
docker push ${DOCKER_REGISTRY}/dku-cde-base-ru4oxgmkpuoy4djmkkuvxfng:dss-13.4.0
With those two files built, you will be able to enable following features in Kubernetes:
- VSCode execution within Dataiku
- Jupyter Notebooks
- Recipe code execution
- Webapp / API / ML Scoring API