Arkouda on Docker and Kubernetes
The Kubernetes deployment of Arkouda involves the following:
- Packaging the Arkouda application stack within a tar.gz file
- Building the Arkouda Docker images
- Helm install the arkouda-locale to deploy the locale containers
- Helm install arkouda-server Helm to deploy the arkouda_server
Since Arkouda is deployed on Kubernetes and therefore packaged within Docker images, dockerized Arkouda can also be deployed as Docker containers via the docker run command.
The k8s-enterprise branch of my fork contains all files associated with dockerized Arkouda that can be run directly as Docker containers or deployed on Kubernetes
A system with at least 4 CPU cores and 16GB of RAM is required to build the docker images.
The Kubernetes deployment of containerized Arkouda is within the k8s-enterprise branch. The first steps are to clone hokiegeek2/arkouda and checkout the k8s-enterprise branch:
git clone https://github.com/hokiegeek2/arkouda.git
git checkout k8s-enterprise
To ensure the Arkouda version is correct, the Docker build process utilizes a tar file of the client as well as server code. Within the Arkouda project root directory, execute the following command, which builds a tar.gz Arkouda distribution:
python3 -m build
With the tar.gz file built, set the ARKOUDA_DIST_FOLDER env variable:
export ARKOUDA_DIST_FOLDER=$(tar --exclude="*/*" -tf dist/arkouda*.gz)
Finally, move the .tar.gz file to the Arkouda project root folder as follows (tar.gz filename will likely be different):
mv dist/*.tar.gz arkouda.tar.gz
For a single-locale Arkouda instance operating within one Docker container, the Arkouda server image is built off of chapel/chapel-gasnet-smp. The build command is as follows:
export ARKOUDA_DIST_FOLDER=<result of tar -tf arkouda.tar.gz>
export VERSION=<desired version number>
docker build --build-arg ARKOUDA_DIST_FOLDER=$ARKOUDA_DIST_FOLDER -f arkouda-server-docker -t hokiegeek2/arkouda-server:$VERSION .
The arkouda-server Docker image is run locally as follows:
export VERSION=<desired version number>
docker run -it --rm -p 5555:5555 hokiegeek2/arkouda-server:$VERSION
Single-locale arkouda-server images are available here
The arkouda-client docker contains solely the Arkouda Python client installed via pip and presents an ipython notebook on startup. The arkouda-client Docker images is built as follows:
export ARKOUDA_DIST_FOLDER=<result of tar -tf arkouda.tar.gz>
export VERSION=<desired version number>
docker build --build-arg ARKOUDA_DIST_FOLDER=$ARKOUDA_DIST_FOLDER -f arkouda-client-docker -t hokiegeek2/arkouda-client:$VERSION .
The arkouda-client is run locally as follows:
export VERSION=<desired version number>
docker run -it --rm --network=host hokiegeek2/arkouda-client:$VERSION
Python 3.7.7 (default, Jun 9 2020, 18:08:39)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import arkouda as ak
In [2]: ak.connect()
connected to arkouda server tcp://*:5555
arkouda-client docker images are available here
The arkouda-full-stack Cocker image contains both the Arkouda Python client and Chapel server. Upon startup, a one-locale Arkouda server instance is launched and an ipython interface is presented.
The full stack Arkouda docker image is built as follows:
export ARKOUDA_DIST_FOLDER=<result of tar -tf arkouda.tar.gz>
export VERSION=<desired version number>
docker build --build-arg ARKOUDA_DIST_FOLDER=$ARKOUDA_DIST_FOLDER -f arkouda-full-stack -t hokiegeek2/arkouda-full-stack:$VERSION .
The full stack Arkouda image is run locally as follows:
docker run -it hokiegeek2/arkouda-full-stack
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import arkouda as ak
In [2]: ak.connect()
connected to arkouda server tcp://*:5555
The full-stack docker images are available here
For multilocale Arkouda distributed across 1..n hosts, there are two Arkouda Docker images: udp-arkouda-server-base and the udp-arkouda-server. The udp-arkouda-server-base image contains the desired Chapel version and corresponding dependencies while the udp-arkouda-server image adds the Arkouda build suitable for deployment on Kubernetes.
export VERSION=0.4.5
export ARKOUDA_DIST_FOLDER=arkouda-2021.8.20+149.gc064829.dirty
-f udp-dynamic-arkouda-server -t hokiegeek2/udp-arkouda-server:$VERSION .
The udp-arkouda-server images are available here
All Kubernetes deployments are executed as Helm charts and all charts are within folders accessible directly from the $ARKOUDA_HOME directory
helm install -n arkouda arkouda arkouda-server-chart/
The multilocale Arkouda server deployment involves two Helm charts: arkouda-locale and arkouda-server and features a mechanism by which the cluster is registered with Kubernetes as a service.
Since the multilocale Arkouda deployment (1) involves registration with Kubernetes and (2) uses the ssh CHPL_COMM_SUBSTRATE, there are two steps to preparing a Kubernetes cluster for hosting Arkouda-on-k8s.
A Kubernetes user with Kubernetes client API read/write access is required for Arkouda to register with k8s upon startup which involves (1) creating an SSL crt/key pair and loading it as a secret into kubernetes (2) creating an arkouda user with the created crt/key pair as credentials and finally (3) assigning an RBAC with Kubernetes client API read/write privileges to the arkouda user. Note: the crt/key pair secret must be installed to the namespace Arkouda is to be deployed as secrets are namespace-scoped. Also, when creating the arkouda.csr file, specify arkouda as the common name (CN).
# Create cert
openssl genrsa -out arkouda.key 2048
openssl req -new -key arkouda.key -out arkouda.csr
# Sign with Kubernetes CA
sudo openssl x509 -req -in arkouda.csr -CA /etc/kubernetes/ssl/kube-ca.pem \
-CAkey /etc/kubernetes/ssl/kube-ca-key.pem -CAcreateserial -out arkouda.crt -days 730
# Create the arkouda user with the generated credentials
kubectl config set-credentials arkouda --client-certificate=arkouda.crt --client-key=arkouda.key
kubectl apply -f $ARKOUDA_HOME/credentials/arkouda-rbac.yaml
# Create the Kubernetes API secret
kubectl create secret tls arkouda-server --cert=arkouda.crt --key=arkouda.key -n arkouda
An SSH key pair deployed within Kubernetes as a secret is required for all Arkouda locales to startup via the GASNET udp with the S (SSH) spawner. Note: the ssh key secret must be installed to the namespace Arkouda is to be deployed as secrets are namespace-scoped.
# Create and SSH cert for the ubuntu user
$ sudo su ubuntu
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/ubuntu/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/ubuntu/.ssh/id_rsa.
Your public key has been saved in /home/ubuntu/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:WlMCThVDnCzqz5n/dCDVHW0h4grCt/nJbBwHpOSEQ78 ubuntu@ace
The key's randomart image is:
+---[RSA 2048]----+
| ..+B=+ . ..o.|
| .=*.*.......o|
| +oB..o.. .. |
| . o =+o |
| . ESo.. |
| . o=o+. |
| o.o B. . |
| = .. . |
| .... |
# Mount the SSH key pair as the arkouda-ssh secret
kubectl create secret generic arkouda-ssh --from-file=~/.ssh/id_rsa --from-file=~/.ssh/id_rsa.pub -n arkouda
As mentioned above, there are two Arkouda Helm deployments that compose a multi-locale k8s Arkouda deployment: arkouda-locale and arkouda-server. IMPORTANT: arkouda-locale Helm install must be executed and completed prior to the arkouda-server deployment. The reason for this is arkouda-server must know the IP addresses of all locales prior to bringing up the cluster via he GASNET SSH_SPAWNER.
There are several values that need to specified in both the arkouda-locale and arkouda-server values.yaml files:
- releaseVersion--the udp-dynamic-server docker image version
- resources--the cpu core and memory resources for the arkouda-locale container
- persistence--if true, the path and hostPath parameters must be set
- path--path within container arkouda files are written to
- hostPath--path on host node or distributed file system that maps to container path
- certFile--path to the ssl cert file used to connect to the Kubernetes API for Arklouda register/deregister
- keyFile--path to the ssl key file used to connect to the Kubernetes API for Arkouda register/deregister
- k8sHost--Kubernetes API url used to register/deregister Arkouda with k8s
- namespace--the namespace Arkouda is deployed to
- externalSystemName--the k8s service name corresponding to the k8s Arkouda deployment
The k8sHost value is retrieved as follows:
$ kubectl cluster-info
Kubernetes control plane is running at
CoreDNS is running at
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The arkouda-locale values.yaml file (shown below) contains one element other than those listed above:
numLocales--the number of arkouda-locale containers to deploy. Note: the size of the Arkouda cluster is numLocales + 1, where 1 is the single arkouda-server container instance
######################## Pod Settings ########################
releaseVersion: 0.2.9
imagePullPolicy: IfNotPresent
################ Arkouda Server Configuration ################
port: 5555
numLocales: 2
authenticate: false
verbose: false
memTrack: true
cpu: 2000m
memory: 2048Mi
cpu: 2000m
memory: 2048Mi
threadsPerLocale: 4
enabled: false
certFile: /etc/ssl/arkouda/tls.crt
keyFile: /etc/ssl/arkouda/tls.key
namespace: arkouda
externalServiceName: arkouda
externalServicePort: 5555
As stated above, deploying multi-locale Arkouda on Kubernetes involves a sequence of two Helm installs: arkouda-locale followed by arkouda-server
The arkouda-locale Helm install deploys the Arkouda containers that host the Chapel locales when the arkouda_server process is started. The arkouda-locale Helm chart is located within the $ARKOUDA_HOME/multilocale-dynamic-arkouda-locale-chart directory.
The Helm install command is as follows:
helm install -n arkouda arkouda-locale multilocale-dynamic-arkouda-locale-chart/
The arkouda-server Helm deployment starts the arkouda_server process which in turns starts up Arkouda Chapel locales within the previously-deployed arkouda-locale containers. The arkouda-server Helm chart is located within the $ARKOUDA_HOME/multilocale-dynamic-arkouda-server-chart directory.
The Helm install command is as follows:
helm install -n arkouda arkouda-server multilocale-dynamic-arkouda-server-chart/
Upon deployment of arkouda-server, the Arkouda-on-Kubernetes instance registers with Kubernetes to enable service discovery by Python clients deployed within Kubernetes.
Just as the order of deploying arkouda-locale and arkouda-server is critically important, so is the undeployment order. Specifically, the arkouda-server deployment must be deleted first; the reason for this is the arkouda-server container deregisters Arkouda from Kubernetes, which has to happen prior to the Arkouda cluster going down.
Persistence must be enabled in both the arkodua-locale and arkouda-server Helm deployments to enable saving of array data across all locales. The configuration from each values.yaml file is as follows:
An example write-read sequence is as follows:
The udp-dynamic-arkouda-server Docker image can be tested outside the context of Kubernetes by running locally. The procedure to running the Docker container and starting up the arkouda_server is a relatively straightforward process involving the following steps:
# Start docker container with port 5555 exposed:
docker run -it --rm -p 5555:5555 --entrypoint=bash hokiegeek2/udp-arkouda-server:0.2.0
# Start ssh server (needed to spawn the Arkouda Chapel locale)
sudo service ssh start
# Create ubuntu user cert and specify corresponding key within authorized_hosts file
ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa <<< y
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
# Startup one-node Arkouda instance
./arkouda_server -nl 1
Warning: Permanently added '' (ECDSA) to the list of known hosts.
* *
* server listening on tcp://e80d237d726a:5555 *
* arkouda server version = *
* memory limit = 29975452876 *
* bytes of memory used = 42641 *
* *
# Test from locale machine
>>> import arkouda as ak
_ _ _
/ \ _ __| | _____ _ _ __| | __ _
/ _ \ | '__| |/ / _ \| | | |/ _` |/ _` |
/ ___ \| | | < (_) | |_| | (_| | (_| |
/_/ \_\_| |_|\_\___/ \__,_|\__,_|\__,_|
Client Version: v2021.08.20+151.g9a55747.dirty
>>> ak.connect()
connected to arkouda server tcp://*:5555
>>> ak.get_config()
{'arkoudaVersion': 'v2021.08.20+151.g9a55747.dirty', 'ZMQVersion': '4.3.2', 'HDF5Version': '1.10.5',
'serverHostname': 'ec0ee910eb63', 'ServerPort': 5555, 'MetricsServerPort': 5556, 'numLocales': 1, 'numPUs': 4,
'maxTaskPar': 4, 'physicalMemory': 33306058752, 'distributionType': 'BlockDom(1,int(64),false,unmanaged DefaultDist)',
'LocaleConfigs': [{'id': 0, 'name': 'ec0ee910eb63', 'numPUs': 4, 'maxTaskPar': 4, 'physicalMemory': 33306058752}],
'authenticate': False, 'logLevel': 'INFO', 'byteorder': 'little', 'connectHostname': 'ec0ee910eb63',
'connectHostIp': ''}