This repo demonstrates spark on kubernetes with a custom Kubernetes REST API made from scratch. Apache Spark version 2.4.4 is used in this demo project.
- Spark Docker
- Docker
The spark docker directory consists of Apache Spark Scala and Python base images through which
one can submit the spark applications into kubernetes. We will be using the Spark
kubernetes operator for the demo.
The Scala base image can be built using the following command:
docker build -t <IMAGE_NAME>:<IMAGE_VERSION> ./spark-docker/scala/
Once the image is built, it can be tagged and pushed to respective docker repository.
The Python base image can be built using the following command:
docker build -t <IMAGE_NAME>:<IMAGE_VERSION> --build-args base_image=<SCALA_SPARK_BASE_IMAGE> ./spark-docker/python/
The Scala Spark base image is required for building the Python Spark base image. Once the image is built, it can be tagged and pushed to respective docker repository.
WORK IN PROGRESS
WORK IN PROGRESS