Skip to content

Commit

Permalink
Merge pull request #3 from guimou/ubi8
Browse files Browse the repository at this point in the history
release 0.1.0
  • Loading branch information
guimou authored Jul 11, 2022
2 parents 8025512 + fbe36c3 commit 890bf94
Show file tree
Hide file tree
Showing 103 changed files with 1,306 additions and 529 deletions.
24 changes: 24 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

Nothing there!

## [0.1.0] - 2022-07-11

### Added

- First official release
- Image for Spark 3.3.0 + Hadoop 3.3.3

### Modified

- All images now based on ubi8/openjdk-8
- Update of Google Spark Operator version (v1beta2-1.3.3-3.1.1) in the instructions
- Images renaming
- Removal of unneeded resources
674 changes: 674 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

8 changes: 8 additions & 0 deletions OWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Each list is sorted alphabetically, additions should maintain that order
approvers:
- guimou
- wseaton

reviewers:
- guimou
- wseaton
180 changes: 120 additions & 60 deletions README.adoc

Large diffs are not rendered by default.

File renamed without changes
Binary file added doc/img/history_server.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
2 changes: 1 addition & 1 deletion spark-history-server/spark-hs-deployment.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
### Adpated from Helm Chart: https://artifacthub.io/packages/helm/spot/spark-history-server
### Adapted from Helm Chart: https://artifacthub.io/packages/helm/spot/spark-history-server
# Source: spark-history-server/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
Expand Down
2 changes: 1 addition & 1 deletion spark-history-server/spark-hs-obc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ metadata:
name: obc-spark-history-server
spec:
generateBucketName: obc-spark-history-server
storageClassName: ocs-storagecluster-ceph-rgw
storageClassName: openshift-storage.noobaa.io
37 changes: 37 additions & 0 deletions spark-images/pyspark-2.4.4_hadoop-2.8.5.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Note: Spark 2.4.4 supports Python up to 3.7 only
# As 3.7 is not available in the ubi8 images, we will install Python 3.6

ARG base_img

FROM $base_img

EXPOSE 8080

ENV PYTHON_VERSION=3.6 \
PATH=$HOME/.local/bin/:$PATH \
PYTHONUNBUFFERED=1 \
PYTHONIOENCODING=UTF-8 \
LC_ALL=en_US.UTF-8 \
LANG=en_US.UTF-8 \
CNB_STACK_ID=com.redhat.stacks.ubi8-python-36 \
CNB_USER_ID=1001 \
CNB_GROUP_ID=0 \
PIP_NO_CACHE_DIR=off

USER 0

RUN INSTALL_PKGS="python36 python36-devel python3-virtualenv python3-setuptools python3-pip \
nss_wrapper httpd httpd-devel mod_ssl mod_auth_gssapi \
mod_ldap mod_session atlas-devel gcc-gfortran libffi-devel \
libtool-ltdl enchant" && \
microdnf -y module enable python36:3.6 httpd:2.4 && \
microdnf -y --setopt=tsflags=nodocs install $INSTALL_PKGS && \
microdnf -y clean all --enablerepo='*' && \
ln -s /usr/bin/python3 /usr/bin/python

ENV PYTHONPATH ${SPARK_HOME}/python/lib/pyspark.zip:${SPARK_HOME}/python/lib/py4j-*.zip

WORKDIR /opt/spark/work-dir
ENTRYPOINT [ "/opt/entrypoint.sh" ]

USER 185
37 changes: 37 additions & 0 deletions spark-images/pyspark-2.4.6_hadoop-3.3.0.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Note: Spark 2.4.6 supports Python up to 3.7 only
# As 3.7 is not available in the ubi8 images, we will install Python 3.6

ARG base_img

FROM $base_img

EXPOSE 8080

ENV PYTHON_VERSION=3.6 \
PATH=$HOME/.local/bin/:$PATH \
PYTHONUNBUFFERED=1 \
PYTHONIOENCODING=UTF-8 \
LC_ALL=en_US.UTF-8 \
LANG=en_US.UTF-8 \
CNB_STACK_ID=com.redhat.stacks.ubi8-python-36 \
CNB_USER_ID=1001 \
CNB_GROUP_ID=0 \
PIP_NO_CACHE_DIR=off

USER 0

RUN INSTALL_PKGS="python36 python36-devel python3-virtualenv python3-setuptools python3-pip \
nss_wrapper httpd httpd-devel mod_ssl mod_auth_gssapi \
mod_ldap mod_session atlas-devel gcc-gfortran libffi-devel \
libtool-ltdl enchant" && \
microdnf -y module enable python36:3.6 httpd:2.4 && \
microdnf -y --setopt=tsflags=nodocs install $INSTALL_PKGS && \
microdnf -y clean all --enablerepo='*' && \
ln -s /usr/bin/python3 /usr/bin/python

ENV PYTHONPATH ${SPARK_HOME}/python/lib/pyspark.zip:${SPARK_HOME}/python/lib/py4j-*.zip

WORKDIR /opt/spark/work-dir
ENTRYPOINT [ "/opt/entrypoint.sh" ]

USER 185
35 changes: 35 additions & 0 deletions spark-images/pyspark-3.0.1_hadoop-3.3.0.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Note: Spark 3.0.1 supports Python up to 3.8, this will be the installed version

ARG base_img

FROM $base_img

EXPOSE 8080

ENV PYTHON_VERSION=3.8 \
PATH=$HOME/.local/bin/:$PATH \
PYTHONUNBUFFERED=1 \
PYTHONIOENCODING=UTF-8 \
LC_ALL=en_US.UTF-8 \
LANG=en_US.UTF-8 \
CNB_STACK_ID=com.redhat.stacks.ubi8-python-38 \
CNB_USER_ID=1001 \
CNB_GROUP_ID=0 \
PIP_NO_CACHE_DIR=off

USER 0

RUN INSTALL_PKGS="python38 python38-devel python38-setuptools python38-pip nss_wrapper \
httpd httpd-devel mod_ssl mod_auth_gssapi mod_ldap \
mod_session atlas-devel gcc-gfortran libffi-devel libtool-ltdl enchant" && \
microdnf -y module enable python38:3.8 httpd:2.4 && \
microdnf -y --setopt=tsflags=nodocs install $INSTALL_PKGS && \
microdnf -y clean all --enablerepo='*' && \
ln -s /usr/bin/python3 /usr/bin/python

ENV PYTHONPATH ${SPARK_HOME}/python/lib/pyspark.zip:${SPARK_HOME}/python/lib/py4j-*.zip

WORKDIR /opt/spark/work-dir
ENTRYPOINT [ "/opt/entrypoint.sh" ]

USER 185
35 changes: 35 additions & 0 deletions spark-images/pyspark-3.3.0_hadoop-3.3.3.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Note: Spark 3.3.0 supports Python up to 3.10, but we will install 3.9

ARG base_img

FROM $base_img

EXPOSE 8080

ENV PYTHON_VERSION=3.9 \
PATH=$HOME/.local/bin/:$PATH \
PYTHONUNBUFFERED=1 \
PYTHONIOENCODING=UTF-8 \
LC_ALL=en_US.UTF-8 \
LANG=en_US.UTF-8 \
CNB_STACK_ID=com.redhat.stacks.ubi8-python-39 \
CNB_USER_ID=1001 \
CNB_GROUP_ID=0 \
PIP_NO_CACHE_DIR=off

USER 0

RUN INSTALL_PKGS="python39 python39-devel python39-setuptools python39-pip nss_wrapper \
httpd httpd-devel mod_ssl mod_auth_gssapi mod_ldap \
mod_session atlas-devel gcc-gfortran libffi-devel libtool-ltdl enchant" && \
microdnf -y module enable python39:3.9 httpd:2.4 && \
microdnf -y --setopt=tsflags=nodocs install $INSTALL_PKGS && \
microdnf -y clean all --enablerepo='*' && \
ln -s /usr/bin/python3 /usr/bin/python

ENV PYTHONPATH ${SPARK_HOME}/python/lib/pyspark.zip:${SPARK_HOME}/python/lib/py4j-*.zip

WORKDIR /opt/spark/work-dir
ENTRYPOINT [ "/opt/entrypoint.sh" ]

USER 185
32 changes: 0 additions & 32 deletions spark-images/pyspark.Dockerfile

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,13 +1,21 @@
FROM openjdk:8-jdk-alpine AS builder
FROM registry.access.redhat.com/ubi8/openjdk-8:1.13 AS builder

# set desired spark, hadoop and kubernetes client versions
ARG spark_version=2.4.4
ARG hadoop_version=2.8.5
ARG kubernetes_client_version=4.6.4
ARG jmx_prometheus_javaagent_version=0.15.0
ARG aws_java_sdk_version=1.11.682
ARG aws_java_sdk_version=1.12.255
ARG spark_uid=185

USER 0

WORKDIR /

# Install gzip to extract archives
RUN microdnf install -y gzip && \
microdnf clean all

# Download Spark
ADD https://archive.apache.org/dist/spark/spark-${spark_version}/spark-${spark_version}-bin-without-hadoop.tgz .
# Unzip Spark
Expand Down Expand Up @@ -35,12 +43,19 @@ ADD https://repo1.maven.org/maven2/io/fabric8/kubernetes-model/${kubernetes_clie

RUN chmod 0644 jars/kubernetes-*.jar

# Install aws-java-sdk
# Delete old aws-java-sdk and replace with newer version
WORKDIR /hadoop/share/hadoop/tools/lib
RUN rm -f ./aws-java-sdk-*.jar
ADD https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/${aws_java_sdk_version}/aws-java-sdk-bundle-${aws_java_sdk_version}.jar .
RUN chmod 0644 aws-java-sdk-bundle*.jar

FROM openjdk:8-jdk-alpine as final
FROM registry.access.redhat.com/ubi8/openjdk-8:1.13 as final

# Fix for https://issues.redhat.com/browse/OPENJDK-335
ENV NSS_WRAPPER_PASSWD=
ENV NSS_WRAPPER_GROUP=

USER 0

WORKDIR /opt/spark

Expand All @@ -54,18 +69,33 @@ COPY --from=builder /hadoop /opt/hadoop
# Copy Prometheus jars from builder stage
COPY --from=builder /prometheus /prometheus

# Add an init process, check the checksum to make sure it's a match
RUN set -e ; \
TINI_BIN=""; \
TINI_SHA256=""; \
TINI_VERSION="v0.19.0"; \
case "$(arch)" in \
x86_64) \
TINI_BIN="tini-amd64"; \
TINI_SHA256="93dcc18adc78c65a028a84799ecf8ad40c936fdfc5f2a57b1acda5a8117fa82c"; \
;; \
aarch64) \
TINI_BIN="tini-arm64"; \
TINI_SHA256="07952557df20bfd2a95f9bef198b445e006171969499a1d361bd9e6f8e5e0e81"; \
;; \
*) \
echo >&2 ; echo >&2 "Unsupported architecture \$(arch)" ; echo >&2 ; exit 1 ; \
;; \
esac ; \
curl --retry 8 -S -L -O "https://github.com/krallin/tini/releases/download/${TINI_VERSION}/${TINI_BIN}" ; \
echo "${TINI_SHA256} ${TINI_BIN}" | sha256sum -c - ; \
mv "${TINI_BIN}" /usr/sbin/tini ; \
chmod +x /usr/sbin/tini

RUN set -ex && \
apk upgrade --no-cache && \
ln -s /lib /lib64 && \
apk add --no-cache bash tini libc6-compat linux-pam nss && \
mkdir -p /opt/spark && \
mkdir -p /opt/spark/work-dir && \
touch /opt/spark/RELEASE && \
rm /bin/sh && \
ln -sv /bin/bash /bin/sh && \
echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \
chgrp root /etc/passwd && chmod ug+rw /etc/passwd && \
rm -rf /var/cache/apt/*
touch /opt/spark/RELEASE

# Configure environment variables for spark
ENV SPARK_HOME /opt/spark
Expand Down
Loading

0 comments on commit 890bf94

Please sign in to comment.