Skip to content

Commit

Permalink
Merge pull request #58 from e0ne/precompiled-driver
Browse files Browse the repository at this point in the history
Added precompiled container build instructions for DOCA drivers
  • Loading branch information
rollandf authored Jun 3, 2024
2 parents d1e245d + bc54862 commit 55b18d0
Show file tree
Hide file tree
Showing 5 changed files with 1,524 additions and 0 deletions.
120 changes: 120 additions & 0 deletions docs/advanced-configurations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -304,3 +304,123 @@ If self-signed certificates are used for an HTTPS based internal repository, a C
deploy: true
certConfg:
name: cert-config
=========================================================
Precompiled Container Build Instructions for DOCA Drivers
=========================================================

-------------
Prerequisites
-------------

Before you begin, ensure that you have the following prerequisites:

~~~~~~
Common
~~~~~~

- Docker (Ubuntu) / Podman (RH) installed on your build system.
- Web access to NVIDIA NIC drivers sources. Latest NIC drivers published at `NIC drivers download center <https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/>`_, for example: `https://www.mellanox.com/downloads/ofed/MLNX_OFED-24.04-0.6.6.0/MLNX_OFED_SRC-debian-24.04-0.6.6.0-0.tgz <https://www.mellanox.com/downloads/ofed/MLNX_OFED-24.04-0.6.6.0/MLNX_OFED_SRC-debian-24.04-0.6.6.0-0.tgz>`_


~~~~
RHEL
~~~~

- Active subscription and login credentials for `registry.redhat.io <https://registry.redhat.io>`_. To build RHEL based container from official repository, you need to log in to `registry.redhat.io <https://registry.redhat.io>`_, run the following command:

.. code-block:: bash
podman login registry.redhat.io --username=${RH_USERNAME} --password=${RH_PASSWORD}
Replace `RH_USERNAME` and `RH_PASSWORD` with your Red Hat account username and password.

-------------------
Dockerfile Overview
-------------------

To build the precompiled container, the Dockerfile is constructed in a multistage fashion.
This approach is used to optimize the resulting container image size and reduce the number of dependencies included in the final image.

The Dockerfile consists of the following stages:

1. **Base Image Update**: The base image is updated and common requirements are installed. This stage sets up the basic environment for the subsequent stages.

2. **Download Driver Sources**: This stage downloads the Mellanox OFED driver sources to the specified path. It prepares the necessary files for the driver build process.

3. **Build Driver**: The driver is built using the downloaded sources and installed on the container. This stage ensures that the driver is compiled and configured correctly for the target system.

4. **Install precompiled driver**: Finally, the precompiled driver is installed on clean container. This stage sets up the environment to run the NVIDIA NIC drivers on the target system.


---------------------------------
Common mandatory build parameters
---------------------------------

Before building the container, you need to provide following parameters as `build-arg` for container build:

1. `D_OS`: The Linux distribution (e.g., ubuntu22.04 / rhel9.2)
2. `D_ARCH`: Compiled Architecture
3. `D_BASE_IMAGE`: Base container image
4. `D_KERNEL_VER`: The target kernel version (e.g., 5.15.0-25-generic / 5.14.0-284.32.1.el9_2.x86_64)
5. `D_OFED_VERSION`: NVIDIA NIC drivers version (e.g., 24.01-0.3.3.1)

**NOTE:** Check desired NVIDIA NIC drivers sources[^1] availability for designated container OS, only versions available on download page can be utilized

------------------------------
RHEL-specific build parameters
------------------------------

1. `D_BASE_IMAGE`: DriverToolKit container image

**NOTE:** DTK (DriverToolKit) is tightly coupled with specific kernel versions, verify match between kernel version to compile drivers for, versus DTK image.

2. `D_FINAL_BASE_IMAGE`: Final container image, to install compiled driver

For more details regarding DTK please read `official documentation <https://docs.openshift.com/container-platform/4.15/hardware_enablement/psap-driver-toolkit.html#pulling-the-driver-toolkit-from-payload>`_.

**NOTE:** For proper Network Operator functionality container tag name must be in following pattern: **driver_ver-container_ver-kernel_ver-os-arch**. For example: 24.01-0.3.3.1-0-5.15.0-25-generic-ubuntu22.04-amd64

~~~~~~~~~~~~
RHEL example
~~~~~~~~~~~~

To build RHEL-based image please use provided :download:`Dockerfile <files/RHEL_Dockerfile>`:

.. code-block:: bash
podman build \
--build-arg D_OS=rhel9.2 \
--build-arg D_ARCH=x86_64 \
--build-arg D_KERNEL_VER=5.14.0-284.32.1.el9_2.x86_64 \
--build-arg D_OFED_VERSION=24.01-0.3.3.1 \
--build-arg D_BASE_IMAGE="registry.redhat.io/openshift4/driver-toolkit-rhel9:v4.13.0-202309112001.p0.gd719bdc.assembly.stream" \
--build-arg D_FINAL_BASE_IMAGE=registry.access.redhat.com/ubi9/ubi:latest \
--tag 24.04-0.6.6.0-0-5.14.0-284.32.1.el9_2-rhel9.2-amd64 \
-f RHEL_Dockerfile \
--target precompiled .
~~~~~~~~~~~~~~
Ubuntu example
~~~~~~~~~~~~~~

To build RHEL-based image please use provided :download:`Dockerfile <files/Ubuntu_Dockerfile>`:.

.. code-block:: bash
docker build \
--build-arg D_OS=ubuntu22.04 \
--build-arg D_ARCH=x86_64 \
--build-arg D_BASE_IMAGE=ubuntu:24.04 \
--build-arg D_KERNEL_VER=5.15.0-25-generic \
--build-arg D_OFED_VERSION=24.01-0.3.3.1 \
--tag 24.01-0.3.3.1-0-5.15.0-25-generic-ubuntu22.04-amd64 \
-f Ubuntu_Dockerfile \
--target precompiled .
**NOTE:** Dockerfiles contain default build parameters, which may fail build proccess on your system if not overridden.

**NOTE:** Entrypoint script :download:`download <files/entrypoint.sh>`
**NOTE:** Driver build script :download:`download <files/dtk_nic_driver_build.sh>`

.. warning:: Modification of `D_OFED_SRC_DOWNLOAD_PATH` must be tighdly coupled with corresponding update to entrypoint.sh script.
141 changes: 141 additions & 0 deletions docs/files/RHEL_Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# Common (multistage) args
ARG D_OS="rhel9.2"
ARG D_ARCH="x86_64"
ARG D_CONTAINER_VER="0"
ARG D_OFED_VERSION="24.04-0.6.6.0"
ARG D_KERNEL_VER="5.14.0-284.32.1.el9_2.x86_64"
ARG D_OFED_SRC_DOWNLOAD_PATH="/run/mellanox/src"
ARG OFED_SRC_LOCAL_DIR=${D_OFED_SRC_DOWNLOAD_PATH}/MLNX_OFED_SRC-${D_OFED_VERSION}

# Final clean image of precompiled driver container
ARG D_FINAL_BASE_IMAGE=registry.access.redhat.com/ubi9/ubi:latest

##################################################################
# Stage: Minimal base image update and install common requirements

# DTK base image (below example for specific kernel headers version)
ARG D_BASE_IMAGE="registry.redhat.io/openshift4/driver-toolkit-rhel9:v4.13.0-202309112001.p0.gd719bdc.assembly.stream"
# Standart: registry.access.redhat.com/ubi9:latest

ARG D_PYTHON_VERSION="36"
ARG D_PYTHON="python${D_PYTHON_VERSION}"

FROM $D_BASE_IMAGE AS base

# Inherited global args
ARG D_OS

RUN if [[ "${D_OS}" == *"rhel9"* ]] ; then \
sed -i 's#/etc/pki/entitlement#/etc/pki/entitlement-host#g' /etc/rhsm/rhsm.conf ;\
fi

RUN set -x && \
# Driver build / install script requirements
dnf -y install perl \
# Container functional requirements
jq iproute kmod procps-ng udev

##############################################################################################
# Stage: Download NVIDIA driver sources and install src driver container packages requirements

FROM base AS driver-src

# Inherited global args
ARG D_OFED_VERSION
ARG D_CONTAINER_VER
ARG D_OFED_SRC_DOWNLOAD_PATH

# Stage args
ARG D_OFED_BASE_URL="https://www.mellanox.com/downloads/ofed/MLNX_OFED-${D_OFED_VERSION}"
ARG D_OFED_SRC_TYPE=""

ARG D_OFED_SRC_ARCHIVE="MLNX_OFED_SRC-${D_OFED_SRC_TYPE}${D_OFED_VERSION}.tgz"
ARG D_OFED_URL_PATH="${D_OFED_BASE_URL}/${D_OFED_SRC_ARCHIVE}"

ENV NVIDIA_NIC_DRIVER_VER=${D_OFED_VERSION}
ENV NVIDIA_NIC_CONTAINER_VER=${D_CONTAINER_VER}
ENV NVIDIA_NIC_DRIVER_PATH="${D_OFED_SRC_DOWNLOAD_PATH}/MLNX_OFED_SRC-${D_OFED_VERSION}"

WORKDIR /root
RUN set -x && \
# Install prerequirements
dnf install -y curl --allowerasing \
# Driver build requirements
autoconf python3-devel ethtool automake pciutils libtool hostname

RUN set -x && \
# Download NVIDIA NIC driver sources
mkdir -p ${D_OFED_SRC_DOWNLOAD_PATH} && \
cd ${D_OFED_SRC_DOWNLOAD_PATH} && (curl -sL ${D_OFED_URL_PATH} | tar -xzf -)

WORKDIR /
ADD ./entrypoint.sh /root/entrypoint.sh
ADD ./dtk_nic_driver_build.sh /root/dtk_nic_driver_build.sh

ENTRYPOINT ["/root/entrypoint.sh"]
CMD ["sources"]

#####################
# Stage: Build driver

FROM driver-src AS driver-builder

# Inherited global args
ARG D_OS
ARG D_KERNEL_VER
ARG OFED_SRC_LOCAL_DIR

RUN set -x && \
# MOFED installation requirements
dnf install -y autoconf gcc make rpm-build

# Build driver
RUN set -x && \
${OFED_SRC_LOCAL_DIR}/install.pl --without-depcheck --distro ${D_OS} --kernel ${D_KERNEL_VER} --kernel-sources /lib/modules/${D_KERNEL_VER}/build --kernel-only --build-only --without-iser --without-srp --without-isert --without-knem --without-xpmem --with-mlnx-tools --with-ofed-scripts --copy-ifnames-udev

###################################
# Stage: Install precompiled driver

ARG D_FINAL_BASE_IMAGE

FROM $D_FINAL_BASE_IMAGE AS precompiled

# Inherited global args
ARG D_ARCH
ARG D_KERNEL_VER
ARG D_OFED_VERSION
ARG D_CONTAINER_VER
ARG OFED_SRC_LOCAL_DIR

ENV NVIDIA_NIC_DRIVER_VER=${D_OFED_VERSION}
ENV NVIDIA_NIC_DRIVER_PATH=""
ENV NVIDIA_NIC_CONTAINER_VER=${D_CONTAINER_VER}

COPY --from=driver-builder ${OFED_SRC_LOCAL_DIR}/RPMS/redhat-release-*/${D_ARCH}/*.rpm /root/

WORKDIR /root/
RUN set -x && \
rpm -ivh --nodeps \
./kmod-mlnx-nfsrdma-*.rpm \
./kmod-mlnx-nvme-*.rpm \
./kmod-mlnx-ofa_kernel-*.rpm \
./mlnx-ofa_kernel-*.rpm \
./mlnx-tools-*.rpm

RUN set -x && \
# MOFED functional requirements
dnf install -y pciutils hostname udev ethtool \
# Container functional requirements
jq iproute kmod procps-ng udev

# Prevent modprobe from giving a WARNING about missing files
RUN touch /lib/modules/${D_KERNEL_VER}/modules.order /lib/modules/${D_KERNEL_VER}/modules.builtin && \
# Introduce installed kernel modules
depmod ${D_KERNEL_VER}

WORKDIR /
ADD ./entrypoint.sh /root/entrypoint.sh
ADD ./dtk_nic_driver_build.sh /root/dtk_nic_driver_build.sh

ENTRYPOINT ["/root/entrypoint.sh"]
CMD ["precompiled"]
123 changes: 123 additions & 0 deletions docs/files/Ubuntu_Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Common (multistage) args
ARG D_OS="ubuntu22.04"
ARG D_ARCH="x86_64"
ARG D_CONTAINER_VER="0"
ARG D_OFED_VERSION="24.04-0.6.6.0"
ARG D_KERNEL_VER="5.15.0-25-generic"
ARG D_OFED_SRC_DOWNLOAD_PATH="/run/mellanox/src"
ARG OFED_SRC_LOCAL_DIR=${D_OFED_SRC_DOWNLOAD_PATH}/MLNX_OFED_SRC-${D_OFED_VERSION}

# Common for build and final clean image of precompiled driver container
ARG D_BASE_IMAGE="ubuntu:22.04"

##################################################################
# Stage: Minimal base image update and install common requirements
FROM $D_BASE_IMAGE AS base

ARG D_APT_REMOVE=""
ARG D_OFED_VERSION
ARG D_CONTAINER_VER
ARG D_OFED_SRC_DOWNLOAD_PATH

ENV NVIDIA_NIC_DRIVER_VER=${D_OFED_VERSION}
ENV NVIDIA_NIC_CONTAINER_VER=${D_CONTAINER_VER}

WORKDIR /root
RUN set -x && \
for source in ${D_APT_REMOVE}; do rm -f /etc/apt/sources.list.d/${source}.list; done && \
# Perform distro update and install prerequirements
apt-get -yq update && \
DEBIAN_FRONTEND=noninteractive apt-get -yq upgrade && \
DEBIAN_FRONTEND=noninteractive apt-get -yq install apt-utils \
# Driver build / install script requirements
perl pciutils kmod lsof python3 dh-python \
# Container functional requirements
jq iproute2 udev ethtool

WORKDIR /
ADD ./entrypoint.sh /root/entrypoint.sh

ENTRYPOINT ["/root/entrypoint.sh"]

##############################################################################################
# Stage: Download NVIDIA driver sources and install src driver container packages requirements

FROM base AS driver-src

# Inherited global args
ARG D_OFED_VERSION
ARG D_OFED_SRC_DOWNLOAD_PATH

# Stage args
ARG D_OFED_BASE_URL="https://www.mellanox.com/downloads/ofed/MLNX_OFED-${D_OFED_VERSION}"
ARG D_OFED_SRC_TYPE="debian-"

ARG D_OFED_SRC_ARCHIVE="MLNX_OFED_SRC-${D_OFED_SRC_TYPE}${D_OFED_VERSION}.tgz"
ARG D_OFED_URL_PATH="${D_OFED_BASE_URL}/${D_OFED_SRC_ARCHIVE}"

ENV NVIDIA_NIC_DRIVER_PATH="${D_OFED_SRC_DOWNLOAD_PATH}/MLNX_OFED_SRC-${D_OFED_VERSION}"

WORKDIR /root
RUN set -x && \
# Install prerequirements
DEBIAN_FRONTEND=noninteractive apt-get -yq install curl \
dkms make autoconf autotools-dev chrpath automake hostname debhelper gcc quilt libc6-dev build-essential pkg-config && \
# Cleanup
apt-get clean autoclean && \
rm -rf /var/lib/apt/lists/*

RUN set -x && \
# Download NVIDIA NIC driver sources
mkdir -p ${D_OFED_SRC_DOWNLOAD_PATH} && \
cd ${D_OFED_SRC_DOWNLOAD_PATH} && (curl -sL ${D_OFED_URL_PATH} | tar -xzf -)

CMD ["sources"]

#####################
# Stage: Build driver

FROM driver-src AS driver-builder

# Inherited global args
ARG D_OS
ARG D_KERNEL_VER
ARG OFED_SRC_LOCAL_DIR

# Driver build manadatory packages
RUN set -x && \
apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get -yq install linux-image-${D_KERNEL_VER} linux-headers-${D_KERNEL_VER}

# Build driver
RUN set -x && \
${OFED_SRC_LOCAL_DIR}/install.pl --without-depcheck --distro ${D_OS} --without-dkms --kernel ${D_KERNEL_VER} --kernel-only --build-only --copy-ifnames-udev --with-mlnx-tools --without-knem-modules --without-srp-modules --without-kernel-mft-modules --without-iser-modules --without-isert-modules

###################################
# Stage: Install precompiled driver

FROM base AS precompiled

# Inherited global args
ARG D_OS
ARG D_ARCH
ARG D_KERNEL_VER
ARG OFED_SRC_LOCAL_DIR

ENV NVIDIA_NIC_DRIVER_PATH=""

RUN set -x && \
apt-get install -y lsb-release && \
# Cleanup
apt-get clean autoclean && \
rm -rf /var/lib/apt/lists/*

# Install driver
COPY --from=driver-builder ${OFED_SRC_LOCAL_DIR}/DEBS/${D_OS}/${D_ARCH}/*.deb /root/
RUN dpkg -i /root/*.deb

# Prevent modprobe from giving a WARNING about missing files
RUN touch /lib/modules/${D_KERNEL_VER}/modules.order /lib/modules/${D_KERNEL_VER}/modules.builtin && \
# Introduce installed kernel modules
depmod ${D_KERNEL_VER}

CMD ["precompiled"]
Loading

0 comments on commit 55b18d0

Please sign in to comment.