Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CI #5

Merged
merged 6 commits into from
Apr 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 8 additions & 13 deletions .github/workflows/docker-build-test-upload.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,11 @@ on:
required: false
type: string
default: default
context:
description: Path to Dockerfile location
required: false
type: string
default: default
registry:
description: Registry
required: false
Expand Down Expand Up @@ -106,22 +111,12 @@ jobs:
# generate dockerfile
cd base-gpu-notebook && bash generate_dockerfile.sh && cd ..

- name: Build base image 🛠
if: contains(inputs.image, 'base-gpu-notebook')
id: build_base_image
uses: docker/build-push-action@v5
with:
context: ${{ inputs.image }}/.build/${{ inputs.variant }}/
push: ${{ inputs.push }}
tags: ${{ inputs.registry }}/${{ env.OWNER }}/${{ inputs.image }}:${{ inputs.variant }}

- name: Build image 🛠
if: |
inputs.parent-image != '' || !contains(inputs.image, 'base-gpu-notebook')
if: inputs.parent-image != ''
id: build_image
uses: docker/build-push-action@v5
with:
context: ${{ inputs.image }}/${{ inputs.variant }}/
context: ${{ inputs.context }}
push: ${{ inputs.push }}
tags: ${{ inputs.registry }}/${{ env.OWNER }}/${{ inputs.image }}:${{ inputs.variant }}

Expand All @@ -136,7 +131,7 @@ jobs:
if: inputs.parent-image != ''
run: |
mkdir -p /tmp/a2s/images/
docker save ${{ env.registry }}/${{ env.OWNER }}/${{ inputs.image }}:${{ inputs.variant }} | zstd > /tmp/a2s/images/${{ inputs.image }}--${{ inputs.variant }}.tar.zst
docker save ${{ inputs.registry }}/${{ env.OWNER }}/${{ inputs.image }}:${{ inputs.variant }} | zstd > /tmp/a2s/images/${{ inputs.image }}--${{ inputs.variant }}.tar.zst
shell: bash

- name: Upload image as artifact 💾
Expand Down
9 changes: 8 additions & 1 deletion .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ on:
branches:
- main
- master
- ci-test
paths:
- ".github/workflows/docker.yml"
- ".github/workflows/docker-build-test-upload.yml"
Expand Down Expand Up @@ -56,6 +57,7 @@ jobs:
parent-variant: cuda11-pytorch-2.2.2
image: base-gpu-notebook
variant: cuda11-pytorch-2.2.2
context: base-gpu-notebook/.build/cuda11-pytorch-2.2.2/
push: ${{ github.event_name == 'push' }}
runs-on: ubuntu-latest

Expand All @@ -68,6 +70,7 @@ jobs:
parent-variant: cuda11-pytorch-2.2.2
image: ml-notebook
variant: cuda11-pytorch-2.2.2
context: ml-notebook/cuda11-pytorch-2.2.2/
push: ${{ github.event_name == 'push' }}
runs-on: ubuntu-latest

Expand All @@ -80,6 +83,7 @@ jobs:
parent-variant: cuda11-pytorch-2.2.2
image: nlp-notebook
variant: cuda11-pytorch-2.2.2
context: nlp-notebook/cuda11-pytorch-2.2.2/
push: ${{ github.event_name == 'push' }}
runs-on: ubuntu-latest

Expand All @@ -103,6 +107,7 @@ jobs:
parent-variant: cuda12-pytorch-2.2.2
image: base-gpu-notebook
variant: cuda12-pytorch-2.2.2
context: base-gpu-notebook/.build/cuda12-pytorch-2.2.2/
push: ${{ github.event_name == 'push' }}
runs-on: ubuntu-latest

Expand All @@ -115,6 +120,7 @@ jobs:
parent-variant: cuda12-pytorch-2.2.2
image: ml-notebook
variant: cuda12-pytorch-2.2.2
context: ml-notebook/cuda12-pytorch-2.2.2/
push: ${{ github.event_name == 'push' }}
runs-on: ubuntu-latest

Expand All @@ -124,9 +130,10 @@ jobs:
uses: ./.github/workflows/docker-build-test-upload.yml
with:
parent-image: ml-notebook
parent-variant: cuda11-pytorch-2.2.2
parent-variant: cuda12-pytorch-2.2.2
image: nlp-notebook
variant: cuda12-pytorch-2.2.2
context: nlp-notebook/cuda12-pytorch-2.2.2/
push: ${{ github.event_name == 'push' }}
runs-on: ubuntu-latest

65 changes: 44 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,67 @@
[<!--lint ignore no-dead-urls-->![Release cuda11.3.1-ubuntu20.04](https://github.com/a2s-institute/docker-stacks/actions/workflows/cuda11.3.1-ubuntu20.04.yml/badge.svg)](https://github.com/a2s-institute/docker-stacks/actions?workflow=cuda11.3.1-ubuntu20.04)
[<!--lint ignore no-dead-urls-->![Release cuda11.8.0-ubuntu22.04](https://github.com/a2s-institute/docker-stacks/actions/workflows/cuda11.8.0-ubuntu22.04.yml/badge.svg)](https://github.com/a2s-institute/docker-stacks/actions?workflow=cuda11.8.0-ubuntu22.04)
[![Docker Repository on Quay](https://quay.io/repository/a2s-institute/docker-stacks/gpu-notebook/status "Docker Repository on Quay")](https://quay.io/repository/a2s-institute/docker-stacks/gpu-notebook)
# A2S Institute Docker Images

# a2s-institute docker images
Our stacks provide GPU-enabled Jupyter Notebook in Docker containers, which can also run on Kubernetes. The images are based on [Jupyter docker-stacks jupyter/pytorch-notebook](https://github.com/jupyter/docker-stacks/tree/main/images/pytorch-notebook). All images are published on our [ghcr.io](https://github.com/orgs/a2s-institute/packages) and [quay.io](https://quay.io/user/a2s-institute/).

Our stacks provide GPU-enabled Jupyter Notebook in Docker containers, which can also be run on Kubernetes. The image is based on [released cuda version](https://hub.docker.com/r/nvidia/cuda/tags?page=1&name=12.) on docker hub and the Jupyter stacks are based on [jupyter/docker-stacks](https://github.com/jupyter/docker-stacks/). All images are published on our [github registry](https://github.com/orgs/a2s-institute/packages).
The stacks contain several machine learning packages such as TensorFlow, PyTorch, scikit-learn, and other machine learning tools. All images also include VSCode and xfce4 desktop environment.

The stacks contain several machine learning packages such as TensorFlow, PyTorch, scikit-learn, and other machine learning tools.
## Docker stack structure
* [gpu-base-notebook](https://github.com/a2s-institute/docker-stacks/tree/master/base-gpu-notebook): contains Jupyter related libraries and also includes different cuda and pytorch versions. It also has VSCode and xfce4 desktop environment.
* [ml-notebook](https://github.com/a2s-institute/docker-stacks/tree/master/ml-notebook): depends on `gpu-base-notebook` and includes several machine learning libaries such as TensorfLow, Keras, scipy, opencv, etc.
* [nlp-notebook](https://github.com/a2s-institute/docker-stacks/tree/master/nlp-notebook): depends on `ml-notebook` and includes NLP libraries such as spaCy, NLTK, llama-cpp-python and wikipedia-api.

## Building and running gpu-notebook in a local Docker container
## Avilable versions
* `gpu-base-notebook:cuda11-pytorch-2.2.2`
* `gpu-base-notebook:cuda12-pytorch-2.2.2`
* `ml-notebook:cuda11-pytorch-2.2.2`
* `ml-notebook:cuda12-pytorch-2.2.2`
* `nlp-notebook:cuda11-pytorch-2.2.2`
* `nlp-notebook:cuda12-pytorch-2.2.2`

<details>
<summary><font color=blue> Older images</font></summary>

- `ghcr.io/a2s-institute/docker-stacks/gpu-notebook:cuda11.3.1-ubuntu22.04` (no vscode and xfce desktop)
- `ghcr.io/a2s-institute/docker-stacks/gpu-notebook:cuda11.8.0-ubuntu22.04` (no vscode and xfce desktop)
- `ghcr.io/a2s-institute/docker-stacks/gpu-notebook:cuda12.1.0-ubuntu22.04` (no vscode and xfce desktop)
- `ghcr.io/a2s-institute/docker-stacks/gpu-notebook:cuda12.1.0-ubuntu22.04` (no vscode and xfce desktop)

</details>

## Building and running A2S images locally

The base image contains several packages for deep learning projects with NVidia GPU support.

* Build notebook image with gpu support
```
bash build_and_publish.sh --registry ghcr.io --publish "" --cuda-version cuda11.8.0-ubuntu22.04
```
# cuda11 and pytorch 2.2.2
bash build_and_publish.sh --registry ghcr.io --publish "" \
--image gpu-base-notebook --tag cuda11-pytorch-2.2.2

You can build this image using different cuda versions available [here](https://hub.docker.com/r/nvidia/cuda/tags).
# cuda12 and pytorch 2.2.2
bash build_and_publish.sh --registry ghcr.io --publish "" \
--image gpu-base-notebook --tag cuda12-pytorch-2.2.2
```

* Run the image locally
```
docker run --gpus all --name gpu-notebook -it --rm -d -p 8880:8888 ghcr.io/b-it-bots/docker/gpu-notebook:cuda11.8.0-ubuntu22.04
# with GPU
docker run --gpus all --name ml-notebook -it --rm -d -p 8888:8888 \
quay.io/ml-notebook:cuda12-pytorch-2.2.2

# without GPU
docker run --name ml-notebook -it --rm -d -p 8888:8888 \
quay.io/ml-notebook:cuda12-pytorch-2.2.2
```

* Login to the container
* Check Jupyter Notebook token via log and open the link
```
docker exec -ti gpu-notebook bash
docker logs --follow ml-notebook

# check nvidia
nvidia-smi
```

## Available images

* `cuda11.3.1-ubuntu20.04` (python=3.10, pytorch=1.12.1)
* `cuda11.8.9-ubuntu22.04` (python=3.11, pytorch=2.0.0)

## Monitoring

You can monitor the GPU usage using nvtop

![nvtop gpu monitoring](figures/nvtop.png)
<img src="figures/nvtop.png" alt="nvtop gpu monitoring" width="640">

5 changes: 2 additions & 3 deletions build_and_publish.sh
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,10 @@ parse_args() {
if [ -z "$CONTAINER_REGISTRY" ]
then
echo "Container registry is not set!. Using docker hub registry"
CONTAINER_REG_OWNER=ghcr.io/a2s-institute/docker-stacks
CONTAINER_REG_OWNER=quay.io/a2s-institute
else
echo "Using $CONTAINER_REGISTRY registry"
OWNER=a2s-institute/docker-stacks
CONTAINER_REG_OWNER=$CONTAINER_REGISTRY/$OWNER
CONTAINER_REG_OWNER=$CONTAINER_REGISTRY/a2s-institute
fi

echo "Container registry/owner = $CONTAINER_REG_OWNER"
Expand Down
6 changes: 3 additions & 3 deletions ml-notebook/cuda11-pytorch-2.2.2/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ LABEL maintainer="Mohammad Wasil <[email protected]>"
USER root

# Install apt packages

RUN curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - && \
echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

RUN apt update -y && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
"deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | tee /etc/apt/sources.list.d/coral-edgetpu.list && \
apt install -y edgetpu-compiler && \
apt install -y libxkbcommon0 libxkbcommon-x11-0 && \
apt install -y build-essential && \
Expand Down
5 changes: 3 additions & 2 deletions ml-notebook/cuda12-pytorch-2.2.2/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,10 @@ LABEL maintainer="Mohammad Wasil <[email protected]>"
USER root

# Install apt packages
RUN curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - && \
echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

RUN apt update -y && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | tee /etc/apt/sources.list.d/coral-edgetpu.list && \
apt install -y edgetpu-compiler && \
apt install -y libxkbcommon0 libxkbcommon-x11-0 && \
apt install -y build-essential && \
Expand Down