Adding Github actions for building the docker images #531

Paladinium · 2025-02-23T15:12:15Z

So here is the docker build with the following notes:

If you want to try an image that is currently pushed to my Docker hub repo, use docker run --rm -it -p 7851:7851 -p 7852:7852 --gpus=all --name alltalk paladinium/alltalk_tts:latest-xtts
The workflow runs on pushes to branch alltalkbeta, tags and releases. If tag is present, it not just uses 'latest', but also the corresponding version, e.g. 'v2.0.1'
- Note that semantic versioning is used here, requiring the tag to start with 'v'
Also, a matrix strategy is used to build variations of the image, one for each TTS engine. This is also reflected in the tag, e.g. 'latest-xtts'
DeepSpeed is currently pulled from my Google Drive -> needs to be adjusted to a location you control (see this comment)
You have to create another Docker Hub repository called erew123/alltalk_tts_environment. This repo stores a the environment (a.k.a. 'base') image providing all important dependencies. This also speeds up the build process.
The docker build also creates a cache on Docker Hub. See the last approach here
When building for the 1st time or on major changes to the images, the entire build takes about 30min. Then, subsequent builds are done in 10min.

What is missing:

Ignored ARM for now (see this comment)
Once the images are published on Docker hub, one could change the documentation as well as docker-start.sh to directly pull images that were already built. Normally, there would be no need for a local build.
The current images are designed to work with GPUs. I have no idea what happens if the image is run on a host without GPU.

Paladinium · 2025-02-23T15:13:04Z

.github/actions/docker-setup/action.yml

+    required: true
+
+runs:
+  using: "composite"


The composite comes in handy as multiple images have to be built: common workflow steps can be shared using this approach.

Paladinium · 2025-02-23T15:13:34Z

.github/actions/docker-setup/action.yml

+runs:
+  using: "composite"
+  steps:
+    - name: Free Disk Space (Ubuntu)


Three is not enough disk space on a standard runner - so we have to free disk space first.

Paladinium · 2025-02-23T15:14:32Z

.github/workflows/publish-docker-v2.yml

+              - '.github/workflows/publish-docker*.yml'
+
+      - name: Build and push base Docker image
+        if: |


The base image is really only built if some specific files change - this saves a lot of time.

Paladinium · 2025-02-23T15:15:20Z

Dockerfile

-RUN mkdir -p /tmp/deepseped
-COPY docker/deepspeed/build/*.whl /tmp/deepspeed/
+RUN mkdir -p /tmp/deepspeed
+COPY docker/deepspeed/build*/*.whl /tmp/deepspeed/


This little trick avoids that the build fails of the directory does not exist, which is the case when not building locally.

Paladinium · 2025-02-23T15:17:22Z

Dockerfile

+    if [ -z "${DEEPSPEED_WHEEL}" ] || [ ! -f $DEEPSPEED_WHEEL ] ; then
+      echo "Downloading pre-built DeepSpeed wheel"
+      # TODO: use a Github artifact URL instead of Google drive:
+      curl "https://drive.usercontent.google.com/download?id=1HluPmdoSaqSRnFfn1CeZE0sfGtkAWosf&confirm=xxx" -o /tmp/deepspeed/deepspeed-0.16.2+b344c04d-cp311-cp311-linux_x86_64.whl


So here is the URL pointing to the DeepSpeed wheel. Of course, we have to change this to a URL that you control.

If you want to keep it simple, you could just create another Github repo storing the DeepSpeed binaries.

I've downloaded and put it here in the Releases https://github.com/erew123/alltalk_tts/releases/tag/DeepSpeed-for-docker

I assume that will work? (its where Ive been storing DeepSpeed anyway)

Paladinium · 2025-02-23T15:18:16Z

docker-build.sh

@@ -8,6 +8,7 @@ cd $SCRIPT_DIR
 TTS_MODEL=xtts
 DOCKER_TAG=latest
 CLEAN=false
+LOCAL_DEEPSPEED_BUILD=false


By default, DeepSpeed is not built locally because it's an intense operation that even brought your PC to its knees.

Paladinium · 2025-02-23T15:19:07Z

docker/variables.sh

+
+# Export the entire associative array (needed by Github action)
+export ALLTALK_VARS


This change was all about making the variables also accessible to the Github action

Paladinium · 2025-02-23T15:19:48Z

.github/workflows/publish-docker-v2.yml

+        uses: docker/metadata-action@v5
+        with:
+          images: |
+            ${{ env.DOCKERHUB_USERNAME }}/${{ env.DOCKERHUB_REPO_NAME }}_environment


This is why you should create a new Docker hub repo erew123/alltalk_tts_environment

Have created and set access (hopefully) for your Docker account:

You should also have access to the other Docker "alltalk_tts" there.

Paladinium · 2025-02-23T15:22:56Z

.github/workflows/publish-docker-v2.yml

@@ -0,0 +1,123 @@
+name: Publish Docker v2


Of course, if you want the workflow to be operational, both new files of this PR under .github have to be committe to main.

Sounds perfectly reasonable to me. I believe we (well I) had to do that anyway to get the actions working in the first place.

Paladinium · 2025-02-23T15:24:57Z

.github/workflows/publish-docker-v2.yml

+        with:
+          context: .
+          file: ./Dockerfile
+          platforms: "linux/amd64"


One could also add linux/arm64 as shown here

I have no idea on ARM support for now and no way to test. It would be a big challenge to get that all tested out (is my suspicion), so unless you really feel we need to, lets save this for a future idea.

Paladinium · 2025-02-24T21:02:14Z

Dockerfile

@@ -75,7 +87,7 @@ EOF
 source ~/.bashrc
 export TRAINER_TELEMETRY=0
 conda activate alltalk
-python finetune.py
+python -m trainer.distribute --script finetune.py


See #261 (comment)

Paladinium commented Feb 23, 2025

View reviewed changes

Paladinium requested a review from erew123 February 23, 2025 15:20

Paladinium commented Feb 23, 2025

View reviewed changes

Paladinium force-pushed the alltalkbeta branch 3 times, most recently from 6d706c8 to eb4accf Compare February 24, 2025 20:33

Adding Github actions for building the docker images

ce0e5c0

Paladinium force-pushed the alltalkbeta branch from eb4accf to ce0e5c0 Compare February 24, 2025 20:35

Paladinium commented Feb 24, 2025

View reviewed changes

erew123 merged commit 91bf7a2 into erew123:alltalkbeta Feb 25, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Github actions for building the docker images #531

Adding Github actions for building the docker images #531

Paladinium commented Feb 23, 2025 •

edited

Loading

Paladinium Feb 23, 2025

Paladinium Feb 23, 2025 •

edited

Loading

Paladinium Feb 23, 2025

Paladinium Feb 23, 2025

Paladinium Feb 23, 2025

erew123 Feb 24, 2025

Paladinium Feb 23, 2025

Paladinium Feb 23, 2025

Paladinium Feb 23, 2025

erew123 Feb 24, 2025

Paladinium Feb 23, 2025

erew123 Feb 24, 2025

Paladinium Feb 23, 2025

erew123 Feb 24, 2025

Paladinium Feb 24, 2025


		# Export the entire associative array (needed by Github action)
		export ALLTALK_VARS

Adding Github actions for building the docker images #531

Adding Github actions for building the docker images #531

Conversation

Paladinium commented Feb 23, 2025 • edited Loading

Choose a reason for hiding this comment

Paladinium Feb 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Paladinium commented Feb 23, 2025 •

edited

Loading

Paladinium Feb 23, 2025 •

edited

Loading