Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Github actions for building the docker images #531

Merged
merged 1 commit into from
Feb 25, 2025

Conversation

Paladinium
Copy link
Collaborator

@Paladinium Paladinium commented Feb 23, 2025

So here is the docker build with the following notes:

  • If you want to try an image that is currently pushed to my Docker hub repo, use docker run --rm -it -p 7851:7851 -p 7852:7852 --gpus=all --name alltalk paladinium/alltalk_tts:latest-xtts
  • The workflow runs on pushes to branch alltalkbeta, tags and releases. If tag is present, it not just uses 'latest', but also the corresponding version, e.g. 'v2.0.1'
    • Note that semantic versioning is used here, requiring the tag to start with 'v'
  • Also, a matrix strategy is used to build variations of the image, one for each TTS engine. This is also reflected in the tag, e.g. 'latest-xtts'
  • DeepSpeed is currently pulled from my Google Drive -> needs to be adjusted to a location you control (see this comment)
  • You have to create another Docker Hub repository called erew123/alltalk_tts_environment. This repo stores a the environment (a.k.a. 'base') image providing all important dependencies. This also speeds up the build process.
  • The docker build also creates a cache on Docker Hub. See the last approach here
  • When building for the 1st time or on major changes to the images, the entire build takes about 30min. Then, subsequent builds are done in 10min.

What is missing:

  • Ignored ARM for now (see this comment)
  • Once the images are published on Docker hub, one could change the documentation as well as docker-start.sh to directly pull images that were already built. Normally, there would be no need for a local build.
  • The current images are designed to work with GPUs. I have no idea what happens if the image is run on a host without GPU.

required: true

runs:
using: "composite"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The composite comes in handy as multiple images have to be built: common workflow steps can be shared using this approach.

runs:
using: "composite"
steps:
- name: Free Disk Space (Ubuntu)
Copy link
Collaborator Author

@Paladinium Paladinium Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three is not enough disk space on a standard runner - so we have to free disk space first.

- '.github/workflows/publish-docker*.yml'

- name: Build and push base Docker image
if: |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base image is really only built if some specific files change - this saves a lot of time.

RUN mkdir -p /tmp/deepseped
COPY docker/deepspeed/build/*.whl /tmp/deepspeed/
RUN mkdir -p /tmp/deepspeed
COPY docker/deepspeed/build*/*.whl /tmp/deepspeed/
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This little trick avoids that the build fails of the directory does not exist, which is the case when not building locally.

Dockerfile Outdated
if [ -z "${DEEPSPEED_WHEEL}" ] || [ ! -f $DEEPSPEED_WHEEL ] ; then
echo "Downloading pre-built DeepSpeed wheel"
# TODO: use a Github artifact URL instead of Google drive:
curl "https://drive.usercontent.google.com/download?id=1HluPmdoSaqSRnFfn1CeZE0sfGtkAWosf&confirm=xxx" -o /tmp/deepspeed/deepspeed-0.16.2+b344c04d-cp311-cp311-linux_x86_64.whl
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here is the URL pointing to the DeepSpeed wheel. Of course, we have to change this to a URL that you control.

If you want to keep it simple, you could just create another Github repo storing the DeepSpeed binaries.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've downloaded and put it here in the Releases https://github.com/erew123/alltalk_tts/releases/tag/DeepSpeed-for-docker

I assume that will work? (its where Ive been storing DeepSpeed anyway)

@@ -8,6 +8,7 @@ cd $SCRIPT_DIR
TTS_MODEL=xtts
DOCKER_TAG=latest
CLEAN=false
LOCAL_DEEPSPEED_BUILD=false
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, DeepSpeed is not built locally because it's an intense operation that even brought your PC to its knees.


# Export the entire associative array (needed by Github action)
export ALLTALK_VARS
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was all about making the variables also accessible to the Github action

uses: docker/metadata-action@v5
with:
images: |
${{ env.DOCKERHUB_USERNAME }}/${{ env.DOCKERHUB_REPO_NAME }}_environment
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why you should create a new Docker hub repo erew123/alltalk_tts_environment

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have created and set access (hopefully) for your Docker account:

image

You should also have access to the other Docker "alltalk_tts" there.

@Paladinium Paladinium requested a review from erew123 February 23, 2025 15:20
@@ -0,0 +1,123 @@
name: Publish Docker v2
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, if you want the workflow to be operational, both new files of this PR under .github have to be committe to main.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds perfectly reasonable to me. I believe we (well I) had to do that anyway to get the actions working in the first place.

with:
context: .
file: ./Dockerfile
platforms: "linux/amd64"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could also add linux/arm64 as shown here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea on ARM support for now and no way to test. It would be a big challenge to get that all tested out (is my suspicion), so unless you really feel we need to, lets save this for a future idea.

@Paladinium Paladinium force-pushed the alltalkbeta branch 3 times, most recently from 6d706c8 to eb4accf Compare February 24, 2025 20:33
@@ -75,7 +87,7 @@ EOF
source ~/.bashrc
export TRAINER_TELEMETRY=0
conda activate alltalk
python finetune.py
python -m trainer.distribute --script finetune.py
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erew123 erew123 merged commit 91bf7a2 into erew123:alltalkbeta Feb 25, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants