Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Dedicated docs on how to skip building an image on pipeline run #3079

Merged
merged 26 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
ae03fd1
add some info on docker skip build
wjayesh Oct 8, 2024
73dd4e0
add docs on not building a docker image
wjayesh Oct 14, 2024
7687aae
update toc and title
wjayesh Oct 14, 2024
ddf105c
added text to stress that this doesnt always happen
wjayesh Oct 14, 2024
057aa73
Apply suggestions from code review
wjayesh Oct 14, 2024
1df5ba3
restructure headings
wjayesh Oct 14, 2024
b69e138
Merge branch 'docs/docker-skip-build' of https://github.com/zenml-io/…
wjayesh Oct 14, 2024
bbe9e95
more english
wjayesh Oct 14, 2024
516a214
Apply suggestions from code review
wjayesh Oct 15, 2024
0b966c1
Merge branch 'docs/docker-skip-build' of https://github.com/zenml-io/…
wjayesh Oct 14, 2024
d373e44
apply review changes
wjayesh Oct 16, 2024
563cb04
add how to reuse builds page
wjayesh Oct 16, 2024
75d947c
aoply hamza comments
wjayesh Oct 16, 2024
44dc550
add redirect for new page name
wjayesh Oct 16, 2024
e5cd75e
apply review changes
wjayesh Oct 16, 2024
a369a8c
move the artifact store block to the top
wjayesh Oct 16, 2024
b626e21
update redirect
wjayesh Oct 16, 2024
d2acb0a
add scarf
wjayesh Oct 16, 2024
a3d8da2
Update .gitbook.yaml
wjayesh Oct 17, 2024
d9daabc
link to code repository
wjayesh Oct 16, 2024
1a0d4dc
Merge branch 'develop' into docs/docker-skip-build
wjayesh Oct 17, 2024
b382722
Merge branch 'develop' into docs/docker-skip-build
wjayesh Oct 17, 2024
a958607
fix relative link
wjayesh Oct 16, 2024
dcbfd8d
Apply suggestions from code review
wjayesh Oct 17, 2024
a407181
Merge branch 'develop' into docs/docker-skip-build
wjayesh Oct 17, 2024
740150d
add where the code should be added
wjayesh Oct 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ def my_pipeline(...):
```

{% hint style="warning" %}
This is an advanced feature and may cause unintended behavior when running your pipelines. If you use this, ensure your code files are correctly included in the image you specified.
This is an advanced feature and may cause unintended behavior when running your pipelines. If you use this, ensure your code files are correctly included in the image you specified. Read in detail about this feature [here](./use-a-prebuilt-image.md) before proceeding.
{% endhint %}

<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
113 changes: 113 additions & 0 deletions docs/book/how-to/customize-docker-builds/use-a-prebuilt-image.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
description: "Skip building an image for your ZenML pipeline altogether."
---

# Use a prebuilt image for pipeline execution
wjayesh marked this conversation as resolved.
Show resolved Hide resolved

When running a pipeline on a remote Stack, ZenML builds a Docker image with a base ZenML image and adds all of your project dependencies and your pipeline code to it. This process might take significant time depending on how big your dependencies are, how powerful your local system is and how fast your internet connection is. This is because Docker must pull base layers and push the final image to your container registry. Although this process only happens once and is skipped if ZenML detects no change in your environment, it might still be a bottleneck slowing down your pipeline execution.
wjayesh marked this conversation as resolved.
Show resolved Hide resolved

To save time and costs, you can choose to not build a Docker image every time your pipeline runs. This guide shows you how to do it using a prebuilt image, what you should include in your image for the pipeline to run successfully and other tips.

{% hint style="info" %}
Note that using this feature means that you won't be able to leverage any updates you make to your code or dependencies, outside of what your image already contains.
{% endhint %}

## Where you can use this feature

- When you are running in an environment that either doesn't have Docker installed or doesn't have enough memory to pull your base image and build a new image on top of it (think Codespaces or other CI/CD environments).
- When ZenML has already built an image for your code in a previous pipeline run and you want to reuse it in a new run. This saves you build times at the cost of not being able to leverage any updates you made to your code (or your dependencies) since then.
schustmi marked this conversation as resolved.
Show resolved Hide resolved

wjayesh marked this conversation as resolved.
Show resolved Hide resolved
## How do you use this feature

The [DockerSettings](../../../../docs/book/how-to/customize-docker-builds/docker-settings-on-a-pipeline.md#specify-docker-settings-for-a-pipeline) class in ZenML allows you to set a parent image to be used in your pipeline runs and the ability to skip building an image on top of it.
wjayesh marked this conversation as resolved.
Show resolved Hide resolved

Just set the `parent_image` attribute of the `DockerSettings` class to the image you want to use and set `skip_build` to `True`.
wjayesh marked this conversation as resolved.
Show resolved Hide resolved

```python
docker_settings = DockerSettings(
parent_image="my_registry.io/image_name:tag",
skip_build=True
)


@pipeline(settings={"docker": docker_settings})
def my_pipeline(...):
...
```

## What the parent image should contain

When you run a pipeline with a pre-built image, skipping the build process, ZenML will not build any image on top of it. This means that the image you provide to the `parent_image` attribute of the `DockerSettings` class has to contain all the code and dependencies that are needed to run your pipeline.
wjayesh marked this conversation as resolved.
Show resolved Hide resolved

{% hint style="info" %}
Note that this is different from the case where you [only specify a parent image](../../../../docs/book/how-to/customize-docker-builds/docker-settings-on-a-pipeline.md#using-a-pre-built-parent-image) and don't want to `skip_build`. In the latter, ZenML still builds the image but does it on top of your parent image and not the base ZenML image.
wjayesh marked this conversation as resolved.
Show resolved Hide resolved
{% endhint %}
{% hint style="info" %}
If you're using an image that was already built by ZenML in a previous pipeline run, you don't need to worry about what goes in it as long as it was built for the same stack as your current pipeline run. You can use it directly.
{% endhint %}

The following points are derived from how ZenML builds an image internally and will help you make your own images.

### Your stack requirements
schustmi marked this conversation as resolved.
Show resolved Hide resolved

A ZenML Stack can have different tools and each comes with its own requirements. You need to ensure that your image contains them. The following is how you can get a list of stack requirements.
wjayesh marked this conversation as resolved.
Show resolved Hide resolved

```python
from zenml.client import Client

stack_name = <YOUR_STACK>
# set your stack as active if it isn't already
Client().set_active_stack(stack_name)

# get the requirements for the active stack
active_stack = Client().active_stack
stack_requirements = active_stack.requirements()
```

### Integration requirements

For all integrations that you use in your pipeline, you need to have their dependencies installed too. You can get a list of them in the following way:

```python
from zenml.integrations.registry import integration_registry
from zenml.integrations.constants import HUGGINGFACE, PYTORCH

# define a list of all required integrations
required_integrations = [PYTORCH, HUGGINGFACE]

# Generate requirements for all required integrations
integration_requirements = set(
itertools.chain.from_iterable(
integration_registry.select_integration_requirements(
integration_name=integration,
target_os=OperatingSystemType.LINUX,
)
for integration in required_integrations
)
)
```

### Any project-specific requirements

For any other dependencies that your project relies on, you can then install all of these different requirements through a line in your `Dockerfile` that looks like the following. It assumes you have accumulated all the requirements in one file.

```Dockerfile
RUN pip install <ANY_ARGS> -r FILE
```

### Any system packages

If you have any `apt` packages that are needed for your application to function, be sure to include them too. This can be achieved in a `Dockerfile` as follows:

```Dockerfile
RUN apt-get update && apt-get install -y --no-install-recommends YOUR_APT_PACKAGES
```

### Your project code files

The files containing your pipeline and step code and all other necessary functions should also be available inside the image. Take a look at [which files are built into the image](../../../../docs/book/how-to/customize-docker-builds/which-files-are-built-into-the-image.md) page to learn more about what to include.
wjayesh marked this conversation as resolved.
Show resolved Hide resolved


{% hint style="info" %}
Note that you also need Python, `pip` and `zenml` installed in your image.
{% endhint %}
1 change: 1 addition & 0 deletions docs/book/toc.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@
* [🐳 Customize Docker builds](how-to/customize-docker-builds/README.md)
* [Docker settings on a pipeline](how-to/customize-docker-builds/docker-settings-on-a-pipeline.md)
* [Docker settings on a step](how-to/customize-docker-builds/docker-settings-on-a-step.md)
* [Use a prebuilt image for pipeline execution](how-to/customize-docker-builds/use-a-prebuilt-image.md)
* [Specify pip dependencies and apt packages](how-to/customize-docker-builds/specify-pip-dependencies-and-apt-packages.md)
* [Use your own Dockerfiles](how-to/customize-docker-builds/use-your-own-docker-files.md)
* [Which files are built into the image](how-to/customize-docker-builds/which-files-are-built-into-the-image.md)
Expand Down
Loading