Skip to content

Commit

Permalink
Docs on ZenML setup (#3100)
Browse files Browse the repository at this point in the history
* add some info on docker skip build

* add docs on not building a docker image

* update toc and title

* added text to stress that this doesnt always happen

* Apply suggestions from code review

Co-authored-by: Hamza Tahir <[email protected]>

* restructure headings

* more english

* Apply suggestions from code review

Co-authored-by: Alex Strick van Linschoten <[email protected]>

* apply review changes

* add how to reuse builds page

* aoply hamza comments

* add redirect for new page name

* apply review changes

* move the artifact store block to the top

* update redirect

* add shared components section

* scarf

* update toc

* add stacks, pipelines, and models page

* add overview page

* add access management guide

* update setup repository

* update text

* Update docs/book/how-to/setting-up-a-project-repository/stacks-pipelines-models.md

Co-authored-by: Hamza Tahir <[email protected]>

* update

* a tags become markdown links

* add artifacts and real life example

* fix typos

* add scarf

* update toc

* remove first md

* split pypi section out

* restructure

* add scarf

* add diagram

* Optimised images with calibre/image-actions

* fix spelling

* add descriptions

* rename some sections

* minor changes

* Update docs/book/how-to/setting-up-a-project-repository/README.md

Co-authored-by: Hamza Tahir <[email protected]>

* Update docs/book/how-to/setting-up-a-project-repository/README.md

Co-authored-by: Hamza Tahir <[email protected]>

* Apply suggestions from code review

Co-authored-by: Hamza Tahir <[email protected]>

* add info about roles zenml pro

* add link

* address review feedback

* extra spaces

* fix naming

* add links and update pypi registry example

* add example of connector roles

* Optimised images with calibre/image-actions

* small fix

* update section name

* apply reviews

* update second location for pypi docs

* Optimised images with calibre/image-actions

* replace image

* Optimised images with calibre/image-actions

---------

Co-authored-by: Jayesh Sharma <[email protected]>
Co-authored-by: Hamza Tahir <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 128b32f)
  • Loading branch information
strickvl authored and htahir1 committed Oct 26, 2024
1 parent c8c3b12 commit 5948af2
Show file tree
Hide file tree
Showing 19 changed files with 566 additions and 71 deletions.
20 changes: 10 additions & 10 deletions .gitbook.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ structure:

redirects:
how-to/customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times: how-to/customize-docker-builds/how-to-reuse-builds.md
reference/migration-guide/README.md: how-to/manage-the-zenml-server/migration-guide/migration-guide.md
reference/migration-guide/migration-zero-twenty.md: how-to/manage-the-zenml-server/migration-guide/migration-zero-twenty.md
reference/migration-guide/migration-zero-thirty.md: how-to/manage-the-zenml-server/migration-guide/migration-zero-thirty.md
reference/migration-guide/migration-zero-forty.md: how-to/manage-the-zenml-server/migration-guide/migration-zero-forty.md
reference/migration-guide/migration-zero-sixty.md: how-to/manage-the-zenml-server/migration-guide/migration-zero-sixty.md
reference/migration-guide: how-to/manage-the-zenml-server/migration-guide/migration-guide.md
reference/migration-guide/migration-zero-twenty: how-to/manage-the-zenml-server/migration-guide/migration-zero-twenty.md
reference/migration-guide/migration-zero-thirty: how-to/manage-the-zenml-server/migration-guide/migration-zero-thirty.md
reference/migration-guide/migration-zero-forty: how-to/manage-the-zenml-server/migration-guide/migration-zero-forty.md
reference/migration-guide/migration-zero-sixty: how-to/manage-the-zenml-server/migration-guide/migration-zero-sixty.md

getting-started/deploying-zenml/manage-the-deployed-services/upgrade-the-version-of-the-zenml-server.md: how-to/manage-the-zenml-server/upgrade-zenml-server.md
getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-your-deployed-server.md: how-to/manage-the-zenml-server/troubleshoot-your-deployed-server.md
how-to/stack-deployment/implement-a-custom-integration.md: how-to/contribute-to-zenml/implement-a-custom-integration.md

getting-started/zenml-pro/system-architectures: getting-started/system-architectures.md
getting-started/deploying-zenml/manage-the-deployed-services/upgrade-the-version-of-the-zenml-server: how-to/manage-the-zenml-server/upgrade-zenml-server.md
getting-started/deploying-zenml/manage-the-deployed-services/troubleshoot-your-deployed-server: how-to/manage-the-zenml-server/troubleshoot-your-deployed-server.md
how-to/stack-deployment/implement-a-custom-integration: how-to/contribute-to-zenml/implement-a-custom-integration.md
how-to/setting-up-a-project-repository/best-practices: how-to/setting-up-a-project-repository/set-up-repository.md
getting-started/zenml-pro/system-architectures: getting-started/system-architectures.md
Binary file modified docs/book/.gitbook/assets/argilla_annotator.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ While reusing Docker builds is useful, it can be limited. This is because specif

## Use the artifact store to upload your code

You can also let ZenML use the artifact store to upload your code. This is the default behaviour if no code repository is detected and the `allow_download_from_artifact_store` flag is not set to `False` in your `DockerSettings`.
You can also let ZenML use the artifact store to upload your code. This is the default behavior if no code repository is detected and the `allow_download_from_artifact_store` flag is not set to `False` in your `DockerSettings`.

## Use code repositories to speed up Docker build times

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
description: How to use a private PyPI repository.
---

# How to use a private PyPI repository

For packages that require authentication, you may need to take additional steps:

1. Use environment variables to store credentials securely.
2. Configure pip or poetry to use these credentials when installing packages.
3. Consider using custom Docker images that have the necessary authentication setup.

Here's an example of how you might set up authentication using environment variables:

```python
import os

from my_simple_package import important_function
from zenml.config import DockerSettings
from zenml import step, pipeline

docker_settings = DockerSettings(
requirements=["my-simple-package==0.1.0"],
environment={'PIP_EXTRA_INDEX_URL': f"https://{os.environ.get('PYPI_TOKEN', '')}@my-private-pypi-server.com/{os.environ.get('PYPI_USERNAME', '')}/"}
)

@step
def my_step():
return important_function()

@pipeline(settings={"docker": docker_settings})
def my_pipeline():
my_step()

if __name__ == "__main__":
my_pipeline()
```

Note: Be cautious with handling credentials. Always use secure methods to manage
and distribute authentication information within your team.
<!-- For scarf -->
<figure><img alt="ZenML Scarf" referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" /></figure>


Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ ZenML Pro comes with multi-tenancy which makes it easy for you to have multiple

## Upgrading your code

Sometimes, you might have to upgrade your code to work with a new version of ZenML. This is true especially when you are moving from a really old version to a new major version. The following tips might help, in addition to everything you've learnt in this document so far.
Sometimes, you might have to upgrade your code to work with a new version of ZenML. This is true especially when you are moving from a really old version to a new major version. The following tips might help, in addition to everything you've learned in this document so far.

### Testing and Compatibility

Expand Down
92 changes: 86 additions & 6 deletions docs/book/how-to/setting-up-a-project-repository/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,93 @@
---
description: Setting your team up for success with a project repository.
description: Setting your team up for success with a well-architected ZenML project.
---

# 😸 Setting up a project repository
# 😸 Setting up a Well-Architected ZenML Project

ZenML code typically lives in a `git` repository. Setting this repository up correctly can make a huge impact on collaboration and
getting the maximum out of your ZenML deployment. This section walks users through some of the options available to create a project
repository with ZenML.
Welcome to the guide on setting up a well-architected ZenML project. This section will provide you with a comprehensive overview of best practices, strategies, and considerations for structuring your ZenML projects to ensure scalability, maintainability, and collaboration within your team.

<figure><img src="../../.gitbook/assets/Remote_with_code_repository.png" alt=""><figcaption><p>A visual representation of how the code repository fits into the general ZenML architecture.</p></figcaption></figure>
## The Importance of a Well-Architected Project

A well-architected ZenML project is crucial for the success of your machine learning operations (MLOps). It provides a solid foundation for your team to develop, deploy, and maintain ML models efficiently. By following best practices and leveraging ZenML's features, you can create a robust and flexible MLOps pipeline that scales with your needs.

## Key Components of a Well-Architected ZenML Project

### Repository Structure

A clean and organized repository structure is essential for any ZenML project. This includes:

- Proper folder organization for pipelines, steps, and configurations
- Clear separation of concerns between different components
- Consistent naming conventions

Learn more about setting up your repository in the [Set up repository guide](./best-practices.md).

### Version Control and Collaboration

Integrating your ZenML project with version control systems like Git is crucial for team collaboration and code management. This allows for:

- Makes creating pipeline builds faster, as you can leverage the same image and [have ZenML download code from your repository](../../how-to/customize-docker-builds/how-to-reuse-builds.md#use-code-repositories-to-speed-up-docker-build-times).
- Easy tracking of changes
- Collaboration among team members

Discover how to connect your Git repository in the [Set up a repository guide](./best-practices.md).

### Stacks, Pipelines, Models, and Artifacts

Understanding the relationship between stacks, models, and pipelines is key to designing an efficient ZenML project:

- Stacks: Define your infrastructure and tool configurations
- Models: Represent your machine learning models and their metadata
- Pipelines: Encapsulate your ML workflows
- Artifacts: Track your data and model outputs

Learn about organizing these components in the [Organizing Stacks, Pipelines, Models, and Artifacts guide](./stacks-pipelines-models.md).

### Access Management and Roles

Proper access management ensures that team members have the right permissions and responsibilities:

- Define roles such as data scientists, MLOps engineers, and infrastructure managers
- Set up [service connectors](../auth-management/README.md) and manage authorizations
- Establish processes for pipeline maintenance and server upgrades
- Leverage [Teams in ZenML Pro](../../getting-started/zenml-pro/teams.md) to assign roles and permissions to a group of users, to mimic your real-world team roles.

Explore access management strategies in the [Access Management and Roles guide](./access-management-and-roles.md).

### Shared Components and Libraries

Leverage shared components and libraries to promote code reuse and standardization across your team:

- Custom flavors, steps, and materializers
- Shared private wheels for internal distribution
- Handling authentication for specific libraries

Find out more about sharing code in the [Shared Libraries and Logic for Teams guide](./shared_components_for_teams.md).

### Project Templates

Utilize project templates to kickstart your ZenML projects and ensure consistency:

- Use pre-made templates for common use cases
- Create custom templates tailored to your team's needs

Learn about using and creating project templates in the [Project Templates guide](./project-templates.md).

### Migration and Maintenance

As your project evolves, you may need to migrate existing codebases or upgrade your ZenML server:

- Strategies for migrating legacy code to newer ZenML versions
- Best practices for upgrading ZenML servers

Discover migration strategies and maintenance best practices in the [Migration and Maintenance guide](../../how-to/manage-the-zenml-server/best-practices-upgrading-zenml.md#upgrading-your-code).

## Getting Started

To begin building your well-architected ZenML project, start by exploring the guides in this section. Each guide provides in-depth information on specific aspects of project setup and management.

Remember, a well-architected project is an ongoing process. Regularly review and refine your project structure, processes, and practices to ensure they continue to meet your team's evolving needs.

By following these guidelines and leveraging ZenML's powerful features, you'll be well on your way to creating a robust, scalable, and collaborative MLOps environment.

<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
description: A guide on managing user roles and responsibilities in ZenML.
---

# Access Management and Roles in ZenML

Effective access management is crucial for maintaining security and efficiency in your ZenML projects. This guide will help you understand the different roles within a ZenML server and how to manage access for your team members.

## Typical Roles in an ML Project

In an ML project, you will typically have the following roles:

- Data Scientists: Primarily work on developing and running pipelines.
- MLOps Platform Engineers: Manage the infrastructure and stack components.
- Project Owners: Oversee the entire ZenML deployment and manage user access.

The above is an estimation of roles that you might have in your team. In your case, the names might be different or there might be more roles, but you can relate the responbilities we discuss in this document to your own project loosely.

{% hint style="info" %}
You can create [Roles in ZenML Pro](../../getting-started/zenml-pro/roles.md) with a given set of permissions and assign them to either Users or Teams that represent your real-world team structure. Sign up for a free trial to try it yourself: https://cloud.zenml.io/
{% endhint %}

## Service Connectors: Gateways to External Services

Service connectors are how different cloud services are integrated with ZenML. They are used to abstract away the credentials and other configurations needed to access these services.

Ideally, you would want that only the MLOps Platform Engineers have access for creating and managing connectors. This is because they are closest to your infrastructure and can make informed decisions about what authentication mechanisms to use and more.

Other team members can use connectors to create stack components that talk to the external services but should not have to worry about setting them and shouldn't have access to the credentials used to configure them.

Let's look at an example of how this works in practice.
Imagine you have a `DataScientist` role in your ZenML server. This role should only be able to use the connectors to create stack components and run pipelines. They shouldn't have access to the credentials used to configure these connectors. Therefore, the permissions for this role could look like the following:

![Data Scientist Permissions](../../.gitbook/assets/data_scientist_connector_role.png)

You can notice that the role doesn't grant the data scientist permissions to create, update, or delete connectors, or read their secret values.

On the other hand, the `MLOpsPlatformEngineer` role has the permissions to create, update, and delete connectors, as well as read their secret values. The permissions for this role could look like the following:

![MLOps Platform Engineer Permissions](../../.gitbook/assets/platform_engineer_connector_role.png)

{% hint style="info" %}
Note that you can only use the RBAC features in ZenML Pro. Learn more about roles in ZenML Pro [here](../../getting-started/zenml-pro/roles.md).
{% endhint %}

Learn more about the best practices in managing credentials and recommended roles in our [Managing Stacks and Components guide](../stack-deployment/README.md).


## Who is responsible for upgrading the ZenML server?

The decision to upgrade your ZenML server is usually taken by your Project Owners after consulting with all the teams using the server. This is because there might be teams with conflicting requirements and moving to a new version of ZenML (that might come with upgrades to certain libraries) can break code for some users.

{% hint style="info" %}
You can choose to have different servers for different teams and that can alleviate some of the pressure to upgrade if you have multiple teams using the same server. ZenML Pro offers [multi-tenancy](../../getting-started/zenml-pro/tenants.md) out of the box, for situations like these. Sign up for a free trial to try it yourself: https://cloud.zenml.io/
{% endhint %}

Performing the upgrade itself is a task that typically falls on the MLOps Platform Engineers. They should:

- ensure that all data is backed up before performing the upgrade
- no service disruption or downtime happens during the upgrade

and more. Read in detail about the best practices for upgrading your ZenML server in the [Best Practices for Upgrading ZenML Servers](../manage-the-zenml-server/best-practices-upgrading-zenml.md) guide.


## Who is responsible for migrating and maintaining pipelines?

When you upgrade to a new version of ZenML, you might have to test if your code works as expected and if the syntax is up to date with what ZenML expects. Although we do our best to make new releases compatible with older versions, there might be some breaking changes that you might have to address.

The pipeline code itself is typically owned by the Data Scientist, but the Platform Engineer is responsible for making sure that new changes can be tested in a safe environment without impacting existing workflows. This involves setting up a new server and doing a staged upgrade and other strategies.

The Data Scientist should also check out the release notes, and the migration guide where applicable when upgrading the code. Read more about the best practices for upgrading your ZenML server and your code in the [Best Practices for Upgrading ZenML Servers](../manage-the-zenml-server/best-practices-upgrading-zenml.md) guide.


## Best Practices for Access Management

Apart from the role-specific tasks we discussed so far, there are some general best practices you should follow to ensure a secure and well-managed ZenML environment that supports collaboration while maintaining proper access controls.

- Regular Audits: Conduct periodic reviews of user access and permissions.
- Role-Based Access Control (RBAC): Implement RBAC to streamline permission management.
- Least Privilege: Grant minimal necessary permissions to each role.
- Documentation: Maintain clear documentation of roles, responsibilities, and access policies.

{% hint style="info" %}
The Role-Based Access Control (RBAC) and assigning of permissions is only available for ZenML Pro users.
{% endhint %}

By following these guidelines, you can ensure a secure and well-managed ZenML environment that supports collaboration while maintaining proper access controls.


<!-- For scarf -->
<figure><img alt="ZenML Scarf" referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" /></figure>


Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ description: >-

A code repository in ZenML refers to a remote storage location for your code. Some commonly known code repository platforms include [GitHub](https://github.com/) and [GitLab](https://gitlab.com/).

<figure><img src="../../.gitbook/assets/Remote_with_code_repository.png" alt=""><figcaption><p>A visual representation of how the code repository fits into the general ZenML architecture.</p></figcaption></figure>

Code repositories enable ZenML to keep track of the code version that you use for your pipeline runs. Additionally, running a pipeline that is tracked in a registered code repository can [speed up the Docker image building for containerized stack components](../customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times.md) by eliminating the need to rebuild Docker images each time you change one of your source code files.

Learn more about how code repositories benefit development [here](../customize-docker-builds/use-code-repositories-to-speed-up-docker-build-times.md).
Expand Down
Loading

0 comments on commit 5948af2

Please sign in to comment.