Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VertexAI Model-Registry & Model-Deployer #3161

Open
wants to merge 32 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
2fb8a7d
initial commit on vertex ai deployer and model registry
safoinme Jun 3, 2024
c03f2a0
vertex model
safoinme Jun 6, 2024
3c6bbe9
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Jul 14, 2024
4eeeb27
vertex deployer
safoinme Jul 15, 2024
7881b69
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Sep 13, 2024
7c0ca3f
vertex registry code
safoinme Sep 18, 2024
6769b6c
format
safoinme Sep 18, 2024
9a03f34
Refactor model registration and add URI parameter
safoinme Sep 20, 2024
afc5c2b
Refactor model registration and add URI parameter
safoinme Sep 21, 2024
2dc0d2d
Refactor model registration and remove unnecessary code
safoinme Sep 21, 2024
5c5bb84
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Oct 21, 2024
54b6748
Refactor GCP service and flavor classes for Vertex AI deployment
safoinme Oct 25, 2024
a80f71a
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Oct 25, 2024
a980449
Refactor Vertex AI model registry and deployer configurations
safoinme Oct 31, 2024
6e2b660
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Oct 31, 2024
0a13214
Refactor model deployer configurations and add VertexAI model deployer
safoinme Oct 31, 2024
53da68d
Refactor model deployer configurations and add VertexAI model deployer
safoinme Oct 31, 2024
ff015e1
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Nov 4, 2024
ce2019d
Rename VertexAI model registry classes and update documentation for c…
safoinme Nov 7, 2024
14f2998
Auto-update of LLM Finetuning template
actions-user Nov 7, 2024
0b30a61
Auto-update of Starter template
actions-user Nov 7, 2024
83dfe31
Auto-update of E2E template
actions-user Nov 7, 2024
7888717
Auto-update of NLP template
actions-user Nov 7, 2024
72cc93c
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Nov 7, 2024
70cc4a9
Auto-update of LLM Finetuning template
actions-user Nov 7, 2024
fcdec6e
Auto-update of Starter template
actions-user Nov 7, 2024
0108c0f
Auto-update of E2E template
actions-user Nov 7, 2024
0c33f82
Auto-update of NLP template
actions-user Nov 7, 2024
4f18ba5
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Nov 8, 2024
012cd6e
Update default filenames and improve backward compatibility for sklea…
safoinme Nov 12, 2024
ac2e69a
Auto-update of LLM Finetuning template
actions-user Nov 12, 2024
3d558ae
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Nov 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 179 additions & 0 deletions docs/book/component-guide/model-deployers/vertex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
# Vertex AI Model Deployer

[Vertex AI](https://cloud.google.com/vertex-ai) provides managed infrastructure for deploying machine learning models at scale. The Vertex AI Model Deployer in ZenML allows you to deploy models to Vertex AI endpoints, providing a scalable and managed solution for model serving.

## When to use it?

You should use the Vertex AI Model Deployer when:
safoinme marked this conversation as resolved.
Show resolved Hide resolved

* You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure
* You need enterprise-grade model serving capabilities with autoscaling
* You want a fully managed solution for hosting ML models
* You need to handle high-throughput prediction requests
* You want to deploy models with GPU acceleration
* You need to monitor and track your model deployments

This is particularly useful in the following scenarios:
* Deploying models to production with high availability requirements
* Serving models that need GPU acceleration
* Handling varying prediction workloads with autoscaling
* Integrating model serving with other GCP services

{% hint style="warning" %}
The Vertex AI Model Deployer requires a Vertex AI Model Registry to be present in your stack. Make sure you have configured both components properly.
{% endhint %}

## How to deploy it?

The Vertex AI Model Deployer is provided by the GCP ZenML integration. First, install the integration:

```shell
zenml integration install gcp -y
```

### Authentication and Service Connector Configuration

The Vertex AI Model Deployer requires proper GCP authentication. The recommended way to configure this is using the ZenML Service Connector functionality:

```shell
# Register the service connector with a service account key
zenml service-connector register vertex_deployer_connector \
--type gcp \
--auth-method=service-account \
--project_id=<PROJECT_ID> \
[email protected] \
--resource-type gcp-generic

# Register the model deployer
zenml model-deployer register vertex_deployer \
--flavor=vertex \
--location=us-central1

# Connect the model deployer to the service connector
zenml model-deployer connect vertex_deployer --connector vertex_deployer_connector
```

{% hint style="info" %}
The service account needs the following permissions:
- `Vertex AI User` role for deploying models
- `Vertex AI Service Agent` role for managing model endpoints
{% endhint %}

## How to use it

### Deploy a model in a pipeline

Here's an example of how to use the Vertex AI Model Deployer in a ZenML pipeline:
safoinme marked this conversation as resolved.
Show resolved Hide resolved

```python
from typing_extensions import Annotated
from zenml import ArtifactConfig, get_step_context, step
from zenml.client import Client
from zenml.integrations.gcp.services.vertex_deployment import (
VertexDeploymentConfig,
VertexDeploymentService,
)

@step(enable_cache=False)
def model_deployer(
) -> Annotated[
VertexDeploymentService,
ArtifactConfig(name="vertex_deployment", is_deployment_artifact=True)
]:
"""Model deployer step."""
zenml_client = Client()
current_model = get_step_context().model
model_registry_uri = current_model.get_model_artifact("THE_MODEL_ARTIFACT_NAME_GIVEN_IN_TRAINING_STEP").uri
model_deployer = zenml_client.active_stack.model_deployer

# Configure the deployment
vertex_deployment_config = VertexDeploymentConfig(
location="europe-west1",
name="zenml-vertex-quickstart",
model_name=current_model.name,
safoinme marked this conversation as resolved.
Show resolved Hide resolved
description="Vertex AI model deployment example",
model_id=model_registry_uri,
machine_type="n1-standard-4", # Optional: specify machine type
min_replica_count=1, # Optional: minimum number of replicas
max_replica_count=3, # Optional: maximum number of replicas
)

# Deploy the model
service = model_deployer.deploy_model(
config=vertex_deployment_config,
service_type=VertexDeploymentService.SERVICE_TYPE,
)

return service
```

### Configuration Options

The Vertex AI Model Deployer accepts a rich set of configuration options through `VertexDeploymentConfig`:

* Basic Configuration:
* `location`: GCP region for deployment (e.g., "us-central1")
* `name`: Name for the deployment endpoint
* `model_name`: Name of the model being deployed
* `model_id`: Model ID from the Vertex AI Model Registry

* Infrastructure Configuration:
* `machine_type`: Type of machine to use (e.g., "n1-standard-4")
* `accelerator_type`: GPU accelerator type if needed
* `accelerator_count`: Number of GPUs per replica
* `min_replica_count`: Minimum number of serving replicas
* `max_replica_count`: Maximum number of serving replicas

* Advanced Configuration:
* `service_account`: Custom service account for the deployment
* `network`: VPC network configuration
* `encryption_spec_key_name`: Customer-managed encryption key
* `enable_access_logging`: Enable detailed access logging
* `explanation_metadata`: Model explanation configuration
* `autoscaling_target_cpu_utilization`: Target CPU utilization for autoscaling

### Running Predictions

Once a model is deployed, you can run predictions using the service:

```python
from zenml.integrations.gcp.model_deployers import VertexModelDeployer
from zenml.services import ServiceState

# Get the deployed service
model_deployer = VertexModelDeployer.get_active_model_deployer()
services = model_deployer.find_model_server(
pipeline_name="deployment_pipeline",
pipeline_step_name="model_deployer",
model_name="my_model",
)

if services:
service = services[0]
if service.is_running:
# Run prediction
prediction = service.predict(
instances=[{"feature1": 1.0, "feature2": 2.0}]
)
print(f"Prediction: {prediction}")
```

### Limitations and Considerations

1. **Stack Requirements**:
- Requires a Vertex AI Model Registry in the stack
- All stack components must be non-local

2. **Authentication**:
- Requires proper GCP credentials with Vertex AI permissions
- Best practice is to use service connectors for authentication

3. **Costs**:
- Vertex AI endpoints incur costs based on machine type and uptime
- Consider using autoscaling to optimize costs

4. **Region Availability**:
- Service availability depends on Vertex AI regional availability
- Model and endpoint must be in the same region

Check out the [SDK docs](https://sdkdocs.zenml.io) for more detailed information about the implementation.
156 changes: 156 additions & 0 deletions docs/book/component-guide/model-registries/vertex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# Vertex AI Model Registry

[Vertex AI](https://cloud.google.com/vertex-ai) is Google Cloud's unified ML platform that helps you build, deploy, and scale ML models. The Vertex AI Model Registry is a centralized repository for managing your ML models throughout their lifecycle. ZenML's Vertex AI Model Registry integration allows you to register, version, and manage your models using Vertex AI's infrastructure.

## When would you want to use it?

You should consider using the Vertex AI Model Registry when:

* You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure
* You need enterprise-grade model management capabilities with fine-grained access control
* You want to track model lineage and metadata in a centralized location
* You're building ML pipelines that need to integrate with other Vertex AI services
* You need to manage model deployment across different GCP environments

This is particularly useful in the following scenarios:

* Building production ML pipelines that need to integrate with GCP services
* Managing multiple versions of models across development and production environments
* Tracking model artifacts and metadata in a centralized location
* Deploying models to Vertex AI endpoints for serving

{% hint style="warning" %}
Important: The Vertex AI Model Registry implementation only supports the model version interface, not the model interface. This means you cannot register, delete, or update models directly - you can only work with model versions. Operations like `register_model()`, `delete_model()`, and `update_model()` are not supported.

Unlike platforms like MLflow where you first create a model container and then add versions to it, Vertex AI combines model registration and versioning into a single operation:

- When you upload a model, it automatically creates both the model and its first version
- Each subsequent upload with the same display name creates a new version
- You cannot create an empty model container without a version
{% endhint %}

## How do you deploy it?

The Vertex AI Model Registry flavor is provided by the GCP ZenML integration. First, install the integration:

```shell
zenml integration install gcp -y
```

### Authentication and Service Connector Configuration

The Vertex AI Model Registry requires proper GCP authentication. The recommended way to configure this is using the ZenML Service Connector functionality. You have several options for authentication:

1. Using a GCP Service Connector with a dedicated service account (Recommended):
```shell
# Register the service connector with a service account key
zenml service-connector register vertex_registry_connector \
--type gcp \
--auth-method=service-account \
--project_id=<PROJECT_ID> \
[email protected] \
--resource-type gcp-generic

# Register the model registry
zenml model-registry register vertex_registry \
--flavor=vertex \
--location=us-central1

# Connect the model registry to the service connector
zenml model-registry connect vertex_registry --connector vertex_registry_connector
```

2. Using local gcloud credentials:
```shell
# Register the model registry using local gcloud auth
zenml model-registry register vertex_registry \
--flavor=vertex \
--location=us-central1
```

{% hint style="info" %}
The service account used needs the following permissions:
- `Vertex AI User` role for creating and managing model versions
- `Storage Object Viewer` role if accessing models stored in Google Cloud Storage
{% endhint %}

## How do you use it?

### Register models inside a pipeline

Here's an example of how to use the Vertex AI Model Registry in your ZenML pipeline using the provided model registration step:

```python
from typing_extensions import Annotated
from zenml import ArtifactConfig, get_step_context, step
from zenml.client import Client
from zenml.logger import get_logger

logger = get_logger(__name__)

@step(enable_cache=False)
def model_register() -> Annotated[str, ArtifactConfig(name="model_registry_uri")]:
"""Model registration step."""
# Get the current model from the context
current_model = get_step_context().model

client = Client()
model_registry = client.active_stack.model_registry
model_version = model_registry.register_model_version(
name=current_model.name,
version=str(current_model.version),
model_source_uri=current_model.get_model_artifact("sklearn_classifier").uri,
description="ZenML model registered after promotion",
)
logger.info(
f"Model version {model_version.version} registered in Model Registry"
)

return model_version.model_source_uri
```

### Configuration Options

The Vertex AI Model Registry accepts the following configuration options:

* `location`: The GCP region where the model registry will be created (e.g., "us-central1")
* `project_id`: (Optional) The GCP project ID. If not specified, will use the default project
* `credentials`: (Optional) GCP credentials configuration

### Working with Model Versions

Since the Vertex AI Model Registry only supports version-level operations, here's how to work with model versions:

```shell
# List all model versions
zenml model-registry models list-versions <model-name>

# Get details of a specific model version
zenml model-registry models get-version <model-name> -v <version>

# Delete a model version
zenml model-registry models delete-version <model-name> -v <version>
```

### Key Differences from MLflow Model Registry

Unlike the MLflow Model Registry, the Vertex AI implementation has some important differences:

1. **Version-Only Interface**: Vertex AI only supports model version operations. You cannot register, delete, or update models directly - only their versions.
2. **Authentication**: Uses GCP service connectors for authentication, similar to other Vertex AI services in ZenML.
3. **Staging Levels**: Vertex AI doesn't have built-in staging levels (like Production, Staging, etc.) - these are handled through metadata.
4. **Default Container Images**: Vertex AI requires a serving container image URI, which defaults to the scikit-learn prediction container if not specified.
5. **Managed Service**: As a fully managed service, you don't need to worry about infrastructure management, but you need valid GCP credentials.

### Limitations

Based on the implementation, there are some limitations to be aware of:

1. The `register_model()`, `update_model()`, and `delete_model()` methods are not implemented as Vertex AI only supports registering model versions
3. It's preferable for the models to be given a serving container image URI specified to avoid using the default scikit-learn prediction container and to ensure compatibility with Vertex AI endpoints
when deploying models.
4. All registered models by the integration are automatically labeled with `managed_by="zenml"` for tracking purposes

Check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration\_code\_docs/integrations-gcp/#zenml.integrations.gcp.model\_registry) to see more about the interface and implementation.

<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
2 changes: 2 additions & 0 deletions docs/book/toc.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,7 @@
* [Develop a custom experiment tracker](component-guide/experiment-trackers/custom.md)
* [Model Deployers](component-guide/model-deployers/model-deployers.md)
* [MLflow](component-guide/model-deployers/mlflow.md)
* [VertexAI](component-guide/model-deployers/vertex.md)
* [Seldon](component-guide/model-deployers/seldon.md)
* [BentoML](component-guide/model-deployers/bentoml.md)
* [Hugging Face](component-guide/model-deployers/huggingface.md)
Expand Down Expand Up @@ -300,6 +301,7 @@
* [Develop a Custom Annotator](component-guide/annotators/custom.md)
* [Model Registries](component-guide/model-registries/model-registries.md)
* [MLflow Model Registry](component-guide/model-registries/mlflow.md)
* [VertexAI](component-guide/model-registries/vertex.md)
* [Develop a Custom Model Registry](component-guide/model-registries/custom.md)
* [Feature Stores](component-guide/feature-stores/feature-stores.md)
* [Feast](component-guide/feature-stores/feast.md)
Expand Down
3 changes: 2 additions & 1 deletion src/zenml/cli/model_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@

import click

from zenml import __version__
from zenml.cli import utils as cli_utils
from zenml.cli.cli import TagGroup, cli
from zenml.enums import StackComponentType
Expand Down Expand Up @@ -643,7 +644,7 @@ def register_model_version(
# Parse metadata
metadata = dict(metadata) if metadata else {}
registered_metadata = ModelRegistryModelMetadata(**dict(metadata))
registered_metadata.zenml_version = zenml_version
registered_metadata.zenml_version = zenml_version or __version__
registered_metadata.zenml_run_name = zenml_run_name
registered_metadata.zenml_pipeline_name = zenml_pipeline_name
registered_metadata.zenml_step_name = zenml_step_name
Expand Down
Loading
Loading