Skip to content

Commit

Permalink
Merge pull request #109 from tsailiming/dsp-example
Browse files Browse the repository at this point in the history
Add data science pipeline example
  • Loading branch information
guimou authored Dec 2, 2024
2 parents 5951a95 + 1ae03fa commit db04166
Show file tree
Hide file tree
Showing 3 changed files with 50 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Data Science Pipeline

## What is it?

OpenShift AI allows building of machine line workflows with a data science pipeline. From OpenShift AI version 2.9, data science pipelines are based on KubeFlow Pipelines (KFP) version 2.0.

## What is Kubeflow Pipelines?
Kubeflow Pipelines (KFP) is a platform for building and deploying portable and scalable machine learning (ML) workflows using Docker containers.

With KFP you can author components and pipelines using the KFP Python SDK, compile pipelines to an intermediate representation YAML, and submit the pipeline to run on a KFP-conformant backend.

The current version of KFP 2.0 in OpenShift AI uses Argo Workflow as the backend.

## Why do I see OpenShift Pipeline in this example?

The example uses OpenShift Pipeline (Tekton) to compile the pipeline into an intermediate representation (IR) YAML and submit it to the Kubeflow Pipeline server (instead of doing it from your Jupyter environment using Elyra, or importing it directly through the Dashboard).

The Tekton pipeline has 2 main tasks:
* git-clone
* execute-kubeflow-pipeline to compile and submit the pipeline

## Example

## Architectural Diagram

![dsp-arch](img/rhoai-dsp.jpg)

The demo uses the following components:

| Component | Descrioption|
|---|---|
| Gitea | To store pipeline source code
| Model Registry | To store model metadata
| OpenShift Pipelines | Using Tekton to build the pipeline
| Data Science Pipeline | To run the pipeline using KFP
| Minio | S3 bucket to store the model
| KServe | To serve the model

## Prerequisite

You will need OpenShift 2.15 installed with ModelRegistry set to `Managed`. In 2.15, the model registry feature is currently in Tech Preview.

### Running the Example

The sample code is available [here](https://github.com/tsailiming/openshift-ai-dsp).




Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ nav:
- Workflows:
- Apache Airflow: tools-and-applications/airflow/airflow.md
- MLflow: tools-and-applications/mlflow/mlflow.md
- Data Science Pipeline: tools-and-applications/datasciencepipeline/datasciencepipeline.md
- Storage:
- Minio: tools-and-applications/minio/minio.md
- Rclone: tools-and-applications/rclone/rclone.md
Expand Down

0 comments on commit db04166

Please sign in to comment.