Merge pull request #109 from tsailiming/dsp-example

Add data science pipeline example
opendatahub-io-contrib · Dec 2, 2024 · db04166 · db04166
2 parents 5951a95 + 1ae03fa
commit db04166
Show file tree

Hide file tree

Showing 3 changed files with 50 additions and 0 deletions.
diff --git a/docs/tools-and-applications/datasciencepipeline/datasciencepipeline.md b/docs/tools-and-applications/datasciencepipeline/datasciencepipeline.md
@@ -0,0 +1,49 @@
+# Data Science Pipeline
+
+## What is it?
+
+OpenShift AI allows building of machine line workflows with a data science pipeline. From OpenShift AI version 2.9, data science pipelines are based on KubeFlow Pipelines (KFP) version 2.0.
+
+## What is Kubeflow Pipelines?
+Kubeflow Pipelines (KFP) is a platform for building and deploying portable and scalable machine learning (ML) workflows using Docker containers.
+
+With KFP you can author components and pipelines using the KFP Python SDK, compile pipelines to an intermediate representation YAML, and submit the pipeline to run on a KFP-conformant backend.
+
+The current version of KFP 2.0 in OpenShift AI uses Argo Workflow as the backend.
+
+## Why do I see OpenShift Pipeline in this example?
+
+The example uses OpenShift Pipeline (Tekton) to compile the pipeline into an intermediate representation (IR) YAML and submit it to the Kubeflow Pipeline server (instead of doing it from your Jupyter environment using Elyra, or importing it directly through the Dashboard).
+
+The Tekton pipeline has 2 main tasks:
+* git-clone
+* execute-kubeflow-pipeline to compile and submit the pipeline
+
+## Example 
+
+## Architectural Diagram
+
+![dsp-arch](img/rhoai-dsp.jpg)
+
+The demo uses the following components:
+
+| Component | Descrioption|
+|---|---|
+| Gitea | To store pipeline source code
+| Model Registry | To store model metadata
+| OpenShift Pipelines | Using Tekton to build the pipeline
+| Data Science Pipeline | To run the pipeline using KFP
+| Minio | S3 bucket to store the model
+| KServe | To serve the model
+
+## Prerequisite
+
+You will need OpenShift 2.15 installed with ModelRegistry set to `Managed`. In 2.15, the model registry feature is currently in Tech Preview.
+
+### Running the Example
+
+The sample code is available [here](https://github.com/tsailiming/openshift-ai-dsp).
+
+
+
+
diff --git a/docs/tools-and-applications/datasciencepipeline/img/rhoai-dsp.jpg b/docs/tools-and-applications/datasciencepipeline/img/rhoai-dsp.jpg
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -137,6 +137,7 @@ nav:
     - Workflows:
       - Apache Airflow: tools-and-applications/airflow/airflow.md
       - MLflow: tools-and-applications/mlflow/mlflow.md
+      - Data Science Pipeline: tools-and-applications/datasciencepipeline/datasciencepipeline.md
     - Storage:
       - Minio: tools-and-applications/minio/minio.md
       - Rclone: tools-and-applications/rclone/rclone.md