Adds wandb example (#1673)

Signed-off-by: Thomas J. Fan <[email protected]>
flyteorg · Jun 13, 2024 · 1a4773d · 1a4773d
1 parent 85c2a40
commit 1a4773d
Show file tree

Hide file tree

Showing 6 changed files with 142 additions and 0 deletions.
diff --git a/docs/integrations.md b/docs/integrations.md
@@ -40,6 +40,8 @@ Flytekit functionality. These plugins can be anything and for comparison can be
   - Convert ML models to ONNX models seamlessly.
 * - {doc}`DuckDB <auto_examples/duckdb_plugin/index>`
   - Run analytical queries using DuckDB.
+* - {doc}`Weights and Biases <auto_examples/wandb_plugin/index>`
+  - `wandb`: Machine learning platform to build better models faster.
 ```
 
 :::{dropdown} {fa}`info-circle` Using flytekit plugins

diff --git a/examples/wandb_plugin/Dockerfile b/examples/wandb_plugin/Dockerfile
@@ -0,0 +1,27 @@
+FROM python:3.11-slim-bookworm
+LABEL org.opencontainers.image.source https://github.com/flyteorg/flytesnacks
+
+WORKDIR /root
+ENV VENV /opt/venv
+ENV LANG C.UTF-8
+ENV LC_ALL C.UTF-8
+ENV PYTHONPATH /root
+
+WORKDIR /root
+
+ENV VENV /opt/venv
+# Virtual environment
+RUN python3 -m venv ${VENV}
+ENV PATH="${VENV}/bin:$PATH"
+
+# Install Python dependencies
+COPY requirements.in /root
+RUN pip install -r /root/requirements.in
+
+# Copy the actual code
+COPY . /root
+
+# This tag is supplied by the build script and will be used to determine the version
+# when registering tasks, workflows, and launch plans
+ARG tag
+ENV FLYTE_INTERNAL_IMAGE $tag
diff --git a/examples/wandb_plugin/README.md b/examples/wandb_plugin/README.md
@@ -0,0 +1,20 @@
+(wandb)=
+
+# Weights and Biases
+
+```{eval-rst}
+.. tags:: Integration, Data, Metrics, Intermediate
+```
+
+The Weights and Biases MLOps platform helps AI developers streamline their ML workflows from end to end. This plugin
+enables seamless use of Weights & Biases within Flyte by configuring links between the two platforms.
+
+First, install the Flyte Weights & Biases plugin:
+
+```bash
+pip install flytekitplugins-wandb
+```
+
+```{auto-examples-toc}
+wandb_example
+```
diff --git a/examples/wandb_plugin/requirements.in b/examples/wandb_plugin/requirements.in
@@ -0,0 +1,4 @@
+flytekitplugins-wandb
+xgboost
+scikit-learn
+wandb
diff --git a/examples/wandb_plugin/wandb_plugin/__init__.py b/examples/wandb_plugin/wandb_plugin/__init__.py
diff --git a/examples/wandb_plugin/wandb_plugin/wandb_example.py b/examples/wandb_plugin/wandb_plugin/wandb_example.py
@@ -0,0 +1,89 @@
+# %% [markdown]
+# (wandb_example)=
+#
+# # Weights and Biases Example
+# The Weights & Biases MLOps platform helps AI developers streamline their ML
+# workflow from end-to-end. This plugin enables seamless use of Weights & Biases
+# within Flyte by configuring links between the two platforms.
+# %%
+from flytekit import ImageSpec, Secret, task, workflow
+from flytekitplugins.wandb import wandb_init
+
+# %% [markdown]
+# First, we specify the project and entity that we will use with Weights and Biases.
+# Please update `WANDB_ENTITY` to the value associated with your account.
+# %%
+WANDB_PROJECT = "flytekit-wandb-plugin"
+WANDB_ENTITY = "github-username"
+
+# %% [markdown]
+# W&B requires an API key to authenticate with their service. In the above example,
+# the secret is created using
+# [Flyte's Secrets manager](https://docs.flyte.org/en/latest/user_guide/productionizing/secrets.html).
+# %%
+SECRET_KEY = "wandb-api-key"
+SECRET_GROUP = "wandb-api-group"
+wandb_secret = Secret(key=SECRET_KEY, group=SECRET_GROUP)
+
+# %% [markdown]
+# Next, we use `ImageSpec` to construct a container that contains the dependencies for this
+# task:
+# %%
+REGISTRY = "localhost:30000"
+
+image = ImageSpec(
+    name="wandb_example",
+    python_version="3.11",
+    packages=["flytekitplugins-wandb", "xgboost", "scikit-learn"],
+    registry=REGISTRY,
+)
+
+
+# %%
+# The `wandb_init` decorator calls `wandb.init` and configures it to use Flyte's
+# execution id as the Weight and Biases run id. The body of the task is XGBoost training
+# code, where we pass `WandbCallback` into `XGBClassifier`'s `callbacks`.
+@task(
+    container_image=image,
+    secret_requests=[wandb_secret],
+)
+@wandb_init(project=WANDB_PROJECT, entity=WANDB_ENTITY, secret=wandb_secret)
+def train() -> float:
+    import wandb
+    from sklearn.datasets import load_iris
+    from sklearn.model_selection import train_test_split
+    from wandb.integration.xgboost import WandbCallback
+    from xgboost import XGBClassifier
+
+    X, y = load_iris(return_X_y=True)
+    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
+    bst = XGBClassifier(
+        n_estimators=100,
+        objective="binary:logistic",
+        callbacks=[WandbCallback(log_model=True)],
+    )
+    bst.fit(X_train, y_train)
+
+    test_score = bst.score(X_test, y_test)
+
+    # Log custom metrics
+    wandb.run.log({"test_score": test_score})
+    return test_score
+
+
+@workflow
+def wf() -> float:
+    return train()
+
+
+# %% [markdown]
+# To enable dynamic log links, add plugin to Flyte's configuration file:
+# ```yaml
+# dynamic-log-links:
+#    - wandb-execution-id:
+#        displayName: Weights & Biases
+#        templateUris: '{{ .taskConfig.host }}/{{ .taskConfig.entity }}/{{ .taskConfig.project }}/runs/{{ .executionName }}-{{ .nodeId }}-{{ .taskRetryAttempt }}'
+#    - wandb-custom-id:
+#        displayName: Weights & Biases
+#        templateUris: '{{ .taskConfig.host }}/{{ .taskConfig.entity }}/{{ .taskConfig.project }}/runs/{{ .taskConfig.id }}'
+# ```