This document covers the deployment guide for MLOps.
For Orchestrator job, either an existing cluster can be used or a new cluster can be created. However, we need to be sure to set following properties in the cluster.
-
Cluster Mode: High Concurrency
-
DataBricks Runtime Version : 8.1 LTS ML (includes Apache Spark 3.0.1, Scala 2.12)
-
Enable Autoscaling: True
-
Worker Type: Standard_F4s
-
Driver Type: Standard_F4s
-
Spark Settings under “Spark Config” (Edit > Advanced Options > Spark)
spark.databricks.cluster.profile serverless spark.databricks.repl.allowedLanguages sql,python,r spark.databricks.conda.condaMagic.enabled true
Orchestrator DataBricks Job from a Databricks Job create template can be created using following example CLI command -
databricks jobs create --json-file <job-template-file>.json
Orchestrator DataBricks Job from a Databricks Job reset template can be updated using following example CLI command -
databricks jobs reset --job-id <job-id of existing job> --json-file <job-template-file>.json
MLflow Experiment can be created using Databricks Workspace Portal or using following CLI commands -
export MLFLOW_TRACKING_URI=databricks
export DATABRICKS_HOST=<databricks host>
export DATABRICKS_TOKEN=<databricks token>
mlflow experiments create --experiment-name /<path in databricks workspace>/<experiment name>
Get DATABRICKS_HOST
and DATABRICKS_TOKEN
from Databricks CLI Reference
The following CLI command can be used to upload Wheel package into DataBricks DBFS.
databricks fs cp --overwrite python-package.whl <dbfs-path>
The following CLI command can be used to import orchestrator python file as a DataBricks notebook into DataBricks workspace.
databricks workspace import -l PYTHON -f SOURCE -o <orchestrator-notebook-python-file>.py <databricks-workspace-path>
Orchestrator databricks job can be triggered using following ways -
- Scheduled :
- Cron based scheduling.
- Manual :
- Databricks workspace portal but clicking on
Run Now With Different Parameters
. - Via Databricks-CLI.
- Via Databricks-API.
- Databricks workspace portal but clicking on