You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| 1 | PyTorch | Image Classification | Training a model to predict clothing categories in FashionMNIST, including accelerated inference with Torch-TensorRT. | [Link](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html)
47
-
| 2 | PyTorch | Housing Regression | Training a model to predict housing prices in the California Housing Dataset, including accelerated inference with Torch-TensorRT. | [Link](https://github.com/christianversloot/machine-learning-articles/blob/main/how-to-create-a-neural-network-for-regression-with-pytorch.md)
48
-
| 3 | Tensorflow | Image Classification | Training a model to predict hand-written digits in MNIST. | [Link](https://github.com/tensorflow/docs/blob/master/site/en/tutorials/keras/save_and_load.ipynb)
49
-
| 4 | Tensorflow | Keras Preprocessing | Training a model with preprocessing layers to predict likelihood of pet adoption in the PetFinder mini dataset. | [Link](https://github.com/tensorflow/docs/blob/master/site/en/tutorials/structured_data/preprocessing_layers.ipynb)
50
-
| 5 | Tensorflow | Keras Resnet50 | Training ResNet-50 to perform flower recognition from flower images. | [Link](https://docs.databricks.com/en/_extras/notebooks/source/deep-learning/keras-metadata.html)
51
-
| 6 | Tensorflow | Text Classification | Training a model to perform sentiment analysis on the IMDB dataset. | [Link](https://github.com/tensorflow/docs/blob/master/site/en/tutorials/keras/text_classification.ipynb)
52
-
| 7+8 | HuggingFace | Conditional Generation | Sentence translation using the T5 text-to-text transformer for both Torch and Tensorflow. | [Link](https://huggingface.co/docs/transformers/model_doc/t5#t5)
53
-
| 9+10 | HuggingFace | Pipelines | Sentiment analysis using Huggingface pipelines for both Torch and Tensorflow. | [Link](https://huggingface.co/docs/transformers/quicktour#pipeline-usage)
54
-
| 11 | HuggingFace | Sentence Transformers | Sentence embeddings using SentenceTransformers in Torch. | [Link](https://huggingface.co/sentence-transformers)
46
+
| 1 | HuggingFace | DeepSeek-R1 | LLM batch inference using the DeepSeek-R1-Distill-Llama reasoning model. | [Link](https://huggingface.co/deepseek-ai/DeepSeek-R1)
47
+
| 2 | HuggingFace | Gemma-7b | LLM batch inference using the lightweight Google Gemma-7b model. | [Link](https://huggingface.co/google/gemma-7b-it)
48
+
| 3 | HuggingFace | Sentence Transformers | Sentence embeddings using SentenceTransformers in Torch. | [Link](https://huggingface.co/sentence-transformers)
49
+
| 4+5 | HuggingFace | Conditional Generation | Sentence translation using the T5 text-to-text transformer for both Torch and Tensorflow. | [Link](https://huggingface.co/docs/transformers/model_doc/t5#t5)
50
+
| 6+7 | HuggingFace | Pipelines | Sentiment analysis using Huggingface pipelines for both Torch and Tensorflow. | [Link](https://huggingface.co/docs/transformers/quicktour#pipeline-usage)
51
+
| 8 | PyTorch | Image Classification | Training a model to predict clothing categories in FashionMNIST, and deploying with Torch-TensorRT accelerated inference. | [Link](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html)
52
+
| 9 | PyTorch | Housing Regression | Training and deploying a model to predict housing prices in the California Housing Dataset, and deploying with Torch-TensorRT accelerated inference. | [Link](https://github.com/christianversloot/machine-learning-articles/blob/main/how-to-create-a-neural-network-for-regression-with-pytorch.md)
53
+
| 10 | Tensorflow | Image Classification | Training and deploying a model to predict hand-written digits in MNIST. | [Link](https://github.com/tensorflow/docs/blob/master/site/en/tutorials/keras/save_and_load.ipynb)
54
+
| 11 | Tensorflow | Keras Preprocessing | Training and deploying a model with preprocessing layers to predict likelihood of pet adoption in the PetFinder mini dataset. | [Link](https://github.com/tensorflow/docs/blob/master/site/en/tutorials/structured_data/preprocessing_layers.ipynb)
55
+
| 12 | Tensorflow | Keras Resnet50 | Deploying ResNet-50 to perform flower recognition from flower images. | [Link](https://docs.databricks.com/en/_extras/notebooks/source/deep-learning/keras-metadata.html)
56
+
| 13 | Tensorflow | Text Classification | Training and deploying a model to perform sentiment analysis on the IMDB dataset. | [Link](https://github.com/tensorflow/docs/blob/master/site/en/tutorials/keras/text_classification.ipynb)
57
+
55
58
56
59
## Running Locally
57
60
@@ -130,9 +133,8 @@ The notebooks use [PyTriton](https://github.com/triton-inference-server/pytriton
130
133
The diagram above shows how Spark distributes inference tasks to run on the Triton Inference Server, with PyTriton handling request/response communication with the server.
131
134
132
135
The process looks like this:
133
-
- Distribute a PyTriton task across the Spark cluster, instructing each worker to launch a Triton server process.
134
-
- Use stage-level scheduling to ensure there is a 1:1 mapping between worker nodes and servers.
135
-
- Define a Triton inference function, which contains a client that binds to the local server on a given worker and sends inference requests.
136
+
- Prior to inference, launch a Triton server process on each node.
137
+
- Define a Triton predict function, which creates a client that binds to the local server and sends/receives inference requests.
136
138
- Wrap the Triton inference function in a predict_batch_udf to launch parallel inference requests using Spark.
137
139
- Finally, distribute a shutdown signal to terminate the Triton server processes on each worker.
Copy file name to clipboardexpand all lines: examples/ML+DL-Examples/Spark-DL/dl_inference/databricks/README.md
+8-5
Original file line number
Diff line number
Diff line change
@@ -34,22 +34,25 @@
34
34
databricks workspace import $INIT_DEST --format AUTO --file $INIT_SRC
35
35
```
36
36
37
-
6. Launch the cluster with the provided script (note that the script specifies**Azure instances** by default; change as needed):
37
+
6. Launch the cluster with the provided script. By default the script will create a cluster with 4 A10 worker nodes and 1 A10 driver node. (Note that the script uses**Azure instances** by default; change as needed).
38
38
```shell
39
39
cd setup
40
40
chmod +x start_cluster.sh
41
41
./start_cluster.sh
42
42
```
43
-
44
43
OR, start the cluster from the Databricks UI:
45
44
46
45
- Go to `Compute > Create compute` and set the desired cluster settings.
47
46
- Integration with Triton inference server uses stage-level scheduling (Spark>=3.4.0). Make sure to:
48
-
- use a cluster with GPU resources
47
+
- use a cluster with GPU resources (for LLM examples, make sure the selected GPUs have sufficient RAM)
49
48
- set a value for`spark.executor.cores`
50
49
- ensure that `spark.executor.resource.gpu.amount` = 1
51
50
- Under `Advanced Options > Init Scripts`, upload the init script from your workspace.
52
-
- Under environment variables, set`FRAMEWORK=torch` or `FRAMEWORK=tf` based on the notebook used.
53
-
- For Tensorflow notebooks, we recommend setting the environment variable `TF_GPU_ALLOCATOR=cuda_malloc_async` (especially for Huggingface LLM models), which enables the CUDA driver to implicity release unused memory from the pool.
51
+
- Under environment variables, set:
52
+
- `FRAMEWORK=torch` or `FRAMEWORK=tf` based on the notebook used.
53
+
- `HF_HOME=/dbfs/FileStore/hf_home` to cache Huggingface models in DBFS.
54
+
- `TF_GPU_ALLOCATOR=cuda_malloc_async` to implicity release unused GPU memory in Tensorflow notebooks.
55
+
56
+
54
57
55
58
7. Navigate to the notebook in your workspace and attach it to the cluster. The default cluster name is `spark-dl-inference-$FRAMEWORK`.
Copy file name to clipboardexpand all lines: examples/ML+DL-Examples/Spark-DL/dl_inference/dataproc/README.md
+1-2
Original file line number
Diff line number
Diff line change
@@ -50,13 +50,12 @@
50
50
```shell
51
51
export FRAMEWORK=torch
52
52
```
53
-
Run the cluster startup script. The script will also retrieve and use the [spark-rapids initialization script](https://github.com/GoogleCloudDataproc/initialization-actions/blob/master/spark-rapids/spark-rapids.sh) to setup GPU resources.
53
+
Run the cluster startup script. The script will also retrieve and use the [spark-rapids initialization script](https://github.com/GoogleCloudDataproc/initialization-actions/blob/master/spark-rapids/spark-rapids.sh) to setup GPU resources. The script will create 4 L4 worker nodes and 1 L4 driver node by default, named `${USER}-spark-dl-inference-${FRAMEWORK}`.
54
54
```shell
55
55
cd setup
56
56
chmod +x start_cluster.sh
57
57
./start_cluster.sh
58
58
```
59
-
By default, the script creates a 4 node GPU cluster named `${USER}-spark-dl-inference-${FRAMEWORK}`.
60
59
61
60
7. Browse to the Jupyter web UI:
62
61
- Go to `Dataproc` > `Clusters` > `(Cluster Name)` > `Web Interfaces` > `Jupyter/Lab`
0 commit comments