Skip to content

Commit 3cff617

Browse files
authored
Merge pull request #289 from nvliyuan/main-v2304
merge branch-23.04 to main branch
2 parents 0cc1d0c + 234fdb7 commit 3cff617

40 files changed

+266
-104
lines changed

.github/workflows/auto-merge.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ name: auto-merge HEAD to BASE
1818
on:
1919
pull_request_target:
2020
branches:
21-
- branch-23.02
21+
- branch-23.04
2222
types: [closed]
2323

2424
jobs:
@@ -29,14 +29,14 @@ jobs:
2929
steps:
3030
- uses: actions/checkout@v3
3131
with:
32-
ref: branch-23.02 # force to fetch from latest upstream instead of PR ref
32+
ref: branch-23.04 # force to fetch from latest upstream instead of PR ref
3333

3434
- name: auto-merge job
3535
uses: ./.github/workflows/auto-merge
3636
env:
3737
OWNER: NVIDIA
3838
REPO_NAME: spark-rapids-examples
39-
HEAD: branch-23.02
40-
BASE: branch-23.04
39+
HEAD: branch-23.04
40+
BASE: branch-23.06
4141
AUTOMERGE_TOKEN: ${{ secrets.AUTOMERGE_TOKEN }} # use to merge PR
4242

docs/get-started/xgboost-examples/csp/databricks/databricks.md

+36-40
Original file line numberDiff line numberDiff line change
@@ -6,27 +6,26 @@ This is a getting started guide to XGBoost4J-Spark on Databricks. At the end of
66
Prerequisites
77
-------------
88

9-
* Apache Spark 3.1+ running in Databricks Runtime 9.1 ML or 10.4 ML with GPU
10-
* AWS: 9.1 LTS ML (GPU, Scala 2.12, Spark 3.1.2) or 10.4 LTS ML (GPU, Scala 2.12, Spark 3.2.1)
11-
* Azure: 9.1 LTS ML (GPU, Scala 2.12, Spark 3.1.2) or 10.4 LTS ML (GPU, Scala 2.12, Spark 3.2.1)
9+
* Apache Spark 3.x running in Databricks Runtime 10.4 ML or 11.3 ML with GPU
10+
* AWS: 10.4 LTS ML (GPU, Scala 2.12, Spark 3.2.1) or 11.3 LTS ML (GPU, Scala 2.12, Spark 3.3.0)
11+
* Azure: 10.4 LTS ML (GPU, Scala 2.12, Spark 3.2.1) or 11.3 LTS ML (GPU, Scala 2.12, Spark 3.3.0)
1212

1313
The number of GPUs per node dictates the number of Spark executors that can run in that node. Each executor should only be allowed to run 1 task at any given time.
1414

1515
Start A Databricks Cluster
1616
--------------------------
1717

18-
Create a Databricks cluster by clicking "+ Create -> Cluster" on the left panel. Ensure the
18+
Create a Databricks cluster by going to "Compute", then clicking `+ Create compute`. Ensure the
1919
cluster meets the prerequisites above by configuring it as follows:
2020
1. Select the Databricks Runtime Version from one of the supported runtimes specified in the
2121
Prerequisites section.
22-
2. Under Autopilot Options, disable autoscaling.
23-
3. Choose the number of workers you want to use.
24-
4. Select a worker type. On AWS, use nodes with 1 GPU each such as `p3.2xlarge` or `g4dn.xlarge`.
22+
2. Choose the number of workers that matches the number of GPUs you want to use.
23+
3. Select a worker type. On AWS, use nodes with 1 GPU each such as `p3.2xlarge` or `g4dn.xlarge`.
2524
p2 nodes do not meet the architecture requirements (Pascal or higher) for the Spark worker
2625
(although they can be used for the driver node). For Azure, choose GPU nodes such as
27-
Standard_NC6s_v3.
28-
5. Select the driver type. Generally this can be set to be the same as the worker.
29-
6. Start the cluster.
26+
Standard_NC6s_v3. For GCP, choose N1 or A2 instance types with GPUs.
27+
4. Select the driver type. Generally this can be set to be the same as the worker.
28+
5. Start the cluster.
3029

3130
Advanced Cluster Configuration
3231
--------------------------
@@ -38,20 +37,18 @@ cluster.
3837
your workspace. See [Managing
3938
Notebooks](https://docs.databricks.com/notebooks/notebooks-manage.html#id2) for instructions on
4039
how to import a notebook.
41-
Select the initialization script based on the Databricks runtime
40+
Select the version of the RAPIDS Accelerator for Apache Spark based on the Databricks runtime
4241
version:
43-
44-
- [Databricks 9.1 LTS
45-
ML](https://docs.databricks.com/release-notes/runtime/9.1ml.html#system-environment) has CUDA 11
46-
installed. Users will need to use 21.12.0 or later on Databricks 9.1 LTS ML. In this case use
47-
[generate-init-script.ipynb](generate-init-script.ipynb) which will install
48-
the RAPIDS Spark plugin.
49-
50-
- [Databricks 10.4 LTS
51-
ML](https://docs.databricks.com/release-notes/runtime/9.1ml.html#system-environment) has CUDA 11
52-
installed. Users will need to use 22.04.0 or later on Databricks 10.4 LTS ML. In this case use
53-
[generate-init-script-10.4.ipynb](generate-init-script-10.4.ipynb) which will install
54-
the RAPIDS Spark plugin.
42+
- [Databricks 10.4 LTS
43+
ML](https://docs.databricks.com/release-notes/runtime/10.4ml.html#system-environment) has CUDA 11
44+
installed. Users will need to use 22.04.0 or later on Databricks 10.4 LTS ML.
45+
- [Databricks 11.3 LTS
46+
ML](https://docs.databricks.com/release-notes/runtime/11.3ml.html#system-environment) has CUDA 11
47+
installed. Users will need to use 23.04.0 or later on Databricks 11.3 LTS ML.
48+
49+
In both cases use
50+
[generate-init-script.ipynb](./generate-init-script.ipynb) which will install
51+
the RAPIDS Spark plugin.
5552

5653
2. Once you are in the notebook, click the “Run All” button.
5754
3. Ensure that the newly created init.sh script is present in the output from cell 2 and that the
@@ -72,23 +69,17 @@ cluster.
7269
The
7370
[`spark.task.resource.gpu.amount`](https://spark.apache.org/docs/latest/configuration.html#scheduling)
7471
configuration is defaulted to 1 by Databricks. That means that only 1 task can run on an
75-
executor with 1 GPU, which is limiting, especially on the reads and writes from Parquet. Set
72+
executor with 1 GPU, which is limiting, especially on the reads and writes from Parquet. Set
7673
this to 1/(number of cores per executor) which will allow multiple tasks to run in parallel just
77-
like the CPU side. Having the value smaller is fine as well.
78-
79-
There is an incompatibility between the Databricks specific implementation of adaptive query
80-
execution (AQE) and the spark-rapids plugin. In order to mitigate this,
81-
`spark.sql.adaptive.enabled` should be set to false. In addition, the plugin does not work with
82-
the Databricks `spark.databricks.delta.optimizeWrite` option.
74+
like the CPU side. Having the value smaller is fine as well.
75+
Note: Please remove the `spark.task.resource.gpu.amount` config for a single-node Databricks
76+
cluster because Spark local mode does not support GPU scheduling.
8377

8478
```bash
85-
spark.plugins com.nvidia.spark.SQLPlugin
86-
spark.task.resource.gpu.amount 0.1
87-
spark.rapids.memory.pinnedPool.size 2G
88-
spark.locality.wait 0s
89-
spark.databricks.delta.optimizeWrite.enabled false
90-
spark.sql.adaptive.enabled false
91-
spark.rapids.sql.concurrentGpuTasks 2
79+
spark.plugins com.nvidia.spark.SQLPlugin
80+
spark.task.resource.gpu.amount 0.1
81+
spark.rapids.memory.pinnedPool.size 2G
82+
spark.rapids.sql.concurrentGpuTasks 2
9283
```
9384

9485
![Spark Config](../../../../img/databricks/sparkconfig.png)
@@ -186,6 +177,11 @@ Limitations
186177
4. Databricks makes changes to the runtime without notification.
187178
188179
Databricks makes changes to existing runtimes, applying patches, without notification.
189-
[Issue-3098](https://github.com/NVIDIA/spark-rapids/issues/3098) is one example of this. We run
190-
regular integration tests on the Databricks environment to catch these issues and fix them once
191-
detected.
180+
[Issue-3098](https://github.com/NVIDIA/spark-rapids/issues/3098) is one example of this. We run
181+
regular integration tests on the Databricks environment to catch these issues and fix them once
182+
detected.
183+
184+
5. In Databricks 11.3, an incorrect result is returned for window frames defined by a range in case
185+
of DecimalTypes with precision greater than 38. There is a bug filed in Apache Spark for it
186+
[here](https://issues.apache.org/jira/browse/SPARK-41793), whereas when using the plugin the
187+
correct result will be returned.

docs/get-started/xgboost-examples/csp/databricks/generate-init-script-10.4.ipynb

+3-3
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
"source": [
2525
"%sh\n",
2626
"cd ../../dbfs/FileStore/jars/\n",
27-
"sudo wget -O rapids-4-spark_2.12-22.12.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.12.0/rapids-4-spark_2.12-22.12.0.jar\n",
27+
"sudo wget -O rapids-4-spark_2.12-23.04.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.04.0/rapids-4-spark_2.12-23.04.0.jar\n",
2828
"sudo wget -O xgboost4j-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-gpu_2.12/1.7.1/xgboost4j-gpu_2.12-1.7.1.jar\n",
2929
"sudo wget -O xgboost4j-spark-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-spark-gpu_2.12/1.7.1/xgboost4j-spark-gpu_2.12-1.7.1.jar\n",
3030
"ls -ltr\n",
@@ -60,7 +60,7 @@
6060
"sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-spark-gpu_2.12--ml.dmlc__xgboost4j-spark-gpu_2.12__1.5.2.jar\n",
6161
"\n",
6262
"sudo cp /dbfs/FileStore/jars/xgboost4j-gpu_2.12-1.7.1.jar /databricks/jars/\n",
63-
"sudo cp /dbfs/FileStore/jars/rapids-4-spark_2.12-22.12.0.jar /databricks/jars/\n",
63+
"sudo cp /dbfs/FileStore/jars/rapids-4-spark_2.12-23.04.0.jar /databricks/jars/\n",
6464
"sudo cp /dbfs/FileStore/jars/xgboost4j-spark-gpu_2.12-1.7.1.jar /databricks/jars/\"\"\", True)"
6565
]
6666
},
@@ -133,7 +133,7 @@
133133
"1. Edit your cluster, adding an initialization script from `dbfs:/databricks/init_scripts/init.sh` in the \"Advanced Options\" under \"Init Scripts\" tab\n",
134134
"2. Reboot the cluster\n",
135135
"3. Go to \"Libraries\" tab under your cluster and install `dbfs:/FileStore/jars/xgboost4j-spark-gpu_2.12-1.7.1.jar` in your cluster by selecting the \"DBFS\" option for installing jars\n",
136-
"4. Import the mortgage example notebook from `https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.02/examples/XGBoost-Examples/mortgage/notebooks/python/mortgage-gpu.ipynb`\n",
136+
"4. Import the mortgage example notebook from `https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.04/examples/XGBoost-Examples/mortgage/notebooks/python/mortgage-gpu.ipynb`\n",
137137
"5. Inside the mortgage example notebook, update the data paths\n",
138138
" `train_data = reader.schema(schema).option('header', True).csv('/data/mortgage/csv/small-train.csv')`\n",
139139
" `trans_data = reader.schema(schema).option('header', True).csv('/data/mortgage/csv/small-trans.csv')`"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"## Download latest Jars"
8+
]
9+
},
10+
{
11+
"cell_type": "code",
12+
"execution_count": 2,
13+
"metadata": {},
14+
"outputs": [],
15+
"source": [
16+
"dbutils.fs.mkdirs(\"dbfs:/FileStore/jars/\")"
17+
]
18+
},
19+
{
20+
"cell_type": "code",
21+
"execution_count": 3,
22+
"metadata": {},
23+
"outputs": [],
24+
"source": [
25+
"%sh\n",
26+
"cd ../../dbfs/FileStore/jars/\n",
27+
"sudo wget -O rapids-4-spark_2.12-23.04.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.04.0/rapids-4-spark_2.12-23.04.0.jar\n",
28+
"sudo wget -O xgboost4j-gpu_2.12-1.7.3.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-gpu_2.12/1.7.3/xgboost4j-gpu_2.12-1.7.3.jar\n",
29+
"sudo wget -O xgboost4j-spark-gpu_2.12-1.7.3.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-spark-gpu_2.12/1.7.3/xgboost4j-spark-gpu_2.12-1.7.3.jar\n",
30+
"ls -ltr\n",
31+
"\n",
32+
"# Your Jars are downloaded in dbfs:/FileStore/jars directory"
33+
]
34+
},
35+
{
36+
"cell_type": "markdown",
37+
"metadata": {},
38+
"source": [
39+
"### Create a Directory for your init script"
40+
]
41+
},
42+
{
43+
"cell_type": "code",
44+
"execution_count": 5,
45+
"metadata": {},
46+
"outputs": [],
47+
"source": [
48+
"dbutils.fs.mkdirs(\"dbfs:/databricks/init_scripts/\")"
49+
]
50+
},
51+
{
52+
"cell_type": "code",
53+
"execution_count": 6,
54+
"metadata": {},
55+
"outputs": [],
56+
"source": [
57+
"dbutils.fs.put(\"/databricks/init_scripts/init.sh\",\"\"\"\n",
58+
"#!/bin/bash\n",
59+
"sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-gpu_2.12--ml.dmlc__xgboost4j-gpu_2.12__1.5.2.jar\n",
60+
"sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-spark-gpu_2.12--ml.dmlc__xgboost4j-spark-gpu_2.12__1.5.2.jar\n",
61+
"\n",
62+
"sudo cp /dbfs/FileStore/jars/xgboost4j-gpu_2.12-1.7.3.jar /databricks/jars/\n",
63+
"sudo cp /dbfs/FileStore/jars/rapids-4-spark_2.12-23.04.0.jar /databricks/jars/\n",
64+
"sudo cp /dbfs/FileStore/jars/xgboost4j-spark-gpu_2.12-1.7.3.jar /databricks/jars/\"\"\", True)"
65+
]
66+
},
67+
{
68+
"cell_type": "markdown",
69+
"metadata": {},
70+
"source": [
71+
"### Confirm your init script is in the new directory"
72+
]
73+
},
74+
{
75+
"cell_type": "code",
76+
"execution_count": 8,
77+
"metadata": {},
78+
"outputs": [],
79+
"source": [
80+
"%sh\n",
81+
"cd ../../dbfs/databricks/init_scripts\n",
82+
"pwd\n",
83+
"ls -ltr"
84+
]
85+
},
86+
{
87+
"cell_type": "markdown",
88+
"metadata": {},
89+
"source": [
90+
"### Download the Mortgage Dataset into your local machine and upload Data using import Data"
91+
]
92+
},
93+
{
94+
"cell_type": "code",
95+
"execution_count": 10,
96+
"metadata": {},
97+
"outputs": [],
98+
"source": [
99+
"dbutils.fs.mkdirs(\"dbfs:/FileStore/tables/\")"
100+
]
101+
},
102+
{
103+
"cell_type": "code",
104+
"execution_count": 11,
105+
"metadata": {},
106+
"outputs": [],
107+
"source": [
108+
"%sh\n",
109+
"cd /dbfs/FileStore/tables/\n",
110+
"wget -O mortgage.zip https://rapidsai-data.s3.us-east-2.amazonaws.com/spark/mortgage.zip\n",
111+
"ls\n",
112+
"unzip mortgage.zip"
113+
]
114+
},
115+
{
116+
"cell_type": "code",
117+
"execution_count": 12,
118+
"metadata": {},
119+
"outputs": [],
120+
"source": [
121+
"%sh\n",
122+
"pwd\n",
123+
"cd ../../dbfs/FileStore/tables\n",
124+
"ls -ltr mortgage/csv/*"
125+
]
126+
},
127+
{
128+
"cell_type": "markdown",
129+
"metadata": {},
130+
"source": [
131+
"### Next steps\n",
132+
"\n",
133+
"1. Edit your cluster, adding an initialization script from `dbfs:/databricks/init_scripts/init.sh` in the \"Advanced Options\" under \"Init Scripts\" tab\n",
134+
"2. Reboot the cluster\n",
135+
"3. Go to \"Libraries\" tab under your cluster and install `dbfs:/FileStore/jars/xgboost4j-spark-gpu_2.12-1.7.3.jar` in your cluster by selecting the \"DBFS\" option for installing jars\n",
136+
"4. Import the mortgage example notebook from `https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.04/examples/XGBoost-Examples/mortgage/notebooks/python/mortgage-gpu.ipynb`\n",
137+
"5. Inside the mortgage example notebook, update the data paths\n",
138+
" `train_data = reader.schema(schema).option('header', True).csv('/data/mortgage/csv/small-train.csv')`\n",
139+
" `trans_data = reader.schema(schema).option('header', True).csv('/data/mortgage/csv/small-trans.csv')`"
140+
]
141+
}
142+
],
143+
"metadata": {
144+
"kernelspec": {
145+
"display_name": "Python 3",
146+
"language": "python",
147+
"name": "python3"
148+
},
149+
"language_info": {
150+
"codemirror_mode": {
151+
"name": "ipython",
152+
"version": 3
153+
},
154+
"file_extension": ".py",
155+
"mimetype": "text/x-python",
156+
"name": "python",
157+
"nbconvert_exporter": "python",
158+
"pygments_lexer": "ipython3",
159+
"version": "3.8.2"
160+
},
161+
"name": "Init Scripts_demo",
162+
"notebookId": 2585487876834616
163+
},
164+
"nbformat": 4,
165+
"nbformat_minor": 1
166+
}

docs/get-started/xgboost-examples/csp/databricks/generate-init-script.ipynb

+3-3
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
"source": [
2525
"%sh\n",
2626
"cd ../../dbfs/FileStore/jars/\n",
27-
"sudo wget -O rapids-4-spark_2.12-22.12.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/22.12.0/rapids-4-spark_2.12-22.12.0.jar\n",
27+
"sudo wget -O rapids-4-spark_2.12-23.04.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.04.0/rapids-4-spark_2.12-23.04.0.jar\n",
2828
"sudo wget -O xgboost4j-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-gpu_2.12/1.7.1/xgboost4j-gpu_2.12-1.7.1.jar\n",
2929
"sudo wget -O xgboost4j-spark-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-spark-gpu_2.12/1.7.1/xgboost4j-spark-gpu_2.12-1.7.1.jar\n",
3030
"ls -ltr\n",
@@ -60,7 +60,7 @@
6060
"sudo rm -f /databricks/jars/spark--maven-trees--ml--9.x--xgboost-gpu--ml.dmlc--xgboost4j-spark-gpu_2.12--ml.dmlc__xgboost4j-spark-gpu_2.12__1.4.1.jar\n",
6161
"\n",
6262
"sudo cp /dbfs/FileStore/jars/xgboost4j-gpu_2.12-1.7.1.jar /databricks/jars/\n",
63-
"sudo cp /dbfs/FileStore/jars/rapids-4-spark_2.12-22.12.0.jar /databricks/jars/\n",
63+
"sudo cp /dbfs/FileStore/jars/rapids-4-spark_2.12-23.04.0.jar /databricks/jars/\n",
6464
"sudo cp /dbfs/FileStore/jars/xgboost4j-spark-gpu_2.12-1.7.1.jar /databricks/jars/\"\"\", True)"
6565
]
6666
},
@@ -133,7 +133,7 @@
133133
"1. Edit your cluster, adding an initialization script from `dbfs:/databricks/init_scripts/init.sh` in the \"Advanced Options\" under \"Init Scripts\" tab\n",
134134
"2. Reboot the cluster\n",
135135
"3. Go to \"Libraries\" tab under your cluster and install `dbfs:/FileStore/jars/xgboost4j-spark-gpu_2.12-1.7.1.jar` in your cluster by selecting the \"DBFS\" option for installing jars\n",
136-
"4. Import the mortgage example notebook from `https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.02/examples/XGBoost-Examples/mortgage/notebooks/python/mortgage-gpu.ipynb`\n",
136+
"4. Import the mortgage example notebook from `https://github.com/NVIDIA/spark-rapids-examples/blob/branch-23.04/examples/XGBoost-Examples/mortgage/notebooks/python/mortgage-gpu.ipynb`\n",
137137
"5. Inside the mortgage example notebook, update the data paths\n",
138138
" `train_data = reader.schema(schema).option('header', True).csv('/data/mortgage/csv/small-train.csv')`\n",
139139
" `trans_data = reader.schema(schema).option('header', True).csv('/data/mortgage/csv/small-trans.csv')`"

docs/get-started/xgboost-examples/csp/dataproc/gcp.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
gcloud dataproc clusters create $CLUSTER_NAME \
1818
--region=$REGION \
1919
--image-version=2.0-ubuntu18 \
20-
--master-machine-type=n1-standard-16 \
20+
--master-machine-type=n2-standard-16 \
2121
--num-workers=$NUM_WORKERS \
2222
--worker-accelerator=type=nvidia-tesla-t4,count=$NUM_GPUS \
2323
--worker-machine-type=n1-highmem-32\

docs/get-started/xgboost-examples/on-prem-cluster/kubernetes-scala.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,11 @@ Prerequisites
1111
* NVIDIA Pascal™ GPU architecture or better
1212
* Multi-node clusters with homogenous GPU configuration
1313
* Software Requirements
14-
* Ubuntu 18.04, 20.04/CentOS7, CentOS8
14+
* Ubuntu 18.04, 20.04/CentOS7, Rocky Linux 8
1515
* CUDA 11.0+
1616
* NVIDIA driver compatible with your CUDA
1717
* NCCL 2.7.8+
18-
* [Kubernetes 1.6+ cluster with NVIDIA GPUs](https://docs.nvidia.com/datacenter/kubernetes/index.html)
18+
* [Kubernetes cluster with NVIDIA GPUs](https://docs.nvidia.com/datacenter/cloud-native/kubernetes/install-k8s.html)
1919
* See official [Spark on Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html#prerequisites)
2020
instructions for detailed spark-specific cluster requirements
2121
* kubectl installed and configured in the job submission environment
@@ -40,7 +40,7 @@ export SPARK_DOCKER_IMAGE=<gpu spark docker image repo and name>
4040
export SPARK_DOCKER_TAG=<spark docker image tag>
4141

4242
pushd ${SPARK_HOME}
43-
wget https://github.com/NVIDIA/spark-rapids-examples/raw/branch-23.02/dockerfile/Dockerfile
43+
wget https://github.com/NVIDIA/spark-rapids-examples/raw/branch-23.04/dockerfile/Dockerfile
4444

4545
# Optionally install additional jars into ${SPARK_HOME}/jars/
4646

0 commit comments

Comments
 (0)