Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge branch-24.12 into main [skip ci] #2675

Merged
merged 175 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from 173 commits
Commits
Show all changes
175 commits
Select commit Hold shift + click to select a range
8c0ea23
Init version 24.12.0-SNAPSHOT
nvauto Sep 24, 2024
59dc583
Auto-merge use branch-24.12 versions
nvauto Sep 24, 2024
2acf86a
Merge pull request #2424 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 24, 2024
9392c4f
Auto-merge use branch-24.12 versions
nvauto Sep 24, 2024
51ef478
Merge pull request #2425 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 24, 2024
dd12265
Auto-merge use branch-24.12 versions
nvauto Sep 25, 2024
b0a8ea9
Merge pull request #2427 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 25, 2024
a66ce16
Auto-merge use branch-24.12 versions
nvauto Sep 25, 2024
9d1903f
Merge pull request #2429 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 25, 2024
c1cade8
Auto-merge use branch-24.12 versions
nvauto Sep 25, 2024
344e1ad
Merge pull request #2431 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 25, 2024
db536d7
Auto-merge use branch-24.12 versions
nvauto Sep 25, 2024
28b9362
Merge pull request #2435 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 25, 2024
cc04d59
Auto-merge use branch-24.12 versions
nvauto Sep 25, 2024
650b845
Merge pull request #2437 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 25, 2024
8c95535
Update submodule cudf to d1b411a273486c0e4205384589d33372b6e32a59 (#2…
nvauto Sep 26, 2024
eaa331e
Auto-merge use branch-24.12 versions
nvauto Sep 26, 2024
0a349d0
Merge pull request #2440 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 26, 2024
0d54865
Auto-merge use branch-24.12 versions
nvauto Sep 26, 2024
f0ea91b
Merge pull request #2442 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 26, 2024
206a1a3
Update submodule cudf to 6b3d57d33f9de725da86eedb67f3debb6f2d41b8 (#2…
nvauto Sep 26, 2024
ab68b8d
Update submodule cudf to 5f1396ae59e13831d11d822833b2ecf36a471328 (#2…
nvauto Sep 26, 2024
c493564
Auto-merge use branch-24.12 versions
nvauto Sep 26, 2024
8da4c1c
Merge pull request #2446 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 26, 2024
2f80bb1
Update submodule cudf to 9125d2f19ecd6a82f29cdb41928737ec73eb491b (#2…
nvauto Sep 27, 2024
e2a42df
Update submodule cudf to 0632538a69f55f6d489d306edf2910a111430425 (#2…
nvauto Sep 27, 2024
59e5085
Update submodule cudf to 22d481a4e3a34d517ad9a9ac46b8b1b456d365c6 (#2…
nvauto Sep 27, 2024
fdfb626
Update submodule cudf to 6973ef806bc9d3cbda37a4c7caa763da12b84b7f (#2…
nvauto Sep 28, 2024
c262911
Update submodule cudf to e2bcbb880f540987eb3fbd0fede9fed826ea2fdf (#2…
nvauto Sep 28, 2024
16307ab
Update submodule cudf to 9b2f892c5ec59605bfdc3a2abe4885176950589a (#2…
nvauto Sep 30, 2024
3a3220d
Auto-merge use branch-24.12 versions
nvauto Sep 30, 2024
349ff0d
Merge pull request #2456 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Sep 30, 2024
2347518
Update submodule cudf to 04baa225ca78de5717c50127bd5f77736f912930 (#2…
nvauto Oct 1, 2024
b2a411a
Update submodule cudf to 69dc356a5dd72232a6f4c8dac89432bfcdc0326b (#2…
nvauto Oct 1, 2024
17effee
Update submodule cudf to dae9d6899dd722c52bd42dd0fee51f4a6b336c93 (#2…
nvauto Oct 2, 2024
26e4e30
Update submodule cudf to bac81cb8f4c61c9a81e30e79d03c323406bf657a (#2…
nvauto Oct 2, 2024
dce3065
Auto-merge use branch-24.12 versions
nvauto Oct 2, 2024
785080f
Merge pull request #2464 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Oct 2, 2024
ae24b4d
Add HostTable interface to allow wielding of host tables in native co…
jlowe Oct 3, 2024
7623334
Update submodule cudf to bd3b3327a6326ffea4658d682b8b9087e32da98a (#2…
nvauto Oct 3, 2024
bf75ec6
Update submodule cudf to 010839172ecb5a99609044a98031ff5b7578cd64 (#2…
nvauto Oct 4, 2024
ec8b4d5
Update submodule cudf to a8da1ff2b393abbafa27dddcf4c19481ec853c28 (#2…
nvauto Oct 4, 2024
49e5f50
Update submodule cudf to 33b8dfa42ff9a600adfa6d10c7740169a0340338 (#2…
nvauto Oct 5, 2024
1ac7105
Update submodule cudf to fcff2b6ef7d6db62fc064ad10ffc6c873fc85b58 (#2…
nvauto Oct 5, 2024
cd24840
Update submodule cudf to f926a61c7d31b7b33c3a3482507e9efb44b2cc36 (#2…
nvauto Oct 7, 2024
7357481
Update submodule cudf to 09ed2105b841fe29be75af8b0d5a41fc09e7b6ac (#2…
nvauto Oct 8, 2024
c81d392
Disable kvikio remote I/O to avoid openssl dependencies (#2476)
jlowe Oct 8, 2024
efa3ba5
Update submodule cudf to 553d8ec197c45f7d10ae4571f625e97d7b88be82 (#2…
nvauto Oct 8, 2024
c5706b0
Update submodule cudf to ded4dd2acbf2c5933765853eab56f4d37599c909 (#2…
nvauto Oct 9, 2024
f9ce439
Update submodule cudf to bfac5e5d9b2c10718d2f0f925b4f2c9f62d8fea1 (#2…
nvauto Oct 9, 2024
1a4558a
Auto-merge use branch-24.12 versions
nvauto Oct 9, 2024
e373162
Merge pull request #2482 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Oct 9, 2024
4a7055d
Update submodule cudf to dfdae599622841bf3f4d523c01eee3ae1fe933f0 (#2…
nvauto Oct 9, 2024
2d67496
Update submodule cudf to 31423d056c45bd6352f0c611ed5e63423b09b954 (#2…
nvauto Oct 10, 2024
6e9548e
Update submodule cudf to 7173b52fce25937bb69e22a083a5de4655078fa1 (#2…
nvauto Oct 10, 2024
811103d
Update submodule cudf to 69b0f661ff2fc4c12bb0fe696e556f6b3224b381 (#2…
nvauto Oct 10, 2024
10fbcff
Implement `concat_json` to join JSON strings given by strings column …
ttnghia Oct 10, 2024
42dd898
Update submodule cudf to 1436cac9de8b450a32e71d5b779503e9a29edaa6 (#2…
nvauto Oct 11, 2024
6314455
Auto-merge use branch-24.12 versions
nvauto Oct 11, 2024
c1760f0
Merge pull request #2490 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Oct 11, 2024
a3132f5
Auto-merge use branch-24.12 versions
nvauto Oct 11, 2024
988a63b
Merge pull request #2493 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Oct 11, 2024
2c85286
Auto-merge use branch-24.12 versions
nvauto Oct 11, 2024
71736bd
Merge pull request #2495 from NVIDIA/bot-auto-merge-branch-24.10
nvauto Oct 11, 2024
024b9c6
Avoid parsing field name twice when matching named instruction in `ge…
ttnghia Oct 11, 2024
9356867
Nvcomp revert followup (#2497)
revans2 Oct 11, 2024
693c47a
Update submodule cudf to be1dd3267ed3cf7045c573ccc622f34fd159675f (#2…
nvauto Oct 12, 2024
282e0d0
Update submodule cudf to 4dbb8a354a9d4f0b4d82a5bf9747409c6304358f (#2…
nvauto Oct 12, 2024
2c3b60c
Make it so applying and removing patches are repeatable without error…
revans2 Oct 14, 2024
4aef6d9
Update to latest cudf 24.12 and add cudftestutil_impl dependency to t…
jlowe Oct 15, 2024
41945c6
Update submodule cudf to 7bcfc87935b7a202002d54e17e140789b02f16e9 (#2…
nvauto Oct 15, 2024
e118e6e
Make submodule-sync always try update cudf-pins (#2504)
pxLi Oct 16, 2024
33a92f7
Update submodule cudf to 3420c71cb72f63db8d63164446cca042f354a08e (#2…
nvauto Oct 16, 2024
fd67ca0
Update pinned versions for cudf 3420c71cb72f63db8d63164446cca042f354a…
nvauto Oct 16, 2024
e53547b
Bump org.apache.hadoop:hadoop-common from 3.2.4 to 3.4.0 (#2432)
dependabot[bot] Oct 16, 2024
252edb8
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 17, 2024
24fafdd
Update submodule cudf to 3683e4685ff0f0bc8122fe654742f708bf9fdbcc (#2…
nvauto Oct 17, 2024
6d2c092
Update submodule cudf to 14209c1962f1615f82f2c5be1cdbf58a6ed05789 (#2…
nvauto Oct 17, 2024
1beb0c8
Update submodule cudf to 00feb82cbda10bf65343e08d54ed9e893ff4aa71 (#2…
nvauto Oct 17, 2024
8a672b6
Update submodule cudf to ce93c366c451e27a49583cbb809bf5579a4bcf15 (#2…
nvauto Oct 18, 2024
797101f
Update submodule cudf to b8917229f8a2446c7e5f697475f76743a05e6856 (#2…
nvauto Oct 18, 2024
4765d5c
Use `cudf::make_strings_column_batch` in `get_json_object` (#2499)
ttnghia Oct 18, 2024
340e271
Update submodule cudf to 6ad90742f5a1efa5eecbbad25dddc46c1ed5c801 (#2…
nvauto Oct 18, 2024
8913882
Update submodule cudf to 98eef67d12670bd592022201b3c9dcc12374a34a (#2…
nvauto Oct 19, 2024
3aa3421
Update submodule cudf to fdd2b262aa76400d3d57018461eba37892445a4b (#2…
nvauto Oct 19, 2024
5a7c5ce
Update submodule cudf to 1ce2526bde7f77d2da7d0927a052fd9ccf69b9f2 (#2…
nvauto Oct 19, 2024
d0a55aa
Update submodule cudf to 074ab749531aa136c546afc7837fec0b404fe022 (#2…
nvauto Oct 20, 2024
ae6b48c
Update pinned versions for cudf 074ab749531aa136c546afc7837fec0b404fe…
nvauto Oct 22, 2024
5fe13b1
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 22, 2024
a11322d
Update submodule cudf to 637e3206a4656bd38636f3fadf3c4573c7bc906a (#2…
nvauto Oct 22, 2024
75155c4
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 22, 2024
dffb829
Update pinned versions for cudf 4fe338c0efe0fee2ee69c8207f9f4cbe9aa4d…
nvauto Oct 22, 2024
9c4061a
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 23, 2024
156ad0c
Update submodule cudf to 3126f775c527a8df65df2e2cbc8c2b73da2219bf (#2…
nvauto Oct 23, 2024
7b65899
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 23, 2024
f08cedf
Update submodule cudf to d7cdf44da2ba921c6fa63feff8749d141643f76e (#2…
nvauto Oct 24, 2024
d7b5035
Update pinned versions for cudf d7cdf44da2ba921c6fa63feff8749d141643f…
nvauto Oct 24, 2024
d7e66ec
Update submodule cudf to 3a623149827ec347e721dd1a18072f18b0b4bcc1 (#2…
nvauto Oct 24, 2024
1ba9349
Update submodule cudf to 7115f20e91a314f07333cbd5c01adc62bf2fbb0c (#2…
nvauto Oct 24, 2024
64635ec
Update pinned versions for cudf 7115f20e91a314f07333cbd5c01adc62bf2fb…
nvauto Oct 25, 2024
6f2b12c
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 25, 2024
c56716b
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 25, 2024
2a04c9f
Update submodule cudf to 8c4d1f201043a6802598bea3dcb58fa1e061d9e5 (#2…
nvauto Oct 28, 2024
ed440b9
Update submodule cudf to 1ad9fc1feef0ea0ee38adaa8f05cde6bb05aff0f (#2…
nvauto Oct 29, 2024
ee07164
Update submodule cudf to bf5b778c265b3bfa712f509be0ba268216bcf3d0 (#2…
nvauto Oct 29, 2024
0f32660
Update submodule cudf to 3775f7b9f6509bd0f2f75c46edb60abf2522de86 (#2…
nvauto Oct 29, 2024
02a5b34
fix max bytes dealloc bug (#2541)
zpuller Oct 29, 2024
47e1738
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 30, 2024
6ccc96f
Add utility methods for kudo (#2542)
liurenjie1024 Oct 30, 2024
d029559
Update submodule cudf to eeb4d2780163794f4b705062e49dbdc3283ebce0 (#2…
nvauto Oct 30, 2024
b0ed734
Update submodule cudf to 6328ad679947eb5cbc352c345a28f079aa6b8005 (#2…
nvauto Oct 30, 2024
c8ff5d6
Upmerge to a new version of CUDF with a new version of nvcomp (#2550)
revans2 Oct 30, 2024
34a3f47
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 30, 2024
d71b16b
Update submodule cudf to 893d0fde7c17a7f8126baddd2f1cf34600f9420e (#2…
nvauto Oct 31, 2024
7ee7b1c
Add schema visitor. (#2548)
liurenjie1024 Oct 31, 2024
660d630
Update submodule cudf to a0711d0f8492762877ea7c84e78166413f44f178 (#2…
nvauto Oct 31, 2024
11e91d4
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Oct 31, 2024
86a9e16
Update submodule cudf to 9657c9a5dc4c4a1bf9fd7b55cfeb53c60dda3c66 (#2…
nvauto Oct 31, 2024
11ac475
Update to latest cudf 24.12 and fix include path (#2560)
jlowe Nov 1, 2024
e2d41a7
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 1, 2024
f5401bb
Update submodule cudf to 3d07509deb9f589e1f986dc7f822392467ffcdde (#2…
nvauto Nov 2, 2024
8ff8490
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 4, 2024
e6a7128
Simplify zone db locking to avoid a race (#2561)
revans2 Nov 4, 2024
1f0b239
Update submodule cudf to e6f5c0e52d3784db4054daf23b7a37daea5c062d (#2…
nvauto Nov 4, 2024
58b5dbf
Make the zone DB shutdown/load cycle redoable (#2567)
revans2 Nov 4, 2024
ee379c3
Update submodule cudf to 45563b363d62b0f27f3d371e880142748a62eec5 (#2…
nvauto Nov 4, 2024
4a3661c
Update pinned versions for cudf 45563b363d62b0f27f3d371e880142748a62e…
nvauto Nov 5, 2024
ee4a6c4
Improve `concat_json` (#2557)
ttnghia Nov 5, 2024
3a4bdd0
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 5, 2024
c216fa0
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 6, 2024
4f904c0
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 6, 2024
2aa3348
Introduce kudo writer. (#2559)
liurenjie1024 Nov 6, 2024
42a29cc
Prepare the arg to disable kvikIO remote IO (#2583)
pxLi Nov 11, 2024
2239da9
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 12, 2024
f7b28ac
Remove the kvikIO workaround (#2584)
pxLi Nov 12, 2024
1eedb40
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 12, 2024
ed4ad22
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 13, 2024
ea47ecb
Update submodule cudf to 487f97c036ae7919e98ddc8bf5412a8002a493c5 (#2…
nvauto Nov 13, 2024
189da8a
Update submodule cudf to f5c0e5c12f6e1810e6fa71eeb8a3f1b5ee079800 (#2…
nvauto Nov 13, 2024
cd56242
Improve `from_json_to_raw_map` (#2562)
ttnghia Nov 13, 2024
84e08e2
Workaournd to apply cudf's thirdparty patches for cudf-pins build (#2…
pxLi Nov 14, 2024
e5e1603
Introduce kudo reader. (#2578)
liurenjie1024 Nov 14, 2024
ab1c8f1
Some minor fix for kudo. (#2596)
liurenjie1024 Nov 15, 2024
bf3323e
Update cmake to 3.28.6 (#2597)
jlowe Nov 15, 2024
b96aa67
Update method for locating cufile and remove unused libcudf cmake opt…
jlowe Nov 15, 2024
35808ef
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 16, 2024
2a13022
Update submodule cudf to 9cc907122077d18e5128e7da36685fdeb82fef41 (#2…
nvauto Nov 16, 2024
96ff348
Make kudo table constructor public. (#2601)
liurenjie1024 Nov 18, 2024
cb732db
Update submodule cudf to d514517001598158a31eba1590b01bf2b14d61f6 (#2…
nvauto Nov 18, 2024
e3fd02e
Update submodule cudf to 302e625bf87dce4059eb7c383dced848ad9d8f4c (#2…
nvauto Nov 19, 2024
5c7b6ef
Update submodule cudf to 384abae3b9954fa227b1df62195b33691e17623a (#2…
nvauto Nov 19, 2024
7dd69e9
Fix non-empty nulls handling in `string_to_decimal` and add more test…
ttnghia Nov 19, 2024
89db5c7
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 19, 2024
e9a73f1
Copyright header check [skip ci] (#2605)
YanxuanLiu Nov 20, 2024
577b5cf
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 20, 2024
5649305
Update submodule cudf to 3111aa4723150cb09b88c6968a51afb681b1ab6a (#2…
nvauto Nov 20, 2024
1f32461
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 20, 2024
bf94d21
Update submodule cudf to d9279929554a40b0417dd4f11e74e8f149477f73 (#2…
nvauto Nov 21, 2024
63f417a
Fix bug license header check [skip ci] (#2616)
YanxuanLiu Nov 21, 2024
a02c564
Update submodule cudf to 68c4285717bd1150c234e5a6e7f8bad7fa5550e2 (#2…
nvauto Nov 21, 2024
ac117b7
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Nov 21, 2024
fa3f445
Update submodule cudf to f54c1a5ad34133605d3b5b447d9717ce7eb6dba0 (#2…
nvauto Nov 21, 2024
6f8c24a
make device memory tracker global (#2620)
zpuller Nov 22, 2024
0708bce
Update submodule cudf to 305182e58c19add98a5abd6a5b00d9b266f41732 (#2…
nvauto Nov 22, 2024
4080f49
Implement `from_json_to_structs` (#2510)
ttnghia Nov 23, 2024
c170ea5
Add `HiveHash` support for nested types (#2534)
ustcfy Nov 25, 2024
6bc2627
Update submodule cudf to 439321edb43082fb75f195b6be2049c925279089 (#2…
nvauto Dec 4, 2024
e0c3157
Update pinned versions for cudf 439321edb43082fb75f195b6be2049c925279…
nvauto Dec 5, 2024
7351748
Update pinned versions for cudf 439321edb43082fb75f195b6be2049c925279…
nvauto Dec 5, 2024
7842da0
Update pinned versions for cudf 439321edb43082fb75f195b6be2049c925279…
nvauto Dec 6, 2024
cdef13e
update copyright (#2673)
YanxuanLiu Dec 10, 2024
e1a279d
Merge branch-24.12 into main
nvauto Dec 10, 2024
c115887
Change version to 24.12.0
nvauto Dec 10, 2024
95f27d9
[submodule-sync] bot-submodule-sync-branch-24.12 to branch-24.12 [ski…
nvauto Dec 13, 2024
64626e6
Merge remote-tracking branch 'upstream/branch-24.12' into merge-branc…
YanxuanLiu Dec 13, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions .github/workflows/license-header-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Copyright (c) 2024, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# A workflow to check copyright/license header
name: license header check

on:
pull_request:
types: [opened, synchronize, reopened]

jobs:
license-header-check:
runs-on: ubuntu-latest
if: "!contains(github.event.pull_request.title, '[bot]')"
steps:
- name: Get checkout depth
run: |
echo "PR_FETCH_DEPTH=$(( ${{ github.event.pull_request.commits }} + 10 ))" >> $GITHUB_ENV

- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: ${{ env.PR_FETCH_DEPTH }}

- name: license-header-check
uses: NVIDIA/spark-rapids-common/license-header-check@main
with:
included_file_patterns: |
*.cpp,
*.hpp,
*.cu,
*.cuh,
*.java,
*.sh,
*Dockerfile*,
*Jenkinsfile*,
*.yml,
*.yaml,
*.txt,
*.xml,
*.fbs,
build/*
excluded_file_patterns: |
thirdparty/*
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[submodule "thirdparty/cudf"]
path = thirdparty/cudf
url = https://github.com/rapidsai/cudf.git
branch = branch-24.10
branch = branch-24.12
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ $ ./build/build-in-docker install ...
```

Now cd to ~/repos/NVIDIA/spark-rapids and build with one of the options from
[spark-rapids instructions](https://github.com/NVIDIA/spark-rapids/blob/branch-24.10/CONTRIBUTING.md#building-from-source).
[spark-rapids instructions](https://github.com/NVIDIA/spark-rapids/blob/branch-24.12/CONTRIBUTING.md#building-from-source).

```bash
$ ./build/buildall
Expand Down
49 changes: 42 additions & 7 deletions build/apply-patches
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,6 @@
# limitations under the License.
#

# Run a command in a Docker container with devtoolset

set -e

BASE_DIR=$( git rev-parse --show-toplevel )
Expand All @@ -26,14 +24,51 @@ PATCH_DIR=${PATCH_DIR:-$(realpath "$BASE_DIR/patches/")}

CUDF_DIR=${CUDF_DIR:-$(realpath "$BASE_DIR/thirdparty/cudf/")}

# Apply pattches to CUDF is problematic in a number of ways. But ultimately it comes down to
# making sure that a user can do development work in spark-rapids-jni without the patches
# getting in the way
# The operations I really want to support no matter what state CUDF is in are
# 1) Build the repo from scratch
# 2) Rebuild the repo without having to clean and start over
# 3) upmerge to a new version of the plugin including updating the cudf submodule
#
# Building from scratch is simple. We want clean to unapply any patches and
# build to apply them. But if we want to rebuild without a clean we need to know what
# state the CUDF repo is in. Did we apply patches to it or not. The fastest way to do this
# is to save some state files about what happened. But a user could mess with CUDF directly
# so we want to have ways to double check that they are indeed correct.

FULLY_PATCHED_FILE="$CUDF_DIR/spark-rapids-jni.patch"

pushd "$CUDF_DIR"
if [ -n "$(git status --porcelain --untracked-files=no)" ] ; then
echo "Error: CUDF repository has uncommitted changes. No patches will be applied..."
exit 1

PATCH_FILES=$(find "$PATCH_DIR" -type f -not -empty)

if [ -z "$PATCH_FILES" ] ; then
echo "No patches to apply"
exit 0
fi

CHANGED_FILES=$(git status --porcelain --untracked-files=no)

if [ \( -s "$FULLY_PATCHED_FILE" \) -a \( -n "$CHANGED_FILES" \) ] ; then
if git apply -R --check "$FULLY_PATCHED_FILE" ; then
echo "Patches appear to have been applied already"
exit 0
fi
fi

if [ -n "$CHANGED_FILES" ] ; then
echo "Error: CUDF repository has uncommitted changes. No patches will be applied. Please clean the repository so we can try and add the needed patches"
echo "$CHANGED_FILE"
exit 1
fi

find "$PATCH_DIR" -maxdepth 1 -type f -print0 | sort -zV | while IFS= read -r -d '' file; do
echo "patching with: $file"
patch --no-backup-if-mismatch -f -t --reject-file=- -p1 -i "$file"
echo "patching with: $file"
git apply -v "$file"
done

git diff > "$FULLY_PATCHED_FILE"

popd
66 changes: 52 additions & 14 deletions build/unapply-patches
Original file line number Diff line number Diff line change
Expand Up @@ -16,29 +16,67 @@
# limitations under the License.
#

# Run a command in a Docker container with devtoolset

set -e

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
BASE_DIR=$( git rev-parse --show-toplevel )

PATCH_DIR=${PATCH_DIR:-$(realpath "$BASE_DIR/patches/")}

PATCH_DIR=${PATCH_DIR:-$(realpath "$SCRIPT_DIR/../patches/")}
CUDF_DIR=${CUDF_DIR:-$(realpath "$BASE_DIR/thirdparty/cudf/")}

CUDF_DIR=${CUDF_DIR:-$(realpath "$SCRIPT_DIR/../thirdparty/cudf/")}
# Apply pattches to CUDF is problematic in a number of ways. But ultimately it comes down to
# making sure that a user can do development work in spark-rapids-jni without the patches
# getting in the way
# The operations I really want to support no matter what state CUDF is in are
# 1) Build the repo from scratch
# 2) Rebuild the repo without having to clean and start over
# 3) upmerge to a new version of the plugin including updating the cudf submodule
#
# Building from scratch is simple. We want clean to unapply any patches and
# build to apply them. But if we want to rebuild without a clean we need to know what
# state the CUDF repo is in. Did we apply patches to it or not. The fastest way to do this
# is to save some state files about what happened. But a user could mess with CUDF directly
# so we want to have ways to double check that they are indeed correct.

FULLY_PATCHED_FILE="$CUDF_DIR/spark-rapids-jni.patch"

pushd "$CUDF_DIR"
if [ -n "$(git status --porcelain --untracked-files=no)" ] ; then
#only try to remove patches if it looks like something was changed
find "$PATCH_DIR" -maxdepth 1 -type f -print0 | sort -zV -r | while IFS= read -r -d '' file; do
echo "patching with: $file"
patch -R --no-backup-if-mismatch --reject-file=- -f -t -p1 -i "$file"
done

PATCH_FILES=$(find "$PATCH_DIR" -type f -not -empty)

if [ -z "$PATCH_FILES" ] ; then
echo "No patches to remove"
exit 0
fi

# Check for modifications
if [ -n "$(git status --porcelain --untracked-files=no)" ] ; then
echo "Error: CUDF repository has uncommitted changes. You might want to clean in manually if you know that is expected"
CHANGED_FILES=$(git status --porcelain --untracked-files=no)

if [ \( -s "$FULLY_PATCHED_FILE" \) -a \( -n "$CHANGED_FILES" \) ] ; then
if git apply --check -R "$FULLY_PATCHED_FILE"; then
echo "Patches appear to have been applied, so going to remove them"
git apply -R -v "$FULLY_PATCHED_FILE"
rm -f "$FULLY_PATCHED_FILE"

# Check for modifications, again
if [ -n "$(git status --porcelain --untracked-files=no)" ] ; then
echo "Error: CUDF repository has uncommitted changes. You might want to clean in manually if you know that is expected"
git status --porcelain --untracked-files=no
exit 1
fi

exit 0
else
echo "Files are changed, but in a way where the full path file does not apply to remove them $FULL_PATCHED_FILE"
exit 1
fi
fi

if [ -n "$CHANGED_FILES" ] ; then
echo "Error: CUDF repository has uncommitted changes, but does not appear to have been patched. Please clean it and try again."
echo "$CHANGED_FILE"
exit 1
else
echo "No changes in CUDF repository to remove"
fi

popd
2 changes: 1 addition & 1 deletion ci/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ RUN dnf --enablerepo=powertools install -y scl-utils gcc-toolset-${TOOLSET_VERSI
RUN mkdir -m 777 /usr/local/rapids /rapids

# 3.22.3: CUDA architecture 'native' support + flexible CMAKE_<LANG>_*_LAUNCHER for ccache
ARG CMAKE_VERSION=3.26.4
ARG CMAKE_VERSION=3.28.6
# default x86_64 from x86 build, aarch64 cmake for arm build
ARG CMAKE_ARCH=x86_64
RUN cd /usr/local && wget --quiet https://github.com/Kitware/CMake/releases/download/v${CMAKE_VERSION}/cmake-${CMAKE_VERSION}-linux-${CMAKE_ARCH}.tar.gz && \
Expand Down
43 changes: 25 additions & 18 deletions ci/submodule-sync.sh
Original file line number Diff line number Diff line change
Expand Up @@ -57,26 +57,29 @@ if [ -n "$CUDF_TAG" ]; then
else
git submodule update --remote --merge
fi

cudf_pins_only=false
cudf_sha=$(git -C thirdparty/cudf rev-parse HEAD)
if [[ "${cudf_sha}" == "${cudf_prev_sha}" ]]; then
echo "Submodule is up to date."
exit 0
echo "cuDF submodule is up to date. Try update cudf-pins..."
cudf_pins_only=true
else
echo "Try update cudf submodule to ${cudf_sha}..."
git add .
git commit -s -m "Update submodule cudf to ${cudf_sha}"
fi

echo "Try update cudf submodule to ${cudf_sha}..."
git add .

echo "Test against ${cudf_sha}..."

echo "Build libcudf only to update pinned versions..."
MVN="mvn -Dmaven.wagon.http.retryHandler.count=3 -B"
set +e
# Don't do a full build. Just try to update/build CUDF with no patches on top of it.
${MVN} validate ${MVN_MIRROR} \
# calling the antrun directly skips applying patches and also only builds
# libcudf
${MVN} antrun:run@build-libcudf ${MVN_MIRROR} \
-DCPP_PARALLEL_LEVEL=${PARALLEL_LEVEL} \
-Dlibcudf.build.configure=true \
-Dlibcudf.dependency.mode=latest \
-Dsubmodule.patch.skip \
-DUSE_GDS=ON -Dtest=*,!CuFileTest,!CudaFatalTest,!ColumnViewNonEmptyNullsTest \
-DUSE_GDS=ON \
-DBUILD_TESTS=ON \
-DUSE_SANITIZER=ON
validate_status=$?
Expand All @@ -88,21 +91,25 @@ rapids_cmake_sha=$(git -C ${LIBCUDF_BUILD_PATH}/_deps/rapids-cmake-src/ rev-pars
echo "Update rapids-cmake pinned SHA1 to ${rapids_cmake_sha}"
echo "${rapids_cmake_sha}" > thirdparty/cudf-pins/rapids-cmake.sha

# Bash the wrong nvcomp version to the correct version until
# nvcomp version mismatch is fixed. https://github.com/rapidsai/cudf/issues/16772.
echo "Revert nvcomp to 3.0.6"
sed -i -e 's/4\.0\.1\.0/3.0.6/' \
-e 's|https://developer.download.nvidia.com/compute/nvcomp/${version}/local_installers/nvcomp-linux-sbsa-${version}-cuda${cuda-toolkit-version-mapping}.tar.gz|https://developer.download.nvidia.com/compute/nvcomp/${version}/local_installers/nvcomp_${version}_SBSA_${cuda-toolkit-version-mapping}.tgz|' \
-e 's|https://developer.download.nvidia.com/compute/nvcomp/${version}/local_installers/nvcomp-linux-x86_64-${version}-cuda${cuda-toolkit-version-mapping}.tar.gz|https://developer.download.nvidia.com/compute/nvcomp/${version}/local_installers/nvcomp_${version}_x86_64_${cuda-toolkit-version-mapping}.tgz|' \
thirdparty/cudf-pins/versions.json
echo "Workaround for https://github.com/NVIDIA/spark-rapids-jni/issues/2582"
cudf_patch_path="cudf/cpp/cmake/thirdparty/patches"
sed -i "s|\${current_json_dir}|\${current_json_dir}/../${cudf_patch_path}|g" thirdparty/cudf-pins/versions.json

# Do the git add after the build so that we get
# the updated versions.json generated by the build
echo "Update cudf submodule to ${cudf_sha} with updated pinned versions"
git add .
git diff-index --quiet HEAD || git commit -s -m "Update submodule cudf to ${cudf_sha}"
if ! git diff-index --quiet HEAD; then
# We perform a squash merge for submodule-sync commits
git commit -s -m "Update pinned versions for cudf ${cudf_sha}"
elif ${cudf_pins_only}; then
echo "No changes to commit. Exit early..."
exit 0
fi

sha=$(git rev-parse HEAD)

echo "Test against ${cudf_sha}..."
set +e
# now build and test everything with the patches in place
${MVN} clean verify ${MVN_MIRROR} \
Expand Down
Empty file added patches/noop.patch
Empty file.
Loading
Loading