Organize parquet reader mukernel non-nullable code, introduce manual block scans #16830

pmattione-nvidia · 2024-09-18T15:58:27Z

This is a collection of a few small optimizations and tweaks for the parquet reader fixed-width mukernels (flat & nested, lists not implemented yet). The benchmark changes are negligible, this is mainly cleanup and code in preparation for the upcoming list mukernel.

If not reading the whole page (chunked reads) exit sooner
By having each thread keep track of the current valid_count (and not saving-to or reading-from the nesting_info until the end), we don't need to synchronize the block threads as frequently, so these extra syncs are removed.
For (non-list) nested columns that aren't nullable, we don't need to loop over the whole nesting depth; only the last level of nesting is used. After removing this loop, the non-nullable code for nested and flat hierarchies is identical, so they're extracted and consolidated into a new function.
When doing block scans in the parquet reader we also need to know the per-warp results of the scan. Because cub doesn't return those, we then do an additional warp-wide ballot that is unnecessary. This introduces code that does a block scan manually, saving the intermediate results. However using this code in the flat & nested kernels uses 8 more registers, so it isn't used yet.
By doing an exclusive-scan instead of an inclusive-scan, we don't need the extra "- 1's" that were everywhere.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

…lable column code

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

nvdbaranec

First pass. Will absorb and come back for another.

cpp/src/io/parquet/decode_fixed.cu

Co-authored-by: nvdbaranec <[email protected]>

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

vuule

Nice set of changes!

cpp/src/io/parquet/decode_fixed.cu

Co-authored-by: Vukasin Milovanovic <[email protected]>

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

pmattione-nvidia · 2024-10-11T16:18:12Z

/merge

Optimize parquet reader block scans, simplify and consolidate non-nul…

5390661

…lable column code

pmattione-nvidia added libcudf Affects libcudf (C++/CUDA) code. Performance Performance related issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 18, 2024

pmattione-nvidia self-assigned this Sep 18, 2024

pmattione-nvidia added 2 commits September 18, 2024 12:07

tweak syncing

3ef7b0d

tweak scan interface for linked lists

254f3e9

pmattione-nvidia changed the base branch from branch-24.10 to branch-24.12 September 24, 2024 15:37

pmattione-nvidia marked this pull request as ready for review September 24, 2024 18:44

pmattione-nvidia requested a review from a team as a code owner September 24, 2024 18:44

pmattione-nvidia requested review from karthikeyann, kingcrimsontianyu and nvdbaranec September 24, 2024 18:44

pmattione-nvidia and others added 3 commits September 25, 2024 12:22

Merge branch 'branch-24.12' into mukernels_fixedwidth_optimize

18d989c

style fixes

8ea1e0e

Merge branch 'mukernels_fixedwidth_optimize' of https://github.com/pm…

326b386

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

nvdbaranec requested changes Sep 25, 2024

View reviewed changes

pmattione-nvidia and others added 7 commits September 26, 2024 10:16

Update cpp/src/io/parquet/decode_fixed.cu

41cb982

Co-authored-by: nvdbaranec <[email protected]>

Update cpp/src/io/parquet/decode_fixed.cu

6e70554

Co-authored-by: nvdbaranec <[email protected]>

Update cpp/src/io/parquet/decode_fixed.cu

9ad4415

Co-authored-by: nvdbaranec <[email protected]>

Unroll block-count loop

3a1fc95

Merge branch 'mukernels_fixedwidth_optimize' of https://github.com/pm…

0babf46

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

more style fixes

5ab9829

Merge branch 'branch-24.12' into mukernels_fixedwidth_optimize

310d50c

pmattione-nvidia requested a review from vuule September 30, 2024 15:33

vuule requested a review from nvdbaranec October 1, 2024 18:16

vuule approved these changes Oct 1, 2024

View reviewed changes

cpp/src/io/parquet/decode_fixed.cu Outdated Show resolved Hide resolved

pmattione-nvidia and others added 2 commits October 2, 2024 15:43

Disable manual block scan for non-lists

4471022

Update cpp/src/io/parquet/decode_fixed.cu

c0ed2cb

Co-authored-by: Vukasin Milovanovic <[email protected]>

pmattione-nvidia added 2 commits October 4, 2024 12:30

Merge branch 'mukernels_fixedwidth_optimize' of https://github.com/pm…

c2139ef

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

Style fixes

b898cba

pmattione-nvidia changed the title ~~Optimize parquet reader mukernel block scans, non-nullable code~~ Organize parquet reader mukernel non-nullable code, introduce manual block scans Oct 4, 2024

pmattione-nvidia and others added 3 commits October 7, 2024 11:58

undo loop unroll, increased reg count

e0b3d40

Merge branch 'branch-24.12' into mukernels_fixedwidth_optimize

f7378fa

Merge branch 'branch-24.12' into mukernels_fixedwidth_optimize

2b7fefa

nvdbaranec approved these changes Oct 10, 2024

View reviewed changes

vuule added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Oct 10, 2024

rapids-bot bot merged commit 891e5aa into rapidsai:branch-24.12 Oct 11, 2024
100 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Organize parquet reader mukernel non-nullable code, introduce manual block scans #16830

Organize parquet reader mukernel non-nullable code, introduce manual block scans #16830

pmattione-nvidia commented Sep 18, 2024 •

edited

Loading

nvdbaranec left a comment

vuule left a comment

pmattione-nvidia commented Oct 11, 2024

Organize parquet reader mukernel non-nullable code, introduce manual block scans #16830

Organize parquet reader mukernel non-nullable code, introduce manual block scans #16830

Conversation

pmattione-nvidia commented Sep 18, 2024 • edited Loading

Checklist

nvdbaranec left a comment

Choose a reason for hiding this comment

vuule left a comment

Choose a reason for hiding this comment

pmattione-nvidia commented Oct 11, 2024

pmattione-nvidia commented Sep 18, 2024 •

edited

Loading