Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add append_array for some array builder implementations #7309

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

rluvaton
Copy link
Contributor

@rluvaton rluvaton commented Mar 18, 2025

This is on top of


Which issue does this PR close?

N/A

Rationale for this change

This is a building block for implementing specialized concat

What changes are included in this PR?

added append_array function for:

  1. PrimitiveBuilder, GenericByteBuilder and BooleanBuilder with tests
  2. Change concat to use this specialized implementation

Are there any user-facing changes?

Yes, new method.


Local benchmark results:

$ cargo bench --features test_utils -p arrow --bench concatenate_kernel -- --baseline main-concatenate_kernel

concat i32 1024         time:   [234.82 ns 236.58 ns 238.50 ns]
                        change: [-52.116% -51.701% -51.234%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

concat i32 nulls 1024   time:   [357.82 ns 360.23 ns 363.06 ns]
                        change: [-46.113% -45.728% -45.370%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

concat i32 8192 over 100 arrays
                        time:   [130.41 µs 131.34 µs 132.28 µs]
                        change: [-5.5753% -4.6316% -3.5808%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild

concat i32 nulls 8192 over 100 arrays
                        time:   [151.43 µs 153.18 µs 155.05 µs]
                        change: [-4.1616% -2.6759% -1.1288%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

concat 1024 arrays i32 4
                        time:   [8.6263 µs 8.6465 µs 8.6699 µs]
                        change: [-89.718% -89.678% -89.638%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

concat str 1024         time:   [6.0463 µs 6.0539 µs 6.0626 µs]
                        change: [-10.066% -9.9362% -9.8074%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

concat str nulls 1024   time:   [3.7298 µs 3.7344 µs 3.7393 µs]
                        change: [-16.925% -16.796% -16.675%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

concat str_dict 1024    time:   [2.0799 µs 2.0838 µs 2.0877 µs]
                        change: [-1.7071% -1.3975% -1.1049%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

concat str_dict_sparse 1024
                        time:   [5.8639 µs 5.8710 µs 5.8775 µs]
                        change: [-0.7492% -0.3510% +0.0217%] (p = 0.08 > 0.05)
                        No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
  2 (2.00%) low severe
  8 (8.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

concat str nulls 1024 #2
                        time:   [3.7188 µs 3.7646 µs 3.8541 µs]
                        change: [-17.070% -16.553% -15.777%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

concat fixed size lists time:   [338.68 µs 341.41 µs 344.52 µs]
                        change: [-0.5865% +1.4834% +3.2434%] (p = 0.14 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe

@github-actions github-actions bot added the arrow Changes to the arrow crate label Mar 18, 2025
@rluvaton rluvaton changed the title feat: add append_array for PrimitiveBuilder feat: add append_array for some array builder implementations Mar 18, 2025
@tustvold
Copy link
Contributor

What is the motivation for adding these here, as opposed to just using the array constructors to implement concat? IMO if you aren't constructing by value, there's no point using the builders?

@rluvaton
Copy link
Contributor Author

rluvaton commented Mar 18, 2025

The motivation is:

  1. when you don't have all the arrays at the moment
  2. Easier for users, as they would need to manually implement concat for different implementation
  3. Add specialize implementation for concat making it faster, in my local testing this can improve up to 50% faster for concat primitive for example

@rluvaton
Copy link
Contributor Author

I've pushed the concat updated implementation so you can run the benchmarks locally, for me:

concat i32 1024         time:   [241.43 ns 244.65 ns 247.98 ns]
                        change: [-47.617% -46.762% -45.826%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

concat i32 nulls 1024   time:   [350.99 ns 352.72 ns 354.66 ns]
                        change: [-45.490% -45.239% -44.996%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

concat 1024 arrays i32 4
                        time:   [7.5176 µs 7.5277 µs 7.5396 µs]
                        change: [-89.988% -89.963% -89.935%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe

@tustvold
Copy link
Contributor

This seems like quite a fair bit of code and codegen to shave off literal nanoseconds, no? Do we see similar returns for the byte array types? Also as this is really just reducing the dispatch overheads, I would expect as the array sizes increase the return would largely disappear.

I dunno I am always pretty wary of adding new unsafe APIs...

@rluvaton rluvaton marked this pull request as ready for review March 23, 2025 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants