Replace usage of `lexical-core crate` with libcore for parsing strings as floats #4785

Weijun-H · 2023-09-07T03:42:43Z

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

tustvold · 2023-09-07T04:37:28Z

arrow-json/src/reader/primitive_array.rs

-        lexical_core::parse::<f32>(s).ok().map(f16::from_f32)
+        s.iter()
+            .map(|&b| b as char)
+            .collect::<String>()


I suspect the performance hit of this allocation will be a non-starter. This could be reduced by using str::from_utf8, but this will still likely regress performance

Since this file is json specific it might also be possible to use serde_json::from_slice(s).ok(), but it seems this has slightly stricter parsing and does not accept numbers such as 2e0 as integer (but does as float).

The great irony there is serde_json is just using a vended version of lexical_core 😅 but it might be a good way to proceed nonetheless

tustvold · 2023-09-11T12:18:02Z

Have you run the benchmarks for this change?

…ing strings as floats

Weijun-H · 2023-09-11T12:48:37Z

Hi @tustvold , I ran cargo +nightly bench --bench cast_kernels .

cast int32 to int32 512 time:   [204.00 ns 206.14 ns 208.33 ns]
                        change: [-5.5029% -3.9430% -2.4182%] (p = 0.00 < 0.05)
                        Performance has improved.

cast int32 to uint32 512
                        time:   [2.0384 µs 2.0466 µs 2.0567 µs]
                        change: [-0.1984% +0.2765% +0.8386%] (p = 0.33 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

cast int32 to float32 512
                        time:   [2.3875 µs 2.3941 µs 2.4022 µs]
                        change: [-1.8787% -0.0544% +1.0504%] (p = 0.95 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

cast int32 to float64 512
                        time:   [2.3832 µs 2.3848 µs 2.3867 µs]
                        change: [-2.7788% -1.7142% -0.9877%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

cast int32 to int64 512 time:   [2.3105 µs 2.3152 µs 2.3200 µs]
                        change: [-2.6993% -1.9542% -1.3761%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

cast float32 to int32 512
                        time:   [2.3297 µs 2.3345 µs 2.3397 µs]
                        change: [+0.9515% +1.1934% +1.4487%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

cast float64 to float32 512
                        time:   [2.3568 µs 2.3623 µs 2.3675 µs]
                        change: [-0.8033% -0.4485% -0.1454%] (p = 0.01 < 0.05)
                        Change within noise threshold.

cast float64 to uint64 512
                        time:   [2.3810 µs 2.3882 µs 2.3973 µs]
                        change: [+0.7436% +0.9361% +1.1451%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

cast int64 to int32 512 time:   [2.4393 µs 2.4450 µs 2.4507 µs]
                        change: [-0.3976% -0.1747% +0.0455%] (p = 0.12 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  7 (7.00%) low mild
  4 (4.00%) high mild

cast date64 to date32 512
                        time:   [411.16 ns 411.63 ns 412.23 ns]
                        change: [-1.4922% -1.1806% -0.8669%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

cast date32 to date64 512
                        time:   [391.89 ns 392.68 ns 393.55 ns]
                        change: [-3.2120% -2.7597% -2.3170%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

cast time32s to time32ms 512
                        time:   [182.21 ns 182.55 ns 182.92 ns]
                        change: [-8.9820% -8.4613% -7.9270%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  2 (2.00%) high mild
  11 (11.00%) high severe

cast time32s to time64us 512
                        time:   [398.80 ns 400.65 ns 402.86 ns]
                        change: [-3.3939% -2.9998% -2.5725%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

cast time64ns to time32s 512
                        time:   [411.41 ns 411.98 ns 412.82 ns]
                        change: [-5.1090% -4.2686% -3.4874%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

cast timestamp_ns to timestamp_s 512
                        time:   [204.24 ns 205.38 ns 206.51 ns]
                        change: [-1.9684% -1.4627% -0.9635%] (p = 0.00 < 0.05)
                        Change within noise threshold.

cast timestamp_ms to timestamp_ns 512
                        time:   [3.0508 µs 3.0544 µs 3.0581 µs]
                        change: [-1.0488% -0.7396% -0.4990%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

cast utf8 to f32        time:   [8.8845 µs 8.8974 µs 8.9138 µs]
                        change: [-21.302% -20.204% -19.257%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

cast i64 to string 512  time:   [15.687 µs 15.720 µs 15.767 µs]
                        change: [-2.4930% -1.7614% -1.2380%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

cast f32 to string 512  time:   [21.984 µs 22.017 µs 22.051 µs]
                        change: [+0.0279% +0.1690% +0.3156%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) low mild

cast timestamp_ms to i64 512
                        time:   [219.23 ns 220.14 ns 221.24 ns]
                        change: [-10.034% -9.0068% -8.0400%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

cast utf8 to date32 512 time:   [9.3471 µs 9.3546 µs 9.3651 µs]
                        change: [+0.1587% +0.4162% +0.6211%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

cast utf8 to date64 512 time:   [63.588 µs 63.866 µs 64.134 µs]
                        change: [+6.2335% +6.7701% +7.3132%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) low mild

cast decimal128 to decimal128 512
                        time:   [4.0842 µs 4.0933 µs 4.1036 µs]
                        change: [-0.7165% -0.5197% -0.3283%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

cast decimal128 to decimal256 512
                        time:   [6.1770 µs 6.1875 µs 6.2019 µs]
                        change: [-0.4030% -0.1438% +0.1200%] (p = 0.30 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

cast decimal256 to decimal128 512
                        time:   [37.884 µs 37.939 µs 37.998 µs]
                        change: [+0.8098% +1.1873% +1.4663%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

cast decimal256 to decimal256 512
                        time:   [6.0214 µs 6.0282 µs 6.0352 µs]
                        change: [-0.3859% -0.0314% +0.2086%] (p = 0.88 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

cast decimal128 to decimal128 512 with same scale
                        time:   [75.928 ns 75.985 ns 76.050 ns]
                        change: [-0.7296% -0.5711% -0.3980%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

cast decimal256 to decimal256 512 with same scale
                        time:   [75.878 ns 75.969 ns 76.121 ns]
                        change: [-2.4126% -1.8729% -1.3256%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

tustvold · 2023-09-11T12:50:19Z

The JSON benchmarks are the ones I'm potentially concerned about

Weijun-H · 2023-09-11T12:56:14Z

The JSON benchmarks are the ones I'm potentially concerned about

The JSON performance has decreased.

small_bench_primitive   time:   [9.3441 µs 9.3692 µs 9.4011 µs]
                        change: [+13.467% +13.828% +14.207%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

large_bench_primitive   time:   [3.0556 ms 3.0708 ms 3.0883 ms]
                        change: [+27.529% +28.217% +28.843%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 17 outliers among 100 measurements (17.00%)
  3 (3.00%) high mild
  14 (14.00%) high severe

small_bench_list        time:   [17.681 µs 17.707 µs 17.732 µs]
                        change: [+11.293% +11.625% +11.948%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

tustvold · 2023-09-11T12:58:20Z

I think a 30% performance regression is a pretty tough sell, especially when it isn't as if lexical_core has security vulnerabilities that impact us... Have you tried the trick of using serde_json to provide the vended lexical_core?

Weijun-H · 2023-09-11T13:46:27Z

I think a 30% performance regression is a pretty tough sell, especially when it isn't as if lexical_core has security vulnerabilities that impact us... Have you tried the trick of using serde_json to provide the vended lexical_core?

The performance of serde_json also decreased.

small_bench_primitive   time:   [7.6051 µs 7.6808 µs 7.7639 µs]
                        change: [+20.123% +20.691% +21.386%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe

Benchmarking large_bench_primitive: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, enable flat sampling, or reduce sample count to 50.
large_bench_primitive   time:   [1.9096 ms 1.9123 ms 1.9155 ms]
                        change: [+4.5542% +4.8926% +5.1944%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe

small_bench_list        time:   [13.844 µs 13.870 µs 13.896 µs]
                        change: [+14.486% +15.117% +15.654%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

tustvold · 2023-09-11T14:14:07Z

Is that using serde_json::from_slice or serde_json::from_str?

Weijun-H · 2023-09-11T15:00:35Z

Is that using serde_json::from_slice or serde_json::from_str?

I used serde_json::from_slice

Weijun-H · 2023-09-12T03:00:25Z

Hi @tustvold, since JSON performance is poor, I will keep the change for arrow-cast, but not for arrow-json. What is your opinion?

Dandandan · 2023-09-12T13:21:00Z

arrow-json/src/reader/primitive_array.rs

-                    Err(_) => lexical_core::parse::<f64>(s).ok().and_then(NumCast::from),
+                match std::str::from_utf8(s) {
+                    Ok(s) => {
+                        if let Ok(res) = s.to_string().parse::<$t>() {


to_string shouldn't be needed?

Weijun-H · 2023-09-12T15:01:19Z

Thanks to the suggestion from @Dandandan, the performance is now almost on par with that of the lexical-core system.

small_bench_primitive   time:   [8.3978 µs 8.4857 µs 8.6162 µs]
                        change: [-5.9124% -2.8511% -0.5427%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  7 (7.00%) high mild
  3 (3.00%) high severe

large_bench_primitive   time:   [2.6429 ms 2.6958 ms 2.7769 ms]
                        change: [+8.5944% +10.894% +14.284%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

small_bench_list        time:   [16.202 µs 16.654 µs 17.206 µs]
                        change: [-3.7706% +0.2623% +3.7338%] (p = 0.91 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high severe

tustvold · 2023-09-13T08:42:40Z

I'm honestly torn by this, it doesn't eliminate the lexical core dependency and represents a non-trivial performance regression...

I can add it to my list to investigate if there is some way to get the same performance here

tustvold · 2023-09-13T08:47:45Z

In fact I'm not sure we even need to do this, the advisory is not for lexical-core - #4774 (comment)

tustvold · 2023-09-25T11:06:48Z

Given this does still regress performance I am going to close this for now, feel free to reopen if I have misunderstood

github-actions bot added the arrow Changes to the arrow crate label Sep 7, 2023

Weijun-H force-pushed the replace-lexical branch from ad02d1e to a46ba81 Compare September 7, 2023 03:58

tustvold reviewed Sep 7, 2023

View reviewed changes

Weijun-H force-pushed the replace-lexical branch from 1cf848b to 1144add Compare September 8, 2023 02:51

tustvold mentioned this pull request Sep 9, 2023

Improve CSV Reader Benchmark Coverage of Small Primitives #4803

Merged

Weijun-H marked this pull request as ready for review September 11, 2023 10:44

Weijun-H added 4 commits September 11, 2023 20:19

Replace usage of lexical-core crate with float-parsing-crate for pars…

2cddc0f

…ing strings as floats

use str::from_utf8

f9d9e2d

fix clippy

f217933

remove lexical in timestamp

fcca059

Dandandan reviewed Sep 12, 2023

View reviewed changes

remove to_string()

82962e0

Weijun-H force-pushed the replace-lexical branch from 1144add to 82962e0 Compare September 12, 2023 14:56

tustvold closed this Sep 25, 2023

tustvold mentioned this pull request Oct 18, 2023

Replace lexical #4774

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace usage of `lexical-core crate` with libcore for parsing strings as floats #4785

Replace usage of `lexical-core crate` with libcore for parsing strings as floats #4785

Weijun-H commented Sep 7, 2023 •

edited

Loading

tustvold Sep 7, 2023

jhorstmann Sep 7, 2023

tustvold Sep 7, 2023 •

edited

Loading

tustvold commented Sep 11, 2023

Weijun-H commented Sep 11, 2023 •

edited

Loading

tustvold commented Sep 11, 2023

Weijun-H commented Sep 11, 2023

tustvold commented Sep 11, 2023

Weijun-H commented Sep 11, 2023

tustvold commented Sep 11, 2023

Weijun-H commented Sep 11, 2023

Weijun-H commented Sep 12, 2023

Dandandan Sep 12, 2023

Weijun-H commented Sep 12, 2023

tustvold commented Sep 13, 2023

tustvold commented Sep 13, 2023

tustvold commented Sep 25, 2023

Replace usage of lexical-core crate with libcore for parsing strings as floats #4785

Replace usage of lexical-core crate with libcore for parsing strings as floats #4785

Conversation

Weijun-H commented Sep 7, 2023 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

tustvold Sep 7, 2023

Choose a reason for hiding this comment

jhorstmann Sep 7, 2023

Choose a reason for hiding this comment

tustvold Sep 7, 2023 • edited Loading

Choose a reason for hiding this comment

tustvold commented Sep 11, 2023

Weijun-H commented Sep 11, 2023 • edited Loading

tustvold commented Sep 11, 2023

Weijun-H commented Sep 11, 2023

tustvold commented Sep 11, 2023

Weijun-H commented Sep 11, 2023

tustvold commented Sep 11, 2023

Weijun-H commented Sep 11, 2023

Weijun-H commented Sep 12, 2023

Dandandan Sep 12, 2023

Choose a reason for hiding this comment

Weijun-H commented Sep 12, 2023

tustvold commented Sep 13, 2023

tustvold commented Sep 13, 2023

tustvold commented Sep 25, 2023

Replace usage of `lexical-core crate` with libcore for parsing strings as floats #4785

Replace usage of `lexical-core crate` with libcore for parsing strings as floats #4785

Weijun-H commented Sep 7, 2023 •

edited

Loading

tustvold Sep 7, 2023 •

edited

Loading

Weijun-H commented Sep 11, 2023 •

edited

Loading