Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster version of the data generator #25

Merged
merged 2 commits into from
Jan 3, 2024

Conversation

rschwietzke
Copy link
Contributor

@rschwietzke rschwietzke commented Jan 2, 2024

It took too much time to get 1 billion rows to disk. Updated the impl (still one thread only) to be much faster. Uses fewer data conversion and instead of Gaussian distribution, we have something simple and equally distributed. Good enough for our purpose.

We do 200 million lines in 15 sec now instead of 85 sec (T14s AMD Gen 1, Samsung SSD 970 Evo).

Main changes: Less branches (loop unrolled), avoid modulo and compare for progress indicator, use a char buffer to avoid string conversions and a even more pseudo random generator to avoid Atomics and side effects to load the CPU pipeline better.

Example for 200 million lines

Old

perf stat java -cp target/classes/ dev.morling.onebrc.CreateMeasurements 200000000
Wrote 50,000,000 measurements in 12524 ms
Wrote 100,000,000 measurements in 34074 ms
Wrote 150,000,000 measurements in 57635 ms
Created file with 200,000,000 measurements in 81035 ms

 Performance counter stats for 'java -cp target/classes/ dev.morling.onebrc.CreateMeasurements 200000000':

         83.799,57 msec task-clock                #    0,935 CPUs utilized          
            33.324      context-switches          #  397,663 /sec                   
             1.942      cpu-migrations            #   23,174 /sec                   
            92.793      page-faults               #    1,107 K/sec                  
   300.991.615.221      cycles                    #    3,592 GHz                      (66,65%)
    12.428.601.049      stalled-cycles-frontend   #    4,13% frontend cycles idle     (66,68%)
    58.763.632.301      stalled-cycles-backend    #   19,52% backend cycles idle      (66,69%)
   598.272.978.436      instructions              #    1,99  insn per cycle         
                                                  #    0,10  stalled cycles per insn  (66,75%)
   113.366.390.250      branches                  #    1,353 G/sec                    (66,72%)
     2.035.685.539      branch-misses             #    1,80% of all branches          (66,62%)

      89,609508930 seconds time elapsed

      79,983998000 seconds user
       3,889775000 seconds sys

New

perf stat java -cp target/classes/ dev.morling.onebrc.CreateMeasurements2 200000000
Wrote 50,000,000 measurements in 4245 ms
Wrote 100,000,000 measurements in 7393 ms
Wrote 150,000,000 measurements in 11117 ms
Wrote 200,000,000 measurements in 14969 ms
Created file with 200,000,000 measurements in 14969 ms

 Performance counter stats for 'java -cp target/classes/ dev.morling.onebrc.CreateMeasurements2 200000000':

         17.576,87 msec task-clock                #    0,653 CPUs utilized          
            20.379      context-switches          #    1,159 K/sec                  
             1.144      cpu-migrations            #   65,086 /sec                   
            97.295      page-faults               #    5,535 K/sec                  
    62.565.060.215      cycles                    #    3,560 GHz                      (66,66%)
     2.170.358.297      stalled-cycles-frontend   #    3,47% frontend cycles idle     (66,92%)
    16.632.698.221      stalled-cycles-backend    #   26,58% backend cycles idle      (67,09%)
   128.526.091.050      instructions              #    2,05  insn per cycle         
                                                  #    0,13  stalled cycles per insn  (66,66%)
    26.800.878.086      branches                  #    1,525 G/sec                    (66,59%)
       471.532.694      branch-misses             #    1,76% of all branches          (66,61%)

      26,915135151 seconds time elapsed

      14,451946000 seconds user
       3,182558000 seconds sys

@gunnarmorling
Copy link
Owner

Nice one! Has that code in the org.schwietzke package been written by you? Or is it taken from somewhere else?

@rschwietzke
Copy link
Contributor Author

rschwietzke commented Jan 3, 2024

Nice one! Has that code in the org.schwietzke package been written by you? Or is it taken from somewhere else?

All my code, which has been already contributed to XLT and HtmlUnit Neko.

FastRandom is inspired by a post somewhere and some demo code, don't recall it anymore. Poor randomness, but good enough for our demo case and repeatable if needed. There is also a more elaborate version here: https://github.com/Xceptance/XLT/blob/develop/src/main/java/it/unimi/dsi/util/FastRandom.java which is based on http://prng.di.unimi.it/.

The CheapCharBuffer is too big for our case but taken from my last PR to HtmlUnit Neko: https://github.com/HtmlUnit/htmlunit-neko/blob/master/src/main/java/org/htmlunit/cyberneko/xerces/xni/XMLString.java Has a different name for compatibility reasons there.

@gunnarmorling
Copy link
Owner

Ok, cool. Merging. Thanks a lot!

@gunnarmorling gunnarmorling merged commit 70fcbf9 into gunnarmorling:main Jan 3, 2024
1 check passed
@rschwietzke rschwietzke deleted the faster-datagenerator branch January 3, 2024 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants