Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: store benchmarks in s3 #1650

Merged
merged 52 commits into from
Dec 13, 2024
Merged

feat: store benchmarks in s3 #1650

merged 52 commits into from
Dec 13, 2024

Conversation

danking
Copy link
Member

@danking danking commented Dec 11, 2024

  • PR benchmarks should still work though we now fully control the diff & table creation.
  • Develop benchmarks
    • New format: JSONL with (at least) five fields: name, unit, range, value, commit_id. We'll switch to Vortex (of course) when its stable.
    • New location: s3://vortex-benchmark-results-database , a world-readable bucket.
  • Benchmarks website
    • Should be visually identical (sans the download button in a new spot with new look)
    • Loads out of S3 via the HTTP API. (We could move to R2 if egress is bad but 100GiB free per month and the benchmarks db is 20MiB currently).
    • Now stored in develop branch under benchmarks-website (necessary for it to be deployed together with the docs)
  • New scripts folder (I am sorry for my sins)
    • cat-s3.sh: compare-and-swap new benchmarks into S3.
    • coerce-criterion-json.sh: converts criterion's (weirdo) json format to the new one above.,
    • commit-json.sh: generates a commit JSON (which is stored in s3://.../commits.json a database commit metadata).
    • compare-benchmark-jsons.py: loads two JSONL files (containing data for just one commit each) and creates a Markdown table of comparisons.

@danking danking added the benchmark Run benchmarks on this branch label Dec 11, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 11, 2024
@a10y
Copy link
Contributor

a10y commented Dec 11, 2024

It may be easier to write this in xtask using ObjectStore, if shell script starts to get unweildy

@danking danking added benchmark Run benchmarks on this branch and removed benchmark Run benchmarks on this branch labels Dec 11, 2024
@danking danking force-pushed the dk/store-bench-in-s3 branch from e53d4f2 to a2ef0ca Compare December 11, 2024 21:07
@danking danking marked this pull request as ready for review December 12, 2024 22:05
@danking danking added the benchmark Run benchmarks on this branch label Dec 12, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 12, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 12, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 12, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 12, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 12, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 12, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 12, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 12, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 12, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 12, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 12, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 12, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 12, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 12, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 12, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 13, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 13, 2024
@danking danking added the benchmark Run benchmarks on this branch label Dec 13, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 13, 2024
Copy link
Contributor

github-actions bot commented Dec 13, 2024

Benchmarks: compress

name PR dk/store base b4c5846 ratio (PR/base) unit
compress time/taxi 1.3583e+09 1.38509e+09 0.980664 ns
compress time/taxi throughput 0.346617 0.339914 1.01972 bytes/ns
parquet_rs-zstd compress time/taxi 1.74998e+09 1.72063e+09 1.01706 ns
parquet_rs-zstd compress time/taxi throughput 0.269038 0.273627 0.983228 bytes/ns
decompress time/taxi 3.0178e+08 2.93745e+08 1.02735 ns
decompress time/taxi throughput 1.56011 1.60279 0.973374 bytes/ns
parquet_rs-zstd decompress time/taxi 3.2868e+08 3.09556e+08 1.06178 ns
parquet_rs-zstd decompress time/taxi throughput 1.43243 1.52092 0.941819 bytes/ns
compress time/AirlineSentiment 648357 708956 0.914524 ns
compress time/AirlineSentiment throughput 0.00318652 0.00291415 1.09347 bytes/ns
parquet_rs-zstd compress time/AirlineSentiment 56911.6 56323.4 1.01044 ns
parquet_rs-zstd compress time/AirlineSentiment throughput 0.0363019 0.036681 0.989665 bytes/ns
decompress time/AirlineSentiment 103583 104790 0.98849 ns
decompress time/AirlineSentiment throughput 0.0199453 0.0197157 1.01164 bytes/ns
parquet_rs-zstd decompress time/AirlineSentiment 32800.2 32393.4 1.01256 ns
parquet_rs-zstd decompress time/AirlineSentiment throughput 0.0629874 0.0637784 0.987597 bytes/ns
compress time/Arade 2.72842e+09 2.7278e+09 1.00022 ns
compress time/Arade throughput 0.288459 0.288524 0.999775 bytes/ns
parquet_rs-zstd compress time/Arade 3.00207e+09 2.969e+09 1.01114 ns
parquet_rs-zstd compress time/Arade throughput 0.262165 0.265084 0.988987 bytes/ns
decompress time/Arade 4.03413e+08 4.88734e+08 0.825425 ns
decompress time/Arade throughput 1.95094 1.61036 1.2115 bytes/ns
parquet_rs-zstd decompress time/Arade 7.15373e+08 6.85223e+08 1.044 ns
parquet_rs-zstd decompress time/Arade throughput 1.10018 1.14858 0.957854 bytes/ns
compress time/Bimbo 1.06815e+10 1.06075e+10 1.00698 ns
compress time/Bimbo throughput 0.666698 0.671349 0.993071 bytes/ns
parquet_rs-zstd compress time/Bimbo 2.1531e+10 1.97587e+10 1.0897 ns
parquet_rs-zstd compress time/Bimbo throughput 0.330748 0.360415 0.917686 bytes/ns
decompress time/Bimbo 3.81974e+09 3.15665e+09 1.21006 ns
decompress time/Bimbo throughput 1.86436 2.25598 0.826406 bytes/ns
parquet_rs-zstd decompress time/Bimbo 4.82973e+09 2.71694e+09 1.77763 ns
parquet_rs-zstd decompress time/Bimbo throughput 1.47448 2.62109 0.562546 bytes/ns
compress time/CMSprovider 1.34825e+10 1.32493e+10 1.0176 ns
compress time/CMSprovider throughput 0.381917 0.388639 0.982704 bytes/ns
parquet_rs-zstd compress time/CMSprovider 1.93389e+10 1.88876e+10 1.02389 ns
parquet_rs-zstd compress time/CMSprovider throughput 0.266262 0.272623 0.976666 bytes/ns
decompress time/CMSprovider 2.91141e+09 2.51855e+09 1.15599 ns
decompress time/CMSprovider throughput 1.76863 2.04451 0.865062 bytes/ns
parquet_rs-zstd decompress time/CMSprovider 5.96964e+09 5.67552e+09 1.05182 ns
parquet_rs-zstd decompress time/CMSprovider throughput 0.862565 0.907264 0.950732 bytes/ns
compress time/Euro2016 2.16641e+09 2.1979e+09 0.985676 ns
compress time/Euro2016 throughput 0.181524 0.178924 1.01453 bytes/ns
parquet_rs-zstd compress time/Euro2016 1.56816e+09 1.55382e+09 1.00923 ns
parquet_rs-zstd compress time/Euro2016 throughput 0.250775 0.253089 0.990857 bytes/ns
decompress time/Euro2016 2.51949e+08 2.42769e+08 1.03781 ns
decompress time/Euro2016 throughput 1.56086 1.61988 0.963564 bytes/ns
parquet_rs-zstd decompress time/Euro2016 5.10042e+08 5.04646e+08 1.01069 ns
parquet_rs-zstd decompress time/Euro2016 throughput 0.771026 0.779271 0.98942 bytes/ns
compress time/Food 1.07615e+09 1.08729e+09 0.989752 ns
compress time/Food throughput 0.309177 0.306009 1.01035 bytes/ns
parquet_rs-zstd compress time/Food 1.08107e+09 1.06793e+09 1.0123 ns
parquet_rs-zstd compress time/Food throughput 0.307772 0.311558 0.987846 bytes/ns
decompress time/Food 1.11469e+08 1.06677e+08 1.04492 ns
decompress time/Food throughput 2.98487 3.11895 0.95701 bytes/ns
parquet_rs-zstd decompress time/Food 2.30959e+08 2.27633e+08 1.01461 ns
parquet_rs-zstd decompress time/Food throughput 1.44061 1.46166 0.985598 bytes/ns
compress time/HashTags 2.65857e+09 2.68372e+09 0.99063 ns
compress time/HashTags throughput 0.302607 0.299772 1.00946 bytes/ns
parquet_rs-zstd compress time/HashTags 2.50866e+09 2.47434e+09 1.01387 ns
parquet_rs-zstd compress time/HashTags throughput 0.320691 0.325139 0.986318 bytes/ns
decompress time/HashTags 4.49638e+08 4.53548e+08 0.99138 ns
decompress time/HashTags throughput 1.78922 1.7738 1.00869 bytes/ns
parquet_rs-zstd decompress time/HashTags 8.49622e+08 8.21668e+08 1.03402 ns
parquet_rs-zstd decompress time/HashTags throughput 0.946896 0.97911 0.967098 bytes/ns
compress time/TPC-H l_comment chunked without fsst 3.48698e+09 3.59026e+09 0.971232 ns
compress time/TPC-H l_comment chunked without fsst throughput 0.0714731 0.0694169 1.02962 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst 9.36585e+08 9.1216e+08 1.02678 ns
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst throughput 0.2661 0.273225 0.973921 bytes/ns
decompress time/TPC-H l_comment chunked without fsst 6.91424e+07 7.00262e+07 0.987378 ns
decompress time/TPC-H l_comment chunked without fsst throughput 3.60452 3.55903 1.01278 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst 2.54156e+08 2.52239e+08 1.0076 ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst throughput 0.980599 0.988052 0.992457 bytes/ns
compress time/TPC-H l_comment chunked 1.01313e+09 1.0154e+09 0.997765 ns
compress time/TPC-H l_comment chunked throughput 0.245994 0.245444 1.00224 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment chunked 9.29861e+08 9.16292e+08 1.01481 ns
parquet_rs-zstd compress time/TPC-H l_comment chunked throughput 0.268024 0.271993 0.985407 bytes/ns
decompress time/TPC-H l_comment chunked 9.11482e+07 9.01527e+07 1.01104 ns
decompress time/TPC-H l_comment chunked throughput 2.73429 2.76448 0.989079 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked 2.53873e+08 2.51555e+08 1.00922 ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked throughput 0.981692 0.99074 0.990867 bytes/ns
compress time/TPC-H l_comment canonical 1.00829e+09 1.0187e+09 0.98978 ns
compress time/TPC-H l_comment canonical throughput 0.247174 0.244648 1.01033 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment canonical 9.28803e+08 9.228e+08 1.0065 ns
parquet_rs-zstd compress time/TPC-H l_comment canonical throughput 0.268329 0.270074 0.993537 bytes/ns
decompress time/TPC-H l_comment canonical 9.05994e+07 9.02174e+07 1.00423 ns
decompress time/TPC-H l_comment canonical throughput 2.75084 2.76249 0.995784 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment canonical 2.54095e+08 2.52004e+08 1.0083 ns
parquet_rs-zstd decompress time/TPC-H l_comment canonical throughput 0.980831 0.988969 0.991771 bytes/ns

@danking danking force-pushed the dk/store-bench-in-s3 branch from 6223ac9 to 0ff191c Compare December 13, 2024 17:19
@danking danking added the benchmark Run benchmarks on this branch label Dec 13, 2024
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Dec 13, 2024
@danking danking added do not merge Pull requests that are not intended to merge and removed do not merge Pull requests that are not intended to merge labels Dec 13, 2024
@danking danking enabled auto-merge (squash) December 13, 2024 17:24
@danking danking merged commit 6c63cf9 into develop Dec 13, 2024
19 checks passed
@danking danking deleted the dk/store-bench-in-s3 branch December 13, 2024 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants