-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: track compressed size & compare to parquet(zstd)? & canonical (#…
…882) We now track these six values: 1. Compression time (s). 2. Compression throughput (bytes/s). 3. Compressed size (bytes). 4. Compressed size as fraction of a Vortex Canonical array. 5. Compressed Layout size as fraction of Parquet without block compression. 6. Compressed Layout size as fraction of Parquet with Zstd. It's a bit janky: I just unconditionally compute these values for several datasets. I couldn't figure out how to ask criterion which benchmark regex is currently in use so, for example, `cargo bench taxi` will still run all the size benchmarks for every other dataset. I also had to do some janky jq parsing to convert from Criterion's JSON output to the style expected by the benchmark-action GitHub action that we use. Nevertheless, now, for each commit to `develop`, we should get all six numbers for the Taxi, Airline Sentiment, Arade, Bimbo, CMSprovider, Euro2016, Food, HashTags, and TPC-H l_comment datasets. They'll be displayed under [Vortex Compression](https://spiraldb.github.io/vortex/dev/bench/#Vortex_Compression) at the benchmarks site. I might need to delete some old data form the gh-pages-bench branch since I changed some benchmark names, but after a few commits, those plots should become useful measures of our compression performance in space and time.
- Loading branch information
Showing
11 changed files
with
364 additions
and
109 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
data | ||
data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.