Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Small additional speedup for deflate_quick
Before and after benchmarks on Intel x86_64, compiled with `RUSTFLAGS="-Ctarget-cpu=native -Cllvm-args=-enable-dfa-jump-thread" cargo build --release`: ``` Benchmark 1 (60 runs): ./compress-baseline 1 rs silesia-small.tar measurement mean ± σ min … max outliers delta wall_time 84.1ms ± 1.93ms 81.4ms … 93.9ms 6 (10%) 0% peak_rss 26.7MB ± 73.6KB 26.5MB … 26.7MB 0 ( 0%) 0% cpu_cycles 303M ± 1.12M 302M … 309M 3 ( 5%) 0% instructions 655M ± 265 655M … 655M 1 ( 2%) 0% cache_references 404K ± 12.6K 396K … 468K 7 (12%) 0% cache_misses 302K ± 6.45K 284K … 321K 6 (10%) 0% branch_misses 3.15M ± 6.83K 3.14M … 3.17M 0 ( 0%) 0% Benchmark 2 (62 runs): ./target/release/examples/compress 1 rs silesia-small.tar measurement mean ± σ min … max outliers delta wall_time 81.9ms ± 930us 80.2ms … 84.3ms 0 ( 0%) ⚡- 2.6% ± 0.6% peak_rss 26.7MB ± 65.8KB 26.6MB … 26.7MB 0 ( 0%) - 0.0% ± 0.1% cpu_cycles 298M ± 656K 297M … 300M 0 ( 0%) ⚡- 1.7% ± 0.1% instructions 645M ± 255 645M … 645M 0 ( 0%) ⚡- 1.5% ± 0.0% cache_references 400K ± 3.70K 397K … 417K 4 ( 6%) - 0.9% ± 0.8% cache_misses 300K ± 6.50K 282K … 309K 4 ( 6%) - 0.8% ± 0.8% branch_misses 3.06M ± 8.81K 3.05M … 3.08M 0 ( 0%) ⚡- 2.9% ± 0.1% ``` No change in performance appeared when running the benchmark at higher compression levels.
- Loading branch information