-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: make Compressor::train 2x faster with bitmap index (#16)
The slowest part of Compressor::train is the double-nested loops over codes. Now compress_count when it records code pairs will also populate a bitmap index, where `pairs_index[code1].set(code2)` will indicate that code2 followed code1 in compressed output. In the `optimize` loop, we can eliminate tight loop iterations by accessing `pairse_index[code1].second_codes()` which yields the value code2 values. This results in a speedup from ~1ms -> 500micros for the training benchmark. We're sub-millisecond! This also makes Miri somewhat palatable to run for all but `test_large`, so I've re-enabled it for CI (currently it runs in 2.5 minutes. Far cry from the < 30s build+test step but I guess it's for a good cause)
- Loading branch information
Showing
3 changed files
with
196 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
name: Miri | ||
|
||
on: | ||
push: | ||
branches: ["develop"] | ||
pull_request: {} | ||
workflow_dispatch: {} | ||
|
||
permissions: | ||
actions: read | ||
contents: read | ||
|
||
jobs: | ||
miri: | ||
name: "miri" | ||
runs-on: ubuntu-latest | ||
env: | ||
RUST_BACKTRACE: 1 | ||
MIRIFLAGS: -Zmiri-strict-provenance -Zmiri-symbolic-alignment-check -Zmiri-backtrace=full | ||
steps: | ||
- uses: actions/checkout@v4 | ||
|
||
- name: Rust Version | ||
id: rust-version | ||
shell: bash | ||
run: echo "version=$(cat rust-toolchain.toml | grep channel | awk -F'\"' '{print $2}')" >> $GITHUB_OUTPUT | ||
|
||
- name: Rust Toolchain | ||
id: rust-toolchain | ||
uses: dtolnay/rust-toolchain@master | ||
if: steps.rustup-cache.outputs.cache-hit != 'true' | ||
with: | ||
toolchain: "${{ steps.rust-version.outputs.version }}" | ||
components: miri | ||
|
||
- name: Rust Dependency Cache | ||
uses: Swatinem/rust-cache@v2 | ||
with: | ||
save-if: ${{ github.ref == 'refs/heads/develop' }} | ||
shared-key: "shared" # To allow reuse across jobs | ||
|
||
- name: Rust Compile Cache | ||
uses: mozilla-actions/[email protected] | ||
- name: Rust Compile Cache Config | ||
shell: bash | ||
run: | | ||
echo "SCCACHE_GHA_ENABLED=true" >> $GITHUB_ENV | ||
echo "RUSTC_WRAPPER=sccache" >> $GITHUB_ENV | ||
echo "CARGO_INCREMENTAL=0" >> $GITHUB_ENV | ||
- name: Run tests with Miri | ||
run: cargo miri test |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters