-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: make Compressor::train 2x faster with bitmap index #16
Conversation
The slowest part of Compressor::train is the double-nested loops over codes. Now compress_count when it records code pairs will also populate a bitmap index, where `pairs_index[code1].set(code2)` will indicate that code2 followed code1 in compressed output. In the `optimize` loop, we can eliminate tight loop iterations by accessing `pairse_index[code1].second_codes()` which yields the value code2 values. This results in a speedup from ~1ms -> 500micros.
pub fn reset(&mut self) { | ||
for idx in 0..COUNTS1_SIZE { | ||
self.counts1[idx] = 0; | ||
} | ||
for idx in 0..COUNTS2_SIZE { | ||
self.counts2[idx] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was slower than just building a new Counter
b/c of the vec![0]
change made in the previous PR
i don't want to lose my 30s CI checks
64beee7
to
f74d185
Compare
## 🤖 New release * `fsst-rs`: 0.2.0 -> 0.2.1 <details><summary><i><b>Changelog</b></i></summary><p> <blockquote> ## [0.2.1](v0.2.0...v0.2.1) - 2024-08-20 ### Added - make Compressor::train 2x faster with bitmap index ([#16](#16)) </blockquote> </p></details> --- This PR was generated with [release-plz](https://github.com/MarcoIeni/release-plz/). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
if self.block == 0 { | ||
return None; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be possible to skip this check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! #18
The slowest part of Compressor::train is the double-nested loops over codes.
Now compress_count when it records code pairs will also populate a bitmap index, where
pairs_index[code1].set(code2)
will indicate that code2 followed code1 in compressed output.In the
optimize
loop, we can eliminate tight loop iterations by accessingpairse_index[code1].second_codes()
which yields the value code2 values.This results in a speedup from ~1ms -> 500micros for the training benchmark. We're sub-millisecond!
This also makes Miri somewhat palatable to run for all but
test_large
, so I've re-enabled it for CI (currently it runs in 2.5 minutes. Far cry from the < 30s build+test step but I guess it's for a good cause)