-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of hybrid encrypt CLI #1483
Improve performance of hybrid encrypt CLI #1483
Conversation
There were quite a few bottlenecks here: * Writes were done serially, writing one file at a time. * Shares were encrypted on a single CPU core I almost used `rayon` to parallelize encryption, but the problem is that we need to get the output sorted to maintain total order across files. Rayon can do that, but requires collecting `ParallelIterator` which would be bad for generating 100M+ reports. Our goal is to be able to share and encrypt 1B, so streaming and manual fiddling with thread pools is justified imo. The way this CLI works right now: it keeps a compute pool for encryption (thread-per-core) and a separate pool of 3 threads to write data for each helper in parallel I also made a few tweaks to improve code re-usability in this module. ## Benchmarks Done locally on M1 Mac Pro (10 cores) Before this change: Encryption process is completed. 442.15834075s After this change Encryption process is completed. 55.63269625s
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1483 +/- ##
==========================================
- Coverage 93.37% 93.27% -0.10%
==========================================
Files 239 239
Lines 43476 43675 +199
==========================================
+ Hits 40594 40739 +145
- Misses 2882 2936 +54 ☔ View full report in Codecov by Sentry. |
) | ||
}) | ||
.collect(), | ||
next_worker: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is next_worker
always 0? what's the point of it if it's constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mutation occurs inside encrypt_share
There were quite a few bottlenecks here:
I almost used
rayon
to parallelize encryption, but the problem is that we need to get the output sorted to maintain total order across files. Rayon can do that, but requires collectingParallelIterator
which would be bad for generating 100M+ reports.Our goal is to be able to share and encrypt 1B, so streaming and manual fiddling with thread pools is justified imo.
The way this CLI works right now: it keeps a compute pool for encryption (thread-per-core) and a separate pool of 3 threads to write data for each helper in parallel
I also made a few tweaks to improve code re-usability in this module.
Benchmarks
Time it takes to encrypt 1M reports done locally on M1 Mac Pro (10 cores)
Before this change:
Encryption process is completed. 442.15834075s
After this change
Encryption process is completed. 55.63269625s