Improve performance of hybrid encrypt CLI #1483

akoshelev · 2024-12-07T06:55:22Z

There were quite a few bottlenecks here:

Writes were done serially, writing one file at a time.
Shares were encrypted on a single CPU core

I almost used rayon to parallelize encryption, but the problem is that we need to get the output sorted to maintain total order across files. Rayon can do that, but requires collecting ParallelIterator which would be bad for generating 100M+ reports.

Our goal is to be able to share and encrypt 1B, so streaming and manual fiddling with thread pools is justified imo.

The way this CLI works right now: it keeps a compute pool for encryption (thread-per-core) and a separate pool of 3 threads to write data for each helper in parallel

I also made a few tweaks to improve code re-usability in this module.

Benchmarks

Time it takes to encrypt 1M reports done locally on M1 Mac Pro (10 cores)

Before this change:
Encryption process is completed. 442.15834075s

After this change
Encryption process is completed. 55.63269625s

There were quite a few bottlenecks here: * Writes were done serially, writing one file at a time. * Shares were encrypted on a single CPU core I almost used `rayon` to parallelize encryption, but the problem is that we need to get the output sorted to maintain total order across files. Rayon can do that, but requires collecting `ParallelIterator` which would be bad for generating 100M+ reports. Our goal is to be able to share and encrypt 1B, so streaming and manual fiddling with thread pools is justified imo. The way this CLI works right now: it keeps a compute pool for encryption (thread-per-core) and a separate pool of 3 threads to write data for each helper in parallel I also made a few tweaks to improve code re-usability in this module. ## Benchmarks Done locally on M1 Mac Pro (10 cores) Before this change: Encryption process is completed. 442.15834075s After this change Encryption process is completed. 55.63269625s

codecov · 2024-12-07T07:22:28Z

Codecov Report

Attention: Patch coverage is 96.79487% with 5 lines in your changes missing coverage. Please review.

Project coverage is 93.27%. Comparing base (e667e78) to head (1d44fc9).
Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
ipa-core/src/cli/crypto/hybrid_encrypt.rs	96.24%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1483      +/-   ##
==========================================
- Coverage   93.37%   93.27%   -0.10%     
==========================================
  Files         239      239              
  Lines       43476    43675     +199     
==========================================
+ Hits        40594    40739     +145     
- Misses       2882     2936      +54

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ipa-core/src/bin/report_collector.rs

eriktaubeneck · 2024-12-07T18:11:38Z

ipa-core/src/cli/crypto/hybrid_encrypt.rs

+                    )
+                })
+                .collect(),
+            next_worker: 0,


why is next_worker always 0? what's the point of it if it's constant?

mutation occurs inside encrypt_share

ipa-core/src/cli/crypto/hybrid_encrypt.rs

eriktaubeneck reviewed Dec 7, 2024

View reviewed changes

ipa-core/src/bin/report_collector.rs Show resolved Hide resolved

eriktaubeneck reviewed Dec 7, 2024

View reviewed changes

ipa-core/src/cli/crypto/hybrid_encrypt.rs Outdated Show resolved Hide resolved

eriktaubeneck reviewed Dec 7, 2024

View reviewed changes

ipa-core/src/cli/crypto/hybrid_encrypt.rs Show resolved Hide resolved

eriktaubeneck reviewed Dec 7, 2024

View reviewed changes

ipa-core/src/cli/crypto/hybrid_encrypt.rs Show resolved Hide resolved

eriktaubeneck approved these changes Dec 7, 2024

View reviewed changes

Feedback

1d44fc9

eriktaubeneck merged commit b27a4b3 into private-attribution:main Dec 7, 2024
13 checks passed

tyurek mentioned this pull request Dec 10, 2024

Remove Info from plaintext Hybrid Reports #1487

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of hybrid encrypt CLI #1483

Improve performance of hybrid encrypt CLI #1483

akoshelev commented Dec 7, 2024

codecov bot commented Dec 7, 2024 •

edited

Loading

eriktaubeneck Dec 7, 2024

akoshelev Dec 7, 2024

Improve performance of hybrid encrypt CLI #1483

Improve performance of hybrid encrypt CLI #1483

Conversation

akoshelev commented Dec 7, 2024

Benchmarks

codecov bot commented Dec 7, 2024 • edited Loading

Codecov Report

eriktaubeneck Dec 7, 2024

Choose a reason for hiding this comment

akoshelev Dec 7, 2024

Choose a reason for hiding this comment

codecov bot commented Dec 7, 2024 •

edited

Loading