Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC][oprf][shuffle] OPRF Shuffle using a 2-round 4-message shuffle protocol #809

Closed
wants to merge 17 commits into from

Conversation

cryo28
Copy link
Contributor

@cryo28 cryo28 commented Oct 18, 2023

  1. New query type
  2. Implementation of the protocol (not sharded)

This is not a final code I'd like to land. But more a preview to gather feedback and comments. In a few places I don't really know what I am doing so I resorted to copy-pasta.

If there is not major problems with the code I'll proceed to writing a bunch of unit tests, docs to package the commit for merging.

image

1. New query type
2. Implementation of the protocol (not sharded)
};

#[derive(Clone)]
pub struct Context<'a> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this code?

Ok(res)
}

async fn run_h3<C, L, R>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of code duplication between these functions.

impl Default for QueryConfig {
fn default() -> Self {
Self {
bk_size: 40,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be smaller.

@cryo28 cryo28 force-pushed the oprf-shuffle branch 4 times, most recently from d24e086 to 572bb25 Compare October 18, 2023 23:39
Copy link
Collaborator

@akoshelev akoshelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't finished the whole thing, some initial feedback in place

}
}

/// Helps to convince the compiler that things are `Send`. Like `seq_join::assert_send`, but for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reuse the existing function

let iter = std::iter::from_fn(move || Some(OPRFShuffleSingleShare::sample(&mut rng)))
.take(batch_size as usize);

// NOTE: I'd like to return an Iterator from here as there is really no need to allocate batch_size of items.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type Output = OPRFShuffleSingleShare;

fn add(self, rhs: Self) -> Self::Output {
*self + *rhs // Relies on Copy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need this comment, rustc will complain if Self is not Copy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is more generic than the implementation above, so you could switch the bodies of add

GeneratePi12,
GeneratePi23,
GeneratePi31,
GenerateZ12,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for $Z_{ij}$ tables, all helpers need to generate 2 of them using shared randomess from left and right. That should require only 2 steps.

Likely it is the same for permutations

}
}

pub fn sample<R: Rng>(rng: &mut R) -> Self {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may want to implement Standard: Distribution<MyStruct> here

) -> Result<Vec<OPRFShuffleSingleShare>, Error> {
let (step_send, step_recv, dir) = match ctx.role() {
Role::H2 => (
OPRFShuffleStep::TransferCHat1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should be able to use the same step for send and receive. Depending on the role, one helper will be only sending data, while recipient consuming it at the other end

Ok(output)
}

async fn exchange_c_hat<C: Context, I: IntoIterator<Item = OPRFShuffleSingleShare>>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably easier not to have this function

ctx,
&OPRFShuffleStep::TransferX2,
Direction::Right,
x_2.clone(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clone here feels wrong

permutation
}

fn permute(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this relies on the IntoInterator specialization to avoid extra allocations, but I don't see why it can't be a one liner apply(permutation, &mut data) on the caller's side

///
/// ![Apply steps][apply]
fn apply<T>(permutation: &[u32], values: &mut [T]) {
// NOTE: This is copypasta from crate::protocol::sort
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we lift it out of sort and reuse?

@@ -8,6 +9,7 @@ use std::{num::NonZeroUsize, sync::Arc};

use async_trait::async_trait;
pub use malicious::{Context as MaliciousContext, Upgraded as UpgradedMaliciousContext};
pub use oprf::Context as OPRFContext;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?

Artem Ignatyev added 4 commits October 19, 2023 12:16
1. Unified GenerateZ_ij and GeneratePi_ij steps
2. Moved assert_stream_send into a new module create:one_off_fns (and updated a few callsites to use it)
3. Replaced try_join! macro with future:try_join fn calls
4. Unified ExchangeCHat1 and ExchangeCHat2 steps into ExchangeCHat
5. Replaced calls to exchange_c_hat function with inline send/receive
6. Dropped OPRFContext. And used Base context instead. I had to make Base::new public
7. Removed unnecessary x_2.clone() un run_h1
8. impl Distrubution<OPRFShuffleSingleShare> for Standard and usage of
   rng.sample_iter(Standard) instead of manual OPRFShuffle::sample()
9. Got rid of permute fn. Call apply (shuffle) inline
1. Rewrote implt Add for &OPRFShuffleSingleShare in a way that does not spook clippy
2. Hoisted variables for narrow contexts generating random tables into
   the callers to  avoid allocation of them. Instead the addition of
   random tables are done by combining iterators if possible
3. Extract the pieces of work common for all 3 helprs (generate pis,
   generate zs) to the calling function
1. Renamed OPRFShuffleSingleShare -> OPRFShare
2. Extract OPRFShare and all its impls into oprf_share module
3. Used apply (permutation) from protocol/sort/apply.rs
Copy link
Collaborator

@danielmasny danielmasny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shuffle looks great to me! I have a couple of suggestions and left them as comments. I haven't checked the new query type aspects of the pr.

Thanks for all your work!

@@ -6,6 +6,7 @@ pub mod context;
pub mod dp;
pub mod ipa;
pub mod modulus_conversion;
pub mod oprf;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub mod oprf;
#[cfg(feature = "ipa-prf")]
pub mod oprf;

I used the ipa-prf feature which requires descriptive gate, but it hasn't landed yet but probably will land soon.

@@ -6,6 +6,7 @@ pub mod context;
pub mod dp;
pub mod ipa;
pub mod modulus_conversion;
pub mod oprf;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably not call this oprf since it is just a shuffle and a basic protocol that could be used in different context. It is not related to an oprf other than that we want to use it together with a prf/orpf in our new IPA version.


use self::oprf_share::{OPRFShare, OprfBK, OprfF, OprfMK};
use super::{
context::Context, ipa::IPAInputRow, sort::apply::apply as apply_permutation, RecordId,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that we refer here to the sort crate, we should probably refactor apply permutation as its own basic protocol. When we do the sharding, we anyway need to add a new, sharded version of apply which will be separated from sort.

use ipa_macros::Step;
use rand::{distributions::Standard, seq::SliceRandom, Rng};

use self::oprf_share::{OPRFShare, OprfBK, OprfF, OprfMK};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not use oprf in the names here because it is confusing, rather something with shuffle

let ctx_b_hat = ctx.narrow(&OPRFShuffleStep::GenerateBHat);
let b_hat: Vec<_> =
generate_random_table_solo(batch_size, &ctx_b_hat, Direction::Right).collect();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why you generate a hat, b hat here and not together with z and pi? I think it would be cleaner to generate them within the same function.


fn generate_pseudorandom_permutation<R: Rng>(batch_size: u32, rng: &mut R) -> Vec<u32> {
let mut permutation = (0..batch_size).collect::<Vec<_>>();
permutation.shuffle(rng);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use the same function: .shuffle in our sorting protocol for generating random permutations from a seed? It seems to me that this could be generated on the fly rather than storing this huge decompressed permutation in memory.

pub breakdown_key: OprfBK,
pub trigger_value: OprfF,
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we implement (Weak)SharedValue for OPRFShare? Then we could operate on SharedValue during the shuffle. (WeakSharedValue is part of #795 (comment) and only requires to implement additions).

Comment on lines 147 to 148

impl Message for OPRFShare {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont need that when implementing SharedValue

@@ -1,9 +1,9 @@
pub mod apply;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above, we should consider remove this from sort and add it e.g. as a basic protocol

let mut permutation = (0..batch_size).collect::<Vec<_>>();
permutation.shuffle(rng);
permutation
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not a single test. We should test whether this is actually correct, by applying it e.g. to a sorted list, and checking the output for inequality with the input list and then sorting the output again and checking the list with the input list for equality. It should be easy to do by using TestWorld

Artem Ignatyev added 8 commits October 20, 2023 14:51
1. Renamed oprf_shuffle module into oprf::shuffle
2. Renamed OPRFShare to ShuffleShare
3. Renamed OPRFShuffle* type aliases (for fields) into ShufleShare*
…individual basic protocol

Moved protocols::sort::apply to protocols::basic:apply_permutation
…module

This is as prep for making oprf/shuffle protocol generic
…ying some trait bounds

the shuffle, run_hN functions do not depend on a particular format of
input rows. Instead they can receiv IntoIterators consuming anything
satisfying a few bounds

I have also moved the logic of dealing with specific inputs
(deserialize, serialize, split into individual shares, combine back)
into a submodule of queiry::runner::oprf, as I expect all this code to
be thrown away/reworked, when the actual formats of inputs/outputs are
more clear.

But for now this change should let me focus on writing unit tests for
the protocol, that should remain relevant even if
the data input/output formats change
…erator of ReplicatedSecretShares

Instead of accepting a tuple of 2 iterators returning SharedValues
accept a single iterator of AdditiveShares: ReplicatedSecretShares

This should make interaction with TestWorld easier
Artem Ignatyev added 3 commits October 25, 2023 09:41
Instead, shuffle in-place based on shared randomness
If a vec produced by permutation is used only to compute some other
"table" by adding some other table to it, we can do that in-place
@cryo28 cryo28 closed this Oct 25, 2023
@cryo28
Copy link
Contributor Author

cryo28 commented Oct 25, 2023

Closing in favor of #816

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants