Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About A2B in Rep4 #1509

Open
sycamoreeeee opened this issue Sep 29, 2024 · 5 comments
Open

About A2B in Rep4 #1509

sycamoreeeee opened this issue Sep 29, 2024 · 5 comments

Comments

@sycamoreeeee
Copy link

Hi Keller, i'm studying the protocol and implementation of Fantastic Four.
In the paper, share splitting operation is used to convert arithmetic shares to boolean ones with Binary Adder, however, i can't connect the described protocol with the code of Rep4::split, especially the parameter "regs" and the unit processing.
Can you help me to understand the implementation? Thanks!

@mkskeller
Copy link
Member

Rep4<T>::split implements a functionality that didn't make it into the paper. The functionality in the paper is found in Rep4Share2<K>::split. The difference between the two is akin to the two variants for bit decomposition in Section 5.3 of https://eprint.iacr.org/2018/403. In terms of parameters and techniques the two implementations are quite similar, however. regs comes from the virtual machine design and it's simply the arguments to the split instruction (apart from the length argument): https://mp-spdz.readthedocs.io/en/latest/instructions.html#Compiler.GC.instructions.split. The unit processing is a tool for optimization in the vectorized setting (i.e. processing many elements at once). Extracting single bits naively is expensive, but modern CPUs have instructions that allow optimizing transposing matrices of bits. The split code translates the problem of extracting vectors of bits from vectors of elements to this transposition. The maximum unit of this transposition is 64x64 bits, which is a natural choice for 64-bit processors.

@sycamoreeeee
Copy link
Author

Hi Keller, in the paper, after locally splitting the [x]_2k (x = x1 + x2 + x3 + x4) into { [x1_j ]_2, [x2_j ]_2, [x3_j ]_2, [x4_j ]_2 } for all j in [k], the final step of computing [x_j]_2 is to use a binary adder. I want to know how to sum these splits up concretely? Is it to first compute y1 = (x1 + x2) and y2 = (x2 + x3) then compute x = (y1 + y2) ?
Also, in the paper, the formal detailed descriptions of mixed circuit computation such as B2A conversions, MSB extraction, bit injection and so on are omitted, while the Rep4 2k offers the basic mul, dotprod and split. Can you help me to find out the full-fledged implementation details? I would appreciate that, thank you!

@mkskeller
Copy link
Member

Hi Keller, in the paper, after locally splitting the [x]_2k (x = x1 + x2 + x3 + x4) into { [x1_j ]_2, [x2_j ]_2, [x3_j ]_2, [x4_j ]_2 } for all j in [k], the final step of computing [x_j]_2 is to use a binary adder. I want to know how to sum these splits up concretely? Is it to first compute y1 = (x1 + x2) and y2 = (x2 + x3) then compute x = (y1 + y2) ?

We use Wallace trees, a technique from binary multiplication: https://en.wikipedia.org/wiki/Wallace_tree

Also, in the paper, the formal detailed descriptions of mixed circuit computation such as B2A conversions, MSB extraction, bit injection and so on are omitted, while the Rep4 2k offers the basic mul, dotprod and split. Can you help me to find out the full-fledged implementation details? I would appreciate that, thank you!

Bit injection is very simple and explained in the text of Section 3. MSB extraction is done by simply computing the MSB in binary as above followed by converting to arithmetic using bit injection. Lastly, B2A with more than one bit can be either be done using edaBits (https://eprint.iacr.org/2020/338) or with a few tricks that aren't in the paper because it's not really relevant in applications.

@mkskeller mkskeller reopened this Dec 13, 2024
@sycamoreeeee
Copy link
Author

Thank you. I'm still studying since I want to re-implement the Rep4.
In the paper you mentioned that “The most common design would use k AND gates in a first step to compute k generate-propagate tuples, followed by tree-wise reduction where every step involves 2 AND gates, resulting in 3k AND gates overall.”

Does the tree-wise reduction correspond to the Wallace tree?
If so, the way I understand it is to use the Full Adder to first reduce x1+x2+x3 to c and s (like ABY3) and repeat for (c, s, x4) to obtain c', s', on which we can use binary adder like PPA to compute the final result?
Is that correct?
If so, since 2 reductions require about 2k FullAdders and each FullAdder requires 2 AND Gates, the overall number of AND Gates and communication rounds could be (5k, k + 2) or (4k + klogk, 2 + log k) with PPA?

@mkskeller
Copy link
Member

You can implement a full adder with only one AND gate when you use a MUX gate (which requires one AND): https://www.researchgate.net/figure/Full-adder-using-XOR-gates-and-a-MUX_fig6_234773872
Furthermore, MP-SPDZ achieves a trade-off between AND gates and rounds using a carry-select adder rather than PPA: https://en.wikipedia.org/wiki/Carry-select_adder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants