Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the meaning of the JSON filename corresponding to in the parameters directory? #75

Open
qxzhou1010 opened this issue May 7, 2024 · 3 comments

Comments

@qxzhou1010
Copy link

In the parameters directory, there are some JSON files, and each JSON file's name has a specific meaning. For instance, 1M-1-32.json indicates that the Sender's data num is one million, the Receiver's query data num is 1, and the label length is 32 bytes. Is my understanding correct?

However, there are some filenames that are not so easy to comprehend, such as: 1M-512-cmp.json and 1M-512-com.json. What's the difference between these two parameters? What do the suffixes cmp and com respectively signify?

@kimlaine
Copy link
Contributor

kimlaine commented Jun 5, 2024

You are correct regarding the sizes. These are parameters that are generally "good" for these sizes. However, note that you can always use it for a larger or smaller sender set as well, or a smaller receiver set, but not a larger receiver set.

The cmp and com indicate that the parameters are particularly optimized for lower computation (cmp) or communication (com). APSI allows for flexible and complicated trade-offs between computation and communication, and these parameter do not by any means represent extremes in either direction. They are just generally good parameters that are particularly good in one of the two aspects.

@MLikeWater
Copy link

@kimlaine
To consult on a question: if the data provider (Sender) has data volumes in the billions or tens of millions, and the data query party (Receiver) has query volumes in the tens of millions or millions, how should I choose these JSON parameter files with the goal of achieving high concurrency and fast query performance? Thanks.

@kimlaine
Copy link
Contributor

kimlaine commented Dec 2, 2024

APSI is designed for the asymmetric case, where the receiver set is much smaller than the sender set. In a more symmetric case you'd generally want to use a different kind of PSI protocol that has better performance (in the symmetric case). If you still want to us APSI, it's always possible to run just many instances of the APSI protocol with smaller (receiver size) parameters. Just add as many entries in the receiver's query as possible and then run queries until you have checked everything. If you have a streaming data situation, this might be pretty reasonable: send queries when you have obtained enough data to fill the receiver's query (adding more items fails).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants