-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid using floating point arithmetic for privacy budget #77
Comments
Good question. I've had similar thoughts. Though I'd managed to suppress them. You have to shave things pretty finely to end up with problems here. Sure, a three-way split is going to accumulate an error, but it is so small as to be a non-event. In many cases, you won't even have an error. Is dividing by 1k going to give people the resolution they require? Or would If we accept some amount of floating point error, then that might push more toward less absolute actions at the enforcement end. Reducing contributions proportionally, rather than dropping them, for example. Let's say that you had someone spend 1/3 then 1/2 of their budget. The 1/3 will have rounding errors, but I've observed that spending the remaining 1/6 doesn't result in problems. You have to get pretty exotic with your budget computations before you are even off by the (floating point) epsilon of 2^-304 or whatever that is. |
Actually I realized another issue that will need some thought, which is the budget calculation is computed as
(3) is obviously not practical. Unfortunately for (2) we don't have fixed precision primitives but it can be simulated by rounding to integral values and using something like
Can you expand on this? What do you mean by problems? Is it related to #16? |
This is just a general observation that - in a lot of cases anyway - the errors involved in multiple floating point additions amount to zero. Not close to zero, but zero. That's likely because the rounding either cancels out or it never ends up being large enough to tickle that 52nd bit of the mantissa. That is, maybe it isn't worth the stress involved with rationals, or even It might be worth quantifying risks.
This is all testable though. I seriously doubt that any budget would result in a significant privacy loss if split multiple ways, as long as there was some minimum spend on each query. (That's a fun exercise, work out what sequence of floating point values ensure the greatest difference between an actual and realized sum. In practice, as long as you don't allow spending in increments of less than Similarly, the consequences of rounding errors leading to budget overflow and loss of data is a. Manageable (it's possible to compute how different spending patterns lead to bad rounding and avoid that) There is also a risk that contributions to histogram need to be split in ways that result in non-integer contributions. There, I think our best recourse is Of course, after writing all this, I'm still of the "don't overthink it" persuasion, despite having thoroughly overthought it. |
This popped up just today: https://github.com/tc39/proposal-math-sum (To be clear, I'm not proposing that we use that. I'm suggesting that we don't.) |
This is a fairly minor issue, but it has been bugging me since reading the spec. We capture the value of
epsilon
inPrivateAttributionConversionOptions
as adouble
, which implies (though it is not specified) that we will be doing accounting indouble
space.Since we will be doing repeated addition / subtraction here, there is a chance for (small) errors to accumulate, meaning that our guarantee of
epsilon
may be (very) slightly off. My preference instead would be to have the interface be inmilliEpsilon
s and do accounting only with integers.1Note: an alternative would be to round
epsilon
to a fixed precision first, and then do error-freeepsilon
computations using rounded values. This has the benefit of enabling a possibly "cleaner" interface for developers, but is worse in my opinion because the rounding is not very transparent.Footnotes
I actually wish we had a rational type, but unfortunately it doesn't exist. ↩
The text was updated successfully, but these errors were encountered: