-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Correctness/Precission of complex composable functions #1134
Comments
Can you create a private repo with the code, and invite me there to look at it? |
I sent you an invitation, in the readme I added my discord such that you can contact me anytime |
Hello @lvlanson the FHE computation comes with errors under a certain probability on PBS application. The compiler offer two different options to control this error probability p_error and global_p_error. While the I hope this help. |
Thank you very much for this information! We will provide this information in our paper to evaluate the results we have. We have a rather lightweight classifier which can be implemented in your framework. From the theoretical side, homomorphic encryption guarantees homomorphic properties if the noise bounds are within half of the lattice width around the encrypted point as far as I understood. But it was not clear why we see these deviations. Currently, by accident, I removed the "composable" tag from the function and still had a spot where I fed the output back into the function. The difference to the version before is, that I did this for a decreased number of steps, then decrypted and encrypted again. By doing this I could increase the size of the domain the values can attain. The results are way better and more consistent to what we would expect, but the deviations have become marginally greater than before. I assume this is not meant to be used in this way. Shouldn't this normally output an error if I use this incorrectly? for x_index in tqdm(epoch_set, disable=not show_tqdm):
evalx = x_enc_train_data[x_index]
evalxc = x_enc_train_targ[x_index]
evalp = circuit.run(evalx, evalp, evalxc, evalpl, lr)
evalp = circuit.decrypt(evalp)
_, evalp, _, _, _ = circuit.encrypt(x_train_scaled[0], evalp, x_targets[0], proto_labels, input_set_lr[0]) |
Hello @lvlanson ! First, an important point to understand is that TFHE is probabilistic by construct. You can not bound the noise, all you can do is change the parameters of the distribution. This means that whatever the parameters you pick, there is always a non-zero probability of getting an error (in your words, the probability of the value being more than half of the lattice width around the encrypted point is always non-zero). Given big enough parameters (sometimes prohibitive ones), you can get the probability of error arbitrary low, but in any case, you are always trading performances for probability of error. So I really recommend that you play with the Now regarding the use of composition. First, if you composed your function without an intermediate decryption/encryption and your model worked, it is just by accident. Indeed, there is no guarantee that inputs and outputs use the same parameters, so any change in your circuit could break this condition, and make the whole model break. About how concrete is meant to be used or not, this is really up to you to say. Depending on your use case, it may be OK to send back the intermediate data to the client for decryption/re-encryption. In this case, you can iterate on your circuit this way. If it is not OK regarding your use-case, you should use composition. If you have to use composition, I strongly advise you to use modules (see https://docs.zama.ai/concrete/compilation/combining/composing_functions_with_modules), and in particular, the automatic tracing capability. Even if you have a single function, it will allow a better optimization of the parameters. This way, you will be able to lower your p_error, while keeping decent performances. Hope this helps, Alex. |
Hey @aPere3, thank you for those clarifications. This helped a lot in getting a better understanding what concrete actually does. Is there a resource where the details on the probabilistic end are explained in more detail? Because I expected that the decryption/encryption procedure should be non-probabilistic (by theory). Is this an implementation detail, or is my understanding of TFHE here just wrong? |
Hey @lvlanson , You may find the TFHE Deep-Dive blog post useful in understanding how all that work. But if you need a more in-depth take, you may need to read the papers on TFHE. That being said, the blog post should be more than sufficient for any concrete user. Our goal is that users should not have to worry about all that. The only thing I would add that may help you understand error better, is that in the following illustration from the blog: The representation of the error is a bit misleading. Indeed, what is represented here, is a single sample of the error distribution. In this particular case, yes, the error is not overlapping with the message, and decryption is just fine. But in practice, when using TFHE you have to think in terms of distributions. The following illustration shows the probability of a bit being activated for an error drawn from a given gaussian distribution. The darker, the higher the probability of the bit being activated: On this illustration, the red frontier represents the location on the bit scale above which the bit has less than 0.1% probability of being activated. You can see that it is pretty high up the scale, for something as big as 0.1% chance. Also one thing to understand is that the security of the scheme relies on the fact that the error should not be too far from the message itself. This makes the situation quite complex for parameter tuning, which is why we provide this Hope that helps. |
I am currently running a rather complex algorithm as a composable function inside a loop. There I feed the output of the function back into the function with various other parameters. When running the algorithm over a circuit (on a cpu for 48hours) and running it without encryption, the results differ slightly.
What could be reasons for this behavior? Are there situations where by design this is expected?
On request I am happy to share the code via email or other means if this would be suitable. I'd be even more than happy to have a conversation over any suitable channel on this matter. This algorithm is currently used inside a research project, where we want to publish results.
The text was updated successfully, but these errors were encountered: