Question: Correctness/Precission of complex composable functions #1134

lvlanson · 2024-11-08T16:13:03Z

I am currently running a rather complex algorithm as a composable function inside a loop. There I feed the output of the function back into the function with various other parameters. When running the algorithm over a circuit (on a cpu for 48hours) and running it without encryption, the results differ slightly.

What could be reasons for this behavior? Are there situations where by design this is expected?

On request I am happy to share the code via email or other means if this would be suitable. I'd be even more than happy to have a conversation over any suitable channel on this matter. This algorithm is currently used inside a research project, where we want to publish results.

youben11 · 2024-11-11T07:13:36Z

Can you create a private repo with the code, and invite me there to look at it?

lvlanson · 2024-11-11T09:46:31Z

I sent you an invitation, in the readme I added my discord such that you can contact me anytime

BourgerieQuentin · 2024-11-12T08:18:04Z

Hello @lvlanson the FHE computation comes with errors under a certain probability on PBS application. The compiler offer two different options to control this error probability p_error and global_p_error. While the global_p_error can be ensured by the crypto parameter optimizer for a single function, it cannot be ensured for modules (or single composable function) as we miss the information on the clear control flow (i.e. branching, number of loop iteration, ...). Definitly is something we could improve, and I'm curious about your use case to see what could make sense to implement on our side, I'll sync with @youben11.

I hope this help.

lvlanson · 2024-11-12T09:09:35Z

Thank you very much for this information! We will provide this information in our paper to evaluate the results we have.

We have a rather lightweight classifier which can be implemented in your framework. From the theoretical side, homomorphic encryption guarantees homomorphic properties if the noise bounds are within half of the lattice width around the encrypted point as far as I understood. But it was not clear why we see these deviations.

Currently, by accident, I removed the "composable" tag from the function and still had a spot where I fed the output back into the function. The difference to the version before is, that I did this for a decreased number of steps, then decrypted and encrypted again. By doing this I could increase the size of the domain the values can attain. The results are way better and more consistent to what we would expect, but the deviations have become marginally greater than before.

I assume this is not meant to be used in this way. Shouldn't this normally output an error if I use this incorrectly?

for x_index in tqdm(epoch_set, disable=not show_tqdm):
    evalx  = x_enc_train_data[x_index]
    evalxc = x_enc_train_targ[x_index]
    evalp  = circuit.run(evalx, evalp, evalxc, evalpl, lr)
                
evalp              = circuit.decrypt(evalp)
_, evalp, _, _, _  = circuit.encrypt(x_train_scaled[0], evalp, x_targets[0], proto_labels, input_set_lr[0])

aPere3 · 2024-11-12T15:21:50Z

Hello @lvlanson !

First, an important point to understand is that TFHE is probabilistic by construct. You can not bound the noise, all you can do is change the parameters of the distribution. This means that whatever the parameters you pick, there is always a non-zero probability of getting an error (in your words, the probability of the value being more than half of the lattice width around the encrypted point is always non-zero). Given big enough parameters (sometimes prohibitive ones), you can get the probability of error arbitrary low, but in any case, you are always trading performances for probability of error. So I really recommend that you play with the p_error parameter mentioned earlier.

Now regarding the use of composition. First, if you composed your function without an intermediate decryption/encryption and your model worked, it is just by accident. Indeed, there is no guarantee that inputs and outputs use the same parameters, so any change in your circuit could break this condition, and make the whole model break.

About how concrete is meant to be used or not, this is really up to you to say. Depending on your use case, it may be OK to send back the intermediate data to the client for decryption/re-encryption. In this case, you can iterate on your circuit this way. If it is not OK regarding your use-case, you should use composition.

If you have to use composition, I strongly advise you to use modules (see https://docs.zama.ai/concrete/compilation/combining/composing_functions_with_modules), and in particular, the automatic tracing capability. Even if you have a single function, it will allow a better optimization of the parameters. This way, you will be able to lower your p_error, while keeping decent performances.

Hope this helps,

Alex.

lvlanson · 2024-11-13T07:36:23Z

Hey @aPere3,

thank you for those clarifications. This helped a lot in getting a better understanding what concrete actually does. Is there a resource where the details on the probabilistic end are explained in more detail? Because I expected that the decryption/encryption procedure should be non-probabilistic (by theory). Is this an implementation detail, or is my understanding of TFHE here just wrong?

aPere3 · 2024-11-14T08:56:03Z

Hey @lvlanson ,

You may find the TFHE Deep-Dive blog post useful in understanding how all that work. But if you need a more in-depth take, you may need to read the papers on TFHE. That being said, the blog post should be more than sufficient for any concrete user. Our goal is that users should not have to worry about all that.

The only thing I would add that may help you understand error better, is that in the following illustration from the blog:

The representation of the error is a bit misleading. Indeed, what is represented here, is a single sample of the error distribution. In this particular case, yes, the error is not overlapping with the message, and decryption is just fine.

But in practice, when using TFHE you have to think in terms of distributions. The following illustration shows the probability of a bit being activated for an error drawn from a given gaussian distribution. The darker, the higher the probability of the bit being activated:

On this illustration, the red frontier represents the location on the bit scale above which the bit has less than 0.1% probability of being activated. You can see that it is pretty high up the scale, for something as big as 0.1% chance. Also one thing to understand is that the security of the scheme relies on the fact that the error should not be too far from the message itself.

This makes the situation quite complex for parameter tuning, which is why we provide this p_error parameter, which takes care of all these subtelties for you.

Hope that helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Correctness/Precission of complex composable functions #1134

Question: Correctness/Precission of complex composable functions #1134

lvlanson commented Nov 8, 2024

youben11 commented Nov 11, 2024

lvlanson commented Nov 11, 2024

BourgerieQuentin commented Nov 12, 2024

lvlanson commented Nov 12, 2024 •

edited

Loading

aPere3 commented Nov 12, 2024

lvlanson commented Nov 13, 2024

aPere3 commented Nov 14, 2024

Question: Correctness/Precission of complex composable functions #1134

Question: Correctness/Precission of complex composable functions #1134

Comments

lvlanson commented Nov 8, 2024

youben11 commented Nov 11, 2024

lvlanson commented Nov 11, 2024

BourgerieQuentin commented Nov 12, 2024

lvlanson commented Nov 12, 2024 • edited Loading

aPere3 commented Nov 12, 2024

lvlanson commented Nov 13, 2024

aPere3 commented Nov 14, 2024

lvlanson commented Nov 12, 2024 •

edited

Loading