-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add FHE training deployment #665
Conversation
dd6fc4c
to
3fa5d75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The notebook seems a bit loaded..
I think we could:
- move the plot_decision boundary functions to some file in
utils
. - why do we need two
plot_decision_boundary
functions? - only show the decision boundary graph at iteration 0 and the last iteration
Next, we should showcase Deployment first:
- "Training on encrypted data in production": export the trainer with the FheModelDev/etc, load client/server, etc..
- "Develop a Logistic regression trainer for encrypted data" -> here we show the simulation part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks ! I have several comments and observations
I agree.
One uses the weigths/bias and the other uses the concrete-ml .predict(). I will make a single one.
I will do this if I can't find anything better. I agree 10 plots is too much. But I want to show that the model is learning.
Feels weird to me to show first the production and then how you develop the model which will eventually go to production. I agree with most of your remark. Maybe we should create a new notebook for production only and refer to the development one. |
3fa7006
to
9bd428e
Compare
8c290c2
to
c6d2d5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for this !
@@ -687,13 +680,7 @@ def fit( # type: ignore[override] | |||
|
|||
# If the model should be trained using FHE training | |||
if self.fit_encrypted: | |||
if fhe is None: | |||
fhe = "disable" | |||
warnings.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no this warning was made on purpose, I think we should keep it
the idea if that users should set fhe
to something if they activate fit_encrypted
to be sure they know what they are doing
the reason why fhe
is not defaulted to disable
is to avoid the ambiguous situation of having fit_encrypted
to false and fhe
to disable
, which wouldn't make much sense since training would be done in floating points with sklearn
cc @fd0r
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I miss something but this didn't really make sense to me. If we have a warning here then why not for all our models in the predict?
avoid the ambiguous situation of having fit_encrypted to false and fhe to disable, which wouldn't make much sense since training would be done in floating points with sklearn
If fit_encrypted = False then we have this:
concrete-ml/src/concrete/ml/sklearn/linear_model.py
Lines 705 to 710 in 4f75979
if fhe is not None: | |
raise ValueError( | |
"Parameter 'fhe' should not be set when FHE training is disabled. Either set it to " | |
"None for floating point training in the clear or set 'fit_encrypted' to True when " | |
f"initializing the model. Got {fhe}." | |
) |
Or what do I miss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes you right about the second case, good thing then !
but the reason we don't have this in the predict
methods is simply because there is not ambiguity there, we only have one parameter related to fhe execution : fhe
(and it can't be None
). Whereas with encrypted training, we have this additional fit_encrypted
in the init.
so more or less I believe the idea here was to better determine the fhe
parameter's role, by having something like :
None
: float cleardisable | simulate | fhe
: the usual
so when a user sets fit_encrypted
, it's better to make sure he's aware of what mode he's using
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no notion of fhe= if it's a training in the clear and we have a assert on this. To me, fhe="disable" should be the default if users use fit_encrypted = True.
I don't think that the warning "Parameter 'fhe' isn't set while FHE training is enabled.\n Defaulting to '{fhe=}'"
is of any help to the user.
If the API is really confusing, having warning to try to explain isn't the right direction. Instead we should rethink the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well iirc there was some discussion on this in particular when the API was made and this is how we decided to go for, maybe @fd0r can tell us more about it !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes @RomanBredehoft got it right.
We don't want disable
by default since it wouldn't make sense if fit_encrypted=False
.
But not providing a value for fhe
if fit_encrypted=True
is also "dangerous". The idea here was to make sure that the user has to provide some parameter for fhe
in this case.
Now that I'm re-reading that I would have put an Exception instead of a warning here.
I don't think we should remove the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want disable by default since it wouldn't make sense if fit_encrypted=False.
But if fit_encrypted=False.
and fhe != None, we raise an error so this is not possible.
But not providing a value for fhe if fit_encrypted=True is also "dangerous"
I really don't understand why this is "dangerous". For the inference, disable is the default. Why would it be dangerous to do it for training?
Also, what should the user learn from "Parameter 'fhe' isn't set while FHE training is enabled.\n Defaulting to '{fhe=}'"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't understand why this is "dangerous". For the inference, disable is the default. Why would it be dangerous to do it for training?
Because we would be changing the default from the signature mostly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what should the user learn from "Parameter 'fhe' isn't set while FHE training is enabled.\n Defaulting to '{fhe=}'"?
Mainly that the default value is changing is shouldn't really happen. And that the prefered way is to specify this argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright 🏳️ well I will add back the warnings and tests for it.
@@ -763,13 +750,7 @@ def partial_fit( | |||
# A partial fit is similar to a fit with a single iteration. The slight differences between | |||
# both are handled in the encrypted method when setting `is_partial_fit` to True. | |||
if self.fit_encrypted: | |||
if fhe is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the feature !! lots of cleaning, which is great. I have several questions and observations
a49f686
to
d39c6aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the deployment be done without any call to the fit
method?
We should be able to deploy with only the batch size, the number of features and the batch size as inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should probably take composition into account such that n-iter is included in the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the deployment be done without any call to the fit method?
We should be able to deploy with only the batch size, the number of features and the batch size as inputs.
I just use the existing way of creating a training FHE circuit. Do we have something else than the _fit_encrypted
method to create such circuit? Or are you proposing to wrap this into another method within the FHEModel API?
That being said, I definitely agree that having to instantiate fhe circuit for deployment with a call to fit is super weird...
Also we should probably take composition into account such that n-iter is included in the query.
I am not sure where the n-iter would belong but I believe it's a parameter that needs to be sent to the server so I would say it would need to be sent at the same time as the serialized encrypted values. I don't think it has anything to do with the FHEModel API unless we start implementing the communication protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created this for the first point https://github.com/zama-ai/concrete-ml-internal/issues/4466
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for these changes, let's address the rest in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for reviewing this so late.
I have a few comments on the API design.
IMHO we should consider composition for this, or merge as is and improve upon that, but it would be easier if it was already taken into account.
Also we shouldn't have to call the fit
function before deploying imo. It's true that we would have to take the inputs ranges into account, the batch-size and number of features and targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's iterate on that! Thanks a lot for taking care of this!
f"Defaulting to '{fhe=}'", | ||
stacklevel=2, | ||
) | ||
fhe = "disable" if fhe is None else fhe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't changed back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning is still removed.
Coverage passed ✅Coverage details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, thanks a lot ! one quick question: how is it going to work once we integrate composition with training (#660) ?
closes https://github.com/zama-ai/concrete-ml-internal/issues/4373
closes https://github.com/zama-ai/concrete-ml-internal/issues/4454
Proposition:
Check the notebook to see how to use it. Basically, it's the same as before for the user. We only add a single parameter in the fhe_dev.save