Knn classifier in cml #217

kcelia · 2023-09-01T14:17:24Z

As of now:

only predict function is supported, predict_proba is not.
what's done on the client side? the majority_vote.
what's done on the server side? pairwise_euclidean_distance and the k nearest labels.
- sqrt is not performed in FHE, since sqrt is a monotonic function, it doesn't affect the argmax calculation. So, removing it will increase the computation,

src/concrete/ml/sklearn/neighbors.py

jfrery

Great first draft well done! Could you explain the choices about what is done in FHE and what is done in clear on the client side?

It looks like we only partly compute the distances in FHE. Why isn't the sqrt done in FHE? Is it just prohibitively expensive to compute the topk and majority vote in FHE? Currently it seems that we return all distance to the client which is probably going to leak quite some information about the training.

jfrery · 2023-09-04T10:58:08Z

src/concrete/ml/sklearn/base.py

+        distance_matrix = (
+            numpy.sum(q_X**2, axis=1, keepdims=True)
+            - 2 * q_X @ self._q_X_fit.T
+            + numpy.expand_dims(numpy.sum(self._q_X_fit**2, axis=1), 0)


numpy.expand_dims(numpy.sum(self._q_X_fit**2, axis=1), 0) can be done at training time I suppose

it's a constant no? it will be precomputed by CP

fd0r · 2023-09-04T15:47:38Z

Seems like this PR adds a lot of time to the CI.

src/concrete/ml/sklearn/base.py

tests/common/test_skearn_model_lists.py

src/concrete/ml/sklearn/base.py

src/concrete/ml/pytest/utils.py

src/concrete/ml/search_parameters/p_error_search.py

tests/deployment/test_client_server.py

quantization not working properly add similarity point encrypted argsort and topk in clear

only pairwise.euclidean_distances is encrypted topk and majority vote are done on the client side

…__init__

tests/sklearn/test_sklearn_models.py

RomanBredehoft · 2023-09-20T15:38:31Z

src/concrete/ml/sklearn/base.py

+        # a training phase
+        self._q_fit_X: numpy.ndarray
+        # _y: Labels of `_q_fit_X`
+        self._y: numpy.ndarray


haven't followed everything, so we keep this _y attribute then ?

yes, we can keep it but we need to make sure it doesn't exist in the model exported in the client (in client / server)

ah yes ok so basically it's for the predict but not for the post_processing right got it

andrei-stoian-zama

Looks good !

tests/deployment/test_client_server.py

jfrery

Looks good just a few comments.

src/concrete/ml/sklearn/base.py

jfrery · 2023-09-21T07:34:13Z

src/concrete/ml/sklearn/base.py

+                    x = scatter1d(x, max_x, range_i + d)
+
+                    # Max index selection
+                    sign = diff <= 0


This is not the sign but a boolean I guess. Why isn't this done right after computing diff?

Have you tried the CP comparison optimization by replacing

diff = a - b sign = diff <= 0

to

sign = a <= b

?

At the beginning of this conversation, I thought we were skeptical about letting the compiler choose the best strategy.

But I missed Ruby's last messages, which are:

yes, it make the bitwidth compatible with the stategies you asked, and once bitwidth inference is done, it picks the best strategy based on an heuristic (I think it tries to minimize the number of TLU without increasing the maximum precision). So if a 8bit TLU already exists in the circuit it accepts to use that precision, otherwise it will try to stick to lower precisions.
It's not optimal in the sense of the cost model as it would requires solving the crypto-parameters

So, I think CP's comparison is worth using.

It's indeed a boolean that tells us if a is greater than b.
I'll change the naming.

Maybe you can just check on a simple example if you have any time improvement. Otherwise you can leave it.

if you don't have the time right now worth an issue then I think because this might be a good thing to try anyway

I'll do it in a separate PR.
Some testings are needed.

Great can you create an issue for this?

github-actions · 2023-09-21T12:55:13Z

Coverage passed ✅

Coverage details

---------- coverage: platform linux, python 3.8.18-final-0 -----------
Name    Stmts   Miss  Cover   Missing
-------------------------------------
TOTAL    6085      0   100%

51 files skipped due to complete coverage.

jfrery

Looks good to me! Thanks. Please just create an issue to check the CP comparison (see comment)

kcelia requested a review from a team as a code owner September 1, 2023 14:17

cla-bot bot added the cla-signed label Sep 1, 2023

kcelia changed the title ~~Knn classifier in cml v2 3818~~ Knn classifier in cml Sep 1, 2023

kcelia force-pushed the knn_classifier_in_cml_v2_3818 branch 2 times, most recently from 8045f47 to a50ad1d Compare September 1, 2023 14:52

andrei-stoian-zama requested changes Sep 3, 2023

View reviewed changes

src/concrete/ml/sklearn/neighbors.py Outdated Show resolved Hide resolved

kcelia force-pushed the knn_classifier_in_cml_v2_3818 branch from 6afa513 to 2f087c1 Compare September 4, 2023 08:05

jfrery reviewed Sep 4, 2023

View reviewed changes

jfrery reviewed Sep 5, 2023

View reviewed changes

src/concrete/ml/sklearn/base.py Outdated Show resolved Hide resolved

kcelia marked this pull request as draft September 5, 2023 14:39

jfrery reviewed Sep 5, 2023

View reviewed changes

src/concrete/ml/sklearn/base.py Outdated Show resolved Hide resolved

bcm-at-zama reviewed Sep 6, 2023

View reviewed changes

tests/common/test_skearn_model_lists.py Show resolved Hide resolved

kcelia force-pushed the knn_classifier_in_cml_v2_3818 branch 3 times, most recently from 45bc34f to 0dc23cf Compare September 11, 2023 13:08

andrei-stoian-zama self-requested a review September 13, 2023 15:41

RomanBredehoft reviewed Sep 13, 2023

View reviewed changes

src/concrete/ml/sklearn/base.py Outdated Show resolved Hide resolved

andrei-stoian-zama requested changes Sep 13, 2023

View reviewed changes

src/concrete/ml/pytest/utils.py Show resolved Hide resolved

src/concrete/ml/search_parameters/p_error_search.py Outdated Show resolved Hide resolved

tests/deployment/test_client_server.py Outdated Show resolved Hide resolved

kcelia added 10 commits September 14, 2023 11:05

chore: update base.py with concrete ml v

0fc4ad8

chore: v2

8dc0199

chore: keep one class

771648f

quantization not working properly add similarity point encrypted argsort and topk in clear

chore: remove other classes

4fc02ea

chore: update

481950d

chore: version 1 w

b1ffecc

only pairwise.euclidean_distances is encrypted topk and majority vote are done on the client side

chore: previous version

af2550a

chore: start testing

96cd821

chore: first testing version

98de388

chore: add _NEIGHBORS_MODELS and get_sklearn_neighbors_models to …

795842e

…__init__

kcelia dismissed RomanBredehoft’s stale review via 9d0a4dd September 19, 2023 16:20

kcelia force-pushed the knn_classifier_in_cml_v2_3818 branch from 7178816 to 9d0a4dd Compare September 19, 2023 16:20

andrei-stoian-zama self-requested a review September 20, 2023 07:41

kcelia added 2 commits September 20, 2023 11:37

chore: predict returns the topk labels

a59aa96

chore: update check_for_divergent_predictions test for KNN

d5b6e46

kcelia force-pushed the knn_classifier_in_cml_v2_3818 branch 6 times, most recently from 69f11f8 to dac4e8c Compare September 20, 2023 14:48

kcelia requested a review from RomanBredehoft September 20, 2023 15:30

RomanBredehoft reviewed Sep 20, 2023

View reviewed changes

tests/sklearn/test_sklearn_models.py Outdated Show resolved Hide resolved

RomanBredehoft reviewed Sep 20, 2023

View reviewed changes

andrei-stoian-zama previously approved these changes Sep 21, 2023

View reviewed changes

tests/deployment/test_client_server.py Show resolved Hide resolved

jfrery reviewed Sep 21, 2023

View reviewed changes

kcelia dismissed andrei-stoian-zama’s stale review via 6229b6d September 21, 2023 07:53

kcelia force-pushed the knn_classifier_in_cml_v2_3818 branch 3 times, most recently from 4b03fb9 to 3cf5e7d Compare September 21, 2023 09:52

chore: add post_processing

fd2c1c7

kcelia force-pushed the knn_classifier_in_cml_v2_3818 branch from 3cf5e7d to fd2c1c7 Compare September 21, 2023 11:15

kcelia requested review from jfrery, RomanBredehoft and andrei-stoian-zama September 21, 2023 12:57

jfrery approved these changes Sep 21, 2023

View reviewed changes

andrei-stoian-zama approved these changes Sep 21, 2023

View reviewed changes

kcelia merged commit 1c33ec8 into main Sep 21, 2023
8 of 9 checks passed

kcelia deleted the knn_classifier_in_cml_v2_3818 branch September 21, 2023 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Knn classifier in cml #217

Knn classifier in cml #217

kcelia commented Sep 1, 2023 •

edited by andrei-stoian-zama

Loading

jfrery left a comment •

edited

Loading

jfrery Sep 4, 2023 •

edited

Loading

andrei-stoian-zama Sep 21, 2023

fd0r commented Sep 4, 2023

RomanBredehoft Sep 20, 2023

andrei-stoian-zama Sep 21, 2023

RomanBredehoft Sep 21, 2023

andrei-stoian-zama left a comment

jfrery left a comment

jfrery Sep 21, 2023

kcelia Sep 21, 2023

kcelia Sep 21, 2023

jfrery Sep 21, 2023

RomanBredehoft Sep 21, 2023

kcelia Sep 21, 2023

jfrery Sep 21, 2023

github-actions bot commented Sep 21, 2023

jfrery left a comment

Knn classifier in cml #217

Knn classifier in cml #217

Conversation

kcelia commented Sep 1, 2023 • edited by andrei-stoian-zama Loading

jfrery left a comment • edited Loading

Choose a reason for hiding this comment

jfrery Sep 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fd0r commented Sep 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrei-stoian-zama left a comment

Choose a reason for hiding this comment

jfrery left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Sep 21, 2023

Coverage passed ✅

jfrery left a comment

Choose a reason for hiding this comment

kcelia commented Sep 1, 2023 •

edited by andrei-stoian-zama

Loading

jfrery left a comment •

edited

Loading

jfrery Sep 4, 2023 •

edited

Loading