chore: allow users to pass schema in encrypted data-frames #676

RomanBredehoft · 2024-05-07T15:13:59Z

refs https://github.com/zama-ai/concrete-ml-internal/issues/4376

RomanBredehoft · 2024-05-07T15:15:33Z

src/concrete/ml/pandas/_processing.py

+        if column_name not in column_names:
+            # TODO: Is this check actually relevant ? Can't the schema provide more columns than the
+            # one found in the data-frame ?
+            raise ValueError(


should we allow schema with column names that do not match the ones found in the given data-frame ?

Imo raising the error as you are doing here is the correct behavior.

RomanBredehoft · 2024-05-07T15:16:51Z

src/concrete/ml/pandas/_processing.py

@@ -189,10 +297,15 @@ def pre_process_dtypes(pandas_dataframe: pandas.DataFrame) -> Tuple[pandas.DataF
                "supported."
            )

+    # TODO: Should all non-integers columns be considered by the schema if not None ? Currently,


should we raise an error/warning if all non-integer columns from the data-frame were not covered by the given schema ?

What would happen if they are missing from the given schema?

they are automatically computed

I'm just wondering because if could be an easy mistake to forget to put some columns, but no error will be raised

RomanBredehoft · 2024-05-07T15:18:39Z

src/concrete/ml/pandas/client_engine.py

@@ -37,7 +41,9 @@ def keygen(self, keys_path: Optional[Union[Path, str]] = None):
        else:
            self.client.keygen(True)

-    def encrypt_from_pandas(self, pandas_dataframe: pandas.DataFrame) -> EncryptedDataFrame:
+    def encrypt_from_pandas(
+        self, pandas_dataframe: pandas.DataFrame, schema: Optional[Dict] = None


a schema is optional. If set, it should follow a specific format.

if needed, we could also handle the output of get_schema (pandas data-frames) as an input here

github-actions · 2024-05-27T17:41:09Z

Coverage passed ✅

Coverage details

---------- coverage: platform linux, python 3.8.18-final-0 -----------
Name    Stmts   Miss  Cover   Missing
-------------------------------------
TOTAL    7633      0   100%

59 files skipped due to complete coverage.

fd0r · 2024-06-03T10:36:38Z

tests/pandas/test_pandas.py

+        elif column.dtype == "object":
+            unique_values = column.unique()
+
+            # Only take strings into account and thus avoid NaN values


You could also use the old x != x to detect NaNs

ah yeah right, I keep forgetting this trick thanks

fd0r

lgtm, responded to the todos

cla-bot bot added the cla-signed label May 7, 2024

RomanBredehoft commented May 7, 2024

View reviewed changes

chore: allow users to pass schema in encrypted data-frames

3648432

RomanBredehoft force-pushed the feat/allow_users_pass_schema_encrypted_dataframe_4376 branch from e81b54d to 3648432 Compare May 23, 2024 16:10

chore: fix pcc

d0a3f96

RomanBredehoft force-pushed the feat/allow_users_pass_schema_encrypted_dataframe_4376 branch from 4473764 to 760712f Compare May 27, 2024 15:15

chore: add checks for value error raises

6421bc5

RomanBredehoft force-pushed the feat/allow_users_pass_schema_encrypted_dataframe_4376 branch from 760712f to 6421bc5 Compare May 27, 2024 16:35

RomanBredehoft marked this pull request as ready for review May 28, 2024 15:02

RomanBredehoft requested a review from a team as a code owner May 28, 2024 15:02

fd0r reviewed Jun 3, 2024

View reviewed changes

fd0r approved these changes Jun 3, 2024

View reviewed changes

andrei-stoian-zama self-requested a review June 4, 2024 08:17

RomanBredehoft merged commit ccd6641 into main Jun 4, 2024
12 checks passed

RomanBredehoft deleted the feat/allow_users_pass_schema_encrypted_dataframe_4376 branch June 4, 2024 08:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: allow users to pass schema in encrypted data-frames #676

chore: allow users to pass schema in encrypted data-frames #676

RomanBredehoft commented May 7, 2024 •

edited

Loading

RomanBredehoft May 7, 2024

fd0r Jun 3, 2024

RomanBredehoft May 7, 2024

fd0r Jun 3, 2024

RomanBredehoft Jun 4, 2024

RomanBredehoft Jun 4, 2024

RomanBredehoft May 7, 2024

github-actions bot commented May 27, 2024

fd0r Jun 3, 2024

RomanBredehoft Jun 3, 2024

fd0r left a comment

chore: allow users to pass schema in encrypted data-frames #676

chore: allow users to pass schema in encrypted data-frames #676

Conversation

RomanBredehoft commented May 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented May 27, 2024

Coverage passed ✅

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fd0r left a comment

Choose a reason for hiding this comment

RomanBredehoft commented May 7, 2024 •

edited

Loading