This document explains the deployment workflow and the model serving pattern for deploying Fully Homomorphic Encryption machine learning models in a client/server setting using Concrete ML.
The steps to prepare a model for encrypted inference in a client/server setting is illustrated as follows:
The training of the model and its compilation to FHE are performed on a development machine.
Three different files are created when saving the model:
client.zip
contains the following files:client.specs.json
lists the secure cryptographic parameters needed for the client to generate private and evaluation keys.serialized_processing.json
describes the pre-processing and post-processing required by the machine learning model, such as quantization parameters to quantize the input and de-quantize the output.
server.zip
contains the compiled model. This file is sufficient to run the model on a server. The compiled model is machine-architecture specific, for example, a model compiled on x86 cannot run on ARM.
The compiled model (server.zip
) is deployed to a server. The cryptographic parameters (client.zip
) are shared with the clients. In some settings, such as a phone application, the client.zip
can be directly deployed on the client device and the server does not need to host it.
{% hint style="info" %} Important: In a client-server production using FHE, the server's output format depends on the model type:
- For regressors, the output matches the
predict()
method from scikit-learn, providing direct predictions. - For classifiers, the output uses the
predict_proba()
method format, offering probability scores for each class, which allows clients to determine class membership by applying a threshold (commonly 0.5). {% endhint %}
The FHEModelDev
, FHEModelClient
, and FHEModelServer
classes in the concrete.ml.deployment
module simplifies the deployment and interaction between the client and server:
-
FHEModelDev
:- This class handles the serialization of the underlying FHE circuit as well as the crypto-parameters used for generating the keys.
- Use the
save
method of this class during the development phase to prepare and save the model artifacts (client.zip
andserver.zip
). Withsave
method, you can deploy a trained model or a training FHE program.
-
FHEModelClient
is used on the client side for the following actions:- Generate and serialize the cryptographic keys.
- Encrypt the data before sending it to the server.
- Decrypt the results received from the server.
- Load quantization parameters and pre/post-processing from
serialized_processing.json
.
-
FHEModelServer
is used on the server side for the following actions:- Load the FHE circuit from
server.zip
. - Execute the model on encrypted data received from the client.
- Load the FHE circuit from
from concrete.ml.sklearn import DecisionTreeClassifier
from concrete.ml.deployment import FHEModelDev, FHEModelClient, FHEModelServer
import numpy as np
# Define the directory for FHE client/server files
fhe_directory = '/tmp/fhe_client_server_files/'
# Initialize the Decision Tree model
model = DecisionTreeClassifier()
# Generate some random data for training
X = np.random.rand(100, 20)
y = np.random.randint(0, 2, size=100)
# Train and compile the model
model.fit(X, y)
model.compile(X)
# Setup the development environment
dev = FHEModelDev(path_dir=fhe_directory, model=model)
dev.save()
# Setup the client
client = FHEModelClient(path_dir=fhe_directory, key_dir="/tmp/keys_client")
serialized_evaluation_keys = client.get_serialized_evaluation_keys()
# Client pre-processes new data
X_new = np.random.rand(1, 20)
encrypted_data = client.quantize_encrypt_serialize(X_new)
# Setup the server
server = FHEModelServer(path_dir=fhe_directory)
server.load()
# Server processes the encrypted data
encrypted_result = server.run(encrypted_data, serialized_evaluation_keys)
# Client decrypts the result
result = client.deserialize_decrypt_dequantize(encrypted_result)
- From Client to Server:
serialized_evaluation_keys
(once),encrypted_data
. - From Server to Client:
encrypted_result
.
These objects are serialized into bytes to streamline the data transfer between the client and server.
The client-side deployment of a secured inference machine learning model is illustrated as follows:
The workflow contains the following steps:
- Key generation: The client obtains the cryptographic parameters stored in
client.zip
and generates a private encryption/decryption key as well as a set of public evaluation keys. - Sending public keys: The public evaluation keys are sent to the server, while the secret key remains on the client.
- Data encryption: The private data is encrypted by the client as described in the
serialized_processing.json
file inclient.zip
. - Data transmission: The encrypted data is sent to the server.
- Encrypted inference: Server-side, the FHE model inference is run on encrypted inputs using the public evaluation keys.
- Data transmission: The encrypted result is returned by the server to the client.
- Data decryption: The client decrypts it using its private key.
- Post-processing: The client performs any necessary post-processing of the decrypted result as specified in
serialized_processing.json
(part ofclient.zip
).
The server-side implementation of a Concrete ML model is illustrated as follows:
The workflow contains the following steps:
- Storing the public key: The public evaluation keys sent by clients are stored.
- Model evaluation: The public evaluation keys are retrieved for the client that is querying the service and used to evaluate the machine learning model stored in
server.zip
. - Sending back the result: The server sends the encrypted result of the computation back to the client.
For a complete example, see the client-server notebook or the use-case examples.