Production deployment
This document explains the deployment workflow and the model serving pattern for deploying Fully Homomorphic Encryption machine learning models in a client/server setting using Concrete ML.
Deployment
The steps to prepare a model for encrypted inference in a client/server setting is illustrated as follows:

Model training and compilation
The training of the model and its compilation to FHE are performed on a development machine.
Three different files are created when saving the model:
client.zipcontains the following files:client.specs.jsonlists the secure cryptographic parameters needed for the client to generate private and evaluation keys.serialized_processing.jsondescribes the pre-processing and post-processing required by the machine learning model, such as quantization parameters to quantize the input and de-quantize the output.
server.zipcontains the compiled model. This file is sufficient to run the model on a server. The compiled model is machine-architecture specific, for example, a model compiled on x86 cannot run on ARM.
Model deployment
The compiled model (server.zip) is deployed to a server. The cryptographic parameters (client.zip) are shared with the clients. In some settings, such as a phone application, the client.zip can be directly deployed on the client device and the server does not need to host it.
Important: In a client-server production using FHE, the server's output format depends on the model type:
For regressors, the output matches the
predict()method from scikit-learn, providing direct predictions.For classifiers, the output uses the
predict_proba()method format, offering probability scores for each class, which allows clients to determine class membership by applying a threshold (commonly 0.5).
Using the API Classes
The FHEModelDev, FHEModelClient, and FHEModelServer classes in the concrete.ml.deployment module simplifies the deployment and interaction between the client and server:
FHEModelDev:This class handles the serialization of the underlying FHE circuit as well as the crypto-parameters used for generating the keys.
Use the
savemethod of this class during the development phase to prepare and save the model artifacts (client.zipandserver.zip). Withsavemethod, you can deploy a trained model or a training FHE program.
FHEModelClientis used on the client side for the following actions:Generate and serialize the cryptographic keys.
Encrypt the data before sending it to the server.
Decrypt the results received from the server.
Load quantization parameters and pre/post-processing from
serialized_processing.json.
FHEModelServeris used on the server side for the following actions:Load the FHE circuit from
server.zip.Execute the model on encrypted data received from the client.
Example Usage
Data transfer overview:
From Client to Server:
serialized_evaluation_keys(once),encrypted_data.From Server to Client:
encrypted_result.
These objects are serialized into bytes to streamline the data transfer between the client and server.
Serving
The client-side deployment of a secured inference machine learning model is illustrated as follows:

The workflow contains the following steps:
Key generation: The client obtains the cryptographic parameters stored in
client.zipand generates a private encryption/decryption key as well as a set of public evaluation keys.Sending public keys: The public evaluation keys are sent to the server, while the secret key remains on the client.
Data encryption: The private data is encrypted by the client as described in the
serialized_processing.jsonfile inclient.zip.Data transmission: The encrypted data is sent to the server.
Encrypted inference: Server-side, the FHE model inference is run on encrypted inputs using the public evaluation keys.
Data transmission: The encrypted result is returned by the server to the client.
Data decryption: The client decrypts it using its private key.
Post-processing: The client performs any necessary post-processing of the decrypted result as specified in
serialized_processing.json(part ofclient.zip).
The server-side implementation of a Concrete ML model is illustrated as follows:

The workflow contains the following steps:
Storing the public key: The public evaluation keys sent by clients are stored.
Model evaluation: The public evaluation keys are retrieved for the client that is querying the service and used to evaluate the machine learning model stored in
server.zip.Sending back the result: The server sends the encrypted result of the computation back to the client.
Example notebook
For a complete example, see the client-server notebook or the use-case examples.
Last updated
Was this helpful?