Linear models

This page explains Concrete ML linear models for both classification and regression. These models are based on scikit-learnarrow-up-right linear models.

Supported models for encrypted inference

The following models are supported for training on clear data and predicting on encrypted data. Their API is similar to the one of scikit-learnarrow-up-right. These models are also compatible with some of scikit-learn's main workflows, such as Pipeline() and GridSearch().

Supported models for encrypted training

In addition to predicting on encrypted data, the following models support training on encrypted data.

| SGDClassifierarrow-up-right | SGDClassifierarrow-up-right |

Ciphertext format compatibility

These models only support Concrete ciphertexts. See the ciphertexts format documentation for more details.

Quantization parameters

The n_bits parameter controls the bit-width of the inputs and weights of the linear models. Linear models do not use table lookups and thus allows weight and inputs to be high precision integers.

For models with input dimensions up to 300, the parameter n_bits can be set to 8 or more. When the input dimensions are larger, n_bits must be reduced to 6-7. In many cases, quantized models can preserve all performance metrics compared to the non-quantized float models from scikit-learn when n_bits is down to 6. You should validate accuracy on held-out test sets and adjust n_bits accordingly.

circle-exclamation

Pre-trained models

You can convert an already trained scikit-learn linear model to a Concrete ML one by using the from_sklearn_modelarrow-up-right method. See the following example.

Example

The following example shows how to train a LogisticRegression model on a simple data-set and then use FHE to perform inference on encrypted data. You can find a more complete example in the LogisticRegression notebook.

Model accuracy

The figure below compares the decision boundary of the FHE classifier and a scikit-learn model executed in clear. You can find the complete code in the LogisticRegression notebook.

The overall accuracy scores are identical (93%) between the scikit-learn model (executed in the clear) and the Concrete ML one (executed in FHE). In fact, quantization has little impact on the decision boundaries, as linear models can use large precision numbers when quantizing inputs and weights in Concrete ML. Additionally, as the linear models do not use Programmable Boostrapping, the FHE computations are always exact, irrespective of the PBS error tolerance setting. This ensures that the FHE predictions are always identical to the quantized clear ones.

Sklearn model decision boundaries
FHE model decision boundaries

Loading a pre-trained model

An alternative to the example above is to train a scikit-learn model in a separate step and then to convert it to Concrete ML.

circle-check

Last updated

Was this helpful?