1 of 75

0.6 What is Concrete ML?

⭐️ Star the repo on Github | 🗣 Community support forum | 📁 Contribute to the project

Concrete-ML is an open-source, privacy-preserving, machine learning inference framework based on fully homomorphic encryption (FHE). It enables data scientists without any prior knowledge of cryptography to automatically turn machine learning models into their FHE equivalent, using familiar APIs from Scikit-learn and PyTorch (see how it looks for linear models, tree-based models, and neural networks).

Fully Homomorphic Encryption (FHE) is an encryption technique that allows computing directly on encrypted data, without needing to decrypt it. With FHE, you can build private-by-design applications without compromising on features. You can learn more about FHE in this introduction or by joining the FHE.org community.

Example usage

Here is a simple example of classification on encrypted data using logistic regression. More examples can be found .

This example shows the typical flow of a Concrete-ML model:

The model is trained on unencrypted (plaintext) data using scikit-learn. As FHE operates over integers, Concrete-ML quantizes the model to use only integers during inference.
The quantized model is compiled to a FHE equivalent. Under the hood, the model is first converted to a Concrete-Numpy program, then compiled.
Inference can then be done on encrypted data. The above example shows encrypted inference in the model-development phase. Alternatively, during in a client/server setting, the data is encrypted by the client, processed securely by the server, and then decrypted by the client.

Current limitations

To make a model work with FHE, the only constraint is to make it run within the supported precision limitations of Concrete-ML (currently 16-bit integers). Thus, machine learning models are required to be quantized, which sometimes leads to a loss of accuracy versus the original model, which operates on plaintext.

Additionally, Concrete-ML currently only supports FHE inference. On the other hand, training has to be done on unencrypted data, producing a model which is then converted to a FHE equivalent that can perform encrypted inference, i.e. prediction over encrypted data.

Finally, in Concrete-ML there is currently no support for pre-processing model inputs and post-processing model outputs. These processing stages may involve text-to-numerical feature transformation, dimensionality reduction, KNN or clustering, featurization, normalization, and the mixing of results of ensemble models.

All of these issues are currently being addressed and significant improvements are expected to be released in the coming months.

Concrete stack

Concrete-ML is built on top of Zama's Concrete framework. It uses , which itself uses the and the . To use these libraries directly, refer to the and documentations.

Online demos and tutorials

Various tutorials are available for the and for . In addition, several standalone demos for use-cases can be found in the section.

If you have built awesome projects using Concrete-ML, feel free to let us know and we'll link to your work!

Additional resources

Looking for support? Ask our team!

Support forum: (we answer in less than 24 hours).
Live discussion on the FHE.org Discord server: (inside the #concrete channel).
Do you have a question about Zama? You can write us on or send us an email at: [email protected]

Getting Started

Installation

Please note that not all hardware/OS combinations are supported. Determine your platform, OS version, and Python version before referencing the table below.

Depending on your OS, Concrete-ML may be installed with Docker or with pip:

OS / HW

Available on Docker

Available on pip

Inference in the Cloud

Concrete-ML models can be easily deployed in a client/server setting, enabling the creation of privacy-preserving services in the cloud.

As seen in the , a Concrete-ML model, once compiled to FHE, generates machine code that performs the inference on private data. Furthermore, secret encryption keys are needed so that the user can securely encrypt their data and decrypt the inference result. An evaluation key is also needed for the server to securely process the user's encrypted data.

Keys are generated by the user once for each service they use, based on the model the service provides and its cryptographic parameters.

The overall communications protocol to enable cloud deployment of machine learning services can be summarized in the following diagram:

Demos and Tutorials

This section lists several demos that apply Concrete-ML to some popular machine learning problems. They show how to build ML models that perform well under FHE constraints, and then how to perform the conversion to FHE.

Simpler tutorials that discuss only model usage and compilation are also available for the and for .

Built-in Models

Pandas

Concrete-ML provides partial support for Pandas, with most available models (linear and tree-based models) usable on Pandas dataframes just as they would be used with NumPy arrays.

The table below summarizes current compatibility:

Methods

Support Pandas dataframe

fit

✓

compile

✗

Example

The following example considers a LogisticRegression model on a simple classification problem. A more advanced example can be found in the , which considers a XGBClassifier.

Deep Learning

Deep Learning Examples

These examples illustrate the basic usage of Concrete-ML to build various types of neural networks. They use simple data-sets, focusing on the syntax and usage of Concrete-ML. For examples showing how to train high-accuracy models on more complex data-sets, see the section.

FHE constraints considerations

The examples listed here make use of simulation (using the ) to perform evaluation over large test sets. Since FHE execution can be slow, only a few FHE executions can be performed. The of Concrete-ML ensure that accuracy measured with simulation is the same that will be obtained during FHE execution.

Advanced topics

Production Deployment

Concrete-ML provides functionality to deploy FHE machine learning models in a client/server setting. The deployment workflow and model serving pattern is as follows:

Deployment

The diagram above shows the steps that a developer goes through to prepare a model for encrypted inference in a client/server setting. The training of the model and its compilation to FHE are performed on a development machine. Three different files are created when saving the model:

Developer Guide

Workflow

Set Up Docker

Before you start this section, you must install Docker by following official guide.

Building the image

Once you have access to this repository and the dev environment is installed on your host OS (via make setup_env once ), you should be able to launch the commands to build the dev Docker image with make docker_build.

Documentation

Using GitBook

Documentation with GitBook is done mainly by pushing content on GitHub. GitBook then pulls the docs from the repository and publishes. In most cases, GitBook is just a mirror of what is available in GitHub.

There are, however, some use-cases where documentation can be modified directly in GitBook (and, then, push the modifications to GitHub), for example when the documentation is modified by a person outside of Zama. In this case, a GitHub branch is created, and a GitHub space is associated to it: modifications are done in this space and automatically pushed to the branch. Once the modifications have been completed, one can simply create a pull-request, to finally merge modifications on the main branch.

Using Sphinx

Documentation can alternatively be built using Sphinx:

The documentation contains both files written by hand by developers (the .md files) and files automatically created by parsing the source files.

Then to open it, go to docs/_build/html/index.html or use the follwing command:

To build and open the docs at the same time, use:

Support and Issues

Concrete-ML is a constant work-in-progress, and thus may contain bugs or suboptimal APIs.

Before opening an issue or asking for support, please read this documentation to understand common issues and limitations of Concrete-ML. You can also check the outstanding issues on github.

Furthermore, undefined behavior may occur if the input-set, which is internally used by the compilation core to set bit-widths of some intermediate data, is not sufficiently representative of the future user inputs. With all the inputs in the input-set, it appears that intermediate data can be represented as an n-bit integer. But, for a particular computation, this same intermediate data needs additional bits to be represented. The FHE execution for this computation will result in an incorrect output, as typically occurs in integer overflows in classical programs.

If you didn't find an answer, you can ask a question on the Zama forum or in the FHE.org Discord.

Submitting an issue

When submitting an issue (), ideally include as much information as possible. In addition to the Python script, the following information is useful:

the reproducibility rate you see on your side
any insight you might have on the bug
any workaround you have been able to find

If you would like to contribute to a project and send pull requests, take a look at the guide.

Inner Workings

concrete.ml.common.check_inputs.md

module `concrete.ml.common.check_inputs`

Check and conversion tools.

Utils that are used to check (including convert) some data types which are compatible with scikit-learn to numpy types.

concrete.ml.common.debugging.custom_assert.md

module `concrete.ml.common.debugging.custom_assert`

Provide some variants of assert.

concrete.ml.common.debugging.md

module `concrete.ml.common.debugging`

Module for debugging.

concrete.ml.common.md

module `concrete.ml.common`

Module for shared data structures and code.

Global Variables

debugging
check_inputs
utils

concrete.ml.deployment.md

module `concrete.ml.deployment`

Module for deployment of the FHE model.

concrete.ml.onnx.convert.md

module `concrete.ml.onnx.convert`

ONNX conversion related code.

Global Variables

IMPLEMENTED_ONNX_OPS
OPSET_VERSION_FOR_ONNX_EXPORT

function `get_equivalent_numpy_forward_and_onnx_model`

Get the numpy equivalent forward of the provided torch Module.

Args:

torch_module (torch.nn.Module): the torch Module for which to get the equivalent numpy forward.
dummy_input (Union[torch.Tensor, Tuple[torch.Tensor, ...]]): dummy inputs for ONNX export.
output_onnx_file

Returns:

Tuple[Callable[..., Tuple[numpy.ndarray, ...]], onnx.GraphProto]: The function that will execute the equivalent numpy code to the passed torch_module and the generated ONNX model.

function `get_equivalent_numpy_forward`

Get the numpy equivalent forward of the provided ONNX model.

Args:

onnx_model (onnx.ModelProto): the ONNX model for which to get the equivalent numpy forward.
check_model (bool): set to True to run the onnx checker on the model. Defaults to True.

Raises:

ValueError: Raised if there is an unsupported ONNX operator required to convert the torch model to numpy.

Returns:

Callable[..., Tuple[numpy.ndarray, ...]]: The function that will execute the equivalent numpy function.

concrete.ml.onnx.md

module `concrete.ml.onnx`

ONNX module.

Global Variables

onnx_impl_utils
ops_impl
onnx_utils

concrete.ml.onnx.onnx_utils.md

module `concrete.ml.onnx.onnx_utils`

Utils to interpret an ONNX model with numpy.

concrete.ml.pytest.md

module `concrete.ml.pytest`

Module which is used to contain common functions for pytest.

Global Variables

torch_models
utils

concrete.ml.quantization.md

module `concrete.ml.quantization`

Modules for quantization.

Global Variables

quantizers
base_quantized_op
quantized_module

concrete.ml.sklearn.md

module `concrete.ml.sklearn`

Import sklearn models.

concrete.ml.sklearn.svm.md

module `concrete.ml.sklearn.svm`

Implement Support Vector Machine.

concrete.ml.sklearn.torch_modules.md

module `concrete.ml.sklearn.torch_modules`

Implement torch module.

concrete.ml.sklearn.tree_to_numpy.md

module `concrete.ml.sklearn.tree_to_numpy`

Implements the conversion of a tree model to a numpy function.

concrete.ml.torch.md

module `concrete.ml.torch`

Modules for torch to numpy conversion.

Global Variables

numpy_module

concrete.ml.torch.numpy_module.md

module `concrete.ml.torch.numpy_module`

A torch to numpy module.

concrete.ml.version.md

module `concrete.ml.version`

File to manage the version of the package.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from concrete.ml.sklearn import LogisticRegression

# Lets create a synthetic data-set
x, y = make_classification(n_samples=100, class_sep=2, n_features=30, random_state=42)

# Split the data-set into a train and test set
X_train, X_test, y_train, y_test = train_test_split(
    x, y, test_size=0.2, random_state=42
)

# Now we train in the clear and quantize the weights
model = LogisticRegression(n_bits=8)
model.fit(X_train, y_train)

# We can simulate the predictions in the clear
y_pred_clear = model.predict(X_test)

# We then compile on a representative set
model.compile(X_train)

# Finally we run the inference on encrypted inputs
y_pred_fhe = model.predict(X_test, execute_in_fhe=True)

print("In clear  :", y_pred_clear)
print("In FHE    :", y_pred_fhe)
print(f"Similarity: {int((y_pred_fhe == y_pred_clear).mean()*100)}%")

# Output:
    # In clear  : [0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 1 1 1 0 0]
    # In FHE    : [0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 1 1 1 0 0]
    # Similarity: 100%

# Without local volume:
docker run --rm -it -p 8888:8888 zamafhe/concrete-ml

# With local volume to save notebooks on host:
docker run --rm -it -p 8888:8888 -v /host/path:/data zamafhe/concrete-ml

import numpy as np
import pandas as pd
from concrete.ml.sklearn import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Create the data set as a Pandas dataframe
X, y = make_classification(
    n_samples=250,
    n_features=30,
    n_redundant=0,
    random_state=2,
)
X, y = pd.DataFrame(X), pd.DataFrame(y)

# Retrieve train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)

# Instantiate the model
model = LogisticRegression(n_bits=8)

# Fit the model
model.fit(X_train, y_train)

# Evaluate the model on the test set in clear
y_pred_clear = model.predict(X_test)

# Compile the model
model.compile(X_train.to_numpy())

# Perform the inference in FHE
y_pred_fhe = model.predict(X_test, execute_in_fhe=True)

# Assert that FHE predictions are the same as the clear predictions
print(
    f"{(y_pred_fhe == y_pred_clear).sum()} "
    f"examples over {len(y_pred_fhe)} have a FHE inference equal to the clear inference."
)

# Output:
    # 100 examples over 100 have a FHE inference equal to the clear inference.

Key Concepts

Concrete-ML is built on top of Concrete-Numpy, which enables Numpy programs to be converted into FHE circuits.

Lifecycle of a Concrete-ML model

I. Model development

training: A model is trained using plaintext, non-encrypted, training data.
quantization: The model is converted into an integer equivalent using quantization. Concrete-ML performs this step either during training (Quantization Aware Training) or after training (Post-training Quantization), depending on model type. Quantization converts inputs, model weights, and all intermediate values of the inference computation to integers. More information is available .
simulation

You can see some examples of the model development workflow .

II. Model deployment

client/server deployment: In a client/server setting, the model can be exported in a way that:
- allows the client to generate keys, encrypt, and decrypt.
- provides a compiled model that can run on the server to perform inference on encrypted data.

You can see an example of the model deployment workflow .

Cryptography concepts

Concrete-ML and Concrete-Numpy are tools that hide away the details of the underlying cryptography scheme, called TFHE. However, some cryptography concepts are still useful when using these two toolkits:

encryption/decryption: These operations transform plaintext, i.e. human-readable information, into ciphertext, i.e. data that contains a form of the original plaintext that is unreadable by a human or computer without the proper key to decrypt it. Encryption takes plaintext and an encryption key and produces ciphertext, while decryption is the inverse operation.
encrypted inference: FHE allows a third party to execute (i.e. run inference or predict) a machine learning model on encrypted data (a ciphertext). The result of the inference is also encrypted and can only be read by the person who receives the decryption key.

While Concrete-ML users only need to understand the cryptography concepts above, for a deeper understanding of the cryptography behind the Concrete stack, please see the or .

Model accuracy considerations under FHE constraints

To respect FHE constraints, all numerical programs that include non-linear operations over encrypted data must have all inputs, constants, and intermediate values represented with integers of a maximum of 16 bits.

Thus, Concrete-ML quantizes the input data and model outputs in the same way as weights and activations. The main levers to control accumulator bit-width are the number of bits used for the inputs, weights, and activations of the model. These parameters are crucial to comply with the constraint on accumulator bit-widths. Please refer to for more details about how to develop models with quantization in Concrete-ML.

However, these methods may cause a reduction in the accuracy of the model since its representative power is diminished. Most importantly, carefully choosing a quantization approach can alleviate accuracy loss, all the while allowing compilation to FHE. Concrete-ML offers built-in models that already include quantization algorithms, and users only need to configure some of their parameters, such as the number of bits, discussed above. See for information about configuring these parameters for various models.

Additional specific methods can help to make models compatible with FHE constraints. For instance, dimensionality reduction can reduce the number of input features and, thus, the maximum accumulator bit-width reached within a circuit. Similarly, sparsity-inducing training methods, such as pruning, deactivate some features during inference, which also helps. For now, dimensionality reduction is considered as a pre-processing step, while pruning is used in the .

The configuration of model quantization parameters is illustrated in the advanced examples for and dimensionality reduction is shown in the .

Built-in Model Examples

These examples illustrate the basic usage of built-in Concrete-ML models. For more examples showing how to train high-accuracy models on more complex data-sets, see the Demos and Tutorials section.

FHE constraints considerations

In Concrete-ML, built-in linear models are exact equivalents to their scikit-learn counterparts. Indeed, since they do not apply any non-linearity during inference, these models are very fast (~1ms FHE inference time) and can use high precision integers (between 20-25 bits).

Tree-based models apply non-linear functions that enable comparisons of inputs and trained thresholds. Thus, they are limited with respect to the number of bits used to represent the inputs. But as these examples show, in practice 5-6 bits are sufficient to exactly reproduce the behavior of their scikit-learn counterpart models.

As shown in the examples below, built-in neural networks can be configured to work with user-specified accumulator sizes, which allow the user to adjust the speed/accuracy tradeoff.

It is recommended to use to configure the speed/accuracy trade-off for tree-based models and neural networks, using grid-search or your own heuristics.

List of examples

1. Linear and logistic regression

These examples show how to use the built-in linear models on synthetic data, which allows for easy visualization of the decision boundaries or trend lines. Executing these 1D and 2D models in FHE takes around 1 millisecond.

2. Generalized linear models

These two examples show generalized linear models (GLM) on the real-world data-set. As the non-linear, inverse-link functions are computed, these models do not use , and are, thus, very fast (~1ms execution time).

3. Decision tree

Using the data-set, this example shows how to train a classifier that detects spam, based on features extracted from email messages. A grid-search is performed over decision-tree hyper-parameters to find the best ones.

4. XGBoost and Random Forest classifier

This example shows how to train tree-ensemble models (either XGBoost or Random Forest), first on a synthetic data-set, and then on the data-set. Grid-search is used to find the best number of trees in the ensemble.

5. XGBoost regression

Privacy-preserving prediction of house prices is shown in this example, using the data-set. Using 50 trees in the ensemble, with 5 bits of precision for the input features, the FHE regressor obtains an score of 0.90 and an execution time of 7-8 seconds.

6. Fully connected neural network

Two different configurations of the built-in, fully-connected neural networks are shown. First, a small bit-width accumulator network is trained on and compared to a Pytorch floating point network. Second, a larger accumulator (>8 bits) is demonstrated on .

7. Comparison of classifiers

Based on three different synthetic data-sets, all the built-in classifiers are demonstrated in this notebook, showing accuracies, inference times, accumulator bit-widths, and decision boundaries.

Pruning

Pruning is a method to reduce neural network complexity, usually applied in order to reduce the computation cost or memory size. Pruning is used in Concrete-ML to control the size of accumulators in neural networks, thus making them FHE-compatible. See here for an explanation of accumulator bit-width constraints.

Overview of pruning in Concrete ML

Pruning is used in Concrete-ML for two types of neural networks:

Built-in include a pruning mechanism that can be parameterized by the user. The pruning type is based on L1-norm. To comply with FHE constraints, Concrete-ML uses unstructured pruning, as the aim is not to eliminate neurons or convolutional filters completely, but to decrease their accumulator bit-width.
Custom neural networks, to work well under FHE constraints, should include pruning. When implemented with PyTorch, you can use the (e.g.L1-Unstructured) to good effect.

Basics of pruning

In neural networks, a neuron computes a linear combination of inputs and learned weights, then applies an activation function.

The neuron computes:

When building a full neural network, each layer will contain multiple neurons, which are connected to the inputs or to the neuron outputs of a previous layer.

For every neuron shown in each layer of the figure above, the linear combinations of inputs and learned weights are computed. Depending on the values of the inputs and weights, the sum - which for Concrete-ML neural networks is computed with integers - can take a range of different values.

To respect the bit-width constraint of the FHE , the values of the accumulator must remain small to be representable using a maximum of 16 bits. In other words, the values must be between 0 and .

Pruning a neural network entails fixing some of the weights to be zero during training. This is advantageous to meet FHE constraints, as irrespective of the distribution of , multiplying these input values by 0 does not increase the accumulator value.

Fixing some of the weights to 0 makes the network graph look more similar to the following:

While pruning weights can reduce the prediction performance of the neural network, studies show that a high level of pruning (above 50%) can often be applied. See here how Concrete-ML uses pruning in .

Pruning in practice

In the formula above, in the worst case, the maximum number of the input and weights that can make the result exceed $n$ bits is given by:

Here, is the maximum precision allowed.

For example, if and with , the worst case is where all inputs and weights are equal to their maximal value . In this case, there can be at most elements in the multi-sums.

In practice, the distribution of the weights of a neural network is Gaussian, with many weights either 0 or having a small value. This enables exceeding the worst-case number of active neurons without having to risk overflowing the bit-width. In built-in neural networks, the parameter n_hidden_neurons_multiplier is multiplied with to determine the total number of non-zero weights that should be kept in a neuron.

get_equivalent_numpy_forward_and_onnx_model(
    torch_module: Module,
    dummy_input: Union[Tensor, Tuple[Tensor, ]],
    output_onnx_file: Optional[Path, str] = None
) → Tuple[Callable[, Tuple[ndarray, ]], GraphProto]

__init__(
    n_bits=8,
    epsilon=0.0,
    tol=0.0001,
    C=1.0,
    loss='epsilon_insensitive',
    fit_intercept=True,
    intercept_scaling=1.0,
    dual=True,
    verbose=0,
    random_state=None,
    max_iter=1000
)

__init__(
    n_bits=8,
    penalty='l2',
    loss='squared_hinge',
    dual=True,
    tol=0.0001,
    C=1.0,
    multi_class='ovr',
    fit_intercept=True,
    intercept_scaling=1,
    class_weight=None,
    verbose=0,
    random_state=None,
    max_iter=1000
)

import onnx
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

from concrete.ml.sklearn import LogisticRegression

# Create the data for classification
x, y = make_classification(n_samples=250, class_sep=2, n_features=30, random_state=42)

# Retrieve train and test sets
X_train, X_test, y_train, y_test = train_test_split(
    x, y, test_size=0.4, random_state=42
)

# Fix the number of bits to used for quantization
model = LogisticRegression(n_bits=8)

# Fit the model
model.fit(X_train, y_train)

# Access to the model
onnx_model = model.onnx_model

# Print the model
print(onnx.helper.printable_graph(onnx_model.graph))

# Save the model
onnx.save(onnx_model, "tmp.onnx")

# And then visualize it with Netron

# Assume quantized_module : QuantizedModule
#        data: numpy.ndarray of float

# Quantization is done in the clear
x_test_q = quantized_module.quantize_input(data)

for i in range(x_test_q.shape[0]):
    # Inputs must have size (1 x N) or (1 x C x H x W), we add the batch dimension with N=1
    x_q = np.expand_dims(x_test_q[i, :], 0)

    # Execute the model in FHE
    out_fhe = quantized_module.forward_fhe.encrypt_run_decrypt(x_q)

    # Dequantization is done in the clear
    output = quantized_module.dequantize_output(out_fhe)

    # For classifiers with multi-class outputs, the arg max is done in the clear
    y_pred = np.argmax(output, 1)

hummingbird.ml

Compilation

Compilation of a model produces machine code that executes the model on encrypted data. In some cases, notably in the client/server setting, the compilation can be done by the server when loading the model for serving.

As FHE execution is much slower than execution on non-encrypted data, Concrete-ML has a simulation mode, using an execution mode named the Virtual Library. Since, by default, the cryptographic parameters are chosen such that the results obtained in FHE are the same as those on clear data, the Virtual Library allows you to benchmark models quickly during development.

Compilation

Concrete-ML implements machine model inference using Concrete-Numpy as a backend. In order to execute in FHE, a numerical program written in Concrete-Numpy needs to be compiled. This functionality is , and Concrete-ML hides away most of the complexity of this step, completing the entire compilation process itself.

From the perspective of the Concrete-ML user, the compilation process performed by Concrete-Numpy can be broken up into 3 steps:

tracing the Numpy program and creating a Concrete-Numpy op-graph
checking the op-graph for FHE compatability
producing machine code for the op-graph (this step automatically determines cryptographic parameters)

Additionally, the packages the result of the last step in a way that allows the deployment of the encrypted circuit to a server, as well as key generation, encryption, and decryption on the client side.

Simulation with the Virtual Library

The first step in the list above takes a Python function implemented using the Concrete-Numpy and transforms it into an executable operation graph.

The result of this single step of the compilation pipeline allows the:

execution of the op-graph, which includes TLUs, on clear non-encrypted data. This is, of course, not secure, but it is much faster than executing in FHE. This mode is useful for debugging, i.e. to find the appropriate hyper-parameters. This mode is called the Virtual Library (which is referred as in Concrete-Numpy).
verification of the maximum bit-width of the op-graph, to determine FHE compatibility, without actually compiling the circuit to machine code.

Enabling Virtual Library execution requires the definition of a compilation Configuration. As simulation does not execute in FHE, this can be considered unsafe:

Next, the following code uses the simulation mode for built-in models:

And finally, for custom models, it is possible to enable simulation using the following syntax:

Obtaining the simulated predictions of the models using the Virtual Library has the same syntax as execution in FHE:

Moreover, the maximum accumulator bit-width is determined as follows:

A simple Concrete-Numpy example

While Concrete-ML hides away all the Concrete-Numpy code that performs model inference, it can be useful to understand how Concrete-Numpy code works. Here is a toy example for a simple linear regression model on integers. Note that this is just an example to illustrate compilation concepts. Generally, it is recommended to use the , which provide linear regression out of the box.

Contributing

There are three ways to contribute to Concrete-ML:

You can open issues to report bugs and typos and to suggest ideas.
You can ask to become an official contributor by emailing [email protected]. Only approved contributors can send pull requests (PR), so please make sure to get in touch before you do.
You can also provide new tutorials or use-cases, showing what can be done with the library. The more examples we have, the better and clearer it is for the other users.

1. Creating a new branch

To create your branch, you have to use the issue ID somewhere in the branch name:

e.g.

2. Before committing

2.1 Conformance

Each commit to Concrete-ML should conform to the standards of the project. You can let the development tools fix some issues automatically with the following command:

Conformance can be checked using the following command:

2.2 Testing

Your code must be well documented, containing tests and not breaking other tests:

You need to make sure you get 100% code coverage. The make pytest command checks that by default and will fail with a coverage report at the end should some lines of your code not be executed during testing.

If your coverage is below 100%, you should write more tests and then create the pull request. If you ignore this warning and create the PR, GitHub actions will fail and your PR will not be merged.

There may be cases where covering your code is not possible (an exception that cannot be triggered in normal execution circumstances). In those cases, you may be allowed to disable coverage for some specific lines. This should be the exception rather than the rule, and reviewers will ask why some lines are not covered. If it appears they can be covered, then the PR won't be accepted in that state.

3. Committing

Concrete-ML uses a consistent commit naming scheme, and you are expected to follow it as well (the CI will make sure you do). The accepted format can be printed to your terminal by running:

e.g.

Just a reminder that commit messages are checked in the comformance step and are rejected if they don't follow the rules. To learn more about conventional commits, check page.

4. Rebasing

You should rebase on top of the main branch before you create your pull request. Merge commits are not allowed, so rebasing on main before pushing gives you the best chance of to avoid rewriting parts of your PR later if conflicts arise with other PRs being merged. After you commit changes to your new branch, you can use the following commands to rebase:

You can learn more about rebasing .

from concrete.ml.sklearn import NeuralNetClassifier
import torch.nn as nn

n_inputs = 10
n_outputs = 2
params = {
    "module__n_layers": 2,
    "module__n_w_bits": 2,
    "module__n_a_bits": 2,
    "module__n_accum_bits": 8,
    "module__n_hidden_neurons_multiplier": 1,
    "module__n_outputs": n_outputs,
    "module__input_dim": n_inputs,
    "module__activation_function": nn.ReLU,
    "max_epochs": 10,
}

concrete_classifier = NeuralNetClassifier(**params)

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

# Instantiate the logistic regression from sklearn
model = LogisticRegression()

# Create synthetic data
X, y = make_classification(
    n_samples=100, n_features=20, n_classes=2
)

# Fit the model
model.fit(X, y)

# Convert the model to ONNX
onnx_model = convert(model, backend="onnx", test_input=X).model

import torch.nn as nn

class QATnetwork(nn.Module):
    def __init__(self):
        super(QATnetwork, self).__init__()
        self.quant_inp = qnn.QuantIdentity(
            bit_width=4, return_quant_tensor=True)
        # ...

    def forward(self, x):
        out = self.quant_inp(x)
        return torch.sigmoid(out)
        # ...

__init__(
    criterion='gini',
    splitter='best',
    max_depth=None,
    min_samples_split=2,
    min_samples_leaf=1,
    min_weight_fraction_leaf=0.0,
    max_features=None,
    random_state=None,
    max_leaf_nodes=None,
    min_impurity_decrease=0.0,
    class_weight=None,
    ccp_alpha: float = 0.0,
    n_bits: int = 6
)

__init__(
    criterion='squared_error',
    splitter='best',
    max_depth=None,
    min_samples_split=2,
    min_samples_leaf=1,
    min_weight_fraction_leaf=0.0,
    max_features=None,
    random_state=None,
    max_leaf_nodes=None,
    min_impurity_decrease=0.0,
    ccp_alpha=0.0,
    n_bits: int = 6
)

__init__(
    n_bits: int = 6,
    n_estimators=20,
    criterion='gini',
    max_depth=4,
    min_samples_split=2,
    min_samples_leaf=1,
    min_weight_fraction_leaf=0.0,
    max_features='sqrt',
    max_leaf_nodes=None,
    min_impurity_decrease=0.0,
    bootstrap=True,
    oob_score=False,
    n_jobs=None,
    random_state=None,
    verbose=0,
    warm_start=False,
    class_weight=None,
    ccp_alpha=0.0,
    max_samples=None
)

__init__(
    n_bits: int = 6,
    n_estimators=20,
    criterion='squared_error',
    max_depth=4,
    min_samples_split=2,
    min_samples_leaf=1,
    min_weight_fraction_leaf=0.0,
    max_features='sqrt',
    max_leaf_nodes=None,
    min_impurity_decrease=0.0,
    bootstrap=True,
    oob_score=False,
    n_jobs=None,
    random_state=None,
    verbose=0,
    warm_start=False,
    ccp_alpha=0.0,
    max_samples=None
)

    quantized_numpy_module = compile_torch_model(
        torch_model,  # our model
        X_train,  # a representative input-set to be used for both quantization and compilation
        n_bits={"net_inputs": 5, "op_inputs": 3, "op_weights": 3, "net_outputs": 5},
        import_qat=is_qat,  # signal to the conversion function whether the network is QAT
        use_virtual_lib=True,
        configuration=COMPIL_CONFIG_VL,
    )

import numpy
from concrete.numpy import compiler

# Let's assume Quantization has been applied and we are left with integers only.
# This is essentially the work of Concrete-ML

# Some parameters (weight and bias) for our model taking a single feature
w = [2]
b = 2

# The function that implements our model
@compiler({"x": "encrypted"})
def linear_model(x):
    return w @ x + b

# A representative input-set is needed to compile the function
# (used for tracing)
n_bits_input = 2
inputset = numpy.arange(0, 2**n_bits_input).reshape(-1, 1)
circuit = linear_model.compile(inputset)

# Use the API to get the maximum bit-width in the circuit
max_bit_width = circuit.graph.maximum_integer_bit_width()
print("Max bit_width = ", max_bit_width)
# Max bit_width =  4

# Test our FHE inference
circuit.encrypt_run_decrypt(numpy.array([3]))
# 8

# Print the graph of the circuit
print(circuit)
# %0 = 2                     # ClearScalar<uint2>
# %1 = [2]                   # ClearTensor<uint2, shape=(1,)>
# %2 = x                     # EncryptedTensor<uint2, shape=(1,)>
# %3 = matmul(%1, %2)        # EncryptedScalar<uint3>
# %4 = add(%3, %0)           # EncryptedScalar<uint4>
# return %4

git commit -m "feat: implement bounds checking"
git commit -m "feat(debugging): add an helper function to draw intermediate representation"
git commit -m "fix(tracing): fix a bug that crashed PyTorch tracer"

# fetch the list of active remote branches
git fetch --all --prune

# checkout to main
git checkout main

# pull the latest changes to main (--ff-only is there to prevent accidental commits to main)
git pull --ff-only

# checkout back to your branch
git checkout $YOUR_BRANCH

# rebase on top of main branch
git rebase main

# If there are conflicts during the rebase, resolve them
# and continue the rebase with the following command
git rebase --continue

# push the latest version of the local branch to remote
git push --force

import numpy
import onnx
import tensorflow
import tf2onnx

from concrete.ml.torch.compile import compile_onnx_model
from concrete.numpy.compilation import Configuration


class FC(tensorflow.keras.Model):
    """A fully-connected model."""

    def __init__(self):
        super().__init__()
        hidden_layer_size = 10
        output_size = 5

        self.dense1 = tensorflow.keras.layers.Dense(
            hidden_layer_size,
            activation=tensorflow.nn.relu,
        )
        self.dense2 = tensorflow.keras.layers.Dense(output_size, activation=tensorflow.nn.relu6)
        self.flatten = tensorflow.keras.layers.Flatten()

    def call(self, inputs):
        """Forward function."""
        x = self.flatten(inputs)
        x = self.dense1(x)
        x = self.dense2(x)
        return self.flatten(x)


n_bits = 6
input_output_feature = 2
input_shape = (input_output_feature,)
num_inputs = 1
n_examples = 5000

# Define the Keras model
keras_model = FC()
keras_model.build((None,) + input_shape)
keras_model.compute_output_shape(input_shape=(None, input_output_feature))

# Create random input
input_set = numpy.random.uniform(-100, 100, size=(n_examples, *input_shape))

# Convert to ONNX
tf2onnx.convert.from_keras(keras_model, opset=14, output_path="tmp.model.onnx")

onnx_model = onnx.load("tmp.model.onnx")
onnx.checker.check_model(onnx_model)

# Compile
quantized_numpy_module = compile_onnx_model(
    onnx_model, input_set, n_bits=2
)

# Create test data from the same distribution and quantize using
# learned quantization parameters during compilation
x_test = tuple(numpy.random.uniform(-100, 100, size=(1, *input_shape)) for _ in range(num_inputs))
qtest = quantized_numpy_module.quantize_input(x_test)

y_clear = quantized_numpy_module(*qtest)
y_fhe = quantized_numpy_module.forward_fhe.encrypt_run_decrypt(*qtest)

print("Execution in clear: ", y_clear)
print("Execution in FHE:   ", y_fhe)
print("Equality:           ", numpy.sum(y_clear == y_fhe), "over", numpy.size(y_fhe), "values")

, then run git lfs pull.

concrete.ml.sklearn.linear_model.md

module `concrete.ml.sklearn.linear_model`

Implement sklearn linear model.

class `LinearRegression`

A linear regression model with FHE.

Parameters:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

For more details on LinearRegression please refer to the scikit-learn documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

method `init`

class `ElasticNet`

An ElasticNet regression model with FHE.

Parameters:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

For more details on ElasticNet please refer to the scikit-learn documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html

method `init`

class `Lasso`

A Lasso regression model with FHE.

Parameters:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

For more details on Lasso please refer to the scikit-learn documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html

method `init`

class `Ridge`

A Ridge regression model with FHE.

Parameters:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

For more details on Ridge please refer to the scikit-learn documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html

method `init`

class `LogisticRegression`

A logistic regression model with FHE.

Parameters:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

For more details on LogisticRegression please refer to the scikit-learn documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

method `init`

# Try to install the env normally
make setup_env

# If you are still having issues, sync the environment
make sync_env

# If you are still having issues on your OS, delete the venv:
rm -rf .venv

# And re-run the env setup
make setup_env

# Try to install the env normally
make setup_env

# If you are still having issues, sync the environment
make sync_env

# If you are still having issues in Docker, delete the venv:
rm -rf ~/dev_venv/*

# Disconnect from Docker
exit

# And relaunch, the venv will be reinstalled
make docker_start

# If you are still out of luck, force a rebuild which will also delete the volumes
make docker_rebuild

# And start Docker, which will reinstall the venv
make docker_start

import numpy
from tqdm import tqdm
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

from concrete.ml.sklearn import LogisticRegression

# Create the data for classification:
X, y = make_classification(
    n_features=30,
    n_redundant=0,
    n_informative=2,
    random_state=2,
    n_clusters_per_class=1,
    n_samples=250,
)

# Retrieve train and test sets:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)

# Instantiate the model:
model = LogisticRegression(n_bits=8)

# Fit the model:
model.fit(X_train, y_train)

# Evaluate the model on the test set in clear:
y_pred_clear = model.predict(X_test)

# Compile the model:
model.compile(X_train)

# Perform the inference in FHE:
y_pred_fhe = model.predict(X_test, execute_in_fhe=True)

# Assert that FHE predictions are the same as the clear predictions:
print(
    f"{(y_pred_fhe == y_pred_clear).sum()} examples over {len(y_pred_fhe)} "
    "have a FHE inference equal to the clear inference."
)

# Output:
#  100 examples over 100 have a FHE inference equal to the clear inference

from sklearn.datasets import load_breast_cancer
from sklearn.decomposition import PCA
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

from concrete.ml.sklearn.xgb import XGBClassifier


# Get data-set and split into train and test
X, y = load_breast_cancer(return_X_y=True)

# Split the train and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Define our model
model = XGBClassifier(n_jobs=1, n_bits=3)

# Define the pipeline
# We will normalize the data and apply a PCA before fitting the model
pipeline = Pipeline(
    [("standard_scaler", StandardScaler()), ("pca", PCA(random_state=0)), ("model", model)]
)

# Define the parameters to tune
param_grid = {
    "pca__n_components": [2, 5, 10, 15],
    "model__max_depth": [2, 3, 5],
    "model__n_estimators": [5, 10, 20],
}

# Instantiate the grid search with 5-fold cross validation on all available cores:
grid = GridSearchCV(pipeline, param_grid, cv=5, n_jobs=-1, scoring="accuracy")

# Launch the grid search
grid.fit(X_train, y_train)

# Print the best parameters found
print(f"Best parameters found: {grid.best_params_}")

# Output:
#  Best parameters found: {'model__max_depth': 5, 'model__n_estimators': 10, 'pca__n_components': 5}

# Currently we only focus on model inference in FHE
# The data transformation will be done in clear (client machine)
# while the model inference will be done in FHE on a server.
# The pipeline can be split into 2 parts:
#   1. data transformation
#   2. estimator
best_pipeline = grid.best_estimator_
data_transformation_pipeline = best_pipeline[:-1]
model = best_pipeline[-1]

# Transform test set
X_train_transformed = data_transformation_pipeline.transform(X_train)
X_test_transformed = data_transformation_pipeline.transform(X_test)

# Evaluate the model on the test set in clear
y_pred_clear = model.predict(X_test_transformed)
print(f"Test accuracy in clear: {(y_pred_clear == y_test).mean():0.2f}")

# In the output, the Test accuracy in clear should be > 0.9

# Compile the model to FHE
model.compile(X_train_transformed)

# Perform the inference in FHE
# Warning: this will take a while. It is recommended to run this with a very small batch of
# example first (e.g. N_TEST_FHE = 1)
# Note that here the encryption and decryption is done behind the scene.
N_TEST_FHE = 1
y_pred_fhe = model.predict(X_test_transformed[:N_TEST_FHE], execute_in_fhe=True)

# Assert that FHE predictions are the same as the clear predictions
print(f"{(y_pred_fhe == y_pred_clear[:N_TEST_FHE]).sum()} "
      f"examples over {N_TEST_FHE} have a FHE inference equal to the clear inference.")

# Output:
#  1 examples over 1 have a FHE inference equal to the clear inference

compile(
    q_inputs: Union[Tuple[ndarray, ], ndarray],
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

from sklearn.datasets import fetch_openml, make_circles
from concrete.ml.sklearn import RandomForestClassifier
from concrete.numpy import Configuration

debug_config = Configuration(
    enable_unsafe_features=True,
    use_insecure_key_cache=True,
    insecure_key_cache_location="~/.cml_keycache",
    p_error=None,
    global_p_error=None,
)

n_bits = 2
X, y = make_circles(n_samples=1000, noise=0.1, factor=0.6, random_state=0)
concrete_clf = RandomForestClassifier(
    n_bits=n_bits, n_estimators=10, max_depth=5
)
concrete_clf.fit(X, y)

concrete_clf.compile(X, debug_config, use_virtual_lib=True)

y_preds_clear = concrete_clf.predict(X)

import numpy
import torch

from torch import nn
from concrete.ml.torch.compile import compile_torch_model

N_FEAT = 2
class SimpleNet(nn.Module):
    """Simple MLP with PyTorch"""

    def __init__(self, n_hidden=30):
        super().__init__()
        self.fc1 = nn.Linear(in_features=N_FEAT, out_features=n_hidden)
        self.fc2 = nn.Linear(in_features=n_hidden, out_features=n_hidden)
        self.fc3 = nn.Linear(in_features=n_hidden, out_features=2)


    def forward(self, x):
        """Forward pass."""
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x


torch_input = torch.randn(100, N_FEAT)
torch_model = SimpleNet(120)
try:
    quantized_numpy_module = compile_torch_model(
        torch_model,
        torch_input,
        n_bits=7,
    )
except RuntimeError as err:
    print(err)

Function you are trying to compile cannot be converted to MLIR:

%0 = _onnx__Gemm_0                    # EncryptedTensor<int7, shape=(1, 2)>        ∈ [-64, 63]
%1 = [[ 33 -27  ...   22 -29]]        # ClearTensor<int7, shape=(2, 120)>          ∈ [-63, 62]
%2 = matmul(%0, %1)                   # EncryptedTensor<int14, shape=(1, 120)>     ∈ [-4973, 4828]
%3 = subgraph(%2)                     # EncryptedTensor<uint7, shape=(1, 120)>     ∈ [0, 126]
%4 = [[ 16   6  ...   10  54]]        # ClearTensor<int7, shape=(120, 120)>        ∈ [-63, 63]
%5 = matmul(%3, %4)                   # EncryptedTensor<int17, shape=(1, 120)>     ∈ [-45632, 43208]
%6 = subgraph(%5)                     # EncryptedTensor<uint7, shape=(1, 120)>     ∈ [0, 126]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ table lookups are only supported on circuits with up to 16-bit integers
%7 = [[ -7 -52] ... [-12  62]]        # ClearTensor<int7, shape=(120, 2)>          ∈ [-63, 62]
%8 = matmul(%6, %7)                   # EncryptedTensor<int16, shape=(1, 2)>       ∈ [-26971, 29843]
return %8

%1 = [[ 33 -27  ...   22 -29]]        # ClearTensor<int7, shape=(2, 120)>         
%4 = [[ 16   6  ...   10  54]]        # ClearTensor<int7, shape=(120, 120)>   
%7 = [[ -7 -52] ... [-12  62]]        # ClearTensor<int7, shape=(120, 2)>

%5 = matmul(%3, %4)                   # EncryptedTensor<int17, shape=(1, 120)>    
%6 = subgraph(%5)                     # EncryptedTensor<uint7, shape=(1, 120)>  
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ table lookups are only supported on circuits with up to 16-bit integers

MLIR
--------------------------------------------------------------------------------
module {
  func.func @main(%arg0: tensor<1x2x!FHE.eint<15>>) -> tensor<1x2x!FHE.eint<15>> {
    %cst = arith.constant dense<16384> : tensor<1xi16>
    %0 = "FHELinalg.sub_eint_int"(%arg0, %cst) : (tensor<1x2x!FHE.eint<15>>, tensor<1xi16>) -> tensor<1x2x!FHE.eint<15>>
    %cst_0 = arith.constant dense<[[-13, 43], [-31, 63], [1, -44], [-61, 20], [31, 2]]> : tensor<5x2xi16>
    %cst_1 = arith.constant dense<[[-45, 57, 19, 50, -63], [32, 37, 2, 52, -60], [-41, 25, -1, 31, -26], [-51, -40, -53, 0, 4], [20, -25, 56, 54, -23]]> : tensor<5x5xi16>
    %cst_2 = arith.constant dense<[[-56, -50, 57, 37, -22], [14, -1, 57, -63, 3]]> : tensor<2x5xi16>
    %c16384_i16 = arith.constant 16384 : i16
    %1 = "FHELinalg.matmul_eint_int"(%0, %cst_2) : (tensor<1x2x!FHE.eint<15>>, tensor<2x5xi16>) -> tensor<1x5x!FHE.eint<15>>
    %cst_3 = tensor.from_elements %c16384_i16 : tensor<1xi16>
    %cst_4 = tensor.from_elements %c16384_i16 : tensor<1xi16>
    %2 = "FHELinalg.add_eint_int"(%1, %cst_4) : (tensor<1x5x!FHE.eint<15>>, tensor<1xi16>) -> tensor<1x5x!FHE.eint<15>>
    %cst_5 = arith.constant

: tensor<5x32768xi64>
    %cst_6 = arith.constant dense<[[0, 1, 2, 3, 4]]> : tensor<1x5xindex>
    %3 = "FHELinalg.apply_mapped_lookup_table"(%2, %cst_5, %cst_6) : (tensor<1x5x!FHE.eint<15>>, tensor<5x32768xi64>, tensor<1x5xindex>) -> tensor<1x5x!FHE.eint<15>>
    %4 = "FHELinalg.matmul_eint_int"(%3, %cst_1) : (tensor<1x5x!FHE.eint<15>>, tensor<5x5xi16>) -> tensor<1x5x!FHE.eint<15>>
    %5 = "FHELinalg.add_eint_int"(%4, %cst_3) : (tensor<1x5x!FHE.eint<15>>, tensor<1xi16>) -> tensor<1x5x!FHE.eint<15>>
    %cst_7 = arith.constant

: tensor<5x32768xi64>
    %6 = "FHELinalg.apply_mapped_lookup_table"(%5, %cst_7, %cst_6) : (tensor<1x5x!FHE.eint<15>>, tensor<5x32768xi64>, tensor<1x5xindex>) -> tensor<1x5x!FHE.eint<15>>
    %7 = "FHELinalg.matmul_eint_int"(%6, %cst_0) : (tensor<1x5x!FHE.eint<15>>, tensor<5x2xi16>) -> tensor<1x2x!FHE.eint<15>>
    return %7 : tensor<1x2x!FHE.eint<15>>

  }
}
--------------------------------------------------------------------------------

concrete.ml.sklearn.glm.md

module `concrete.ml.sklearn.glm`

Implement sklearn's Generalized Linear Models (GLM).

class `PoissonRegressor`

A Poisson regression model with FHE.

Parameters:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

For more details on PoissonRegressor please refer to the scikit-learn documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.PoissonRegressor.html

method `init`

method `post_processing`

Post-processing the predictions.

Args:

y_preds (numpy.ndarray): The predictions to post-process.
already_dequantized (bool): Whether the inputs were already dequantized or not. Default to False.

Returns:

numpy.ndarray: The post-processed predictions.

method `predict`

Predict on user data.

Predict on user data using either the quantized clear model, implemented with tensors, or, if execute_in_fhe is set, using the compiled FHE circuit.

Args:

X (numpy.ndarray): The input data.
execute_in_fhe (bool): Whether to execute the inference in FHE. Default to False.

Returns:

numpy.ndarray: The model's predictions.

class `GammaRegressor`

A Gamma regression model with FHE.

Parameters:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

For more details on GammaRegressor please refer to the scikit-learn documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.GammaRegressor.html

method `init`

method `post_processing`

Post-processing the predictions.

Args:

y_preds (numpy.ndarray): The predictions to post-process.
already_dequantized (bool): Whether the inputs were already dequantized or not. Default to False.

Returns:

numpy.ndarray: The post-processed predictions.

method `predict`

Predict on user data.

Predict on user data using either the quantized clear model, implemented with tensors, or, if execute_in_fhe is set, using the compiled FHE circuit.

Args:

X (numpy.ndarray): The input data.
execute_in_fhe (bool): Whether to execute the inference in FHE. Default to False.

Returns:

numpy.ndarray: The model's predictions.

class `TweedieRegressor`

A Tweedie regression model with FHE.

Parameters:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.

For more details on TweedieRegressor please refer to the scikit-learn documentation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.TweedieRegressor.html

method `init`

method `post_processing`

Post-processing the predictions.

Args:

y_preds (numpy.ndarray): The predictions to post-process.
already_dequantized (bool): Whether the inputs were already dequantized or not. Default to False.

Returns:

numpy.ndarray: The post-processed predictions.

method `predict`

Predict on user data.

Predict on user data using either the quantized clear model, implemented with tensors, or, if execute_in_fhe is set, using the compiled FHE circuit.

Args:

X (numpy.ndarray): The input data.
execute_in_fhe (bool): Whether to execute the inference in FHE. Default to False.

Returns:

numpy.ndarray: The model's predictions.

concrete.ml.deployment.fhe_client_server.md

module `concrete.ml.deployment.fhe_client_server`

APIs for FHE deployment.

Global Variables

CML_VERSION
AVAILABLE_MODEL

class `FHEModelServer`

Server API to load and run the FHE circuit.

method `init`

Initialize the FHE API.

Args:

path_dir (str): the path to the directory where the circuit is saved

method `load`

Load the circuit.

method `run`

Run the model on the server over encrypted data.

Args:

serialized_encrypted_quantized_data (cnp.PublicArguments): the encrypted, quantized and serialized data
serialized_evaluation_keys (cnp.EvaluationKeys): the serialized evaluation keys

Returns:

cnp.PublicResult: the result of the model

class `FHEModelDev`

Dev API to save the model and then load and run the FHE circuit.

method `init`

Initialize the FHE API.

Args:

path_dir (str): the path to the directory where the circuit is saved
model (Any): the model to use for the FHE API

method `save`

Export all needed artifacts for the client and server.

Raises:

Exception: path_dir is not empty

class `FHEModelClient`

Client API to encrypt and decrypt FHE data.

method `init`

Initialize the FHE API.

Args:

path_dir (str): the path to the directory where the circuit is saved
key_dir (str): the path to the directory where the keys are stored

method `deserialize_decrypt`

Deserialize and decrypt the values.

Args:

serialized_encrypted_quantized_result (cnp.PublicArguments): the serialized, encrypted and quantized result

Returns:

numpy.ndarray: the decrypted and deserialized values

method `deserialize_decrypt_dequantize`

Deserialize, decrypt and dequantize the values.

Args:

serialized_encrypted_quantized_result (cnp.PublicArguments): the serialized, encrypted and quantized result

Returns:

numpy.ndarray: the decrypted (dequantized) values

method `generate_private_and_evaluation_keys`

Generate the private and evaluation keys.

Args:

force (bool): if True, regenerate the keys even if they already exist

method `get_serialized_evaluation_keys`

Get the serialized evaluation keys.

Returns:

cnp.EvaluationKeys: the evaluation keys

method `load`

Load the quantizers along with the FHE specs.

method `quantize_encrypt_serialize`

Quantize, encrypt and serialize the values.

Args:

x (numpy.ndarray): the values to quantize, encrypt and serialize

Returns:

cnp.PublicArguments: the quantized, encrypted and serialized values

compile_torch_model(
    torch_model: Module,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    n_bits=8,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → QuantizedModule

compile_onnx_model(
    onnx_model: ModelProto,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    import_qat: bool = False,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    n_bits=8,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → QuantizedModule

compile_brevitas_qat_model(
    torch_model: Module,
    torch_inputset: Union[Tensor, ndarray, Tuple[Union[Tensor, ndarray], ]],
    n_bits: Union[int, dict],
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    output_onnx_file: Union[Path, str] = None,
    verbose_compilation: bool = False
) → QuantizedModule

__init__(
    n_bits: int = 6,
    max_depth: Optional[int] = 3,
    learning_rate: Optional[float] = 0.1,
    n_estimators: Optional[int] = 20,
    objective: Optional[str] = 'binary:logistic',
    booster: Optional[str] = None,
    tree_method: Optional[str] = None,
    n_jobs: Optional[int] = None,
    gamma: Optional[float] = None,
    min_child_weight: Optional[float] = None,
    max_delta_step: Optional[float] = None,
    subsample: Optional[float] = None,
    colsample_bytree: Optional[float] = None,
    colsample_bylevel: Optional[float] = None,
    colsample_bynode: Optional[float] = None,
    reg_alpha: Optional[float] = None,
    reg_lambda: Optional[float] = None,
    scale_pos_weight: Optional[float] = None,
    base_score: Optional[float] = None,
    missing: float = nan,
    num_parallel_tree: Optional[int] = None,
    monotone_constraints: Optional[Dict[str, int], str] = None,
    interaction_constraints: Optional[str, List[Tuple[str]]] = None,
    importance_type: Optional[str] = None,
    gpu_id: Optional[int] = None,
    validate_parameters: Optional[bool] = None,
    predictor: Optional[str] = None,
    enable_categorical: bool = False,
    use_label_encoder: bool = False,
    random_state: Optional[RandomState, int] = None,
    verbosity: Optional[int] = None
)

__init__(
    n_bits: int = 6,
    max_depth: Optional[int] = 3,
    learning_rate: Optional[float] = 0.1,
    n_estimators: Optional[int] = 20,
    objective: Optional[str] = 'reg:squarederror',
    booster: Optional[str] = None,
    tree_method: Optional[str] = None,
    n_jobs: Optional[int] = None,
    gamma: Optional[float] = None,
    min_child_weight: Optional[float] = None,
    max_delta_step: Optional[float] = None,
    subsample: Optional[float] = None,
    colsample_bytree: Optional[float] = None,
    colsample_bylevel: Optional[float] = None,
    colsample_bynode: Optional[float] = None,
    reg_alpha: Optional[float] = None,
    reg_lambda: Optional[float] = None,
    scale_pos_weight: Optional[float] = None,
    base_score: Optional[float] = None,
    missing: float = nan,
    num_parallel_tree: Optional[int] = None,
    monotone_constraints: Optional[Dict[str, int], str] = None,
    interaction_constraints: Optional[str, List[Tuple[str]]] = None,
    importance_type: Optional[str] = None,
    gpu_id: Optional[int] = None,
    validate_parameters: Optional[bool] = None,
    predictor: Optional[str] = None,
    enable_categorical: bool = False,
    use_label_encoder: bool = False,
    random_state: Optional[RandomState, int] = None,
    verbosity: Optional[int] = None
)

__init__(
    n_bits=8,
    alpha=1.0,
    l1_ratio=0.5,
    fit_intercept=True,
    normalize='deprecated',
    precompute=False,
    max_iter=1000,
    copy_X=True,
    tol=0.0001,
    warm_start=False,
    positive=False,
    random_state=None,
    selection='cyclic'
)

__init__(
    n_bits=8,
    alpha: float = 1.0,
    fit_intercept=True,
    normalize='deprecated',
    precompute=False,
    copy_X=True,
    max_iter=1000,
    tol=0.0001,
    warm_start=False,
    positive=False,
    random_state=None,
    selection='cyclic'
)

__init__(
    n_bits=8,
    alpha: float = 1.0,
    fit_intercept=True,
    normalize='deprecated',
    copy_X=True,
    max_iter=None,
    tol=0.001,
    solver='auto',
    positive=False,
    random_state=None
)

__init__(
    n_bits=8,
    penalty='l2',
    dual=False,
    tol=0.0001,
    C=1.0,
    fit_intercept=True,
    intercept_scaling=1,
    class_weight=None,
    random_state=None,
    solver='lbfgs',
    max_iter=100,
    multi_class='auto',
    verbose=0,
    warm_start=False,
    n_jobs=None,
    l1_ratio=None
)

Using Torch

In addition to the built-in models, Concrete-ML supports generic machine learning models implemented with Torch, or exported as ONNX graphs.

As Quantization Aware Training (QAT) is the most appropriate method of training neural networks that are compatible with FHE constraints, Concrete-ML works with Brevitas, a library providing QAT support for PyTorch.

The following example uses a simple QAT PyTorch model that implements a fully connected neural network with two hidden layers. Due to its small size, making this model respect FHE constraints is relatively easy.

Once the model is trained, calling the compile_brevitas_qat_model from Concrete-ML will automatically perform conversion and compilation of a QAT network. Here, 3-bit quantization is used for both the weights and activations.

The model can now be used to perform encrypted inference. Next, the test data is quantized:

and the encrypted inference can be run using either:

quantized_numpy_module.forward_and_dequant() to compute predictions in the clear on quantized data, and then de-quantize the result. The return value of this function contains the dequantized (float) output of running the model in the clear. Calling the forward function on the clear data is useful when debugging. The results in FHE will be the same as those on clear quantized data.
quantized_numpy_module.forward_fhe.encrypt_run_decrypt() to perform the FHE inference. In this case, de-quantization is done in a second stage using quantized_numpy_module.dequantize_output().

Generic Quantization Aware Training import

While the example above shows how to import a Brevitas/PyTorch model, Concrete-ML also provides an option to import generic QAT models implemented either in PyTorch or through ONNX. Interestingly, deep learning models made with TensorFlow or Keras should be usable, by preliminary converting them to ONNX.

QAT models contain quantizers in the PyTorch graph. These quantizers ensure that the inputs to the Linear/Dense and Conv layers are quantized.

Suppose that n_bits_qat is the bit-width of activations and weights during the QAT process. To import a PyTorch QAT network, you can use the library function, passing import_qat=True:

Alternatively, if you want to import an ONNX model directly, please see . The also supports the import_qat parameter.

When importing QAT models using this generic pipeline, a representative calibration set should be given as quantization parameters in the model need to be inferred from the statistics of the values encountered during inference.

Supported operators and activations

Concrete-ML supports a variety of PyTorch operators that can be used to build fully connected or convolutional neural networks, with normalization and activation layers. Moreover, many element-wise operators are supported.

Operators

univariate operators

shape modifying operators

operators that take an encrypted input and unencrypted constants

Please note that Concrete-ML supports these operators but also the QAT equivalents from Brevitas.

brevitas.nn.QuantLinear
brevitas.nn.QuantConv2d

operators that can take both encrypted+unencrypted and encrypted+encrypted inputs

Quantizers

brevitas.nn.QuantIdentity

Activations

Note that the equivalent versions from torch.functional are also supported.

from concrete.ml.quantization import QuantizedArray

def q_impl(
    self,
    *q_inputs: QuantizedArray,
    **attrs,
) -> QuantizedArray:

    # Retrieve the quantized inputs
    prepared_inputs = self._prepare_inputs_with_constants(
        *q_inputs, calibrate=False, quantize_actual_values=True
    )

    result = (
        sum_result.astype(numpy.float32) - q_input.quantizer.zero_point
    ) * q_input.quantizer.scale

    return QuantizedArray(
        self.n_bits,
        result,
        value_is_float=True,
        options=self.input_quant_opts,
        stats=self.output_quant_stats,
        params=self.output_quant_params,
    )


def q_impl(
    self,
    *q_inputs: QuantizedArray,
    **attrs,
) -> QuantizedArray:

    execute_in_float = len(self.constant_inputs) > 0 or self.can_fuse()

    # a floating point implementation that can fuse
    if execute_in_float:
        prepared_inputs = self._prepare_inputs_with_constants(
            *q_inputs, calibrate=False, quantize_actual_values=False
        )

        result = prepared_inputs[0] + self.b_sign * prepared_inputs[1]
        return QuantizedArray(
            self.n_bits,
            result,
            # ......
        )
    else:
        prepared_inputs = self._prepare_inputs_with_constants(
            *q_inputs, calibrate=False, quantize_actual_values=True
        )
        # an integer implementation follows, see Case 2
        # ....

__init__(
    n_bits: 'Union[int, dict]' = 8,
    alpha: 'float' = 1.0,
    fit_intercept: 'bool' = True,
    max_iter: 'int' = 100,
    tol: 'float' = 0.0001,
    warm_start: 'bool' = False,
    verbose: 'int' = 0
)

__init__(
    n_bits: 'Union[int, dict]' = 8,
    alpha: 'float' = 1.0,
    fit_intercept: 'bool' = True,
    max_iter: 'int' = 100,
    tol: 'float' = 0.0001,
    warm_start: 'bool' = False,
    verbose: 'int' = 0
)

__init__(
    n_bits: 'Union[int, dict]' = 8,
    power: 'float' = 0.0,
    alpha: 'float' = 1.0,
    fit_intercept: 'bool' = True,
    link: 'str' = 'auto',
    max_iter: 'int' = 100,
    tol: 'float' = 0.0001,
    warm_start: 'bool' = False,
    verbose: 'int' = 0
)

import brevitas.nn as qnn
import torch.nn as nn
import torch

N_FEAT = 12
n_bits = 3

class QATSimpleNet(nn.Module):
    def __init__(self, n_hidden):
        super().__init__()

        self.quant_inp = qnn.QuantIdentity(bit_width=n_bits, return_quant_tensor=True)
        self.fc1 = qnn.QuantLinear(N_FEAT, n_hidden, True, weight_bit_width=n_bits, bias_quant=None)
        self.quant2 = qnn.QuantIdentity(bit_width=n_bits, return_quant_tensor=True)
        self.fc2 = qnn.QuantLinear(n_hidden, n_hidden, True, weight_bit_width=n_bits, bias_quant=None)
        self.quant3 = qnn.QuantIdentity(bit_width=n_bits, return_quant_tensor=True)
        self.fc3 = qnn.QuantLinear(n_hidden, 2, True, weight_bit_width=n_bits, bias_quant=None)

    def forward(self, x):
        x = self.quant_inp(x)
        x = self.quant2(torch.relu(self.fc1(x)))
        x = self.quant3(torch.relu(self.fc2(x)))
        x = self.fc3(x)
        return x

from concrete.ml.torch.compile import compile_brevitas_qat_model
import numpy

torch_input = torch.randn(100, N_FEAT)
torch_model = QATSimpleNet(30)
quantized_numpy_module = compile_brevitas_qat_model(
    torch_model, # our model
    torch_input, # a representative input-set to be used for both quantization and compilation
    n_bits = n_bits,
)

from concrete.ml.torch.compile import compile_torch_model
n_bits_qat = 3

quantized_numpy_module = compile_torch_model(
    torch_model,
    torch_input,
    import_qat=True,
    n_bits=n_bits_qat,
)

from concrete.ml.quantization import QuantizedArray
import numpy
numpy.random.seed(0)
A = numpy.random.uniform(-2, 2, 10)
print("A = ", A)
# array([ 0.19525402,  0.86075747,  0.4110535,  0.17953273, -0.3053808,
#         0.58357645, -0.24965115,  1.567092 ,  1.85465104, -0.46623392])
q_A = QuantizedArray(7, A)
print("q_A.qvalues = ", q_A.qvalues)
# array([ 37,          73,          48,         36,          9,
#         58,          12,          112,        127,         0])
# the quantized integers values from A.
print("q_A.quantizer.scale = ", q_A.quantizer.scale)
# 0.018274684777173276, the scale S.
print("q_A.quantizer.zero_point = ", q_A.quantizer.zero_point)
# 26, the zero point Z.
print("q_A.dequant() = ", q_A.dequant())
# array([ 0.20102153,  0.85891018,  0.40204307,  0.18274685, -0.31066964,
#         0.58478991, -0.25584559,  1.57162289,  1.84574316, -0.4751418 ])
# Dequantized values.

q_A = QuantizedArray(3, A)
print("Unsigned: q_A.qvalues = ", q_A.qvalues)
print("q_A.quantizer.zero_point = ", q_A.quantizer.zero_point)
# Unsigned: q_A.qvalues =  [2 4 2 2 0 3 0 6 7 0]
# q_A.quantizer.zero_point =  1

q_A = QuantizedArray(3, A, is_signed=True, is_symmetric=True)
print("Signed Symmetric: q_A.qvalues = ", q_A.qvalues)
print("q_A.quantizer.zero_point = ", q_A.quantizer.zero_point)
# Signed Symmetric: q_A.qvalues =  [ 0  1  1  0  0  1  0  3  3 -1]
# q_A.quantizer.zero_point =  0

import numpy
from concrete.ml.quantization.quantizers import QuantizationOptions

q_values = [0, 0, 1, 2, 3, -1]
QuantizedArray(
        q_A.quantizer.n_bits,
        q_values,
        value_is_float=False,
        options=q_A.quantizer.quant_options,
        stats=q_A.quantizer.quant_stats,
        params=q_A.quantizer.quant_params,
).dequant()

Step-by-step Guide

This guide provides a complete example of converting a PyTorch neural network into its FHE-friendly, quantized counterpart. It focuses on Quantization Aware Training a simple network on a synthetic data-set.

In general, quantization can be carried out in two different ways: either during training with Quantization Aware Training (QAT) or after the training phase with Post-Training Quantization (PTQ).

Regarding FHE-friendly neural networks, QAT is the best way to reach optimal accuracy under FHE constrains. This technique allows weights and activations to be reduced to very low bit-widths (e.g. 2-3 bits), which, combined with pruning, can keep accumulator bit-widths low.

Concrete-ML uses the third party library Brevitas to perform QAT for PyTorch NNs, but options exist for other frameworks such as Keras/Tensorflow.

Several demos and tutorials that use Brevitas are available in Concrete-ML library, such as the CIFAR classification tutorial.

This guide is based on a , from which some code blocks are documented here.

For a more formal description of the usage of Brevitas to build FHE-compatible neural networks, please see the .

Baseline PyTorch model

In PyTorch, using standard layers, a fully connected neural network would look as follows:

The , example shows how to train a fully-connected neural network, similar to the one above, on a synthetic 2D data-set with a checkerboard grid pattern of 100 x 100 points. The data is split into 9500 training and 500 test samples.

Once trained, this PyTorch network can be imported using the function. This function uses simple Post-Training Quantization.

The network was trained using different numbers of neurons in the hidden layers, and quantized using 3-bits weights and activations. The mean accumulator size shown below was extracted using the and is measured as the mean over 10 runs of the experiment. An accumulator of 6.6 means that 4 times out of 10 the accumulator measured was 6 bits while 6 times it was 7 bits.

neurons

100

This shows that the fp32 accuracy and accumulator size increases with the number of hidden neurons, while the 3-bit accuracy remains low irrespective of the number of neurons. While all the configurations tried here were FHE-compatible (accumulator < 16 bits), it is often preferable to have a lower accumulator size in order to speed up the inference time.

The accumulator size is determined by Concrete-Numpy as being the maximum bit-width encountered anywhere in the encrypted circuit.

Quantization Aware Training:

using is the best way to guarantee a good accuracy for Concrete-ML compatible neural networks.

Brevitas provides a quantized version of almost all PyTorch layers (Linear layer becomes QuantLinear, ReLU layer becomes QuantReLU and so one), plus some extra quantization parameters, such as :

bit_width: precision quantization bits for activations
act_quant: quantization protocol for the activations
weight_bit_width

In order to use FHE, the network must be quantized from end to end, and thanks to the Brevitas's QuantIdentity layer, it is possible to quantize the input by placing it at the entry point of the network. Moreover, it is also possible to combine PyTorch and Brevitas layers, provided that a QuantIdentity is placed after this PyTorch layer. The following table gives the replacements to be made to convert a PyTorch NN for Concrete-ML compatibility.

Pytorch fp32 layer

Concrete-ML model with Pytorch/Brevitas

Furthermore, some PyTorch operators (from the PyTorch functional API), require a brevitas.quant.QuantIdentity to be applied on their inputs.

PyTorch ops that require QuantIdentity

The QAT import tool in Concrete-ML is a work in progress. While it has been tested with some networks built with Brevitas, it is possible to use other tools to obtain QAT networks.

For instance, with Brevitas, the network above becomes :

Note that in the network above, biases are used for linear layers but are not quantized ("bias": True, "bias_quant": None). The addition of the bias is an univariate operation and is fused into the activation function.

Training this network with pruning (see below) with 30 out of 100 total non-zero neurons gives good accuracy while keeping the accumulator size low.

Non-zero neurons

The PyTorch QAT training loop is the same as the standard floating point training loop, but hyper-parameters such as learning rate might need to be adjusted.

Quantization Aware Training is somewhat slower than normal training. QAT introduces quantization during both the forward and backward passes. The quantization process is inefficient on GPUs as its computational intensity is low with respect to data transfer time.

Pruning using torch

Considering that FHE only works with limited integer precision, there is a risk of overflowing in the accumulator, which will make Concrete-ML raise an error.

To understand how to overcome this limitation, consider a scenario where 2 bits are used for weights and layer inputs/outputs. The Linear layer computes a dot product between weights and inputs . With 2 bits, no overflow can occur during the computation of the Linear layer as long the number of neurons does not exceed 14, i.e. the sum of 14 products of 2-bit numbers does not exceed 7 bits.

By default, Concrete-ML uses symmetric quantization for model weights, with values in the interval . For example, for the possible values are , for the values can be .

However, in a typical setting, the weights will not all have the maximum or minimum values (e.g. ). Instead, weights typically have a normal distribution around 0, which is one of the motivating factors for their symmetric quantization. A symmetric distribution and many zero-valued weights are desirable because opposite sign weights can cancel each other out and zero weights do not increase the accumulator size.

This fact can be leveraged to train a network with more neurons, while not overflowing the accumulator, using a technique called , where the developer can impose a number of zero-valued weights. Torch out of the box.

The following code shows how to use pruning in the previous example:

Results with PrunedQuantNet, a pruned version of the QuantSimpleNet with 100 neurons on the hidden layers, are given below, showing a mean accumulator size measured over 10 runs of the experiment:

Non-zero neurons

This shows that the fp32 accuracy has been improved while maintaining constant mean accumulator size.

When pruning a larger neural network during training, it is easier to obtain a low bit-width accumulator while maintaining better final accuracy. Thus, pruning is more robust than training a similar, smaller network.

from concrete.ml.sklearn import XGBClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

x, y = make_classification(n_samples=100, class_sep=2, n_features=4, random_state=42)

# Retrieve train and test sets
X_train, _, y_train, _ = train_test_split(x, y, test_size=10, random_state=42)

clf = XGBClassifier()
clf.fit(X_train, y_train)

# Here we set the p_error parameter
clf.compile(X_train, p_error = 0.1)

from concrete.ml.sklearn import DecisionTreeClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

x, y = make_classification(n_samples=100, class_sep=2, n_features=4, random_state=42)

# Retrieve train and test sets
X_train, _, y_train, _ = train_test_split(x, y, test_size=10, random_state=42)

clf = DecisionTreeClassifier(random_state=42)
clf.fit(X_train, y_train)

clf.compile(X_train, verbose_compilation=True, show_mlir=True, p_error=0.033)

Computation Graph
-------------------------------------------------------------------------------------------------------------------------------
 %0 = _inputs                                  # EncryptedTensor<uint6, shape=(1, 4)>           ∈ [0, 63]
 %1 = transpose(%0)                            # EncryptedTensor<uint6, shape=(4, 1)>           ∈ [0, 63]
 %2 = [[0 0 0 1]]                              # ClearTensor<uint1, shape=(1, 4)>               ∈ [0, 1]
 %3 = matmul(%2, %1)                           # EncryptedTensor<uint6, shape=(1, 1)>           ∈ [0, 63]
 %4 = [[32]]                                   # ClearTensor<uint6, shape=(1, 1)>               ∈ [32, 32]
 %5 = less_equal(%3, %4)                       # EncryptedTensor<uint1, shape=(1, 1)>           ∈ [False, True]
 %6 = reshape(%5, newshape=[ 1  1 -1])         # EncryptedTensor<uint1, shape=(1, 1, 1)>        ∈ [False, True]
 %7 = [[[ 1]  [-1]]]                           # ClearTensor<int2, shape=(1, 2, 1)>             ∈ [-1, 1]
 %8 = matmul(%7, %6)                           # EncryptedTensor<int2, shape=(1, 2, 1)>         ∈ [-1, 1]
 %9 = reshape(%8, newshape=[ 2 -1])            # EncryptedTensor<int2, shape=(2, 1)>            ∈ [-1, 1]
%10 = [[1] [0]]                                # ClearTensor<uint1, shape=(2, 1)>               ∈ [0, 1]
%11 = equal(%10, %9)                           # EncryptedTensor<uint1, shape=(2, 1)>           ∈ [False, True]
%12 = reshape(%11, newshape=[ 1  2 -1])        # EncryptedTensor<uint1, shape=(1, 2, 1)>        ∈ [False, True]
%13 = [[[63  0]  [ 0 63]]]                     # ClearTensor<uint6, shape=(1, 2, 2)>            ∈ [0, 63]
%14 = matmul(%13, %12)                         # EncryptedTensor<uint6, shape=(1, 2, 1)>        ∈ [0, 63]
%15 = reshape(%14, newshape=[ 1  2 -1])        # EncryptedTensor<uint6, shape=(1, 2, 1)>        ∈ [0, 63]
return %15

MLIR
-------------------------------------------------------------------------------------------------------------------------------
module {
  func.func @main(%arg0: tensor<1x4x!FHE.eint<6>>) -> tensor<1x2x1x!FHE.eint<6>> {
    %cst = arith.constant dense<[[[63, 0], [0, 63]]]> : tensor<1x2x2xi7>
    %cst_0 = arith.constant dense<[[1], [0]]> : tensor<2x1xi7>
    %cst_1 = arith.constant dense<[[[1], [-1]]]> : tensor<1x2x1xi7>
    %cst_2 = arith.constant dense<32> : tensor<1x1xi7>
    %cst_3 = arith.constant dense<[[0, 0, 0, 1]]> : tensor<1x4xi7>
    %c32_i7 = arith.constant 32 : i7
    %0 = "FHELinalg.transpose"(%arg0) {axes = []} : (tensor<1x4x!FHE.eint<6>>) -> tensor<4x1x!FHE.eint<6>>
    %cst_4 = tensor.from_elements %c32_i7 : tensor<1xi7>
    %1 = "FHELinalg.matmul_int_eint"(%cst_3, %0) : (tensor<1x4xi7>, tensor<4x1x!FHE.eint<6>>) -> tensor<1x1x!FHE.eint<6>>
    %cst_5 = arith.constant dense<[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]> : tensor<64xi64>
    %2 = "FHELinalg.apply_lookup_table"(%1, %cst_5) : (tensor<1x1x!FHE.eint<6>>, tensor<64xi64>) -> tensor<1x1x!FHE.eint<6>>
    %3 = tensor.expand_shape %2 [[0], [1, 2]] : tensor<1x1x!FHE.eint<6>> into tensor<1x1x1x!FHE.eint<6>>
    %4 = "FHELinalg.matmul_int_eint"(%cst_1, %3) : (tensor<1x2x1xi7>, tensor<1x1x1x!FHE.eint<6>>) -> tensor<1x2x1x!FHE.eint<6>>
    %5 = tensor.collapse_shape %4 [[0, 1], [2]] : tensor<1x2x1x!FHE.eint<6>> into tensor<2x1x!FHE.eint<6>>
    %6 = "FHELinalg.add_eint_int"(%5, %cst_4) : (tensor<2x1x!FHE.eint<6>>, tensor<1xi7>) -> tensor<2x1x!FHE.eint<6>>
    %cst_6 = arith.constant dense<"0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"> : tensor<2x64xi64>
    %cst_7 = arith.constant dense<[[0], [1]]> : tensor<2x1xindex>
    %7 = "FHELinalg.apply_mapped_lookup_table"(%6, %cst_6, %cst_7) : (tensor<2x1x!FHE.eint<6>>, tensor<2x64xi64>, tensor<2x1xindex>) -> tensor<2x1x!FHE.eint<6>>
    %8 = tensor.expand_shape %7 [[0, 1], [2]] : tensor<2x1x!FHE.eint<6>> into tensor<1x2x1x!FHE.eint<6>>
    %9 = "FHELinalg.matmul_int_eint"(%cst, %8) : (tensor<1x2x2xi7>, tensor<1x2x1x!FHE.eint<6>>) -> tensor<1x2x1x!FHE.eint<6>>
    return %9 : tensor<1x2x1x!FHE.eint<6>>
  }
}

Optimizer
-------------------------------------------------------------------------------------------------------------------------------
--- Circuit
  6 bits integers
  7 manp (maxi log2 norm2)
  388ms to solve
--- User config
  3.300000e-02 error per pbs call
  1.000000e+00 error per circuit call
--- Complexity for the full circuit
  4.214000e+02 Millions Operations
--- Correctness for each Pbs call
  1/30 errors (3.234529e-02)
--- Correctness for the full circuit
  1/10 errors (9.390887e-02)
--- Parameters resolution
  1x glwe_dimension
  2**11 polynomial (2048)
  762 lwe dimension
  keyswitch l,b=5,3
  blindrota l,b=2,15
  wopPbs : false
---

import torch
from torch import nn

IN_FEAT = 2
OUT_FEAT = 2

class SimpleNet(nn.Module):
    """Simple MLP with PyTorch"""

    def __init__(self, n_hidden = 30):
        super().__init__()
        self.fc1 = nn.Linear(in_features=IN_FEAT, out_features=n_hidden)
        self.fc2 = nn.Linear(in_features=n_hidden, out_features=n_hidden)
        self.fc3 = nn.Linear(in_features=n_hidden, out_features=OUT_FEAT)


    def forward(self, x):
        """Forward pass."""
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x

from brevitas import nn as qnn
from brevitas.core.quant import QuantType
from brevitas.quant import Int8ActPerTensorFloat, Int8WeightPerTensorFloat

N_BITS = 3
IN_FEAT = 2
OUT_FEAT = 2

class QuantSimpleNet(nn.Module):
    def __init__(
        self,
        n_hidden,
        qlinear_args={
            "weight_bit_width": N_BITS,
            "weight_quant": Int8WeightPerTensorFloat,
            "bias": True,
            "bias_quant": None,
            "narrow_range": True
        },
        qidentity_args={"bit_width": N_BITS, "act_quant": Int8ActPerTensorFloat},
    ):
        super().__init__()

        self.quant_inp = qnn.QuantIdentity(**qidentity_args)
        self.fc1 = qnn.QuantLinear(IN_FEAT, n_hidden, **qlinear_args)
        self.relu1 = qnn.QuantReLU(bit_width=qidentity_args["bit_width"])
        self.fc2 = qnn.QuantLinear(n_hidden, n_hidden, **qlinear_args)
        self.relu2 = qnn.QuantReLU(bit_width=qidentity_args["bit_width"])
        self.fc3 = qnn.QuantLinear(n_hidden, OUT_FEAT, **qlinear_args)

        for m in self.modules():
            if isinstance(m, qnn.QuantLinear):
                torch.nn.init.uniform_(m.weight.data, -1, 1)

    def forward(self, x):
        x = self.quant_inp(x)
        x = self.relu1(self.fc1(x))
        x = self.relu2(self.fc2(x))
        x = self.fc3(x)
        return x

import torch.nn.utils.prune as prune

class PrunedQuantNet(SimpleNet):
    """Simple MLP with PyTorch"""

    pruned_layers = set()

    def prune(self, max_non_zero):
        # Linear layer weight has dimensions NumOutputs x NumInputs
        for name, layer in self.named_modules():
            if isinstance(layer, nn.Linear):
                print(name, layer)
                num_zero_weights = (layer.weight.shape[1] - max_non_zero) * layer.weight.shape[0]
                if num_zero_weights <= 0:
                    continue
                print(f"Pruning layer {name} factor {num_zero_weights}")
                prune.l1_unstructured(layer, "weight", amount=num_zero_weights)
                self.pruned_layers.add(name)

    def unprune(self):
        for name, layer in self.named_modules():
            if name in self.pruned_layers:
                prune.remove(layer, "weight")
                self.pruned_layers.remove(name)

concrete.ml.sklearn.protocols.md

module `concrete.ml.sklearn.protocols`

Protocols.

Protocols are used to mix type hinting with duck-typing. Indeed we don't always want to have an abstract parent class between all objects. We are more interested in the behavior of such objects. Implementing a Protocol is a way to specify the behavior of objects.

To read more about Protocol please read: https://peps.python.org/pep-0544

class `Quantizer`

Quantizer Protocol.

To use to type hint a quantizer.

method `dequant`

Dequantize some values.

Args:

X (numpy.ndarray): Values to dequantize

.. # noqa: DAR202

Returns:

numpy.ndarray: Dequantized values

method `quant`

Quantize some values.

Args:

values (numpy.ndarray): Values to quantize

.. # noqa: DAR202

Returns:

numpy.ndarray: The quantized values

class `ConcreteBaseEstimatorProtocol`

A Concrete Estimator Protocol.

property onnx_model

onnx_model.

.. # noqa: DAR202

Results: onnx.ModelProto

property quantize_input

Quantize input function.

method `compile`

Compiles a model to a FHE Circuit.

Args:

X (numpy.ndarray): the dequantized dataset
configuration (Optional[Configuration]): the options for compilation
compilation_artifacts

.. # noqa: DAR202

Returns:

Circuit: the compiled Circuit.

method `fit`

Initialize and fit the module.

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): labels associated with training data
**fit_params

.. # noqa: DAR202

Returns:

ConcreteBaseEstimatorProtocol: the trained estimator

method `fit_benchmark`

Fit the quantized estimator and return reference estimator.

This function returns both the quantized estimator (itself), but also a wrapper around the non-quantized trained NN. This is useful in order to compare performance between the quantized and fp32 versions of the classifier

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): labels associated with training data
*args

.. # noqa: DAR202

Returns:

self: self fitted
model: underlying estimator

method `post_processing`

Post-process models predictions.

Args:

y_preds (numpy.ndarray): predicted values by model (clear-quantized)

.. # noqa: DAR202

Returns:

numpy.ndarray: the post-processed predictions

class `ConcreteBaseClassifierProtocol`

Concrete classifier protocol.

property onnx_model

onnx_model.

.. # noqa: DAR202

Results: onnx.ModelProto

property quantize_input

Quantize input function.

method `compile`

Compiles a model to a FHE Circuit.

Args:

X (numpy.ndarray): the dequantized dataset
configuration (Optional[Configuration]): the options for compilation
compilation_artifacts

.. # noqa: DAR202

Returns:

Circuit: the compiled Circuit.

method `fit`

Initialize and fit the module.

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): labels associated with training data
**fit_params

.. # noqa: DAR202

Returns:

ConcreteBaseEstimatorProtocol: the trained estimator

method `fit_benchmark`

Fit the quantized estimator and return reference estimator.

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): labels associated with training data
*args

.. # noqa: DAR202

Returns:

self: self fitted
model: underlying estimator

method `post_processing`

Post-process models predictions.

Args:

y_preds (numpy.ndarray): predicted values by model (clear-quantized)

.. # noqa: DAR202

Returns:

numpy.ndarray: the post-processed predictions

method `predict`

Predicts for each sample the class with highest probability.

Args:

X (numpy.ndarray): Features
execute_in_fhe (bool): Whether the inference should be done in fhe or not.

.. # noqa: DAR202

Returns: numpy.ndarray

method `predict_proba`

Predicts for each sample the probability of each class.

Args:

X (numpy.ndarray): Features
execute_in_fhe (bool): Whether the inference should be done in fhe or not.

.. # noqa: DAR202

Returns: numpy.ndarray

class `ConcreteBaseRegressorProtocol`

Concrete regressor protocol.

property onnx_model

onnx_model.

.. # noqa: DAR202

Results: onnx.ModelProto

property quantize_input

Quantize input function.

method `compile`

Compiles a model to a FHE Circuit.

Args:

X (numpy.ndarray): the dequantized dataset
configuration (Optional[Configuration]): the options for compilation
compilation_artifacts

.. # noqa: DAR202

Returns:

Circuit: the compiled Circuit.

method `fit`

Initialize and fit the module.

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): labels associated with training data
**fit_params

.. # noqa: DAR202

Returns:

ConcreteBaseEstimatorProtocol: the trained estimator

method `fit_benchmark`

Fit the quantized estimator and return reference estimator.

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): labels associated with training data
*args

.. # noqa: DAR202

Returns:

self: self fitted
model: underlying estimator

method `post_processing`

Post-process models predictions.

Args:

y_preds (numpy.ndarray): predicted values by model (clear-quantized)

.. # noqa: DAR202

Returns:

numpy.ndarray: the post-processed predictions

method `predict`

Predicts for each sample the expected value.

Args:

X (numpy.ndarray): Features
execute_in_fhe (bool): Whether the inference should be done in fhe or not.

.. # noqa: DAR202

Returns: numpy.ndarray

concrete.ml.quantization.quantizers.md

module `concrete.ml.quantization.quantizers`

Quantization utilities for a numpy array/tensor.

Global Variables

STABILITY_CONST

function `fill_from_kwargs`

Fill a parameter set structure from kwargs parameters.

Args:

obj: an object of type klass, if None the object is created if any of the type's members appear in the kwargs
klass: the type of object to fill
kwargs

Returns:

obj: an object of type klass
kwargs: remaining parameter names and values that were not filled into obj

Raises:

TypeError: if the types of the parameters in kwargs could not be converted to the corresponding types of members of klass

class `QuantizationOptions`

Options for quantization.

Determines the number of bits for quantization and the method of quantization of the values. Signed quantization allows negative quantized values. Symmetric quantization assumes the float values are distributed symmetrically around x=0 and assigns signed values around 0 to the float values. QAT (quantization aware training) quantization assumes the values are already quantized, taking a discrete set of values, and assigns these values to integers, computing only the scale.

method `init`

property quant_options

Get a copy of the quantization parameters.

Returns:

UniformQuantizationParameters: a copy of the current quantization parameters

method `copy_opts`

Copy the options from a different structure.

Args:

opts (QuantizationOptions): structure to copy parameters from.

method `is_equal`

Compare two quantization options sets.

Args:

opts (QuantizationOptions): options to compare this instance to
ignore_sign_qat (bool): ignore sign comparison for QAT options

Returns:

bool: whether the two quantization options compared are equivalent

class `MinMaxQuantizationStats`

Calibration set statistics.

This class stores the statistics for the calibration set or for a calibration data batch. Currently we only store min/max to determine the quantization range. The min/max are computed from the calibration set.

property quant_stats

Get a copy of the calibration set statistics.

Returns:

MinMaxQuantizationStats: a copy of the current quantization stats

method `check_is_uniform_quantized`

Check if these statistics correspond to uniformly quantized values.

Determines whether the values represented by this QuantizedArray show a quantized structure that allows to infer the scale of quantization.

Args:

options (QuantizationOptions): used to quantize the values in the QuantizedArray

Returns:

bool: check result.

method `compute_quantization_stats`

Compute the calibration set quantization statistics.

Args:

values (numpy.ndarray): Calibration set on which to compute statistics.

method `copy_stats`

Copy the statistics from a different structure.

Args:

stats (MinMaxQuantizationStats): structure to copy statistics from.

class `UniformQuantizationParameters`

Quantization parameters for uniform quantization.

This class stores the parameters used for quantizing real values to discrete integer values. The parameters are computed from quantization options and quantization statistics.

property quant_params

Get a copy of the quantization parameters.

Returns:

UniformQuantizationParameters: a copy of the current quantization parameters

method `compute_quantization_parameters`

Compute the quantization parameters.

Args:

options (QuantizationOptions): quantization options set
stats (MinMaxQuantizationStats): calibrated statistics for quantization

method `copy_params`

Copy the parameters from a different structure.

Args:

params (UniformQuantizationParameters): parameter structure to copy

class `UniformQuantizer`

Uniform quantizer.

Contains all information necessary for uniform quantization and provides quantization/dequantization functionality on numpy arrays.

Args:

options (QuantizationOptions): Quantization options set
stats (Optional[MinMaxQuantizationStats]): Quantization batch statistics set
params

method `init`

property quant_options

Get a copy of the quantization parameters.

Returns:

UniformQuantizationParameters: a copy of the current quantization parameters

property quant_params

Get a copy of the quantization parameters.

Returns:

UniformQuantizationParameters: a copy of the current quantization parameters

property quant_stats

Get a copy of the calibration set statistics.

Returns:

MinMaxQuantizationStats: a copy of the current quantization stats

method `check_is_uniform_quantized`

Check if these statistics correspond to uniformly quantized values.

Determines whether the values represented by this QuantizedArray show a quantized structure that allows to infer the scale of quantization.

Args:

options (QuantizationOptions): used to quantize the values in the QuantizedArray

Returns:

bool: check result.

method `compute_quantization_parameters`

Compute the quantization parameters.

Args:

options (QuantizationOptions): quantization options set
stats (MinMaxQuantizationStats): calibrated statistics for quantization

method `compute_quantization_stats`

Compute the calibration set quantization statistics.

Args:

values (numpy.ndarray): Calibration set on which to compute statistics.

method `copy_opts`

Copy the options from a different structure.

Args:

opts (QuantizationOptions): structure to copy parameters from.

method `copy_params`

Copy the parameters from a different structure.

Args:

params (UniformQuantizationParameters): parameter structure to copy

method `copy_stats`

Copy the statistics from a different structure.

Args:

stats (MinMaxQuantizationStats): structure to copy statistics from.

method `dequant`

Dequantize values.

Args:

qvalues (numpy.ndarray): integer values to dequantize

Returns:

numpy.ndarray: Dequantized float values.

method `is_equal`

Compare two quantization options sets.

Args:

opts (QuantizationOptions): options to compare this instance to
ignore_sign_qat (bool): ignore sign comparison for QAT options

Returns:

bool: whether the two quantization options compared are equivalent

method `quant`

Quantize values.

Args:

values (numpy.ndarray): float values to quantize

Returns:

numpy.ndarray: Integer quantized values.

class `QuantizedArray`

Abstraction of quantized array.

Contains float values and their quantized integer counter-parts. Quantization is performed by the quantizer member object. Float and int values are kept in sync. Having both types of values is useful since quantized operators in Concrete ML graphs might need one or the other depending on how the operator works (in float or in int). Moreover, when the encrypted function needs to return a value, it must return integer values.

See https://arxiv.org/abs/1712.05877.

Args:

values (numpy.ndarray): Values to be quantized.
n_bits (int): The number of bits to use for quantization.
value_is_float

method `init`

method `dequant`

Dequantize self.qvalues.

Returns:

numpy.ndarray: Dequantized values.

method `quant`

Quantize self.values.

Returns:

numpy.ndarray: Quantized values.

method `update_quantized_values`

Update qvalues to get their corresponding values using the related quantized parameters.

Args:

qvalues (numpy.ndarray): Values to replace self.qvalues

Returns:

values (numpy.ndarray): Corresponding values

method `update_values`

Update values to get their corresponding qvalues using the related quantized parameters.

Args:

values (numpy.ndarray): Values to replace self.values

Returns:

qvalues (numpy.ndarray): Corresponding qvalues

concrete.ml.quantization.base_quantized_op.md

module `concrete.ml.quantization.base_quantized_op`

Base Quantized Op class that implements quantization for a float numpy op.

Global Variables

ONNX_OPS_TO_NUMPY_IMPL
ALL_QUANTIZED_OPS
ONNX_OPS_TO_QUANTIZED_IMPL

class `QuantizedOp`

Base class for quantized ONNX ops implemented in numpy.

Args:

n_bits_output (int): The number of bits to use for the quantization of the output
int_input_names (Set[str]): The set of names of integer tensors that are inputs to this op
constant_inputs

method `init`

property int_input_names

Get the names of encrypted integer tensors that are used by this op.

Returns:

List[str]: the names of the tensors

method `calibrate`

Create corresponding QuantizedArray for the output of the activation function.

Args:

*inputs (numpy.ndarray): Calibration sample inputs.

Returns:

numpy.ndarray: the output values for the provided calibration samples.

method `call_impl`

Call self.impl to centralize mypy bug workaround.

Args:

*inputs (numpy.ndarray): real valued inputs.
**attrs: the QuantizedOp attributes.

Returns:

numpy.ndarray: return value of self.impl

method `can_fuse`

Determine if the operator impedes graph fusion.

This function shall be overloaded by inheriting classes to test self._int_input_names, to determine whether the operation can be fused to a TLU or not. For example an operation that takes inputs produced by a unique integer tensor can be fused to a TLU. Example: f(x) = x * (x + 1) can be fused. A function that does f(x) = x * (x @ w + 1) can't be fused.

Returns:

bool: whether this instance of the QuantizedOp produces Concrete Numpy code that can be fused to TLUs

classmethod `must_quantize_input`

Determine if an input must be quantized.

Quantized ops and numpy onnx ops take inputs and attributes. Inputs can be either constant or variable (encrypted). Note that this does not handle attributes, which are handled by QuantizedOp classes separately in their constructor.

Args:

input_name_or_idx (int): Index of the input to check.

Returns:

result (bool): Whether the input must be quantized (must be a QuantizedArray) or if it stays as a raw numpy.array read from ONNX.

classmethod `op_type`

Get the type of this operation.

Returns:

op_type (str): The type of this operation, in the ONNX referential

method `prepare_output`

Quantize the output of the activation function.

The calibrate method needs to be called with sample data before using this function.

Args:

qoutput_activation (numpy.ndarray): Output of the activation function.

Returns:

QuantizedArray: Quantized output.

method `q_impl`

Execute the quantized forward.

Args:

*q_inputs (QuantizedArray): Quantized inputs.
**attrs: the QuantizedOp attributes.

Returns:

QuantizedArray: The returned quantized value.

class `QuantizedOpUnivariateOfEncrypted`

An univariate operator of an encrypted value.

This operation is not really operating as a quantized operation. It is useful when the computations get fused into a TLU, as in e.g. Act(x) = x || (x + 42)).

method `init`

property int_input_names

Get the names of encrypted integer tensors that are used by this op.

Returns:

List[str]: the names of the tensors

method `calibrate`

Create corresponding QuantizedArray for the output of the activation function.

Args:

*inputs (numpy.ndarray): Calibration sample inputs.

Returns:

numpy.ndarray: the output values for the provided calibration samples.

method `call_impl`

Call self.impl to centralize mypy bug workaround.

Args:

*inputs (numpy.ndarray): real valued inputs.
**attrs: the QuantizedOp attributes.

Returns:

numpy.ndarray: return value of self.impl

method `can_fuse`

Determine if this op can be fused.

This operation can be fused and computed in float when a single integer tensor generates both the operands. For example in the formula: f(x) = x || (x + 1) where x is an integer tensor.

Returns:

bool: Can fuse

classmethod `must_quantize_input`

Determine if an input must be quantized.

Args:

input_name_or_idx (int): Index of the input to check.

Returns:

result (bool): Whether the input must be quantized (must be a QuantizedArray) or if it stays as a raw numpy.array read from ONNX.

classmethod `op_type`

Get the type of this operation.

Returns:

op_type (str): The type of this operation, in the ONNX referential

method `prepare_output`

Quantize the output of the activation function.

The calibrate method needs to be called with sample data before using this function.

Args:

qoutput_activation (numpy.ndarray): Output of the activation function.

Returns:

QuantizedArray: Quantized output.

method `q_impl`

Execute the quantized forward.

Args:

*q_inputs (QuantizedArray): Quantized inputs.
**attrs: the QuantizedOp attributes.

Returns:

QuantizedArray: The returned quantized value.

class `QuantizedMixingOp`

An operator that mixes (adds or multiplies) together encrypted inputs.

Mixing operators cannot be fused to TLUs.

method `init`

property int_input_names

Get the names of encrypted integer tensors that are used by this op.

Returns:

List[str]: the names of the tensors

method `calibrate`

Create corresponding QuantizedArray for the output of the activation function.

Args:

*inputs (numpy.ndarray): Calibration sample inputs.

Returns:

numpy.ndarray: the output values for the provided calibration samples.

method `call_impl`

Call self.impl to centralize mypy bug workaround.

Args:

*inputs (numpy.ndarray): real valued inputs.
**attrs: the QuantizedOp attributes.

Returns:

numpy.ndarray: return value of self.impl

method `can_fuse`

Determine if this op can be fused.

Mixing operations cannot be fused since it must be performed over integer tensors and it combines different encrypted elements of the input tensors. Mixing operations are Conv, MatMul, etc.

Returns:

bool: False, this operation cannot be fused as it adds different encrypted integers

method `make_output_quant_parameters`

Build a quantized array from quantized integer results of the op and quantization params.

Args:

q_values (Union[numpy.ndarray, Any]): the quantized integer values to wrap in the QuantizedArray
scale (float): the pre-computed scale of the quantized values
zero_point

Returns:

QuantizedArray: the quantized array that will be passed to the QuantizedModule output.

classmethod `must_quantize_input`

Determine if an input must be quantized.

Args:

input_name_or_idx (int): Index of the input to check.

Returns:

result (bool): Whether the input must be quantized (must be a QuantizedArray) or if it stays as a raw numpy.array read from ONNX.

classmethod `op_type`

Get the type of this operation.

Returns:

op_type (str): The type of this operation, in the ONNX referential

method `prepare_output`

Quantize the output of the activation function.

The calibrate method needs to be called with sample data before using this function.

Args:

qoutput_activation (numpy.ndarray): Output of the activation function.

Returns:

QuantizedArray: Quantized output.

method `q_impl`

Execute the quantized forward.

Args:

*q_inputs (QuantizedArray): Quantized inputs.
**attrs: the QuantizedOp attributes.

Returns:

QuantizedArray: The returned quantized value.

compile(
    X: 'ndarray',
    configuration: 'Optional[Configuration]',
    compilation_artifacts: 'Optional[DebugArtifacts]',
    show_mlir: 'bool',
    use_virtual_lib: 'bool',
    p_error: 'float',
    global_p_error: 'float',
    verbose_compilation: 'bool'
) → Circuit

compile(
    X: 'ndarray',
    configuration: 'Optional[Configuration]',
    compilation_artifacts: 'Optional[DebugArtifacts]',
    show_mlir: 'bool',
    use_virtual_lib: 'bool',
    p_error: 'float',
    global_p_error: 'float',
    verbose_compilation: 'bool'
) → Circuit

compile(
    X: 'ndarray',
    configuration: 'Optional[Configuration]',
    compilation_artifacts: 'Optional[DebugArtifacts]',
    show_mlir: 'bool',
    use_virtual_lib: 'bool',
    p_error: 'float',
    global_p_error: 'float',
    verbose_compilation: 'bool'
) → Circuit

__init__(
    n_bits,
    values: Optional[ndarray],
    value_is_float: bool = True,
    options: QuantizationOptions = None,
    stats: Optional[MinMaxQuantizationStats] = None,
    params: Optional[UniformQuantizationParameters] = None,
    **kwargs
)

__init__(
    n_bits_output: int,
    int_input_names: Optional[Set[str]] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: Optional[QuantizationOptions] = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Optional[Set[str]] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: Optional[QuantizationOptions] = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Optional[Set[str]] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: Optional[QuantizationOptions] = None,
    **attrs
) → None

concrete.ml.sklearn.qnn.md

module `concrete.ml.sklearn.qnn`

Scikit-learn interface for concrete quantized neural networks.

Global Variables

MAX_BITWIDTH_BACKWARD_COMPATIBLE

class `SparseQuantNeuralNetImpl`

Sparse Quantized Neural Network classifier.

This class implements an MLP that is compatible with FHE constraints. The weights and activations are quantized to low bitwidth and pruning is used to ensure accumulators do not surpass an user-provided accumulator bit-width. The number of classes and number of layers are specified by the user, as well as the breadth of the network

method `init`

Sparse Quantized Neural Network constructor.

Args:

input_dim: Number of dimensions of the input data
n_layers: Number of linear layers for this network
n_outputs

Raises:

ValueError: if the parameters have invalid values or the computed accumulator bitwidth is zero

method `enable_pruning`

Enable pruning in the network. Pruning must be made permanent to recover pruned weights.

Raises:

ValueError: if the quantization parameters are invalid

method `forward`

Forward pass.

Args:

x (torch.Tensor): network input

Returns:

x (torch.Tensor): network prediction

method `make_pruning_permanent`

Make the learned pruning permanent in the network.

method `max_active_neurons`

Compute the maximum number of active (non-zero weight) neurons.

The computation is done using the quantization parameters passed to the constructor. Warning: With the current quantization algorithm (asymmetric) the value returned by this function is not guaranteed to ensure FHE compatibility. For some weight distributions, weights that are 0 (which are pruned weights) will not be quantized to 0. Therefore the total number of active quantized neurons will not be equal to max_active_neurons.

Returns:

n (int): maximum number of active neurons

method `on_train_end`

Call back when training is finished, can be useful to remove training hooks.

class `QuantizedSkorchEstimatorMixin`

Mixin class that adds quantization features to Skorch NN estimators.

property base_estimator_type

Get the sklearn estimator that should be trained by the child class.

property base_module_to_compile

Get the module that should be compiled to FHE. In our case this is a torch nn.Module.

Returns:

module (nn.Module): the instantiated torch module

property fhe_circuit

Get the FHE circuit.

Returns:

Circuit: the FHE circuit

property input_quantizers

Get the input quantizers.

Returns:

List[Quantizer]: the input quantizers

property n_bits_quant

Return the number of quantization bits.

This is stored by the torch.nn.module instance and thus cannot be retrieved until this instance is created.

Returns:

n_bits (int): the number of bits to quantize the network

Raises:

ValueError: with skorch estimators, the module_ is not instantiated until .fit() is called. Thus this estimator needs to be .fit() before we get the quantization number of bits. If it is not trained we raise an exception

property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

_onnx_model_ (onnx.ModelProto): the ONNX model

property output_quantizers

Get the input quantizers.

Returns:

List[QuantizedArray]: the input quantizers

property quantize_input

Get the input quantization function.

Returns:

Callable : function that quantizes the input

method `get_params_for_benchmark`

Get parameters for benchmark when cloning a skorch wrapped NN.

We must remove all parameters related to the module. Skorch takes either a class or a class instance for the module parameter. We want to pass our trained model, a class instance. But for this to work, we need to remove all module related constructor params. If not, skorch will instantiate a new class instance of the same type as the passed module see skorch net.py NeuralNet::initialize_instance

Returns:

params (dict): parameters to create an equivalent fp32 sklearn estimator for benchmark

method `infer`

Perform a single inference step on a batch of data.

This method is specific to Skorch estimators.

Args:

x (torch.Tensor): A batch of the input data, produced by a Dataset
**fit_params (dict) : Additional parameters passed to the forward method of the module and to the self.train_split call.

Returns: A torch tensor with the inference results for each item in the input

method `on_train_end`

Call back when training is finished by the skorch wrapper.

Check if the underlying neural net has a callback for this event and, if so, call it.

Args:

net: estimator for which training has ended (equal to self)
X: data
y

class `FixedTypeSkorchNeuralNet`

A mixin with a helpful modification to a skorch estimator that fixes the module type.

method `get_params`

Get parameters for this estimator.

Args:

deep (bool): If True, will return the parameters for this estimator and contained subobjects that are estimators.
**kwargs: any additional parameters to pass to the sklearn BaseEstimator class

Returns:

params : dict, Parameter names mapped to their values.

class `NeuralNetClassifier`

Scikit-learn interface for quantized FHE compatible neural networks.

This class wraps a quantized NN implemented using our Torch tools as a scikit-learn Estimator. It uses the skorch package to handle training and scikit-learn compatibility, and adds quantization and compilation functionality. The neural network implemented by this class is a multi layer fully connected network trained with Quantization Aware Training (QAT).

The datatypes that are allowed for prediction by this wrapper are more restricted than standard scikit-learn estimators as this class needs to predict in FHE and network inference executor is the NumpyModule.

method `init`

property base_estimator_type

property base_module_to_compile

Get the module that should be compiled to FHE. In our case this is a torch nn.Module.

Returns:

module (nn.Module): the instantiated torch module

property classes_

property fhe_circuit

Get the FHE circuit.

Returns:

Circuit: the FHE circuit

property history

property input_quantizers

Get the input quantizers.

Returns:

List[Quantizer]: the input quantizers

property n_bits_quant

Return the number of quantization bits.

This is stored by the torch.nn.module instance and thus cannot be retrieved until this instance is created.

Returns:

n_bits (int): the number of bits to quantize the network

Raises:

ValueError: with skorch estimators, the module_ is not instantiated until .fit() is called. Thus this estimator needs to be .fit() before we get the quantization number of bits. If it is not trained we raise an exception

property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

_onnx_model_ (onnx.ModelProto): the ONNX model

property output_quantizers

Get the input quantizers.

Returns:

List[QuantizedArray]: the input quantizers

property quantize_input

Get the input quantization function.

Returns:

Callable : function that quantizes the input

method `fit`

method `get_params`

Get parameters for this estimator.

Args:

deep (bool): If True, will return the parameters for this estimator and contained subobjects that are estimators.
**kwargs: any additional parameters to pass to the sklearn BaseEstimator class

Returns:

params : dict, Parameter names mapped to their values.

method `get_params_for_benchmark`

Get parameters for benchmark when cloning a skorch wrapped NN.

Returns:

params (dict): parameters to create an equivalent fp32 sklearn estimator for benchmark

method `infer`

Perform a single inference step on a batch of data.

This method is specific to Skorch estimators.

Args:

x (torch.Tensor): A batch of the input data, produced by a Dataset
**fit_params (dict) : Additional parameters passed to the forward method of the module and to the self.train_split call.

Returns: A torch tensor with the inference results for each item in the input

method `on_train_end`

Call back when training is finished by the skorch wrapper.

Check if the underlying neural net has a callback for this event and, if so, call it.

Args:

net: estimator for which training has ended (equal to self)
X: data
y

method `predict`

Predict on user provided data.

Predicts using the quantized clear or FHE classifier

Args:

X : input data, a numpy array of raw values (non quantized)
execute_in_fhe : whether to execute the inference in FHE or in the clear

Returns:

y_pred : numpy ndarray with predictions

class `NeuralNetRegressor`

Scikit-learn interface for quantized FHE compatible neural networks.

method `init`

property base_estimator_type

property base_module_to_compile

Get the module that should be compiled to FHE. In our case this is a torch nn.Module.

Returns:

module (nn.Module): the instantiated torch module

property fhe_circuit

Get the FHE circuit.

Returns:

Circuit: the FHE circuit

property history

property input_quantizers

Get the input quantizers.

Returns:

List[Quantizer]: the input quantizers

property n_bits_quant

Return the number of quantization bits.

This is stored by the torch.nn.module instance and thus cannot be retrieved until this instance is created.

Returns:

n_bits (int): the number of bits to quantize the network

Raises:

ValueError: with skorch estimators, the module_ is not instantiated until .fit() is called. Thus this estimator needs to be .fit() before we get the quantization number of bits. If it is not trained we raise an exception

property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

_onnx_model_ (onnx.ModelProto): the ONNX model

property output_quantizers

Get the input quantizers.

Returns:

List[QuantizedArray]: the input quantizers

property quantize_input

Get the input quantization function.

Returns:

Callable : function that quantizes the input

method `fit`

method `get_params`

Get parameters for this estimator.

Args:

deep (bool): If True, will return the parameters for this estimator and contained subobjects that are estimators.
**kwargs: any additional parameters to pass to the sklearn BaseEstimator class

Returns:

params : dict, Parameter names mapped to their values.

method `get_params_for_benchmark`

Get parameters for benchmark when cloning a skorch wrapped NN.

Returns:

params (dict): parameters to create an equivalent fp32 sklearn estimator for benchmark

method `infer`

Perform a single inference step on a batch of data.

This method is specific to Skorch estimators.

Args:

x (torch.Tensor): A batch of the input data, produced by a Dataset
**fit_params (dict) : Additional parameters passed to the forward method of the module and to the self.train_split call.

Returns: A torch tensor with the inference results for each item in the input

method `on_train_end`

Call back when training is finished by the skorch wrapper.

Check if the underlying neural net has a callback for this event and, if so, call it.

Args:

net: estimator for which training has ended (equal to self)
X: data
y

__init__(
    input_dim,
    n_layers,
    n_outputs,
    n_hidden_neurons_multiplier=4,
    n_w_bits=3,
    n_a_bits=3,
    n_accum_bits=8,
    activation_function=<class 'torch.nn.modules.activation.ReLU'>,
    quant_narrow=False,
    quant_signed=True
)

concrete.ml.pytest.torch_models.md

x: the input of the NN

Returns: the output of the NN

class `NetWithLoops`

Torch model, where we reuse some elements in a loop.

Torch model, where we reuse some elements in a loop in the forward and don't expect the user to define these elements in a particular order.

method `init`

method `forward`

Forward pass.

Args:

x: the input of the NN

Returns: the output of the NN

class `MultiInputNN`

Torch model to test multiple inputs forward.

method `init`

method `forward`

Forward pass.

Args:

x: the first input of the NN
y: the second input of the NN

Returns: the output of the NN

class `BranchingModule`

Torch model with some branching and skip connections.

method `init`

method `forward`

Forward pass.

Args:

x: the input of the NN

Returns: the output of the NN

class `BranchingGemmModule`

Torch model with some branching and skip connections.

method `init`

method `forward`

Forward pass.

Args:

x: the input of the NN

Returns: the output of the NN

class `UnivariateModule`

Torch model that calls univariate and shape functions of torch.

method `init`

method `forward`

Forward pass.

Args:

x: the input of the NN

Returns: the output of the NN

class `StepActivationModule`

Torch model implements a step function that needs Greater, Cast and Where.

method `init`

method `forward`

Forward pass with a quantizer built into the computation graph.

Args:

x: the input of the NN

Returns: the output of the NN

class `NetWithConcatUnsqueeze`

Torch model to test the concat and unsqueeze operators.

method `init`

method `forward`

Forward pass.

Args:

x: the input of the NN

Returns: the output of the NN

class `MultiOpOnSingleInputConvNN`

Network that applies two quantized operations on a single input.

method `init`

method `forward`

Forward pass.

Args:

x: the input of the NN

Returns: the output of the NN

class `FCSeq`

Torch model that should generate MatMul->Add ONNX patterns.

This network generates additions with a constant scalar

method `init`

method `forward`

Forward pass.

Args:

x: the input of the NN

Returns: the output of the NN

class `FCSeqAddBiasVec`

Torch model that should generate MatMul->Add ONNX patterns.

This network tests the addition with a constant vector

method `init`

method `forward`

Forward pass.

Args:

x: the input of the NN

Returns: the output of the NN

class `TinyCNN`

A very small CNN.

method `init`

Create the tiny CNN with two conv layers.

Args:

n_classes: number of classes
act: the activation

method `forward`

Forward the two layers with the chosen activation function.

Args:

x: the input of the NN

Returns: the output of the NN

class `TinyQATCNN`

A very small QAT CNN to classify the sklearn digits dataset.

This class also allows pruning to a maximum of 10 active neurons, which should help keep the accumulator bit-width low.

method `init`

Construct the CNN with a configurable number of classes.

Args:

n_classes (int): number of outputs of the neural net
n_bits (int): number of weight and activation bits for quantization
n_active

method `forward`

Run inference on the tiny CNN, apply the decision layer on the reshaped conv output.

Args:

x: the input to the NN

Returns: the output of the NN

method `test_torch`

Test the network: measure accuracy on the test set.

Args:

test_loader: the test loader

Returns:

res: the number of correctly classified test examples

method `toggle_pruning`

Enable or remove pruning.

Args:

enable: if we enable the pruning or not

class `SimpleQAT`

Torch model implements a step function that needs Greater, Cast and Where.

method `init`

method `forward`

Forward pass with a quantizer built into the computation graph.

Args:

x: the input of the NN

Returns: the output of the NN

class `QATTestModule`

Torch model that implements a simple non-uniform quantizer.

method `init`

method `forward`

Forward pass with a quantizer built into the computation graph.

Args:

x: the input of the NN

Returns: the output of the NN

class `SingleMixNet`

Torch model that with a single conv layer that produces the output, e.g. a blur filter.

method `init`

method `forward`

Execute the single convolution.

Args:

x: the input of the NN

Returns: the output of the NN

class `TorchSum`

Torch model to test the ReduceSum ONNX operator in a leveled circuit.

method `init`

Initialize the module.

Args:

dim (Tuple[int]): The axis along which the sum should be executed
keepdim (bool): If the output should keep the same dimension as the input or not

method `forward`

Forward pass.

Args:

x (torch.tensor): The input of the model

Returns:

torch_sum (torch.tensor): The sum of the input's tensor elements along the given axis

class `TorchSumMod`

Torch model to test the ReduceSum ONNX operator in a circuit containing a PBS.

method `init`

Initialize the module.

Args:

dim (Tuple[int]): The axis along which the sum should be executed
keepdim (bool): If the output should keep the same dimension as the input or not

method `forward`

Forward pass.

Args:

x (torch.tensor): The input of the model

Returns:

torch_sum (torch.tensor): The sum of the input's tensor elements along the given axis

concrete.ml.sklearn.base.md

module `concrete.ml.sklearn.base`

Module that contains base classes for our libraries estimators.

Global Variables

OPSET_VERSION_FOR_ONNX_EXPORT

function `get_sklearn_models`

Return the list of available models in Concrete-ML.

Returns: the lists of models in Concrete-ML

function `get_sklearn_linear_models`

Return the list of available linear models in Concrete-ML.

Args:

classifier (bool): whether you want classifiers or not
regressor (bool): whether you want regressors or not
str_in_class_name

Returns: the lists of linear models in Concrete-ML

function `get_sklearn_tree_models`

Return the list of available tree models in Concrete-ML.

Args:

classifier (bool): whether you want classifiers or not
regressor (bool): whether you want regressors or not
str_in_class_name

Returns: the lists of tree models in Concrete-ML

function `get_sklearn_neural_net_models`

Return the list of available neural net models in Concrete-ML.

Args:

classifier (bool): whether you want classifiers or not
regressor (bool): whether you want regressors or not
str_in_class_name

Returns: the lists of neural net models in Concrete-ML

class `QuantizedTorchEstimatorMixin`

Mixin that provides quantization for a torch module and follows the Estimator API.

This class should be mixed in with another that provides the full Estimator API. This class only provides modifiers for .fit() (with quantization) and .predict() (optionally in FHE)

method `init`

property base_estimator_type

Get the sklearn estimator that should be trained by the child class.

property base_module_to_compile

Get the Torch module that should be compiled to FHE.

property fhe_circuit

Get the FHE circuit.

Returns:

Circuit: the FHE circuit

property input_quantizers

Get the input quantizers.

Returns:

List[Quantizer]: the input quantizers

property n_bits_quant

Get the number of quantization bits.

property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

_onnx_model_ (onnx.ModelProto): the ONNX model

property output_quantizers

Get the input quantizers.

Returns:

List[QuantizedArray]: the input quantizers

property quantize_input

Get the input quantization function.

Returns:

Callable : function that quantizes the input

method `compile`

Compile the model.

Args:

X (numpy.ndarray): the dequantized dataset
configuration (Optional[Configuration]): the options for compilation
compilation_artifacts

Returns:

Circuit: the compiled Circuit.

Raises:

ValueError: if called before the model is trained

method `fit`

Initialize and fit the module.

If the module was already initialized, by calling fit, the module will be re-initialized (unless warm_start is True). In addition to the torch training step, this method performs quantization of the trained torch model.

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): labels associated with training data
**fit_params

Returns:

self: the trained quantized estimator

method `fit_benchmark`

Fit the quantized estimator as well as its equivalent float estimator.

This function returns both the quantized estimator (itself) as well as its non-quantized (float) equivalent, which are both trained separately. This is useful in order to compare performances between quantized and fp32 versions.

Args:

X : The training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): The labels associated with the training data
*args

Returns:

self: The trained quantized estimator
fp32_model: The trained float equivalent estimator

method `get_params_for_benchmark`

Get the parameters to instantiate the sklearn estimator trained by the child class.

Returns:

params (dict): dictionary with parameters that will initialize a new Estimator

method `post_processing`

Post-processing the output.

Args:

y_preds (numpy.ndarray): the output to post-process

Raises:

ValueError: if unknown post-processing function

Returns:

numpy.ndarray: the post-processed output

method `predict`

Predict on user provided data.

Predicts using the quantized clear or FHE classifier

Args:

X : input data, a numpy array of raw values (non quantized)
execute_in_fhe : whether to execute the inference in FHE or in the clear

Returns:

y_pred : numpy ndarray with predictions

method `predict_proba`

Predict on user provided data, returning probabilities.

Predicts using the quantized clear or FHE classifier

Args:

X : input data, a numpy array of raw values (non quantized)
execute_in_fhe : whether to execute the inference in FHE or in the clear

Returns:

y_pred : numpy ndarray with probabilities (if applicable)

Raises:

ValueError: if the estimator was not yet trained or compiled

class `BaseTreeEstimatorMixin`

Mixin class for tree-based estimators.

A place to share methods that are used on all tree-based estimators.

method `init`

Initialize the TreeBasedEstimatorMixin.

Args:

n_bits (int): number of bits used for quantization

property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

onnx.ModelProto: the ONNX model

method `compile`

Compile the model.

Args:

X (numpy.ndarray): the dequantized dataset
configuration (Optional[Configuration]): the options for compilation
compilation_artifacts

Returns:

Circuit: the compiled Circuit.

method `dequantize_output`

Dequantize the integer predictions.

Args:

y_preds (numpy.ndarray): the predictions

Returns: the dequantized predictions

method `fit_benchmark`

Fit the sklearn tree-based model and the FHE tree-based model.

Args:

X (numpy.ndarray): The input data.
y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.
*args

Returns: Tuple[ConcreteEstimators, SklearnEstimators]: The FHE and sklearn tree-based models.

method `quantize_input`

Quantize the input.

Args:

X (numpy.ndarray): the input

Returns: the quantized input

class `BaseTreeRegressorMixin`

Mixin class for tree-based regressors.

A place to share methods that are used on all tree-based regressors.

method `init`

Initialize the TreeBasedEstimatorMixin.

Args:

n_bits (int): number of bits used for quantization

property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

onnx.ModelProto: the ONNX model

method `compile`

Compile the model.

Args:

X (numpy.ndarray): the dequantized dataset
configuration (Optional[Configuration]): the options for compilation
compilation_artifacts

Returns:

Circuit: the compiled Circuit.

method `dequantize_output`

Dequantize the integer predictions.

Args:

y_preds (numpy.ndarray): the predictions

Returns: the dequantized predictions

method `fit`

Fit the tree-based estimator.

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): The target data.
**kwargs

Returns:

Any: The fitted model.

method `fit_benchmark`

Fit the sklearn tree-based model and the FHE tree-based model.

Args:

X (numpy.ndarray): The input data.
y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.
*args

Returns: Tuple[ConcreteEstimators, SklearnEstimators]: The FHE and sklearn tree-based models.

method `post_processing`

Apply post-processing to the predictions.

Args:

y_preds (numpy.ndarray): The predictions.

Returns:

numpy.ndarray: The post-processed predictions.

method `predict`

Predict the probability.

Args:

X (numpy.ndarray): The input data.
execute_in_fhe (bool): Whether to execute in FHE. Defaults to False.

Returns:

numpy.ndarray: The predicted probabilities.

method `quantize_input`

Quantize the input.

Args:

X (numpy.ndarray): the input

Returns: the quantized input

class `BaseTreeClassifierMixin`

Mixin class for tree-based classifiers.

A place to share methods that are used on all tree-based classifiers.

method `init`

Initialize the TreeBasedEstimatorMixin.

Args:

n_bits (int): number of bits used for quantization

property onnx_model

Get the ONNX model.

.. # noqa: DAR201

Returns:

onnx.ModelProto: the ONNX model

method `compile`

Compile the model.

Args:

X (numpy.ndarray): the dequantized dataset
configuration (Optional[Configuration]): the options for compilation
compilation_artifacts

Returns:

Circuit: the compiled Circuit.

method `dequantize_output`

Dequantize the integer predictions.

Args:

y_preds (numpy.ndarray): the predictions

Returns: the dequantized predictions

method `fit`

Fit the tree-based estimator.

Args:

X : training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): The target data.
**kwargs

Returns:

Any: The fitted model.

method `fit_benchmark`

Fit the sklearn tree-based model and the FHE tree-based model.

Args:

X (numpy.ndarray): The input data.
y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.
*args

Returns: Tuple[ConcreteEstimators, SklearnEstimators]: The FHE and sklearn tree-based models.

method `post_processing`

Apply post-processing to the predictions.

Args:

y_preds (numpy.ndarray): The predictions.

Returns:

numpy.ndarray: The post-processed predictions.

method `predict`

Predict the class with highest probability.

Args:

X (numpy.ndarray): The input data.
execute_in_fhe (bool): Whether to execute in FHE. Defaults to False.

Returns:

numpy.ndarray: The predicted target values.

method `predict_proba`

Predict the probability.

Args:

X (numpy.ndarray): The input data.
execute_in_fhe (bool): Whether to execute in FHE. Defaults to False.

Returns:

numpy.ndarray: The predicted probabilities.

method `quantize_input`

Quantize the input.

Args:

X (numpy.ndarray): the input

Returns: the quantized input

class `SklearnLinearModelMixin`

A Mixin class for sklearn linear models with FHE.

method `init`

Initialize the FHE linear model.

Args:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.
*args: The arguments to pass to the sklearn linear model.

method `clean_graph`

Clean the graph of the onnx model.

This will remove the Cast node in the model's onnx.graph since they have no use in quantized or FHE models.

method `compile`

Compile the FHE linear model.

Args:

X (numpy.ndarray): The input data.
configuration (Optional[Configuration]): Configuration object to use during compilation
compilation_artifacts

Returns:

Circuit: The compiled Circuit.

method `dequantize_output`

Dequantize the output.

Args:

q_y_preds (numpy.ndarray): The quantized output to dequantize

Returns:

numpy.ndarray: The dequantized output

method `fit`

Fit the FHE linear model.

Args:

X : Training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): The target data.
*args

Returns: Any

method `fit_benchmark`

Fit the sklearn linear model and the FHE linear model.

Args:

X (numpy.ndarray): The input data.
y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.
*args

Returns: Tuple[SklearnLinearModelMixin, sklearn.linear_model.LinearRegression]: The FHE and sklearn LinearRegression.

method `post_processing`

Post-processing the quantized output.

For linear models, post-processing only considers a dequantization step.

Args:

y_preds (numpy.ndarray): The quantized outputs to post-process

Returns:

numpy.ndarray: The post-processed output

method `predict`

Predict on user data.

Predict on user data using either the quantized clear model, implemented with tensors, or, if execute_in_fhe is set, using the compiled FHE circuit

Args:

X (numpy.ndarray): The input data
execute_in_fhe (bool): Whether to execute the inference in FHE

Returns:

numpy.ndarray: The prediction as ordinals

method `quantize_input`

Quantize the input.

Args:

X (numpy.ndarray): The input to quantize

Returns:

numpy.ndarray: The quantized input

class `SklearnLinearClassifierMixin`

A Mixin class for sklearn linear classifiers with FHE.

method `init`

Initialize the FHE linear model.

Args:

n_bits (int, Dict[str, int]): Number of bits to quantize the model. If an int is passed for n_bits, the value will be used for quantizing inputs and weights. If a dict is passed, then it should contain "op_inputs" and "op_weights" as keys with corresponding number of quantization bits so that: - op_inputs : number of bits to quantize the input values - op_weights: number of bits to quantize the learned parameters Default to 8.
*args: The arguments to pass to the sklearn linear model.

method `clean_graph`

Clean the graph of the onnx model.

Any operators following gemm, including the sigmoid, softmax and argmax operators, are removed from the graph. They will be executed in clear in the post-processing method.

method `compile`

Compile the FHE linear model.

Args:

X (numpy.ndarray): The input data.
configuration (Optional[Configuration]): Configuration object to use during compilation
compilation_artifacts

Returns:

Circuit: The compiled Circuit.

method `decision_function`

Predict confidence scores for samples.

Args:

X (numpy.ndarray): Samples to predict.
execute_in_fhe (bool): If True, the inference will be executed in FHE. Default to False.

Returns:

numpy.ndarray: Confidence scores for samples.

method `dequantize_output`

Dequantize the output.

Args:

q_y_preds (numpy.ndarray): The quantized output to dequantize

Returns:

numpy.ndarray: The dequantized output

method `fit`

Fit the FHE linear model.

Args:

X : Training data By default, you should be able to pass: * numpy arrays * torch tensors * pandas DataFrame or Series
y (numpy.ndarray): The target data.
*args

Returns: Any

method `fit_benchmark`

Fit the sklearn linear model and the FHE linear model.

Args:

X (numpy.ndarray): The input data.
y (numpy.ndarray): The target data. random_state (Optional[Union[int, numpy.random.RandomState, None]]): The random state. Defaults to None.
*args

Returns: Tuple[SklearnLinearModelMixin, sklearn.linear_model.LinearRegression]: The FHE and sklearn LinearRegression.

method `post_processing`

Post-processing the predictions.

This step may include a dequantization of the inputs if not done previously, in particular within the client-server workflow.

Args:

y_preds (numpy.ndarray): The predictions to post-process.
already_dequantized (bool): Whether the inputs were already dequantized or not. Default to False.

Returns:

numpy.ndarray: The post-processed predictions.

method `predict`

Predict on user data.

Predict on user data using either the quantized clear model, implemented with tensors, or, if execute_in_fhe is set, using the compiled FHE circuit.

Args:

X (numpy.ndarray): Samples to predict.
execute_in_fhe (bool): If True, the inference will be executed in FHE. Default to False.

Returns:

numpy.ndarray: The prediction as ordinals.

method `predict_proba`

Predict class probabilities for samples.

Args:

X (numpy.ndarray): Samples to predict.
execute_in_fhe (bool): If True, the inference will be executed in FHE. Default to False.

Returns:

numpy.ndarray: Class probabilities for samples.

method `quantize_input`

Quantize the input.

Args:

X (numpy.ndarray): The input to quantize

Returns:

numpy.ndarray: The quantized input

**kwargs: The keyword arguments to pass to the sklearn linear model.

concrete.ml.onnx.ops_impl.md

module `concrete.ml.onnx.ops_impl`

ONNX ops implementation in python + numpy.

function `cast_to_float`

Cast values to floating points.

Args:

inputs (Tuple[numpy.ndarray]): The values to consider.

Returns:

Tuple[numpy.ndarray]: The float values.

function `onnx_func_raw_args`

Decorate a numpy onnx function to flag the raw/non quantized inputs.

Args:

*args (tuple[Any]): function argument names

Returns:

result (ONNXMixedFunction): wrapped numpy function with a list of mixed arguments

function `numpy_where_body`

Compute the equivalent of numpy.where.

This function is not mapped to any ONNX operator (as opposed to numpy_where). It is usable by functions which are mapped to ONNX operators, e.g. numpy_div or numpy_where.

Args:

c (numpy.ndarray): Condition operand.
t (numpy.ndarray): True operand.
f

Returns:

numpy.ndarray: numpy.where(c, t, f)

function `numpy_where`

Compute the equivalent of numpy.where.

Args:

c (numpy.ndarray): Condition operand.
t (numpy.ndarray): True operand.
f

Returns:

numpy.ndarray: numpy.where(c, t, f)

function `numpy_add`

Compute add in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Add-13

Args:

a (numpy.ndarray): First operand.
b (numpy.ndarray): Second operand.

Returns:

Tuple[numpy.ndarray]: Result, has same element type as two inputs

function `numpy_constant`

Return the constant passed as a kwarg.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Constant-13

Args:

**kwargs: keyword arguments

Returns:

Any: The stored constant.

function `numpy_matmul`

Compute matmul in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#MatMul-13

Args:

a (numpy.ndarray): N-dimensional matrix A
b (numpy.ndarray): N-dimensional matrix B

Returns:

Tuple[numpy.ndarray]: Matrix multiply results from A * B

function `numpy_relu`

Compute relu in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Relu-14

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_sigmoid`

Compute sigmoid in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Sigmoid-13

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_softmax`

Compute softmax in numpy according to ONNX spec.

Softmax is currently not supported in FHE.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#softmax-13

Args:

x (numpy.ndarray): Input tensor
axis (None, int, tuple of int): Axis or axes along which a softmax's sum is performed. If None, it will sum all of the elements of the input array. If axis is negative it counts from the last to the first axis. Default to 1.

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_cos`

Compute cos in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Cos-7

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_cosh`

Compute cosh in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Cosh-9

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_sin`

Compute sin in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Sin-7

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_sinh`

Compute sinh in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Sinh-9

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_tan`

Compute tan in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Tan-7

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_tanh`

Compute tanh in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Tanh-13

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_acos`

Compute acos in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Acos-7

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_acosh`

Compute acosh in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Acosh-9

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_asin`

Compute asin in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Asin-7

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_asinh`

Compute sinh in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Asinh-9

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_atan`

Compute atan in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Atan-7

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_atanh`

Compute atanh in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Atanh-9

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_elu`

Compute elu in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Elu-6

Args:

x (numpy.ndarray): Input tensor
alpha (float): Coefficient

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_selu`

Compute selu in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Selu-6

Args:

x (numpy.ndarray): Input tensor
alpha (float): Coefficient
gamma

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_celu`

Compute celu in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Celu-12

Args:

x (numpy.ndarray): Input tensor
alpha (float): Coefficient

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_leakyrelu`

Compute leakyrelu in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#LeakyRelu-6

Args:

x (numpy.ndarray): Input tensor
alpha (float): Coefficient

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_thresholdedrelu`

Compute thresholdedrelu in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#ThresholdedRelu-10

Args:

x (numpy.ndarray): Input tensor
alpha (float): Coefficient

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_hardsigmoid`

Compute hardsigmoid in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#HardSigmoid-6

Args:

x (numpy.ndarray): Input tensor
alpha (float): Coefficient
beta

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_softplus`

Compute softplus in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Softplus-1

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_abs`

Compute abs in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Abs-13

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_div`

Compute div in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Div-14

Args:

a (numpy.ndarray): Input tensor
b (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_mul`

Compute mul in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Mul-14

Args:

a (numpy.ndarray): Input tensor
b (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_sub`

Compute sub in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Sub-14

Args:

a (numpy.ndarray): Input tensor
b (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_log`

Compute log in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Log-13

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_erf`

Compute erf in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Erf-13

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_hardswish`

Compute hardswish in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#hardswish-14

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_exp`

Compute exponential in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Exp-13

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: The exponential of the input tensor computed element-wise

function `numpy_equal`

Compute equal in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Equal-11

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_not`

Compute not in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Not-1

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_not_float`

Compute not in numpy according to ONNX spec and cast outputs to floats.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Not-1

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_greater`

Compute greater in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Greater-13

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_greater_float`

Compute greater in numpy according to ONNX spec and cast outputs to floats.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Greater-13

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_greater_or_equal`

Compute greater or equal in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#GreaterOrEqual-12

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_greater_or_equal_float`

Compute greater or equal in numpy according to ONNX specs and cast outputs to floats.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#GreaterOrEqual-12

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_less`

Compute less in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Less-13

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_less_float`

Compute less in numpy according to ONNX spec and cast outputs to floats.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Less-13

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_less_or_equal`

Compute less or equal in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#LessOrEqual-12

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_less_or_equal_float`

Compute less or equal in numpy according to ONNX spec and cast outputs to floats.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#LessOrEqual-12

Args:

x (numpy.ndarray): Input tensor
y (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_identity`

Compute identity in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Identity-14

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_transpose`

Transpose in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Transpose-13

Args:

x (numpy.ndarray): Input tensor
perm (numpy.ndarray): Permutation of the axes

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_avgpool`

Compute Average Pooling using Torch.

Currently supports 2d average pooling with torch semantics. This function is ONNX compatible.

See: https://github.com/onnx/onnx/blob/main/docs/Operators.md#AveragePool

Args:

x (numpy.ndarray): input data (many dtypes are supported). Shape is N x C x H x W for 2d
ceil_mode (int): ONNX rounding parameter, expected 0 (torch style dimension computation)
kernel_shape

Returns:

res (numpy.ndarray): a tensor of size (N x InChannels x OutHeight x OutWidth).
See https: //pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html

Raises:

AssertionError: if the pooling arguments are wrong

function `numpy_maxpool`

Compute Max Pooling using Torch.

Currently supports 2d max pooling with torch semantics. This function is ONNX compatible.

See: https://github.com/onnx/onnx/blob/main/docs/Operators.md#MaxPool

Args:

x (numpy.ndarray): the input
kernel_shape (Union[Tuple[int, ...], List[int]]): shape of the kernel
strides

Returns:

res (numpy.ndarray): a tensor of size (N x InChannels x OutHeight x OutWidth).
See https: //pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html

function `numpy_cast`

Execute ONNX cast in Numpy.

Supports only booleans for now, which are converted to integers.

See: https://github.com/onnx/onnx/blob/main/docs/Operators.md#Cast

Args:

data (numpy.ndarray): Input encrypted tensor
to (int): integer value of the onnx.TensorProto DataType enum

Returns:

result (numpy.ndarray): a tensor with the required data type

function `numpy_batchnorm`

Compute the batch normalization of the input tensor.

This can be expressed as:

Y = (X - input_mean) / sqrt(input_var + epsilon) * scale + B

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#BatchNormalization-14

Args:

x (numpy.ndarray): tensor to normalize, dimensions are in the form of (N,C,D1,D2,...,Dn), where N is the batch size, C is the number of channels.
scale (numpy.ndarray): scale tensor of shape (C,)
bias

Returns:

numpy.ndarray: Normalized tensor

function `numpy_flatten`

Flatten a tensor into a 2d array.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Flatten-13.

Args:

x (numpy.ndarray): tensor to flatten
axis (int): axis after which all dimensions will be flattened (axis=0 gives a 1D output)

Returns:

result: flattened tensor

function `numpy_or`

Compute or in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Or-7

Args:

a (numpy.ndarray): Input tensor
b (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_or_float`

Compute or in numpy according to ONNX spec and cast outputs to floats.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Or-7

Args:

a (numpy.ndarray): Input tensor
b (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_round`

Compute round in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Round-11 Remark that ONNX Round operator is actually a rint, since the number of decimals is forced to be 0

Args:

a (numpy.ndarray): Input tensor whose elements to be rounded.

Returns:

Tuple[numpy.ndarray]: Output tensor with rounded input elements.

function `numpy_pow`

Compute pow in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Pow-13

Args:

a (numpy.ndarray): Input tensor whose elements to be raised.
b (numpy.ndarray): The power to which we want to raise.

Returns:

Tuple[numpy.ndarray]: Output tensor.

function `numpy_floor`

Compute Floor in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Floor-1

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_max`

Compute Max in numpy according to ONNX spec.

Computes the max between the first input and a float constant.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Max-1

Args:

a (numpy.ndarray): Input tensor
b (numpy.ndarray): Constant tensor to compare to the first input

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_min`

Compute Min in numpy according to ONNX spec.

Computes the minimum between the first input and a float constant.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Max-1

Args:

a (numpy.ndarray): Input tensor
b (numpy.ndarray): Constant tensor to compare to the first input

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_sign`

Compute Sign in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Sign-9

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_neg`

Compute Negative in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#Sign-9

Args:

x (numpy.ndarray): Input tensor

Returns:

Tuple[numpy.ndarray]: Output tensor

function `numpy_concatenate`

Apply concatenate in numpy according to ONNX spec.

See https://github.com/onnx/onnx/blob/main/docs/Changelog.md#concat-13

Args:

*x (numpy.ndarray): Input tensors to be concatenated.
axis (int): Which axis to concat on.

Returns:

Tuple[numpy.ndarray]: Output tensor.

class `ONNXMixedFunction`

A mixed quantized-raw valued onnx function.

ONNX functions will take inputs which can be either quantized or float. Some functions only take quantized inputs, but some functions take both types. For mixed functions we need to tag the parameters that do not need quantization. Thus quantized ops can know which inputs are not QuantizedArray and we avoid unnecessary wrapping of float values as QuantizedArrays.

method `init`

Create the mixed function and raw parameter list.

Args:

function (Any): function to be decorated
non_quant_params: Set[str]: set of parameters that will not be quantized (stored as numpy.ndarray)

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

compile(
    X: ndarray,
    configuration: Optional[Configuration] = None,
    compilation_artifacts: Optional[DebugArtifacts] = None,
    show_mlir: bool = False,
    use_virtual_lib: bool = False,
    p_error: Optional[float] = None,
    global_p_error: Optional[float] = None,
    verbose_compilation: bool = False
) → Circuit

keepdims

numpy_maxpool(
    x: ndarray,
    kernel_shape: Tuple[int, ],
    strides: Tuple[int, ] = None,
    auto_pad: str = 'NOTSET',
    pads: Tuple[int, ] = None,
    dilations: Optional[Tuple[int, ], List[int]] = None,
    ceil_mode: int = 0,
    storage_order: int = 0
) → Tuple[ndarray]

numpy_batchnorm(
    x: ndarray,
    scale: ndarray,
    bias: ndarray,
    input_mean: ndarray,
    input_var: ndarray,
    epsilon=1e-05,
    momentum=0.9,
    training_mode=0
) → Tuple[ndarray]

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: QuantizationOptions = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: Optional[QuantizationOptions] = None,
    **attrs
) → None

__init__(
    n_bits_output: int,
    int_input_names: Set[str] = None,
    constant_inputs: Optional[Dict[str, Any], Dict[int, Any]] = None,
    input_quant_opts: Optional[QuantizationOptions] = None,
    **attrs
) → None

0.6

What is Concrete ML?

hashtagExample usage

hashtagCurrent limitations

hashtagConcrete stack

hashtagOnline demos and tutorials

hashtagAdditional resources

hashtagLooking for support? Ask our team!

Getting Started

Installation

Inference in the Cloud

Demos and Tutorials

Built-in Models

Pandas

hashtagExample

Deep Learning

Deep Learning Examples

hashtagFHE constraints considerations

Advanced topics

Production Deployment

hashtagDeployment

Developer Guide

Workflow

Set Up Docker

hashtagBuilding the image

Documentation

hashtagUsing GitBook

hashtagUsing Sphinx

Support and Issues

hashtagSubmitting an issue

Inner Workings

concrete.ml.common.check_inputs.md

hashtagmodule concrete.ml.common.check_inputs

concrete.ml.common.debugging.custom_assert.md

hashtagmodule concrete.ml.common.debugging.custom_assert

concrete.ml.common.debugging.md

hashtagmodule concrete.ml.common.debugging

concrete.ml.common.md

hashtagmodule concrete.ml.common

hashtagGlobal Variables

concrete.ml.deployment.md

hashtagmodule concrete.ml.deployment

concrete.ml.onnx.convert.md

hashtagmodule concrete.ml.onnx.convert

hashtagGlobal Variables

hashtagfunction get_equivalent_numpy_forward_and_onnx_model

hashtagfunction get_equivalent_numpy_forward

concrete.ml.onnx.md

hashtagmodule concrete.ml.onnx

hashtagGlobal Variables

concrete.ml.onnx.onnx_utils.md

hashtagmodule concrete.ml.onnx.onnx_utils

concrete.ml.pytest.md

hashtagmodule concrete.ml.pytest

hashtagGlobal Variables

concrete.ml.quantization.md

hashtagmodule concrete.ml.quantization

hashtagGlobal Variables

concrete.ml.sklearn.md

hashtagmodule concrete.ml.sklearn

concrete.ml.sklearn.svm.md

hashtagmodule concrete.ml.sklearn.svm

concrete.ml.sklearn.torch_modules.md

hashtagmodule concrete.ml.sklearn.torch_modules

concrete.ml.sklearn.tree_to_numpy.md

hashtagmodule concrete.ml.sklearn.tree_to_numpy

concrete.ml.torch.md

hashtagmodule concrete.ml.torch

hashtagGlobal Variables

concrete.ml.torch.numpy_module.md

hashtagmodule concrete.ml.torch.numpy_module

concrete.ml.version.md

hashtagmodule concrete.ml.version

What is Concrete ML?

hashtagExample usage

hashtagCurrent limitations

hashtagConcrete stack

hashtagOnline demos and tutorials

hashtagAdditional resources

hashtagLooking for support? Ask our team!

Example usage

Current limitations

Concrete stack

Online demos and tutorials

Additional resources

Looking for support? Ask our team!

Example

FHE constraints considerations

Deployment

Building the image

Using GitBook

Using Sphinx

Submitting an issue

module `concrete.ml.common.check_inputs`

module `concrete.ml.common.debugging.custom_assert`

module `concrete.ml.common.debugging`

module `concrete.ml.common`

Global Variables

module `concrete.ml.deployment`

module `concrete.ml.onnx.convert`

Global Variables

function `get_equivalent_numpy_forward_and_onnx_model`

function `get_equivalent_numpy_forward`

module `concrete.ml.onnx`

Global Variables

module `concrete.ml.onnx.onnx_utils`

module `concrete.ml.pytest`

Global Variables

module `concrete.ml.quantization`

Global Variables

module `concrete.ml.sklearn`

module `concrete.ml.sklearn.svm`

module `concrete.ml.sklearn.torch_modules`

module `concrete.ml.sklearn.tree_to_numpy`

module `concrete.ml.torch`

Global Variables

module `concrete.ml.torch.numpy_module`

module `concrete.ml.version`

Example usage

Current limitations

Concrete stack

Online demos and tutorials

Additional resources

Looking for support? Ask our team!

Using PyPi

Requirements

Installation

Using Docker

module `concrete.ml.common.debugging`

Submitting an issue

Using GitBook

Using Sphinx

module `concrete.ml.pytest`

Global Variables

module `concrete.ml.onnx`

Global Variables

module `concrete.ml.quantization`

Global Variables

module `concrete.ml.torch`

Global Variables

module `concrete.ml.common`

Global Variables

module `concrete.ml.version`

module `concrete.ml.sklearn.torch_modules`

Deployment

Building the image

module `concrete.ml.sklearn`

module `concrete.ml.common.check_inputs`

module `concrete.ml.deployment`

FHE constraints considerations

Serving

Example notebook

function `check_array_and_assert`

function `check_X_y_and_assert`

List of Examples

1. Step-by-step guide to building a custom NN

2. Custom convolutional NN on the Digits data-set

Example

module `concrete.ml.onnx.convert`

Global Variables