# concrete.ml.quantization.post\_training.md

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L0)

## module `concrete.ml.quantization.post_training`

Post Training Quantization methods.

### **Global Variables**

* **ONNX\_OPS\_TO\_NUMPY\_IMPL**
* **DEFAULT\_MODEL\_BITS**
* **ONNX\_OPS\_TO\_QUANTIZED\_IMPL**

***

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L26)

### function `get_n_bits_dict`

```python
get_n_bits_dict(n_bits: Union[int, Dict[str, int]]) → Dict[str, int]
```

Convert the n\_bits parameter into a proper dictionary.

**Args:**

* `n_bits` (int, Dict\[str, int]): number of bits for quantization, can be a single value or a dictionary with the following keys : - "op\_inputs" and "op\_weights" (mandatory) - "model\_inputs" and "model\_outputs" (optional, default to 5 bits). When using a single integer for n\_bits, its value is assigned to "op\_inputs" and "op\_weights" bits. The maximum between this value and a default value (5) is then assigned to the number of "model\_inputs" "model\_outputs". This default value is a compromise between model accuracy and runtime performance in FHE. "model\_outputs" gives the precision of the final network's outputs, while "model\_inputs" gives the precision of the network's inputs. "op\_inputs" and "op\_weights" both control the quantization for inputs and weights of all layers.

**Returns:**

* `n_bits_dict` (Dict\[str, int]): A dictionary properly representing the number of bits to use for quantization.

***

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L92)

### class `ONNXConverter`

Base ONNX to Concrete ML computation graph conversion class.

This class provides a method to parse an ONNX graph and apply several transformations. First, it creates QuantizedOps for each ONNX graph op. These quantized ops have calibrated quantizers that are useful when the operators work on integer data or when the output of the ops is the output of the encrypted program. For operators that compute in float and will be merged to TLUs, these quantizers are not used. Second, this converter creates quantized tensors for initializer and weights stored in the graph.

This class should be sub-classed to provide specific calibration and quantization options depending on the usage (Post-training quantization vs Quantization Aware training).

**Arguments:**

* `n_bits` (int, Dict\[str, int]): number of bits for quantization, can be a single value or a dictionary with the following keys : - "op\_inputs" and "op\_weights" (mandatory) - "model\_inputs" and "model\_outputs" (optional, default to 5 bits). When using a single integer for n\_bits, its value is assigned to "op\_inputs" and "op\_weights" bits. The maximum between this value and a default value (5) is then assigned to the number of "model\_inputs" "model\_outputs". This default value is a compromise between model accuracy and runtime performance in FHE. "model\_outputs" gives the precision of the final network's outputs, while "model\_inputs" gives the precision of the network's inputs. "op\_inputs" and "op\_weights" both control the quantization for inputs and weights of all layers.
* `numpy_model` (NumpyModule): Model in numpy.
* `rounding_threshold_bits` (int): if not None, every accumulators in the model are rounded down to the given bits of precision

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L128)

#### method `__init__`

```python
__init__(
    n_bits: Union[int, Dict],
    numpy_model: NumpyModule,
    rounding_threshold_bits: Optional[int] = None
)
```

***

**property n\_bits\_model\_inputs**

Get the number of bits to use for the quantization of the first layer's output.

**Returns:**

* `n_bits` (int): number of bits for input quantization

***

**property n\_bits\_model\_outputs**

Get the number of bits to use for the quantization of the last layer's output.

**Returns:**

* `n_bits` (int): number of bits for output quantization

***

**property n\_bits\_op\_inputs**

Get the number of bits to use for the quantization of any operators' inputs.

**Returns:**

* `n_bits` (int): number of bits for the quantization of the operators' inputs

***

**property n\_bits\_op\_weights**

Get the number of bits to use for the quantization of any constants (usually weights).

**Returns:**

* `n_bits` (int): number of bits for quantizing constants used by operators

***

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L577)

#### method `quantize_module`

```python
quantize_module(*calibration_data: ndarray) → QuantizedModule
```

Quantize numpy module.

Following <https://arxiv.org/abs/1712.05877> guidelines.

**Args:**

* `*calibration_data (numpy.ndarray)`: Data that will be used to compute the bounds, scales and zero point values for every quantized object.

**Returns:**

* `QuantizedModule`: Quantized numpy module

***

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L719)

### class `PostTrainingAffineQuantization`

Post-training Affine Quantization.

Create the quantized version of the passed numpy module.

**Args:**

* `n_bits` (int, Dict): Number of bits to quantize the model. If an int is passed for n\_bits, the value will be used for activation, inputs and weights. If a dict is passed, then it should contain "model\_inputs", "op\_inputs", "op\_weights" and "model\_outputs" keys with corresponding number of quantization bits for: - model\_inputs : number of bits for model input - op\_inputs : number of bits to quantize layer input values - op\_weights: learned parameters or constants in the network - model\_outputs: final model output quantization bits
* `numpy_model` (NumpyModule): Model in numpy.
* `rounding_threshold_bits` (int): if not None, every accumulators in the model are rounded down to the given bits of precision
* `is_signed`: Whether the weights of the layers can be signed. Currently, only the weights can be signed.

**Returns:**

* `QuantizedModule`: A quantized version of the numpy model.

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L128)

#### method `__init__`

```python
__init__(
    n_bits: Union[int, Dict],
    numpy_model: NumpyModule,
    rounding_threshold_bits: Optional[int] = None
)
```

***

**property n\_bits\_model\_inputs**

Get the number of bits to use for the quantization of the first layer's output.

**Returns:**

* `n_bits` (int): number of bits for input quantization

***

**property n\_bits\_model\_outputs**

Get the number of bits to use for the quantization of the last layer's output.

**Returns:**

* `n_bits` (int): number of bits for output quantization

***

**property n\_bits\_op\_inputs**

Get the number of bits to use for the quantization of any operators' inputs.

**Returns:**

* `n_bits` (int): number of bits for the quantization of the operators' inputs

***

**property n\_bits\_op\_weights**

Get the number of bits to use for the quantization of any constants (usually weights).

**Returns:**

* `n_bits` (int): number of bits for quantizing constants used by operators

***

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L577)

#### method `quantize_module`

```python
quantize_module(*calibration_data: ndarray) → QuantizedModule
```

Quantize numpy module.

Following <https://arxiv.org/abs/1712.05877> guidelines.

**Args:**

* `*calibration_data (numpy.ndarray)`: Data that will be used to compute the bounds, scales and zero point values for every quantized object.

**Returns:**

* `QuantizedModule`: Quantized numpy module

***

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L863)

### class `PostTrainingQATImporter`

Converter of Quantization Aware Training networks.

This class provides specific configuration for QAT networks during ONNX network conversion to Concrete ML computation graphs.

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L128)

#### method `__init__`

```python
__init__(
    n_bits: Union[int, Dict],
    numpy_model: NumpyModule,
    rounding_threshold_bits: Optional[int] = None
)
```

***

**property n\_bits\_model\_inputs**

Get the number of bits to use for the quantization of the first layer's output.

**Returns:**

* `n_bits` (int): number of bits for input quantization

***

**property n\_bits\_model\_outputs**

Get the number of bits to use for the quantization of the last layer's output.

**Returns:**

* `n_bits` (int): number of bits for output quantization

***

**property n\_bits\_op\_inputs**

Get the number of bits to use for the quantization of any operators' inputs.

**Returns:**

* `n_bits` (int): number of bits for the quantization of the operators' inputs

***

**property n\_bits\_op\_weights**

Get the number of bits to use for the quantization of any constants (usually weights).

**Returns:**

* `n_bits` (int): number of bits for quantizing constants used by operators

***

[![](https://img.shields.io/badge/-source-cccccc?style=flat-square)](https://github.com/zama-ai/concrete-ml/blob/release/1.1.x/src/concrete/ml/quantization/post_training.py#L577)

#### method `quantize_module`

```python
quantize_module(*calibration_data: ndarray) → QuantizedModule
```

Quantize numpy module.

Following <https://arxiv.org/abs/1712.05877> guidelines.

**Args:**

* `*calibration_data (numpy.ndarray)`: Data that will be used to compute the bounds, scales and zero point values for every quantized object.

**Returns:**

* `QuantizedModule`: Quantized numpy module


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.zama.org/concrete-ml/1.1/developer-guide/api/concrete.ml.quantization.post_training.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
