# Nearest neighbors

This document introduces the nearest neighbors non-parametric classification models that Concrete ML provides with a scikit-learn interface through the `KNeighborsClassifier` class.

|                                                                            Concrete ML                                                                            | scikit-learn                                                                                                          |
| :---------------------------------------------------------------------------------------------------------------------------------------------------------------: | --------------------------------------------------------------------------------------------------------------------- |
| [KNeighborsClassifier](https://github.com/zama-ai/concrete-ml/blob/release/1.9.x/docs/references/api/concrete.ml.sklearn.neighbors.md#class-kneighborsclassifier) | [KNeighborsClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html) |

## Ciphertext format compatibility

These models only support *Concrete* ciphertexts. See [the ciphertexts format](/concrete-ml/get-started/concepts.md#ciphertext-formats) documentation for more details.

## Example

```python
from concrete.ml.sklearn import KNeighborsClassifier

concrete_classifier = KNeighborsClassifier(n_bits=2, n_neighbors=3)
```

## Quantization parameters

The `KNeighborsClassifier` class quantizes the training data-set provided to `.fit` using the specified number of bits (`n_bits`). To comply with [accumulator size constraints](/concrete-ml/get-started/concepts.md#model-accuracy-considerations-under-fhe-constraints), you must keep this value low. The model's accuracy will depend significantly on a well-chosen `n_bits` value and the dimensionality of the data.

The `predict` method of the `KNeighborsClassifier` performs the following steps:

1. Quantize the test vectors on clear data
2. Compute the top-k class indices of the closest training set vector on encrypted data
3. Vote for the top-k class labels to find the class for each test vector, performed on clear data

## Inference time considerations

The FHE inference latency of this model is heavily influenced by the `n_bits` and the dimensionality of the data. Additionally, the data-set size has a linear impact on the data complexity. The number of nearest neighbors (`n_neighbors`) also affects performance.

The KNN computation executes in FHE in $$O(Nlog^2k)$$ steps, where $$N$$ is the training data-set size and $$k$$ is `n_neighbors`. Each step requires several [PBS operations](/concrete-ml/get-started/concepts.md#cryptography-concepts), with their runtime affected by the factors listed above. These factors determine the precision needed to represent the distances between test vectors and training data-set vectors. The PBS input precision required by the circuit is related to the precision of the distance values.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.zama.org/concrete-ml/built-in-models/nearest-neighbors.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
