Post-training Quantization with NNCF (new)

@sphinxdirective

.. toctree:: :maxdepth: 1 :hidden:

basic_quantization_flow quantization_w_accuracy_control

Neural Network Compression Framework (NNCF) provides a new post-training quantization API available in Python that is aimed at reusing the code for model training or validation that is usually available with the model in the source framework, for example, PyTorch or TensroFlow. The API is cross-framework and currently supports models representing in the following frameworks: PyTorch, TensorFlow 2.x, ONNX, and OpenVINO.

This API has two main capabilities to apply 8-bit post-training quantization:

:doc:Basic quantization <basic_quantization_flow> - the simplest quantization flow that allows to apply 8-bit integer quantization to the model.
:doc:Quantization with accuracy control <quantization_w_accuracy_control> - the most advanced quantization flow that allows to apply 8-bit quantization to the model with accuracy control.

Additional Resources ####################

NNCF GitHub <https://github.com/openvinotoolkit/nncf>__
:doc:Optimizing Models at Training Time <tmo_introduction>

@endsphinxdirective

1.2 KiB Raw Blame History

Post-training Quantization with NNCF (new)

1.2 KiB

Raw Blame History