DOCS shift to rst - Post-training Quantization with NNCF (#16631)
This commit is contained in:
parent
b0e6b1e83c
commit
712d1b99d1
@ -1,4 +1,4 @@
|
|||||||
# Basic Quantization Flow {#basic_qauntization_flow}
|
# Basic Quantization Flow {#basic_quantization_flow}
|
||||||
|
|
||||||
@sphinxdirective
|
@sphinxdirective
|
||||||
|
|
||||||
|
@ -6,17 +6,21 @@
|
|||||||
:maxdepth: 1
|
:maxdepth: 1
|
||||||
:hidden:
|
:hidden:
|
||||||
|
|
||||||
basic_qauntization_flow
|
basic_quantization_flow
|
||||||
quantization_w_accuracy_control
|
quantization_w_accuracy_control
|
||||||
|
|
||||||
@endsphinxdirective
|
|
||||||
|
|
||||||
Neural Network Compression Framework (NNCF) provides a new post-training quantization API available in Python that is aimed at reusing the code for model training or validation that is usually available with the model in the source framework, for example, PyTorch* or TensroFlow*. The API is cross-framework and currently supports models representing in the following frameworks: PyTorch, TensorFlow 2.x, ONNX, and OpenVINO.
|
Neural Network Compression Framework (NNCF) provides a new post-training quantization API available in Python that is aimed at reusing the code for model training or validation that is usually available with the model in the source framework, for example, PyTorch or TensroFlow. The API is cross-framework and currently supports models representing in the following frameworks: PyTorch, TensorFlow 2.x, ONNX, and OpenVINO.
|
||||||
|
|
||||||
This API has two main capabilities to apply 8-bit post-training quantization:
|
This API has two main capabilities to apply 8-bit post-training quantization:
|
||||||
* [Basic quantization](@ref basic_qauntization_flow) - the simplest quantization flow that allows to apply 8-bit integer quantization to the model.
|
|
||||||
* [Quantization with accuracy control](@ref quantization_w_accuracy_control) - the most advanced quantization flow that allows to apply 8-bit quantization to the model with accuracy control.
|
|
||||||
|
|
||||||
## See also
|
* :doc:`Basic quantization <basic_quantization_flow>` - the simplest quantization flow that allows to apply 8-bit integer quantization to the model.
|
||||||
|
* :doc:`Quantization with accuracy control <quantization_w_accuracy_control>` - the most advanced quantization flow that allows to apply 8-bit quantization to the model with accuracy control.
|
||||||
|
|
||||||
* [NNCF GitHub](https://github.com/openvinotoolkit/nncf)
|
Additional Resources
|
||||||
* [Optimizing Models at Training Time](@ref tmo_introduction)
|
####################
|
||||||
|
|
||||||
|
* `NNCF GitHub <https://github.com/openvinotoolkit/nncf>`__
|
||||||
|
* :doc:`Optimizing Models at Training Time <tmo_introduction>`
|
||||||
|
|
||||||
|
@endsphinxdirective
|
||||||
|
@ -5,12 +5,12 @@
|
|||||||
Introduction
|
Introduction
|
||||||
####################
|
####################
|
||||||
|
|
||||||
This is the advanced quantization flow that allows to apply 8-bit quantization to the model with control of accuracy metric. This is achieved by keeping the most impactful operations within the model in the original precision. The flow is based on the :doc:`Basic 8-bit quantization <basic_qauntization_flow>` and has the following differences:
|
This is the advanced quantization flow that allows to apply 8-bit quantization to the model with control of accuracy metric. This is achieved by keeping the most impactful operations within the model in the original precision. The flow is based on the :doc:`Basic 8-bit quantization <basic_quantization_flow>` and has the following differences:
|
||||||
|
|
||||||
* Beside the calibration dataset, a **validation dataset** is required to compute accuracy metric. They can refer to the same data in the simplest case.
|
* Beside the calibration dataset, a **validation dataset** is required to compute accuracy metric. They can refer to the same data in the simplest case.
|
||||||
* **Validation function**, used to compute accuracy metric is required. It can be a function that is already available in the source framework or a custom function.
|
* **Validation function**, used to compute accuracy metric is required. It can be a function that is already available in the source framework or a custom function.
|
||||||
* Since accuracy validation is run several times during the quantization process, quantization with accuracy control can take more time than the [Basic 8-bit quantization](@ref basic_qauntization_flow) flow.
|
* Since accuracy validation is run several times during the quantization process, quantization with accuracy control can take more time than the [Basic 8-bit quantization](@ref basic_quantization_flow) flow.
|
||||||
* The resulted model can provide smaller performance improvement than the :doc:`Basic 8-bit quantization <basic_qauntization_flow>` flow because some of the operations are kept in the original precision.
|
* The resulted model can provide smaller performance improvement than the :doc:`Basic 8-bit quantization <basic_quantization_flow>` flow because some of the operations are kept in the original precision.
|
||||||
|
|
||||||
.. note:: Currently, this flow is available only for models in OpenVINO representation.
|
.. note:: Currently, this flow is available only for models in OpenVINO representation.
|
||||||
|
|
||||||
@ -19,7 +19,7 @@ The steps for the quantization with accuracy control are described below.
|
|||||||
Prepare datasets
|
Prepare datasets
|
||||||
####################
|
####################
|
||||||
|
|
||||||
This step is similar to the :doc:`Basic 8-bit quantization <basic_qauntization_flow>` flow. The only difference is that two datasets, calibration and validation, are required.
|
This step is similar to the :doc:`Basic 8-bit quantization <basic_quantization_flow>` flow. The only difference is that two datasets, calibration and validation, are required.
|
||||||
|
|
||||||
.. tab:: OpenVINO
|
.. tab:: OpenVINO
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user