[POT] Update AccuracyAware doc (#8261)

* Update AA doc with GNA

* Apply comments

* Update doc with note
This commit is contained in:
Nikita Malinin 2021-11-02 13:31:51 +03:00 committed by GitHub
parent b69a371b8f
commit 2dadf50a68
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -4,6 +4,11 @@
AccuracyAware algorithm is designed to perform accurate 8-bit quantization and allows the model to stay in the
pre-defined range of accuracy drop, for example 1%, defined by the user in the configuration file. This may cause a
degradation in performance in comparison to [DefaultQuantization](../default/README.md) algorithm because some layers can be reverted back to the original precision.
> **NOTE**:
In case of GNA `target_device`, POT moves INT8 weights to INT16 to stay in the pre-defined range of the accuracy drop. Thus, the algorithm works for the `performance` (INT8) preset only.
For the `accuracy` preset, this algorithm is not performed, but the parameters tuning is available (if `tune_hyperparams` option is enabled).
Generally, the algorithm consists of the following steps:
1. The model gets fully quantized using the DefaultQuantization algorithm.
2. The quantized and full-precision models are compared on a subset of the validation set in order to find mismatches in the target accuracy metric. A ranking subset is extracted based on the mismatches.