[POT] Update AccuracyAware doc (#8261)
* Update AA doc with GNA * Apply comments * Update doc with note
This commit is contained in:
parent
b69a371b8f
commit
2dadf50a68
@ -4,6 +4,11 @@
|
|||||||
AccuracyAware algorithm is designed to perform accurate 8-bit quantization and allows the model to stay in the
|
AccuracyAware algorithm is designed to perform accurate 8-bit quantization and allows the model to stay in the
|
||||||
pre-defined range of accuracy drop, for example 1%, defined by the user in the configuration file. This may cause a
|
pre-defined range of accuracy drop, for example 1%, defined by the user in the configuration file. This may cause a
|
||||||
degradation in performance in comparison to [DefaultQuantization](../default/README.md) algorithm because some layers can be reverted back to the original precision.
|
degradation in performance in comparison to [DefaultQuantization](../default/README.md) algorithm because some layers can be reverted back to the original precision.
|
||||||
|
|
||||||
|
> **NOTE**:
|
||||||
|
In case of GNA `target_device`, POT moves INT8 weights to INT16 to stay in the pre-defined range of the accuracy drop. Thus, the algorithm works for the `performance` (INT8) preset only.
|
||||||
|
For the `accuracy` preset, this algorithm is not performed, but the parameters tuning is available (if `tune_hyperparams` option is enabled).
|
||||||
|
|
||||||
Generally, the algorithm consists of the following steps:
|
Generally, the algorithm consists of the following steps:
|
||||||
1. The model gets fully quantized using the DefaultQuantization algorithm.
|
1. The model gets fully quantized using the DefaultQuantization algorithm.
|
||||||
2. The quantized and full-precision models are compared on a subset of the validation set in order to find mismatches in the target accuracy metric. A ranking subset is extracted based on the mismatches.
|
2. The quantized and full-precision models are compared on a subset of the validation set in order to find mismatches in the target accuracy metric. A ranking subset is extracted based on the mismatches.
|
||||||
|
Loading…
Reference in New Issue
Block a user