[POT] Update AccuracyAware doc (#8261)
* Update AA doc with GNA * Apply comments * Update doc with note
This commit is contained in:
parent
b69a371b8f
commit
2dadf50a68
@ -4,6 +4,11 @@
|
||||
AccuracyAware algorithm is designed to perform accurate 8-bit quantization and allows the model to stay in the
|
||||
pre-defined range of accuracy drop, for example 1%, defined by the user in the configuration file. This may cause a
|
||||
degradation in performance in comparison to [DefaultQuantization](../default/README.md) algorithm because some layers can be reverted back to the original precision.
|
||||
|
||||
> **NOTE**:
|
||||
In case of GNA `target_device`, POT moves INT8 weights to INT16 to stay in the pre-defined range of the accuracy drop. Thus, the algorithm works for the `performance` (INT8) preset only.
|
||||
For the `accuracy` preset, this algorithm is not performed, but the parameters tuning is available (if `tune_hyperparams` option is enabled).
|
||||
|
||||
Generally, the algorithm consists of the following steps:
|
||||
1. The model gets fully quantized using the DefaultQuantization algorithm.
|
||||
2. The quantized and full-precision models are compared on a subset of the validation set in order to find mismatches in the target accuracy metric. A ranking subset is extracted based on the mismatches.
|
||||
|
Loading…
Reference in New Issue
Block a user