POT documentation updates (#10578)

* POT changes

* change install

* change img size

* remove cli option
This commit is contained in:
Tatiana Savina
2022-03-06 09:14:39 +03:00
committed by GitHub
parent 41818a377f
commit de47a3b4a4
20 changed files with 76 additions and 126 deletions

View File

@@ -13,7 +13,7 @@
Command-line Interface <pot_compression_cli_README>
pot_compression_api_README
pot_configs_README
Deep neural network protection <pot_ranger_README>
Deep Neural Network Protection <pot_ranger_README>
pot_docs_FrequentlyAskedQuestions
@endsphinxdirective
@@ -37,15 +37,13 @@ Figure below shows the optimization workflow:
* Multiple domains: Computer Vision, Natural Language Processing, Recommendation Systems, Speech Recognition.
* [Command-line tool](docs/CLI.md) that provides a simple interface for basic use cases.
* [API](openvino/tools/pot/api/README.md) that helps to apply optimization methods within a custom inference script written with OpenVINO Python* API.
* (Experimental) [Ranger algorithm](@ref pot_ranger_README) for model prodection in safity-critical cases.
* (Experimental) [Ranger algorithm](@ref pot_ranger_README) for the model protection in safety-critical cases.
For benchmarking results collected for the models optimized with POT tool, see [INT8 vs FP32 Comparison on Select Networks and Platforms](@ref openvino_docs_performance_int8_vs_fp32).
For benchmarking results collected for the models optimized with the POT tool, see [INT8 vs FP32 Comparison on Select Networks and Platforms](@ref openvino_docs_performance_int8_vs_fp32).
POT is opensourced on GitHub as a part of OpenVINO and available at https://github.com/openvinotoolkit/openvino/tools/pot.
POT is open-sourced on GitHub as a part of OpenVINO and available at https://github.com/openvinotoolkit/openvino/tools/pot.
Further documentation presumes that you are familiar with basic Deep Learning concepts, such as model inference,
dataset preparation, model optimization, as well as with the OpenVINO&trade; toolkit and its components, such as [Model Optimizer](@ref openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide)
and [Accuracy Checker Tool](@ref omz_tools_accuracy_checker).
Further documentation presumes that you are familiar with basic Deep Learning concepts, such as model inference, dataset preparation, model optimization, as well as with the OpenVINO&trade; toolkit and its components, such as [Model Optimizer](@ref openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide) and [Accuracy Checker Tool](@ref omz_tools_accuracy_checker).
## Get started
@@ -58,16 +56,15 @@ To install POT, follow the [Installation Guide](docs/InstallationGuide.md).
The POT provides three basic usage options:
* **Command-line interface (CLI)**:
* [**Simplified mode**](@ref pot_docs_simplified_mode): use this option if the model belongs to the Computer Vision domain and you do have an unannotated dataset for optimization. Note that this optimization method can cause a deviation of model accuracy.
* [**Model Zoo flow**](@ref pot_compression_cli_README): this option is recommended if the model is imported from OpenVINO&trade;
[Model Zoo](https://github.com/openvinotoolkit/open_model_zoo) or there is a valid [Accuracy Checker Tool](@ref omz_tools_accuracy_checker)
configuration file for the model that allows validating model accuracy using [Accuracy Checker Tool](@ref omz_tools_accuracy_checker).
* [**Simplified mode**](@ref pot_docs_simplified_mode): use this option if the model belongs to the **Computer Vision** domain and you have an **unannotated dataset** for optimization. This optimization method does not allow measuring model accuracy and might cause its deviation.
* [**Model Zoo flow**](@ref pot_compression_cli_README): this option is recommended if the model is similar to the model from OpenVINO&trade; [Model Zoo](https://github.com/openvinotoolkit/open_model_zoo) or there is a valid [Accuracy Checker Tool](@ref omz_tools_accuracy_checker_README)
configuration file for the model that allows validating model accuracy using [Accuracy Checker Tool](@ref omz_tools_accuracy_checker_README).
* [**Python\* API**](@ref pot_compression_api_README): this option allows integrating the optimization methods implemented in POT into
a Python* inference script that uses [OpenVINO Python* API](https://docs.openvino.ai/latest/openvino_inference_engine_ie_bridges_python_docs_api_overview.html).
POT is also integrated into [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench), a web-based graphical environment
that enables you to optimize, tune, analyze, visualize, and compare performance of deep learning models.
POT is also integrated into [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench), a web-based graphical environment
that enables you to to import, optimize, benchmark, visualize, and compare performance of deep learning models.
### Examples

View File

@@ -30,13 +30,13 @@ This section contains only three parameters:
}
```
The main parameter is `"type"` which can take two possible options: `"accuracy_checher"` (default) or `"simplified"`. It specifies the engine used for model inference and validation (if supported):
- **Simplified mode** engines. These engines can be used only with `DefaultQuantization` algorithm to get a fully quantized model. They do not use the Accuracy Checker tool and annotation. In the case, of this mode the following parameters are applicable:
- `"data_source"` Specifies the path to the directory where to calibration data is stored.
- **Simplified mode** engines. These engines can be used only with `DefaultQuantization` algorithm to get a fully quantized model. They do not use the Accuracy Checker tool and annotation. In this case, the following parameters are applicable:
- `"data_source"` Specifies the path to the directory where the calibration data is stored.
- `"layout"` - (Optional) Layout of input data. Supported values: [`"NCHW"`, `"NHWC"`, `"CHW"`, `"CWH"`].
- **Accuracy Checker** engine. It relies on the [Deep Learning Accuracy Validation Framework](@ref omz_tools_accuracy_checker) (Accuracy Checker) when inferencing DL models and working with datasets.
The benefit of this mode is you can compute accuracy in case you have annotations. When this mode is selected, you can use the accuracy aware algorithms family.
- **Accuracy Checker** engine. It relies on the [Deep Learning Accuracy Validation Framework](@ref omz_tools_accuracy_checker_README) (Accuracy Checker) when inferencing DL models and working with datasets.
If you have annotations, you can benefit from this mode by measuring accuracy. When this mode is selected, you can use the accuracy-aware algorithms family.
There are two options to define engine parameters in this mode:
- Refer to the existing Accuracy Checker configuration file which is represented by the YAML file. It can be a file used for full-precision model validation. In this case, you should define only the `"config"` parameter containing a path to the AccuracyChecker configuration file.
- Refer to the existing Accuracy Checker configuration file which is represented by the YAML file. It can be a file used for full-precision model validation. In this case, you should define only the `"config"` parameter containing the path to the AccuracyChecker configuration file.
- Define all the [required Accuracy Checker parameters](@ref omz_tools_accuracy_checker_dlsdk_launcher)
directly in the JSON file. In this case, POT just passes the corresponding dictionary of parameters to the Accuracy Checker when instantiating it.
For more details, refer to the corresponding Accuracy Checker information and examples of configuration files provided with the tool:
@@ -49,9 +49,10 @@ This section defines optimization algorithms and their parameters. For more deta
## Examples of the Configuration File
For a quick start, many examples of configuration files are provided [here](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/examples). There you can find ready-to-use configurations for the models from various domains: Computer Vision (Image
Classification, Object Detection, Segmentation), Natural Language Processing, Recommendation Systems. We basically
put configuration files for the models which require non-default configuration settings in order to get accurate results.
For a quick start, many examples of configuration files are provided [here](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/examples). There you can find ready-to-use configurations for the models from various domains: Computer Vision (Image
Classification, Object Detection, Segmentation), Natural Language Processing, Recommendation Systems. We put configuration files for the models which require non-default configuration settings to get accurate results.
For details on how to run the Post-Training Optimization Tool with a sample configuration file, see the [example](@ref pot_configs_examples_README).
## See Also

View File

@@ -17,7 +17,7 @@ its models then you can employ POT CLI to optimize your model.
In other cases, you should consider using POT [API](@ref pot_compression_api_README). To start with POT CLI please refer to the
following [example](@ref pot_configs_examples_README).
Note: There is also the so-called [**Simplified mode**](@ref pot_docs_simplified_mode) that is basically aimed at INT8 quantization if the model is from the Computer Vision domain and has a simple dataset preprocessing, like image resize and crop. In this case, you can also use POT CLI for
Note: There is also the so-called [**Simplified mode**](@ref pot_docs_simplified_mode) aimed at INT8 quantization if the model is from the Computer Vision domain and has a simple dataset preprocessing, like image resize and crop. In this case, you can also use POT CLI for
optimization. However, the accuracy results are not guaranteed in this case. Moreover, you are also limited in the
optimization methods choice since the accuracy measurement is not available.

View File

@@ -27,7 +27,7 @@ In case of issues while running the example, refer to [POT Frequently Asked Ques
In the instructions below, the Post-Training Optimization Tool directory `<POT_DIR>` is referred to:
- `<ENV>/lib/python<version>/site-packages/` in the case of PyPI installation, where `<ENV>` is a Python*
environment where OpenVINO is installed and `<version>` is a Python* version, e.g. `3.6`.
- `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` in the case of OpenVINO distribution package.
`<INSTALL_DIR>` is the directory where Intel&reg; Distribution of OpenVINO&trade; toolkit is installed.
## Model Preparation

View File

@@ -7,9 +7,6 @@
The minimum and the recommended requirements to run the Post-training Optimization Tool (POT) are the same as in [OpenVINO&trade;](https://docs.openvino.ai/latest/index.html).
There are two ways how to install the POT on your system:
- Installation from PyPI repository
- Installation from Intel&reg; Distribution of OpenVINO&trade; toolkit package
## Install POT from PyPI
The simplest way to get the Post-training Optimization Tool and OpenVINO&trade; installed is to use PyPI. Follow the steps below to do that:
@@ -18,50 +15,3 @@ The simplest way to get the Post-training Optimization Tool and OpenVINO&trade;
3. To install POT and other OpenVINO&trade; developer tools, run `pip install openvino-dev`.
Now the Post-training Optimization Tool is available in the command line by the `pot` alias. To verify it, run `pot -h`.
## Install and Set Up POT from Intel&reg; Distribution of OpenVINO&trade; toolkit package
In the instructions below, `<INSTALL_DIR>` is the directory where the Intel&reg; distribution of OpenVINO&trade; toolkit
is installed. The Post-training Optimization Tool is distributed as a part of the OpenVINO&trade; release package, and to use the POT as a command-line tool,
you need to install OpenVINO&trade; as well as POT dependencies, namely [Model Optimizer](@ref openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide)
and [Accuracy Checker](@ref omz_tools_accuracy_checker). It is recommended to create a separate [Python* environment](https://docs.python.org/3/tutorial/venv.html) before installing the OpenVINO&trade; and its components.
POT source files are available in `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` directory after the OpenVINO&trade; installation.
To set up the Post-training Optimization Tool in your environment, follow the steps below.
### Set up the Model Optimizer and Accuracy Checker components
- To set up the [Model Optimizer](@ref openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide):
1. Go to `<INSTALL_DIR>/deployment_tools/model_optimizer/install_prerequisites`.
2. Run the following script to configure the Model Optimizer:
* Linux:
```sh
sudo ./install_prerequisites.sh
```
* Windows:
```bat
install_prerequisites.bat
```
3. To verify that the Model Optimizer is installed, run `<INSTALL_DIR>/deployment_tools/model_optimizer/mo.py -h`.
- To set up the [Accuracy Checker](@ref omz_tools_accuracy_checker):
1. Go to `<INSTALL_DIR>/deployment_tools/open_model_zoo/tools/accuracy_checker`.
2. Run the following script to configure the Accuracy Checker:
```sh
python setup.py install
```
3. Now the Accuracy Checker is available in the command line by the `accuracy_check` alias. To verify it, run `accuracy_check -h`.
### Set up the POT
1. Go to `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit`.
2. Run the following script to configure the POT:
```sh
python setup.py install
```
In order to enable advanced algorithms such as the Tree-Structured Parzen Estimator (TPE) based optimization, add the following flag to the installation command:
```sh
python setup.py install --install-extras
```
3. Now the POT is available in the command line by the `pot` alias. To verify it, run `pot -h`.

View File

@@ -7,23 +7,25 @@ Currently, there are two groups of optimization methods that can change the IR a
- **Quantization**. The rest of this document is dedicated to the representation of quantized models.
## Representation of quantized models
The OpenVINO Toolkit represents all the quantized models using the so-called [FakeQuantize](../../../docs/ops/quantization/FakeQuantize_1.md) operation. This operation is very expressive and allows mapping values from arbitrary input and output ranges. The whole idea behind that is quite simple: we project (discretize) the input values to the low-precision data type using affine transformation (with clamp and rounding) and then reproject discrete values back to the original range and data type. It can be considered as an emulation of the quantization/dequantization process which happens at runtime. The figure below shows a part of the DL model, namely the Convolutional layer, that undergoes various transformations on way from being a floating-point model to an integer model executed in the OpenVINO runtime. Column 2 of this figure below shows a model quantized with [Neural Network Compression Framework (NNCF)](https://github.com/openvinotoolkit/nncf).
![](images/model_flow.png)
The OpenVINO Toolkit represents all the quantized models using the so-called [FakeQuantize](https://docs.openvino.ai/latest/openvino_docs_MO_DG_prepare_model_convert_model_Legacy_IR_Layers_Catalog_Spec.html#fakequantize-layer) operation. This operation is very expressive and allows mapping values from arbitrary input and output ranges. We project (discretize) the input values to the low-precision data type using affine transformation (with clamp and rounding) and then re-project discrete values back to the original range and data type. It can be considered as an emulation of the quantization/dequantization process which happens at runtime. The figure below shows a part of the DL model, namely the Convolutional layer, that undergoes various transformations, from being a floating-point model to an integer model executed in the OpenVINO runtime. Column 2 of this figure below shows a model quantized with [Neural Network Compression Framework (NNCF)](https://github.com/openvinotoolkit/nncf).
![](images/model_flow.png)
To reduce memory footprint weights of quantized models are transformed to a target data type, e.g. in the case of 8-bit quantization, this is int8. During this transformation, the floating-point weights tensor and one of the FakeQuantize operations that correspond to it are replaced with 8-bit weight tensor and the sequence of Convert, Subtract, Multiply operations that represent the typecast and dequantization parameters (scale and zero-point) as it is shown in column 3 of the figure.
## Interpreting FakeQuantize at runtime
At inference time, the quantized model undergoes the second set of transformations that allows interpreting floating-point operations with quantization rules as integer operations. OpenVINO Deep Learning Deployment Toolkit has a special component which is called Low-Precision Transformations (LPT) for that purpose.
At runtime each FakeQuantize can be split into two independent operations: **Quantize** and **Dequantize** (column 4). The former is aimed to transform the input data into the target precision while the latter transforms the resulting values back to the original range. *Dequantize* operations can be propagated forward through the linear layers, such as *Convolution* or *Fully-Connected*, and in some cases fused with the following *Quantize* operation for the next layer into the so-called *Requantize* operation (column 5).
At inference time, the quantized model undergoes the second set of transformations that allows interpreting floating-point operations with quantization rules as integer operations. OpenVINO Toolkit has Low-Precision Transformations (LPT) component for that purpose.
At runtime each FakeQuantize can be split into two independent operations: **Quantize** and **Dequantize** (column 4). **Quantize** transforms the input data into the target precision while **Dequantize** transforms the resulting values back to the original range. *Dequantize* operations can be propagated forward through the linear layers, such as *Convolution* or *Fully-Connected*, and, in some cases, fused with the following *Quantize* operation for the next layer into the so-called *Requantize* operation (column 5).
From the computation standpoint, the FakeQuantize formula is split into two parts:
`output = round((x - input_low) / (input_high - input_low) * (levels-1)) / (levels-1) * (output_high - output_low) + output_low`
The first part of this fomula represents *Quantize* operation:
`q = round((x - input_low) / (input_high - input_low) * (levels-1))`
The second is responsible for the dequantization:
`r = q / (levels-1) * (output_high - output_low) + output_low`
From the scale/zero-point notation standpoint the latter formula can be written as follows:
`r = (output_high - output_low) / (levels-1) * (q + output_low / (output_high - output_low) * (levels-1))`
From the computation standpoint, the FakeQuantize formula also is split into two parts accordingly:
`output = round((x - input_low) / (input_high - input_low) * (levels-1)) / (levels-1) * (output_high - output_low) + output_low`
The first part of this fomula represents *Quantize* operation:
`q = round((x - input_low) / (input_high - input_low) * (levels-1))`
The second is responsible for the dequantization:
`r = q / (levels-1) * (output_high - output_low) + output_low`
From the scale/zero-point notation standpoint the latter formula can be written as follows:
`r = (output_high - output_low) / (levels-1) * (q + output_low / (output_high - output_low) * (levels-1))`
Thus we can define:
- **Scale** as `(output_high - output_low) / (levels-1)`
- **Zero-point** as `-output_low / (output_high - output_low) * (levels-1)`

View File

@@ -1,21 +1,21 @@
# Saturation (overflow) issue workaround {#pot_saturation_issue}
## Introduction
8-bit instructions of previous generations of Intel&reg; CPUs, namely that based on SSE, AVX-2, AVX-512 instruction sets, admit so-called saturation (overflow) of the intermediate buffer when calculating the dot product which is an essential part of Convolutional or MatMul operations. This saturation can lead to an accuracy drop on the aforementioned architectures during the inference of 8-bit quantized models. However, it is not possible to predict such degradation since most of the computations are executed in parallel during DL model inference which makes this process non-deterministic. This problem is typical for models with non-ReLU activation functions and a low level of redundancy, e.g. optimized or efficient models. It can prevent deploying the model on legacy HW or creating cross-platform applications. The problem does not occur on the CPUs with Intel Deep Learning Boost (VNNI) technology and further generations as well as GPUs.
8-bit instructions of previous generations of Intel&reg; CPUs, namely those based on SSE, AVX-2, AVX-512 instruction sets, admit so-called saturation (overflow) of the intermediate buffer when calculating the dot product which is an essential part of Convolutional or MatMul operations. This saturation can lead to an accuracy drop on the mentioned architectures during the inference of 8-bit quantized models. However, it is not possible to predict such degradation since most of the computations are executed in parallel during DL model inference which makes this process non-deterministic. This problem is typical for models with non-ReLU activation functions and low level of redundancy, for example, optimized or efficient models. It can prevent deploying the model on legacy hardware or creating cross-platform applications. The problem does not occur on the CPUs with Intel Deep Learning Boost (VNNI) technology and further generations, as well as on GPUs.
## How to detect
The only way to detect saturation issue is to run inference on the CPU that admits it and on the HW that does not have such a problem (e.g. VNNI-based CPU). If the accuracy difference is significant (e.g. more than 1%) this is the main indicator of the saturation issue impact.
## Saturation Problem Detection
The only way to detect saturation issue is to run inference on the CPU that admits it and on the hardware that does not have such problem (for example, VNNI-based CPU). If the accuracy difference is significant (more than 1%), this is the main indicator of the saturation issue impact.
## Workaround
There is a workaround that helps fully address the saturation issue during the inference. The idea is to use only 7 bits to represent weights (of Convolutional or Fully-Connected layers) while quantizing activations using the full range of 8-bit data types. However, such a trick can lead to accuracy degradation itself due to the reduced representation of weights. On the other hand, using this trick for the first layer can help to mitigate the saturation issue for many models.
There is a workaround that helps fully address the saturation issue during the inference. The algorithm uses only 7 bits to represent weights (of Convolutional or Fully-Connected layers) while quantizing activations using the full range of 8-bit data types. However, this can lead to an accuracy degradation due to the reduced representation of weights. On the other hand, using this workaround for the first layer can help mitigate the saturation issue for many models.
POT tool provides three options to deal with the saturation issue which can be enabled in POT configuration file using the "saturation_fix" parameter:
POT tool provides three options to deal with the saturation issue. The options can be enabled in the POT configuration file using the "saturation_fix" parameter:
* (Default) Fix saturation issue for the first layer: "first_layer" option
* Apply for all layers in the model: "all" option
* Not apply saturation fix at all: "no" option
* Do not apply saturation fix at all: "no" option
Below is an example of the section in POT configuration file with the `saturation_fix` option:
Below is an example of the section in the POT configuration file with the `saturation_fix` option:
```json
"algorithms": [
{
@@ -29,10 +29,11 @@ Below is an example of the section in POT configuration file with the `saturatio
]
```
## Recommendations
If you observe the saturation issue we recommend trying the option "all" during the model quantization. If it does not help to improve the accuracy we recommend using [Quantization-aware training from NNCF](https://github.com/openvinotoolkit/nncf) and fine-tune the model.
If you are not planning to use legacy CPU HW you can use the option "no" which can also lead to slightly better accuracy.
If you observe the saturation issue, we recommend trying the option "all" during the model quantization. If it does not help improve the accuracy, we recommend using [Quantization-aware training from NNCF](https://github.com/openvinotoolkit/nncf) and fine-tuning the model.
If you are not planning to use legacy CPU HW, you can use the option "no", which might also lead to slightly better accuracy.
## See Also
* [Lower Numerical Precision Deep Learning Inference and Training blogpost](https://www.intel.com/content/www/us/en/developer/articles/technical/lower-numerical-precision-deep-learning-inference-and-training.html)
* [Configuration file desciption](@ref pot_configs_README)
* [Configuration file description](@ref pot_configs_README)

View File

@@ -1,31 +1,35 @@
# Optimization with Simplified mode {#pot_docs_simplified_mode}
## Introduction
Simplified mode is designed to simplify data preparation for the model optimization process. The mode is represented by an implementation of Engine interface from the POT API that allows reading data from an arbitrary folder specified by the user. For more details about POT API please refer to the corresponding [description](pot_compression_api_README). Currently, Simplified mode is available only for image data stored in a single folder in PNG or JPEG formats.
Note: This mode cannot be used with accuracy-aware methods, i.e. there is no way to control accuracy after optimization. Nevertheless, this mode can be helpful to estimate performance benefits when using model optimizations.
Simplified mode is designed to make data preparation for the model optimization process easier. The mode is represented by an implementation of Engine interface from the POT API. It allows reading the data from an arbitrary folder specified by the user. For more details about POT API, refer to the corresponding [description](pot_compression_api_README). Currently, Simplified mode is available only for image data in PNG or JPEG formats, stored in a single folder.
Note: This mode cannot be used with accuracy-aware methods. There is no way to control accuracy after optimization. Nevertheless, this mode can be helpful to estimate performance benefits when using model optimizations.
## Usage
To use Simplified mode you should prepare data and place them in a separate folder. No other files should be presented in this folder. There are two options to run POT in the Simplified mode:
To use the Simplified mode, prepare the data and place it in a separate folder. No other files should be present in this folder. There are two options to run POT in the Simplified mode:
* Using command-line options only. Here is an example for 8-bit quantization:
`pot -q default -m <path_to_xml> -w <path_to_bin> --engine simplified --data-source <path_to_data>`
* To provide more options you can use the corresponding `"engine"` section in the POT configuration file as follows:
* To provide more options, use the corresponding `"engine"` section in the POT configuration file as follows:
```json
"engine": {
"type": "simplified",
"layout": "NCHW", // Layout of input data. Supported ["NCHW",
// "NHWC", "CHW", "CWH"] layout
"data_source": "PATH_TO_SOURCE" // You can specify path to directory with images
"data_source": "PATH_TO_SOURCE" // You can specify path to the directory with images
// Also you can specify template for file names to filter images to load.
// Templates are unix style (This option valid only in simplified mode)
// Templates are unix style (this option is valid only in Simplified mode)
}
```
A template of configuration file for 8-bit quantization using Simplified mode can be found [here](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/simplified_mode_template.json).
A template of configuration file for 8-bit quantization using Simplified mode can be found [at the following link](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/simplified_mode_template.json).
For more details about how to use POT via CLI please refer to this [document](@ref pot_compression_cli_README).
For more details about POT usage via CLI, refer to this [CLI document](@ref pot_compression_cli_README).
## See Also
* [Configuration File Description](@ref pot_configs_README)

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d5650775fe986b294278186c12b91fadbb758e06783f500b9fd399e474eafe2c
size 34217
oid sha256:79ef392200a6d9ecad6be9cab7b1ecd4af7b88b4fd55f8f8884a02b16b435f68
size 36036

View File

@@ -1,3 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:791f253493350d04c62e53f40b086fb73ceb1b96d346c9772e82de9892fee7a4
size 33789
oid sha256:1f23982591a6acc707c2a8494ed32e8de98fd73521143c1b286b2faee3c3b516
size 40722

View File

@@ -8,9 +8,9 @@
DefaultQuantization Algorithm <pot_compression_algorithms_quantization_default_README>
AccuracyAwareQuantization Algorithm <pot_compression_algorithms_quantization_accuracy_aware_README>
TunableQuantization algorithm <pot_compression_algorithms_quantization_tunable_quantization_README>
Saturation issue workaround <pot_saturation_issue>
Low-precision model representation <pot_docs_model_representation>
TunableQuantization Algorithm <pot_compression_algorithms_quantization_tunable_quantization_README>
Saturation Issue Workaround <pot_saturation_issue>
Low-precision Model Representation <pot_docs_model_representation>
@endsphinxdirective
@@ -26,7 +26,7 @@ During the quantization process, the POT tool runs inference of the optimizing m
Currently, the POT provides two algorithms for 8-bit quantization, which are verified and guarantee stable results on a
wide range of DNN models:
* [**DefaultQuantization**](@ref pot_compression_algorithms_quantization_default_README) is a default method that provides fast and in most cases accurate results for 8-bit
quantization. It requires only a non-annotated dataset for quantization. For details, see the [DefaultQuantization Algorithm](@ref pot_compression_algorithms_quantization_default_README) documentation.
quantization. It requires only a unannotated dataset for quantization. For details, see the [DefaultQuantization Algorithm](@ref pot_compression_algorithms_quantization_default_README) documentation.
* [**AccuracyAwareQuantization**](@ref pot_compression_algorithms_quantization_accuracy_aware_README) enables remaining at a predefined range of accuracy drop after quantization at the cost
of performance improvement. The method requires annotated representative dataset and may require more time for quantization. For details, see the

View File

@@ -57,7 +57,7 @@ quantization time. Default value is `False`.
## Examples
A template and full specification for AccuracyAwareQuantization algorithm can be found:
A template and full specification for AccuracyAwareQuantization algorithm:
* [Template](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/accuracy_aware_quantization_template.json)
* [Full specification](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/accuracy_aware_quantization_spec.json)

View File

@@ -115,7 +115,7 @@ Enabling this option may increase compressed model accuracy, but will result in
## Examples
A template and full specification for DefaultQuantization algorithm can be found:
A template and full specification for DefaultQuantization algorithm:
* [Template](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/default_quantization_template.json)
* [Full specification](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/default_quantization_spec.json)

View File

@@ -15,8 +15,8 @@ To run this sample, you will need to download the Brain Tumors 2017 part of the
## How to Run the Sample
In the instructions below, the Post-Training Optimization Tool directory `<POT_DIR>` is referred to:
- `<ENV>/lib/python<version>/site-packages/` in the case of PyPI installation, where `<ENV>` is a Python*
environment where OpenVINO is installed and `<version>` is a Python* version, e.g. `3.6`.
- `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` in the case of OpenVINO distribution package.
environment where OpenVINO is installed and `<version>` is a Python* version, for example `3.6`.
`<INSTALL_DIR>` is the directory where Intel&reg; Distribution of OpenVINO&trade; toolkit is installed.
1. To get started, follow the [Installation Guide](@ref pot_InstallationGuide).

View File

@@ -21,7 +21,7 @@ are not supported through the `AccuracyCheker` or `Simplified` engines (see [Bes
All available samples can be found in `<POT_DIR>/api/samples` folder, where `<POT_DIR>` is a directory where the Post-Training Optimization Tool is installed.
> **NOTE**: - `<POT_DIR>` is referred to `<ENV>/lib/python<version>/site-packages/` in the case of PyPI installation, where `<ENV>` is a Python*
> environment where OpenVINO is installed and `<version>` is a Python* version, e.g. `3.6` or to `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` in the case of OpenVINO distribution package.
> environment where OpenVINO is installed and `<version>` is a Python* version, for example `3.6`.
> `<INSTALL_DIR>` is the directory where Intel&reg; Distribution of OpenVINO&trade; toolkit is installed.
There are currently the following samples that demonstrate the implementation of `Engine`, `Metric` and `DataLoader` interfaces

View File

@@ -13,8 +13,7 @@ which will be later referred as `<IMAGES_DIR>`. Annotations to images should be
## How to Run the Sample
In the instructions below, the Post-Training Optimization Tool directory `<POT_DIR>` is referred to:
- `<ENV>/lib/python<version>/site-packages/` in the case of PyPI installation, where `<ENV>` is a Python*
environment where OpenVINO is installed and `<version>` is a Python* version, e.g. `3.6`.
- `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` in the case of OpenVINO distribution package.
environment where OpenVINO is installed and `<version>` is a Python* version, for example `3.6`.
`<INSTALL_DIR>` is the directory where Intel&reg; Distribution of OpenVINO&trade; toolkit is installed.
1. To get started, follow the [Installation Guide](@ref pot_InstallationGuide).

View File

@@ -17,8 +17,7 @@ can be downloaded separately and are located in the `wider_face_split/wider_face
## How to Run the Sample
In the instructions below, the Post-Training Optimization Tool directory `<POT_DIR>` is referred to:
- `<ENV>/lib/python<version>/site-packages/` in the case of PyPI installation, where `<ENV>` is a Python*
environment where OpenVINO is installed and `<version>` is a Python* version, e.g. `3.6`.
- `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` in the case of OpenVINO distribution package.
environment where OpenVINO is installed and `<version>` is a Python* version, for example `3.6`.
`<INSTALL_DIR>` is the directory where Intel&reg; Distribution of OpenVINO&trade; toolkit is installed.
1. To get started, follow the [Installation Guide](@ref pot_InstallationGuide).

View File

@@ -12,8 +12,7 @@ To run this sample, you will need to download the validation part of the [COCO](
## How to Run the Sample
In the instructions below, the Post-Training Optimization Tool directory `<POT_DIR>` is referred to:
- `<ENV>/lib/python<version>/site-packages/` in the case of PyPI installation, where `<ENV>` is a Python*
environment where OpenVINO is installed and `<version>` is a Python* version, e.g. `3.6`.
- `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` in the case of OpenVINO distribution package.
environment where OpenVINO is installed and `<version>` is a Python* version, for example `3.6`.
`<INSTALL_DIR>` is the directory where Intel&reg; Distribution of OpenVINO&trade; toolkit is installed.
1. To get started, follow the [Installation Guide](@ref pot_InstallationGuide).

View File

@@ -15,8 +15,7 @@ and segmentation masks are kept in the `SegmentationClass` directory.
## How to Run the Sample
In the instructions below, the Post-Training Optimization Tool directory `<POT_DIR>` is referred to:
- `<ENV>/lib/python<version>/site-packages/` in the case of PyPI installation, where `<ENV>` is a Python*
environment where OpenVINO is installed and `<version>` is a Python* version, e.g. `3.6`.
- `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` in the case of OpenVINO distribution package.
environment where OpenVINO is installed and `<version>` is a Python* version, for example `3.6`.
`<INSTALL_DIR>` is the directory where Intel&reg; Distribution of OpenVINO&trade; toolkit is installed.
1. To get started, follow the [Installation Guide](@ref pot_InstallationGuide).

View File

@@ -14,8 +14,7 @@ For generating data from original formats to .ark, please, follow the [Kaldi dat
## How to Run the Sample
In the instructions below, the Post-Training Optimization Tool directory `<POT_DIR>` is referred to:
- `<ENV>/lib/python<version>/site-packages/` in the case of PyPI installation, where `<ENV>` is a Python*
environment where OpenVINO is installed and `<version>` is a Python* version, e.g. `3.6`.
- `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit` in the case of OpenVINO distribution package.
environment where OpenVINO is installed and `<version>` is a Python* version, for example `3.6`.
`<INSTALL_DIR>` is the directory where Intel&reg; Distribution of OpenVINO&trade; toolkit is installed.
1. To get started, follow the [Installation Guide](@ref pot_InstallationGuide).