From 2f5be5e81ce22170189d8cf59ca8a2df45834ebb Mon Sep 17 00:00:00 2001 From: Sebastian Golebiewski Date: Mon, 3 Apr 2023 08:28:24 +0200 Subject: [PATCH] DOCS shift to rst - Post-Training Optimization (#16621) --- tools/pot/configs/README.md | 91 ++-- tools/pot/docs/CLI.md | 140 ++++-- tools/pot/docs/SaturationIssue.md | 57 ++- tools/pot/docs/SimplifiedMode.md | 84 ++-- tools/pot/openvino/tools/pot/api/README.md | 534 ++++++++++++--------- 5 files changed, 543 insertions(+), 363 deletions(-) diff --git a/tools/pot/configs/README.md b/tools/pot/configs/README.md index 4a075015e0c..c900db6ea38 100644 --- a/tools/pot/configs/README.md +++ b/tools/pot/configs/README.md @@ -1,58 +1,75 @@ # Configuration File Description {#pot_configs_README} + +@sphinxdirective + The tool is designed to work with the configuration file where all the parameters required for the optimization are specified. These parameters are organized as a dictionary and stored in -a JSON file. JSON file allows using comments that are supported by the `jstyleson` Python* package. +a JSON file. JSON file allows using comments that are supported by the ``jstyleson`` Python package. Logically all parameters are divided into three groups: + - **Model parameters** that are related to the model definition (e.g. model name, model path, etc.) - **Engine parameters** that define parameters of the engine which is responsible for the model inference and data preparation used for optimization and evaluation (e.g. preprocessing parameters, dataset path, etc.) - **Compression parameters** that are related to the optimization algorithm (e.g. algorithm name and specific parameters) -## Model Parameters +Model Parameters +#################### + +.. code-block:: json + + "model": { + "model_name": "model_name", + "model": "", + "weights": "" + } -```json -"model": { - "model_name": "model_name", - "model": "", - "weights": "" - } -``` This section contains only three parameters: -- `"model_name"` - string parameter that defines a model name, e.g. `"MobileNetV2"` -- `"model"` - string parameter that defines the path to an input model topology (.xml) -- `"weights"` - string parameter that defines the path to an input model weights (.bin) -## Engine Parameters +- ``"model_name"`` - string parameter that defines a model name, e.g. ``"MobileNetV2"`` +- ``"model"`` - string parameter that defines the path to an input model topology (.xml) +- ``"weights"`` - string parameter that defines the path to an input model weights (.bin) + +Engine Parameters +#################### + +.. code-block:: json + + "engine": { + "type": "accuracy_checker", + "config": "./configs/examples/accuracy_checker/mobilenet_v2.yaml" + } + + +The main parameter is ``"type"`` which can take two possible options: ``"accuracy_checher"`` (default) or ``"simplified"``. It specifies the engine used for model inference and validation (if supported): + +- **Simplified mode** engines. These engines can be used only with ``DefaultQuantization`` algorithm to get a fully quantized model. They do not use the Accuracy Checker tool and annotation. In this case, the following parameters are applicable: + + - ``"data_source"`` specifies the path to the directory​ where the calibration data is stored. + - ``"layout"`` - (Optional) Layout of input data. Supported values: [``"NCHW"``, ``"NHWC"``, ``"CHW"``, ``"CWH"``]​. + +- **Accuracy Checker** engine. It relies on the :doc:`Deep Learning Accuracy Validation Framework ` (Accuracy Checker) when inferencing DL models and working with datasets. -```json -"engine": { - "type": "accuracy_checker", - "config": "./configs/examples/accuracy_checker/mobilenet_v2.yaml" - } -``` -The main parameter is `"type"` which can take two possible options: `"accuracy_checher"` (default) or `"simplified"`. It specifies the engine used for model inference and validation (if supported): -- **Simplified mode** engines. These engines can be used only with `DefaultQuantization` algorithm to get a fully quantized model. They do not use the Accuracy Checker tool and annotation. In this case, the following parameters are applicable: - - `"data_source"` Specifies the path to the directory​ where the calibration data is stored. - - `"layout"` - (Optional) Layout of input data. Supported values: [`"NCHW"`, `"NHWC"`, `"CHW"`, `"CWH"`]​. -- **Accuracy Checker** engine. It relies on the [Deep Learning Accuracy Validation Framework](@ref omz_tools_accuracy_checker) (Accuracy Checker) when inferencing DL models and working with datasets. If you have annotations, you can benefit from this mode by measuring accuracy. When this mode is selected, you can use the accuracy-aware algorithms family. There are two options to define engine parameters in this mode: - - Refer to the existing Accuracy Checker configuration file which is represented by the YAML file. It can be a file used for full-precision model validation. In this case, you should define only the `"config"` parameter containing the path to the AccuracyChecker configuration file. - - Define all the [required Accuracy Checker parameters](@ref omz_tools_accuracy_checker_openvino_launcher) - directly in the JSON file. In this case, POT just passes the corresponding dictionary of parameters to the Accuracy Checker when instantiating it. - For more details, refer to the corresponding Accuracy Checker information and examples of configuration files provided with the tool: - - 8-bit quantization of [SSD-MobileNet model](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/examples/quantization/object_detection/ssd_mobilenetv1_int8.json) -## Compression Parameters +- Refer to the existing Accuracy Checker configuration file which is represented by the YAML file. It can be a file used for full-precision model validation. In this case, you should define only the ``"config"`` parameter containing the path to the AccuracyChecker configuration file. +- Define all the :doc:`required Accuracy Checker parameters ` directly in the JSON file. In this case, POT just passes the corresponding dictionary of parameters to the Accuracy Checker when instantiating it. For more details, refer to the corresponding Accuracy Checker information and examples of configuration files provided with the tool: 8-bit quantization of `SSD-MobileNet model `__ -For more details about parameters of the concrete optimization algorithm, see descriptions of [Default Quantization](@ref pot_compression_algorithms_quantization_default_README) and [Accuracy-aware Quantizatoin](@ref accuracy_aware_README) methods. +Compression Parameters +###################### -## Examples of the Configuration File +For more details about parameters of the concrete optimization algorithm, see descriptions of :doc:`Default Quantization ` and :doc:`Accuracy-aware Quantizatoin ` methods. +Examples of the Configuration File +################################## -For a quick start, many examples of configuration files are provided [here](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/examples). There you can find ready-to-use configurations for the models from various domains: Computer Vision (Image - Classification, Object Detection, Segmentation), Natural Language Processing, Recommendation Systems. We put configuration files for the models which require non-default configuration settings to get accurate results. +For a quick start, many examples of configuration files are provided `here `__. +There, you can find ready-to-use configurations for the models from various domains: Computer Vision (Image Classification, Object Detection, Segmentation), Natural Language Processing, Recommendation Systems. We put configuration files for the models which require non-default configuration settings to get accurate results. -For details on how to run the Post-Training Optimization Tool with a sample configuration file, see the [example](@ref pot_configs_examples_README). +For details on how to run the Post-Training Optimization Tool with a sample configuration file, see the :doc:`example `. -## See Also -* [Optimization with Simplified mode](@ref pot_docs_simplified_mode) +Additional Resources +#################### + +* :doc:`Optimization with Simplified mode ` + +@endsphinxdirective diff --git a/tools/pot/docs/CLI.md b/tools/pot/docs/CLI.md index fe9b3ef5c04..f7d5c06b975 100644 --- a/tools/pot/docs/CLI.md +++ b/tools/pot/docs/CLI.md @@ -5,71 +5,109 @@ .. toctree:: :maxdepth: 1 :hidden: - + Simplified Mode pot_configs_README -@endsphinxdirective -## Introduction -POT command-line interface (CLI) is aimed at optimizing models that are similar to the models from OpenVINO™ [Model Zoo](https://github.com/openvinotoolkit/open_model_zoo) or if there is a valid [AccuracyChecker Tool](@ref omz_tools_accuracy_checker) configuration file for the model. Examples of AccuracyChecker configuration files can be found on [GitHub](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public). Each model folder contains YAML configuration file that can be used with POT as is. +Introduction +#################### -> **NOTE**: There is also the so-called [Simplified mode](@ref pot_docs_simplified_mode) aimed at optimizatoin of models from the Computer Vision domain and has a simple dataset preprocessing, like image resize and crop. In this case, you can also use POT CLI for optimization. However, the accuracy results are not guaranteed in this case. Moreover, you are also limited in the optimization methods choice since the accuracy measurement is not available. - +POT command-line interface (CLI) is aimed at optimizing models that are similar to the models from OpenVINO `Model Zoo `__ or if there is a valid :doc:`AccuracyChecker Tool ` configuration file for the model. Examples of AccuracyChecker configuration files can be found on `GitHub `__. Each model folder contains YAML configuration file that can be used with POT as is. + +.. note:: + + There is also the so-called :doc:`Simplified mode ` aimed at optimization of models from the Computer Vision domain and has a simple dataset preprocessing, like image resize and crop. In this case, you can also use POT CLI for optimization. However, the accuracy results are not guaranteed in this case. Moreover, you are also limited in the optimization methods choice since the accuracy measurement is not available. + + +Run POT CLI +#################### -## Run POT CLI There are two ways how to run POT via command line: -- **Basic usage for DefaultQuantization**. In this case you can run POT with basic setting just specifying all the options via command line. `-q default` stands for [DefaultQuantization](../openvino/tools/pot/algorithms/quantization/default/README.md) algorithm: - ```sh - pot -q default -m -w --ac-config - ``` -- **Basic usage for AccuracyAwareQauntization**. You can also run [AccuracyAwareQuantization](../openvino/tools/pot/algorithms/quantization/accuracy_aware/README.md) method with basic options. `--max-drop 0.01` option defines maximum accuracy deviation to 1 absolute percent from the original model: - ```sh - pot -q accuracy_aware -m -w --ac-config --max-drop 0.01 - ``` -- **Advanced usage**. In this case you should prepare a configuration file for the POT where you can specify advanced options for the optimization -methods available. See [POT configuration file description](@ref pot_configs_README) for more details. -To launch the command-line tool with the configuration file run: - ```sh - pot -c - ``` -For all available usage options, use the `-h`, `--help` arguments or refer to the Command-Line Arguments section below. +- **Basic usage for DefaultQuantization**. In this case you can run POT with basic setting just specifying all the options via command line. ``-q default`` stands for :doc:`DefaultQuantization ` algorithm: -By default, the results are dumped into the separate output subfolder inside the `./results` folder that is created -in the same directory where the tool is run from. Use the `-e` option to evaluate the accuracy directly from the tool. + .. code-block:: sh -See also the [End-to-end example](@ref pot_configs_examples_README) about how to run a particular example of 8-bit + pot -q default -m -w --ac-config + +- **Basic usage for AccuracyAwareQuantization**. You can also run :doc:`AccuracyAwareQuantization ` method with basic options. ``--max-drop 0.01`` option defines maximum accuracy deviation to 1 absolute percent from the original model: + + .. code-block:: sh + + pot -q accuracy_aware -m -w --ac-config --max-drop 0.01 + + +- **Advanced usage**. In this case you should prepare a configuration file for the POT where you can specify advanced options for the optimization methods available. See :doc:`POT configuration file description ` for more details. + + To launch the command-line tool with the configuration file run: + + .. code-block:: sh + + pot -c + + +For all available usage options, use the ``-h``, ``--help`` arguments or refer to the Command-Line Arguments section below. + +By default, the results are dumped into the separate output subfolder inside the ``./results`` folder that is created +in the same directory where the tool is run from. Use the ``-e`` option to evaluate the accuracy directly from the tool. + +See also the :doc:`End-to-end example ` about how to run a particular example of 8-bit quantization with the POT. -### Command-Line Arguments +Command-Line Arguments +++++++++++++++++++++++ -The following command-line options are available to run the tool: +The following command-line options are available to run the tool: -| Argument | Description | -| ------------------------------------------------- | ------------------------------------------------------- | -| `-h`, `--help` | Optional. Show help message and exit. | -| `-q`, `--quantize` | Quantize model to 8 bits with specified quantization method: `default` or `accuracy_aware`. | -| `--preset` | Use `performance` for fully symmetric quantization or `mixed` preset for symmetric quantization of weight and asymmetric quantization of activations. Applicable only when `-q` option is used.| -| `-m`, `--model` | Path to the optimizing model file (.xml). Applicable only when `-q` option is used. | -| `-w`, `--weights` | Path to the weights file of the optimizing model (.bin). Applicable only when `-q` option is used. | -| `-n`, `--name` | Optional. Model name. Applicable only when `-q` option is used. | -| `--engine {accuracy_checker, simplified}` | Engine type used to specify CLI mode. Default: `accuracy_checker`. | -| `--data-source DATA_DIR` | Optional. Valid and required for Simplified mode only. Specifies the path to calibration data. | -| `--ac-config` | Path to the Accuracy Checker configuration file. Applicable only when `-q` option is used. | -| `--max-drop` | Optional. Maximum accuracy drop. Valid only for accuracy-aware quantization. Applicable only when `-q` option is used and `accuracy_aware` method is selected. | -| `-c CONFIG`, `--config CONFIG` | Path to a config file with task- or model-specific parameters. | -| `-e`, `--evaluate` | Optional. Evaluate model on the whole dataset after optimization. | -| `--output-dir OUTPUT_DIR` | Optional. A directory where results are saved. Default: `./results`. | -| `-sm`, `--save-model` | Optional. Save the original full-precision model. | -| `-d`, `--direct-dump` | Optional. Save results to the "optimized" subfolder within the specified output directory with no additional subpaths added at the end. | -| `--log-level {CRITICAL,ERROR,WARNING,INFO,DEBUG}` | Optional. Log level to print. Default: INFO. | -| `--progress-bar` | Optional. Disable CL logging and enable progress bar. | -| `--stream-output` | Optional. Switch model quantization progress display to a multiline mode. Use with third-party components. | -| `--keep-uncompressed-weights` | Optional. Keep Convolution, Deconvolution and FullyConnected weights uncompressed. Use with third-party components.| ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| Argument | Description | ++=====================================================+=======================================================================================================================================================================================================+ +| ``-h``, ```--help``` | Optional. Show help message and exit. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ```-q``, ```--quantize``` | Quantize model to 8 bits with specified quantization method: ``default`` or ``accuracy_aware``. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--preset`` | Use ``performance`` for fully symmetric quantization or ``mixed`` preset for symmetric quantization of weight and asymmetric quantization of activations. Applicable only when ``-q`` option is used. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``-m``, ``--model`` | Path to the optimizing model file (.xml). Applicable only when ``-q`` option is used. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``-w``, ``--weights`` | Path to the weights file of the optimizing model (.bin). Applicable only when ``-q`` option is used. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``-n``, ``--name`` | Optional. Model name. Applicable only when ``-q`` option is used. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--engine {accuracy_checker, simplified}`` | Engine type used to specify CLI mode. Default: ``accuracy_checker``. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--data-source DATA_DIR`` | Optional. Valid and required for Simplified mode only. Specifies the path to calibration data. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--ac-config`` | Path to the Accuracy Checker configuration file. Applicable only when ``-q`` option is used. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--max-drop`` | Optional. Maximum accuracy drop. Valid only for accuracy-aware quantization. Applicable only when ``-q`` option is used and ``accuracy_aware`` method is selected. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``-c CONFIG``, ``--config CONFIG`` | Path to a config file with task- or model-specific parameters. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``-e``, ``--evaluate`` | Optional. Evaluate model on the whole dataset after optimization. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--output-dir OUTPUT_DIR`` | Optional. A directory where results are saved. Default: ``./results``. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``-sm`, `--save-model`` | Optional. Save the original full-precision model. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``-d`, `--direct-dump`` | Optional. Save results to the "optimized" subfolder within the specified output directory with no additional subpaths added at the end. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--log-level {CRITICAL,ERROR,WARNING,INFO,DEBUG}`` | Optional. Log level to print. Default: INFO. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--progress-bar`` | Optional. Disable CL logging and enable progress bar. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--stream-output`` | Optional. Switch model quantization progress display to a multiline mode. Use with third-party components. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ +| ``--keep-uncompressed-weights`` | Optional. Keep Convolution, Deconvolution and FullyConnected weights uncompressed. Use with third-party components. | ++-----------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -## See Also -* [Optimization with Simplified mode](@ref pot_docs_simplified_mode) -* [Post-Training Optimization Best Practices](@ref pot_docs_BestPractices) +See Also +#################### + +* :doc:`Optimization with Simplified mode ` +* :doc:`Post-Training Optimization Best Practices ` + +@endsphinxdirective \ No newline at end of file diff --git a/tools/pot/docs/SaturationIssue.md b/tools/pot/docs/SaturationIssue.md index b98580e0436..e27debf1c51 100644 --- a/tools/pot/docs/SaturationIssue.md +++ b/tools/pot/docs/SaturationIssue.md @@ -1,39 +1,52 @@ # Saturation (overflow) Issue Workaround {#pot_saturation_issue} -## Introduction +@sphinxdirective + +Introduction +#################### + 8-bit instructions of older Intel CPU generations (based on SSE, AVX-2, and AVX-512 instruction sets) are prone to so-called saturation (overflow) of the intermediate buffer when calculating the dot product, which is an essential part of Convolutional or MatMul operations. This saturation can lead to a drop in accuracy when running inference of 8-bit quantized models on the mentioned architectures. Additionally, it is impossible to predict if the issue occurs in a given setup, since most computations are executed in parallel during DL model inference, which makes this process non-deterministic. This is a common problem for models with non-ReLU activation functions and low level of redundancy (for example, optimized or efficient models). It can prevent deploying the model on legacy hardware or creating cross-platform applications. The problem does not occur on GPUs or CPUs with Intel Deep Learning Boost (VNNI) technology and further generations. -## Saturation Problem Detection +Saturation Problem Detection +############################ + The only way to detect the saturation issue is to run inference on a CPU that allows it and then on one that does not (for example, a VNNI-based CPU). A significant difference in accuracy (more than 1%) will be the main indicator of the saturation issue impact. -## Saturation Issue Workaround +Saturation Issue Workaround +########################### + While quantizing activations use the full range of 8-bit data types, there is a workaround using only 7 bits to represent weights (of Convolutional or Fully-Connected layers). Using this algorithm for the first layer can help mitigate the saturation issue for many models. However, this can lead to lower accuracy due to reduced representation of weights. -POT tool provides three options to deal with the saturation issue. The options can be enabled in the POT configuration file using the `saturation_fix` parameter: +POT tool provides three options to deal with the saturation issue. The options can be enabled in the POT configuration file using the ``saturation_fix`` parameter: -* "First_layer" option -- (default) fix saturation issue for the first layer. +* "First_layer" option -- (default) fix saturation issue for the first layer. * "All" option -- apply for all layers in the model. * "No" option -- do not apply saturation fix at all. -Below is an example of the section in the POT configuration file with the `saturation_fix` option: -```json -"algorithms": [ - { - "name": "DefaultQuantization", - "params": { - "preset": "performance", - "stat_subset_size": 300, - "saturation_fix": "all" // Apply the saturation fix to all the layers - } - } -] -``` +Below is an example of the section in the POT configuration file with the ``saturation_fix`` option: -If you observe the saturation issue, try the "all" option during model quantization. If the accuracy problem still occurs, try using [Quantization-aware training from NNCF](https://github.com/openvinotoolkit/nncf) and fine-tuning the model. +.. code-block:: json + + "algorithms": [ + { + "name": "DefaultQuantization", + "params": { + "preset": "performance", + "stat_subset_size": 300, + "saturation_fix": "all" // Apply the saturation fix to all the layers + } + } + ] + + +If you observe the saturation issue, try the "all" option during model quantization. If the accuracy problem still occurs, try using `Quantization-aware training from NNCF `__ and fine-tuning the model. Use the "no" option when leaving out legacy CPU HW. It might also lead to slightly better accuracy. -## Additional Resources +Additional Resources +#################### -* [Lower Numerical Precision Deep Learning Inference and Training blogpost](https://www.intel.com/content/www/us/en/developer/articles/technical/lower-numerical-precision-deep-learning-inference-and-training.html) -* [Configuration file description](@ref pot_configs_README) \ No newline at end of file +* `Lower Numerical Precision Deep Learning Inference and Training blogpost `__ +* :doc:`Configuration file description ` + +@endsphinxdirective diff --git a/tools/pot/docs/SimplifiedMode.md b/tools/pot/docs/SimplifiedMode.md index bd17d689af6..ea543368c7c 100644 --- a/tools/pot/docs/SimplifiedMode.md +++ b/tools/pot/docs/SimplifiedMode.md @@ -1,58 +1,76 @@ # Optimization with Simplified Mode {#pot_docs_simplified_mode} -## Introduction +@sphinxdirective -Simplified mode is designed to make data preparation for the model optimization process easier. The mode is represented by an implementation of Engine interface from the POT API. It allows reading the data from an arbitrary folder specified by the user. For more details about POT API, refer to the corresponding [description](@ref pot_compression_api_README). Currently, Simplified mode is available only for image data in PNG or JPEG formats, stored in a single folder. It supports Computer Vision models with a single input or two inputs where the second is "image_info" (Faster R-CNN, Mask R-CNN, etc.). +Introduction +#################### -> **NOTE**: This mode cannot be used with accuracy-aware methods. There is no way to control accuracy after optimization. Nevertheless, this mode can be helpful to estimate performance benefits when using model optimizations. +Simplified mode is designed to make data preparation for the model optimization process easier. The mode is represented by an implementation of Engine interface from the POT API. It allows reading the data from an arbitrary folder specified by the user. For more details about POT API, refer to the corresponding :doc:`description `. Currently, Simplified mode is available only for image data in PNG or JPEG formats, stored in a single folder. It supports Computer Vision models with a single input or two inputs where the second is "image_info" (Faster R-CNN, Mask R-CNN, etc.). -## Usage +.. note:: + + This mode cannot be used with accuracy-aware methods. There is no way to control accuracy after optimization. Nevertheless, this mode can be helpful to estimate performance benefits when using model optimizations. + +Usage +#################### To use the Simplified mode, prepare the data and place it in a separate folder. No other files should be present in this folder. -To apply optimization when there is only a model and no data is available. It is possible to generate a synthetic dataset using Dataset Management Framework (Datumaro) available on [GitHub](https://github.com/openvinotoolkit/datumaro). Currently, data generation is available only for Computer Vision models, it can take time in some cases. +To apply optimization when there is only a model and no data is available. It is possible to generate a synthetic dataset using Dataset Management Framework (Datumaro) available on `GitHub `__. Currently, data generation is available only for Computer Vision models, it can take time in some cases. Install Datumaro: -``` bash -pip install datumaro -``` +.. code-block:: bash + + pip install datumaro + + Create a synthetic dataset with elements of the specified type and shape, and save it to the provided directory. Usage: -``` bash -datum generate [-h] -o OUTPUT_DIR -k COUNT --shape SHAPE [SHAPE ...] - [-t {image}] [--overwrite] [--model-dir MODEL_PATH] -``` -Example of generating 300 images with height = 224 and width = 256 and saving them in the `./dataset` directory. -```bash -datum generate -o ./dataset -k 300 --shape 224 256 -``` -After that, `OUTPUT_DIR` can be provided to `--data-source` CLI option or to `data_source` config parameter. +.. code-block:: bash + + datum generate [-h] -o OUTPUT_DIR -k COUNT --shape SHAPE [SHAPE ...] + [-t {image}] [--overwrite] [--model-dir MODEL_PATH] + + +Example of generating 300 images with height = 224 and width = 256 and saving them in the ``./dataset`` directory. + +.. code-block:: bash + + datum generate -o ./dataset -k 300 --shape 224 256 + + +After that, ``OUTPUT_DIR`` can be provided to ``--data-source`` CLI option or to ``data_source`` config parameter. There are two options to run POT in the Simplified mode: * Using command-line options only. Here is an example for 8-bit quantization: - - `pot -q default -m -w --engine simplified --data-source ` + + ``pot -q default -m -w --engine simplified --data-source `` + * To provide more options, use the corresponding `"engine"` section in the POT configuration file as follows: - ```json - "engine": { - "type": "simplified", - "layout": "NCHW", // Layout of input data. Supported ["NCHW", - // "NHWC", "CHW", "CWH"] layout - "data_source": "PATH_TO_SOURCE" // You can specify path to the directory with images - // Also you can specify template for file names to filter images to load. - // Templates are unix style (this option is valid only in Simplified mode) - } - ``` + + .. code-block:: json + + "engine": { + "type": "simplified", + "layout": "NCHW", // Layout of input data. Supported ["NCHW", + // "NHWC", "CHW", "CWH"] layout + "data_source": "PATH_TO_SOURCE" // You can specify path to the directory with images + // Also you can specify template for file names to filter images to load. + // Templates are unix style (this option is valid only in Simplified mode) + } -A template of configuration file for 8-bit quantization using Simplified mode can be found [at the following link](https://github.com/openvinotoolkit/openvino/blob/master/tools/pot/configs/simplified_mode_template.json). +A template of configuration file for 8-bit quantization using Simplified mode can be found `at the following link `__. -For more details about POT usage via CLI, refer to this [CLI document](@ref pot_compression_cli_README). +For more details about POT usage via CLI, refer to this :doc:`CLI document `. -## Additional Resources +Additional Resources +#################### - * [Configuration File Description](@ref pot_configs_README) \ No newline at end of file +* :doc:`Configuration File Description ` + +@endsphinxdirective \ No newline at end of file diff --git a/tools/pot/openvino/tools/pot/api/README.md b/tools/pot/openvino/tools/pot/api/README.md index 652c542c161..e7ebdc98ce8 100644 --- a/tools/pot/openvino/tools/pot/api/README.md +++ b/tools/pot/openvino/tools/pot/api/README.md @@ -1,307 +1,401 @@ # API Reference {#pot_compression_api_README} +@sphinxdirective + Post-training Optimization Tool API provides a full set of interfaces and helpers that allow users to implement a custom optimization pipeline for various types of DL models including cascaded or compound models. Below is a full specification of this API: -### DataLoader +DataLoader +++++++++++++++++++++ + +.. code-block:: sh + + class openvino.tools.pot.DataLoader(config) + -``` -class openvino.tools.pot.DataLoader(config) -``` The base class for all DataLoaders. -`DataLoader` loads data from a dataset and applies pre-processing to them providing access to the pre-processed data +``DataLoader`` loads data from a dataset and applies pre-processing to them providing access to the pre-processed data by index. -All subclasses should override `__len__()` function, which should return the size of the dataset, and `__getitem__()`, -which supports integer indexing in the range of 0 to `len(self)`. `__getitem__()` method can return data in one of the possible formats: -``` -(data, annotation) -``` +All subclasses should override ``__len__()`` function, which should return the size of the dataset, and ``__getitem__()``, +which supports integer indexing in the range of 0 to ``len(self)``. ``__getitem__()`` method can return data in one of the possible formats: + +.. code-block:: sh + + (data, annotation) + + or -``` -(data, annotation, metadata) -``` -`data` is the input that is passed to the model at inference so that it should be properly preprocessed. `data` can be either `numpy.array` object or dictionary where the key is the name of the model input and value is `numpy.array` which corresponds to this input. The format of `annotation` should correspond to the expectations of the `Metric` class. `metadata` is an optional field that can be used to store additional information required for post-processing. -### Metric +.. code-block:: sh + + (data, annotation, metadata) + + +``data`` is the input that is passed to the model at inference so that it should be properly preprocessed. ``data`` can be either ``numpy.array`` object or dictionary where the key is the name of the model input and value is ``numpy.array`` which corresponds to this input. The format of ``annotation`` should correspond to the expectations of the ``Metric`` class. ``metadata`` is an optional field that can be used to store additional information required for post-processing. + +Metric +++++++++++++++++++++ + +.. code-block:: sh + + class openvino.tools.pot.Metric() + -``` -class openvino.tools.pot.Metric() -``` An abstract class representing an accuracy metric. All instances should override the following properties: -- `value` - returns the accuracy metric value for the last model output in a format of `Dict[str, numpy.array]`. -- `avg_value` - returns the average accuracy metric over collected model results in a format of `Dict[str, numpy.array]`. -- `higher_better` should return `True` if a higher value of the metric corresponds to better performance, otherwise, returns `False`. Default implementation returns `True`. + +- ``value`` - returns the accuracy metric value for the last model output in a format of ``Dict[str, numpy.array]``. +- ``avg_value`` - returns the average accuracy metric over collected model results in a format of ``Dict[str, numpy.array]``. +- ``higher_better`` should return ``True`` if a higher value of the metric corresponds to better performance, otherwise, returns ``False``. Default implementation returns ``True``. and methods: -- `update(output, annotation)` - calculates and updates the accuracy metric value using the last model output and annotation. The model output and annotation should be passed in this method. It should also contain the model-specific post-processing in case the model returns the raw output. -- `reset()` - resets collected accuracy metric. -- `get_attributes()` - returns a dictionary of metric attributes: - ``` - {metric_name: {attribute_name: value}} - ``` - Required attributes: - - `direction` - (`higher-better` or `higher-worse`) a string parameter defining whether metric value - should be increased in accuracy-aware algorithms. - - `type` - a string representation of metric type. For example, 'accuracy' or 'mean_iou'. -### Engine +- ``update(output, annotation)`` - calculates and updates the accuracy metric value using the last model output and annotation. The model output and annotation should be passed in this method. It should also contain the model-specific post-processing in case the model returns the raw output. +- ``reset()`` - resets collected accuracy metric. +- ``get_attributes()`` - returns a dictionary of metric attributes: + + .. code-block:: sh + + {metric_name: {attribute_name: value}} + + + Required attributes: + + - ``direction`` - (``higher-better`` or ``higher-worse``) a string parameter defining whether metric value should be increased in accuracy-aware algorithms. + - ``type`` - a string representation of metric type. For example, 'accuracy' or 'mean_iou'. + +Engine +++++++++++++++++++++ + +.. code-block:: sh + + class openvino.tools.pot.Engine(config, data_loader=None, metric=None) -``` -class openvino.tools.pot.Engine(config, data_loader=None, metric=None) -``` Base class for all Engines. The engine provides model inference, statistics collection for activations and calculation of accuracy metrics for a dataset. -*Parameters* -- `config` - engine specific config. -- `data_loader` - `DataLoader` instance to iterate over dataset. -- `metric` - `Metric` instance to calculate the accuracy metric of the model. +*Parameters* + +- ``config`` - engine specific config. +- ``data_loader`` - ``DataLoader`` instance to iterate over dataset. +- ``metric`` - ``Metric`` instance to calculate the accuracy metric of the model. All subclasses should override the following methods: -- `set_model(model)` - sets/resets a model.

- *Parameters* - - `model` - `CompressedModel` instance for inference. -- `predict(stats_layout=None, sampler=None, metric_per_sample=False, print_progress=False)` - performs model inference -on the specified subset of data.

+- ``set_model(model)`` - sets/resets a model. + *Parameters* - - `stats_layout` - dictionary of statistic collection functions. An optional parameter. - ``` - { - 'node_name': { - 'stat_name': fn - } - } - ``` + + - ``model`` - `CompressedModel` instance for inference. + +- `predict(stats_layout=None, sampler=None, metric_per_sample=False, print_progress=False)` - performs model inference on the specified subset of data. + + *Parameters* + + - `stats_layout` - dictionary of statistic collection functions. An optional parameter. + + .. code-block:: sh + + { + 'node_name': { + 'stat_name': fn + } + } + - `sampler` - `Sampler` instance that provides a way to iterate over the dataset. (See details below). - `metric_per_sample` - if `Metric` is specified and this parameter is set to True, then the metric value should be calculated for each data sample, otherwise for the whole dataset. - `print_progress` - print inference progress. - + *Returns* - - a tuple of dictionaries of per-sample and overall metric values if `metric_per_sample` is True - ``` - ( - { - 'sample_id': sample_index, - 'metric_name': metric_name, - 'result': metric_value - }, - { - 'metric_name': metric_value - } - ) - ``` - Otherwise, a dictionary of overall metrics.
- ``` - { 'metric_name': metric_value } - ``` -- a dictionary of collected statistics - ``` - { - 'node_name': { - 'stat_name': [statistics] - } - } - ``` -### Pipeline + - a tuple of dictionaries of per-sample and overall metric values if ``metric_per_sample`` is True + + .. code-block:: sh + + ( + { + 'sample_id': sample_index, + 'metric_name': metric_name, + 'result': metric_value + }, + { + 'metric_name': metric_value + } + ) + + + Otherwise, a dictionary of overall metrics. + + .. code-block:: sh + + { 'metric_name': metric_value } + + +- a dictionary of collected statistics + + .. code-block:: sh + + { + 'node_name': { + 'stat_name': [statistics] + } + } + + +Pipeline +++++++++++++++++++++ + +.. code-block:: sh + + class openvino.tools.pot.Pipeline(engine) + -``` -class openvino.tools.pot.Pipeline(engine) -``` Pipeline class represents the optimization pipeline. -*Parameters* -- `engine` - instance of `Engine` class for model inference. +*Parameters* -The pipeline can be applied to the DL model by calling `run(model)` method where `model` is the `NXModel` instance. +- ``engine`` - instance of ``Engine`` class for model inference. -#### Create a pipeline +The pipeline can be applied to the DL model by calling ``run(model)`` method where ``model`` is the ``NXModel`` instance. + +Create a pipeline +-------------------- The POT Python* API provides the utility function to create and configure the pipeline: -``` -openvino.tools.pot.create_pipeline(algo_config, engine) -``` -*Parameters* -- `algo_config` - a list defining optimization algorithms and their parameters included in the optimization pipeline. - The order in which they are applied to the model in the optimization pipeline is determined by the order in the list. + +.. code-block:: sh + + openvino.tools.pot.create_pipeline(algo_config, engine) + + +*Parameters* + +- ``algo_config`` - a list defining optimization algorithms and their parameters included in the optimization pipeline. + The order in which they are applied to the model in the optimization pipeline is determined by the order in the list. Example of the algorithm configuration of the pipeline: - ``` - algo_config = [ - { - 'name': 'DefaultQuantization', - 'params': { - 'preset': 'performance', - 'stat_subset_size': 500 - } - }, - ... - ] - ``` -- `engine` - instance of `Engine` class for model inference. + + .. code-block:: sh + + algo_config = [ + { + 'name': 'DefaultQuantization', + 'params': { + 'preset': 'performance', + 'stat_subset_size': 500 + } + }, + ... + ] + + +- ``engine`` - instance of ``Engine`` class for model inference. *Returns* -- instance of the `Pipeline` class. -## Helpers and Internal Model Representation +- instance of the ``Pipeline`` class. + +Helpers and Internal Model Representation +######################################### + In order to simplify implementation of optimization pipelines we provide a set of ready-to-use helpers. Here we also describe internal representation of the DL model and how to work with it. -### IEEngine +IEEngine +++++++++++++++++++++ -``` -class openvino.tools.pot.IEEngine(config, data_loader=None, metric=None) -``` -IEEngine is a helper which implements Engine class based on [OpenVINO™ Inference Engine Python* API](https://docs.openvino.ai/latest/api/ie_python_api/api.html). +.. code-block:: sh + + class openvino.tools.pot.IEEngine(config, data_loader=None, metric=None) + +IEEngine is a helper which implements Engine class based on :doc:`OpenVINO™ Inference Engine Python API `. This class support inference in synchronous and asynchronous modes and can be reused as-is in the custom pipeline or with some modifications, e.g. in case of custom post-processing of inference results. The following methods can be overridden in subclasses: -- `postprocess_output(outputs, metadata)` - Processes model output data using the image metadata obtained during data loading.

+ +- ``postprocess_output(outputs, metadata)`` - Processes model output data using the image metadata obtained during data loading. + *Parameters* - - `outputs` - dictionary of output data per output name. - - `metadata` - information about the data used for inference. - + + - ``outputs`` - dictionary of output data per output name. + - ``metadata`` - information about the data used for inference. + *Return* + - list of the output data in an order expected by the accuracy metric if any is used - -`IEEngine` supports data returned by `DataLoader` in the format: -``` -(data, annotation) -``` + +``IEEngine`` supports data returned by ``DataLoader`` in the format: + +.. code-block:: sh + + (data, annotation) + + or -``` -(data, annotation, metadata) -``` -Metric values returned by a `Metric` instance are expected to be in the format: -- for `value()`: -``` -{metric_name: [metric_values_per_image]} -``` -- for `avg_value()`: -``` -{metric_name: metric_value} -``` +.. code-block:: sh -In order to implement a custom `Engine` class you may need to get familiar with the following interfaces: + (data, annotation, metadata) -### CompressedModel -The Python* POT API provides the `CompressedModel` class as one interface for working with single and cascaded DL model. +Metric values returned by a ``Metric`` instance are expected to be in the format: + +- for ``value()``: + + .. code-block:: sh + + {metric_name: [metric_values_per_image]} + +- for ``avg_value()``: + + .. code-block:: sh + + {metric_name: metric_value} + + +In order to implement a custom ``Engine`` class you may need to get familiar with the following interfaces: + +CompressedModel +++++++++++++++++++++ + +The Python POT API provides the ``CompressedModel`` class as one interface for working with single and cascaded DL model. It is used to load, save and access the model, in case of the cascaded model, access each model of the cascaded model. -``` -class openvino.tools.pot.graph.nx_model.CompressedModel(**kwargs) -``` +.. code-block:: sh + + class openvino.tools.pot.graph.nx_model.CompressedModel(**kwargs) + The CompressedModel class provides a representation of the DL model. A single model and cascaded model can be represented as an instance of this class. The cascaded model is stored as a list of models. *Properties* -- `models` - list of models of the cascaded model. -- `is_cascade` - returns True if the loaded model is cascaded model. - -### Read model from OpenVINO IR -The Python* POT API provides the utility function to load model from the OpenVINO™ Intermediate Representation (IR): -``` -openvino.tools.pot.load_model(model_config) -``` +- ``models`` - list of models of the cascaded model. +- ``is_cascade`` - returns True if the loaded model is cascaded model. + +Read model from OpenVINO IR +++++++++++++++++++++++++++++++ + +The Python POT API provides the utility function to load model from the OpenVINO™ Intermediate Representation (IR): + +.. code-block:: sh + + openvino.tools.pot.load_model(model_config) + *Parameters* -- `model_config` - dictionary describing a model that includes the following attributes: - - `model_name` - model name. - - `model` - path to the network topology (.xml). - - `weights` - path to the model weights (.bin). - - Example of `model_config` for a single model: - ``` - model_config = { - 'model_name': 'mobilenet_v2', - 'model': '/mobilenet_v2.xml', - 'weights': '/mobilenet_v2.bin' - } - ``` - Example of `model_config` for a cascaded model: - ``` - model_config = { - 'model_name': 'mtcnn', - 'cascade': [ - { - 'name': 'pnet', - "model": '/pnet.xml', - 'weights': '/pnet.bin' - }, - { - 'name': 'rnet', - 'model': '/rnet.xml', - 'weights': '/rnet.bin' - }, - { - 'name': 'onet', - 'model': '/onet.xml', - 'weights': '/onet.bin' - } - ] - } - ``` + +- ``model_config`` - dictionary describing a model that includes the following attributes: + - ``model_name`` - model name. + - ``model`` - path to the network topology (.xml). + - ``weights`` - path to the model weights (.bin). + + Example of ``model_config`` for a single model: + + .. code-block:: sh + + model_config = { + 'model_name': 'mobilenet_v2', + 'model': '/mobilenet_v2.xml', + 'weights': '/mobilenet_v2.bin' + } + + Example of ``model_config`` for a cascaded model: + + .. code-block:: sh + + model_config = { + 'model_name': 'mtcnn', + 'cascade': [ + { + 'name': 'pnet', + "model": '/pnet.xml', + 'weights': '/pnet.bin' + }, + { + 'name': 'rnet', + 'model': '/rnet.xml', + 'weights': '/rnet.bin' + }, + { + 'name': 'onet', + 'model': '/onet.xml', + 'weights': '/onet.bin' + } + ] + } + *Returns* -- `CompressedModel` instance -#### Save model to IR -The Python* POT API provides the utility function to save model in the OpenVINO™ Intermediate Representation (IR): -``` -openvino.tools.pot.save_model(model, save_path, model_name=None, for_stat_collection=False) -``` +- ``CompressedModel`` instance + +Save model to IR +---------------- + +The Python POT API provides the utility function to save model in the OpenVINO™ Intermediate Representation (IR): + +.. code-block:: sh + + openvino.tools.pot.save_model(model, save_path, model_name=None, for_stat_collection=False) + + *Parameters* -- `model` - `CompressedModel` instance. -- `save_path` - path to save the model. -- `model_name` - name under which the model will be saved. -- `for_stat_collection` - whether model is saved to be used for statistic collection or for normal inference - (affects only cascaded models). If set to False, removes model prefixes from node names. + +- ``model`` - ``CompressedModel`` instance. +- ``save_path`` - path to save the model. +- ``model_name`` - name under which the model will be saved. +- ``for_stat_collection`` - whether model is saved to be used for statistic collection or for normal inference (affects only cascaded models). If set to False, removes model prefixes from node names. *Returns* + - list of dictionaries with paths: - ``` - [ - { - 'name': model name, - 'model': path to .xml, - 'weights': path to .bin - }, - ... - ] - ``` -### Sampler + .. code-block:: sh + + [ + { + 'name': model name, + 'model': path to .xml, + 'weights': path to .bin + }, + ... + ] + + +Sampler +++++++++++++++++++++ + +.. code-block:: sh + + class openvino.tools.pot.samplers.Sampler(data_loader=None, batch_size=1, subset_indices=None) -``` -class openvino.tools.pot.samplers.Sampler(data_loader=None, batch_size=1, subset_indices=None) -``` Base class for all Samplers. Sampler provides a way to iterate over the dataset. -All subclasses overwrite `__iter__()` method, providing a way to iterate over the dataset, and a `__len__()` method +All subclasses overwrite ``__iter__()`` method, providing a way to iterate over the dataset, and a ``__len__()`` method that returns the length of the returned iterators. -*Parameters* -- `data_loader` - instance of `DataLoader` class to load data. -- `batch_size` - number of items in batch, default is 1. -- `subset_indices` - indices of samples to load. If `subset_indices` is set to None then the sampler will take elements - from the whole dataset. +*Parameters* -### BatchSampler +- ``data_loader`` - instance of ``DataLoader`` class to load data. +- ``batch_size`` - number of items in batch, default is 1. +- ``subset_indices`` - indices of samples to load. If ``subset_indices`` is set to None then the sampler will take elements from the whole dataset. -``` -class openvino.tools.pot.samplers.batch_sampler.BatchSampler(data_loader, batch_size=1, subset_indices=None): -``` -Sampler provides an iterable over the dataset subset if `subset_indices` is specified or over the whole dataset with -given `batch_size`. Returns a list of data items. +BatchSampler +++++++++++++ + +.. code-block:: sh + + class openvino.tools.pot.samplers.batch_sampler.BatchSampler(data_loader, batch_size=1, subset_indices=None): + +Sampler provides an iterable over the dataset subset if ``subset_indices`` is specified +or over the whole dataset with given ``batch_size``. Returns a list of data items. + +@endsphinxdirective