Revised Tuning For Performance and Model optimization docs (#10276)

* Revised Tuning for performance and Model optimization docs

* Fixed links

* Fixed link

* Applied comments

* Fixed one more comment
This commit is contained in:
Alexander Kozlov
2022-03-03 18:58:58 +03:00
committed by GitHub
parent 554b50eb85
commit 1bbd92a8f8
5 changed files with 111 additions and 23 deletions

View File

@@ -26,11 +26,11 @@
:caption: Tuning for Performance
:hidden:
openvino_docs_performance_benchmarks
openvino_docs_optimization_guide_dldt_optimization_guide
openvino_docs_MO_DG_Getting_Performance_Numbers
pot_README
openvino_docs_model_optimization_guide
openvino_docs_tuning_utilities
openvino_docs_performance_benchmarks
.. toctree::

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:d7a58f31b2043fe9d92892b1f40ed8a7c596c36ef9d1cd1c71adb981009161bf
size 45665

View File

@@ -9,32 +9,20 @@ Performance means how fast the model is in deployment. Two key metrics are used
![](../img/LATENCY_VS_THROUGHPUT.svg)
Latency measures inference time (ms) required to process a single input. When it comes to batch input need to measure throughput (images per second or frames per second, FPS). To calculate throughput, divide number of frames that were processed by the processing time.
Latency measures inference time (ms) required to process a single input. When it comes to batch input need to measure throughput (images per second or frames per second, FPS). To calculate throughput, divide the number of frames that were processed by the processing time.
> **NOTE**: To get performance numbers for OpenVINO, as well as tips how to measure it and compare with native framework, check [Getting performance numbers](../MO_DG/prepare_model/Getting_performance_numbers.md) page.
## How to measure performance
To get performance numbers for OpenVINO, as well as tips how to measure it and compare with native framework, go to [Getting performance numbers](../MO_DG/prepare_model/Getting_performance_numbers.md) page.
## How to Improve Performance
> **NOTE**: Make sure that your model can be successfully inferred with OpenVINO Runtime.
> **NOTE**: Make sure that your model can be successfully inferred with OpenVINO Inference Engine before reffering to the optimization topic.
Inside OpenVINO there are two ways how to get better performance number: during developing and deployment your model. **It is possible to combine both developing and deployment optimizations**.
Inside OpenVINO there are two ways how to get better performance numbers: optimize the model, which is called **model optimization** or tune parameters of execution, which is also **deployment optimization**. Note, that it is possible to combine both types of optimizations.
- **Developing step** includes model modification. Inside developing optimization there are three ways to optimize your model:
- **Model optimization** includes model modification, such as quantization, pruning, optimization of preprocessing, etc. Fore more details, refer to this [document](./model_optimization_guide.md).
- **Post-training Optimization tool** (POT) is designed to optimize the inference of deep learning models by applying special methods without model retraining or fine-tuning, like post-training quantization.
- **Deployment optimization** includes tuning inference parameters and optimizing model execution. To read more visit [Deployment Optimization Guide](../optimization_guide/dldt_deployment_optimization_guide.md).
- **Neural Network Compression Framework (NNCF)** provides a suite of advanced algorithms for Neural Networks inference optimization with minimal accuracy drop, available quantization, pruning and sparsity optimization algorithms.
- **Model Optimizer** implement some optimization to a model, most of them added by default, but you can configure mean/scale values, batch size RGB vs BGR input channels and other parameters to speed-up preprocess of a model ([Additional Optimization Use Cases](../MO_DG/prepare_model/Additional_Optimizations.md))
- **Deployment step** includes tuning inference parameters and optimizing model execution, to read more visit [Deployment Optimization Guide](../optimization_guide/dldt_deployment_optimization_guide.md).
More detailed workflow:
![](../img/DEVELOPMENT_FLOW_V3_crunch.svg)
To understand when to use each development optimization tool, follow this diagram:
POT is the easiest way to get optimized models and it is also really fast and usually takes several minutes depending on the model size and used HW. NNCF can be considered as an alternative or an addition when the first does not give accurate results.
![](../img/WHAT_TO_USE.svg)
## Performance benchmarks
To estimate the performance and compare performance numbers, measured on various supported devices, a wide range of public models are available at [Perforance benchmarks](../benchmarks/performance_benchmarks.md) section.

View File

@@ -0,0 +1,34 @@
# Model Optimization Guide {#openvino_docs_model_optimization_guide}
@sphinxdirective
.. toctree::
:maxdepth: 1
:hidden:
pot_README
docs_nncf_introduction
@endsphinxdirective
Model optimization assumes applying transformations to the model and relevant data flow to improve the inference performance. These transformations are basically offline and can require the availability of training and validation data. It includes such methods as quantization, pruning, preprocessing optimization, etc. OpenVINO provides several tools to optimize models at different steps of model development:
- **Post-training Optimization tool [(POT)](../../tools/pot/README.md)** is designed to optimize the inference of deep learning models by applying post-training methods that do not require model retraining or fine-tuning, like post-training quantization.
- **Neural Network Compression Framework [(NNCF)](./nncf_introduction.md)** provides a suite of advanced algorithms for Neural Networks inference optimization with minimal accuracy drop, for example, quantization, pruning algorithms.
- **Model Optimizer** implements optimization to a model, most of them added by default, but you can configure mean/scale values, batch size RGB vs BGR input channels, and other parameters to speed-up preprocess of a model ([Additional Optimization Use Cases](../MO_DG/prepare_model/Additional_Optimizations.md))
## Detailed workflow:
![](../img/DEVELOPMENT_FLOW_V3_crunch.svg)
To understand which development optimization tool you need, refer to the diagram:
POT is the easiest way to get optimized models, and usually takes several minutes depending on the model size and used HW. NNCF can be considered as an alternative or addition when the first one does not give accurate results.
![](../img/WHAT_TO_USE.svg)
## See also
- [Deployment optimization](./dldt_deployment_optimization_guide.md)

View File

@@ -0,0 +1,63 @@
# Neural Network Compression Framework {#docs_nncf_introduction}
This document describes the Neural Network Compression Framework (NNCF) which is being developed as a separate project outside of OpenVINO™ but it is highly aligned with OpenVINO™ in terms of the supported optimization features and models. It is open-sourced and available on [GitHub](https://github.com/openvinotoolkit/nncf).
## Introduction
Neural Network Compression Framework (NNCF) is aimed at optimizing Deep Neural Network (DNN) by applying optimization methods, such as quantization, pruning, etc., to the original framework model. It mostly provides in-training optimization capabilities which means that optimization methods require model fine-tuning during and after optimization. The diagram below shows the model optimization workflow using NNCF.
![](../img/nncf_workflow.png)
### Features
- Support optimization of PyTorch and TensorFlow 2.x models.
- Support of various optimization algorithms, applied during a model fine-tuning process to achieve a better performance-accuracy trade-off:
|Compression algorithm|PyTorch|TensorFlow 2.x|
| :--- | :---: | :---: |
|[8- bit quantization](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Quantization.md) | Supported | Supported |
|[Filter pruning](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Pruning.md) | Supported | Supported |
|[Sparsity](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Sparsity.md) | Supported | Supported |
|[Mixed-precision quantization](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Quantization.md#mixed_precision_quantization) | Supported | Not supported |
|[Binarization](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Binarization.md) | Supported | Not supported |
- Stacking of optimization methods. For example: 8-bit quaNtization + Filter Pruning.
- Support for [Accuracy-Aware model training](https://github.com/openvinotoolkit/nncf/blob/develop/docs/Usage.md#accuracy-aware-model-training) pipelines via the [Adaptive Compression Level Training](https://github.com/openvinotoolkit/nncf/tree/develop/docs/accuracy_aware_model_training/AdaptiveCompressionLevelTraining.md) and [Early Exit Training](https://github.com/openvinotoolkit/nncf/tree/develop/docs/accuracy_aware_model_training/EarlyExitTrainig.md).
- Automatic, configurable model graph transformation to obtain the compressed model.
> **NOTE**: Limited support for TensorFlow models. Only the models created, using Sequential or Keras Functional API, are supported.
- GPU-accelerated layers for the faster compressed model fine-tuning.
- Distributed training support.
- Configuration file examples for each supported compression algorithm.
- Exporting PyTorch compressed models to ONNX\* checkpoints and TensorFlow compressed models to SavedModel or Frozen Graph format, ready to use with [OpenVINO™ toolkit](https://github.com/openvinotoolkit/).
- Git patches for prominent third-party repositories ([huggingface-transformers](https://github.com/huggingface/transformers)) demonstrating the process of integrating NNCF into custom training pipelines
## Get started
### Installation
NNCF provides the packages available for installation through the PyPI repository. To install the latest version via pip manager run the following command:
```
pip install nncf
```
### Usage examples
NNCF provides various examples and tutorials that demonstrate usage of optimization methods.
### Tutorials
- [Quantization-aware training of PyTorch model](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/302-pytorch-quantization-aware-training)
- [Quantization-aware training of TensorFlow model](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/305-tensorflow-quantization-aware-training)
- (Experimental) [Post-training quantization of PyTorch model](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/112-pytorch-post-training-quantization-nncf)
### Samples
- PyTorch:
- [Image Classification sample](https://github.com/openvinotoolkit/nncf/blob/develop/examples/torch/classification/README.md)
- [Object Detection sample](https://github.com/openvinotoolkit/nncf/blob/develop/examples/torch/object_detection/README.md)
- [Semantic segmentation sample](https://github.com/openvinotoolkit/nncf/blob/develop/examples/torch/semantic_segmentation/README.md)
- TensorFlow samples:
- [Image Classification sample](https://github.com/openvinotoolkit/nncf/blob/develop/examples/tensorflow/classification/README.md)
- [Object Detection sample](https://github.com/openvinotoolkit/nncf/blob/develop/examples/tensorflow/object_detection/README.md)
- [Instance Segmentation sample](https://github.com/openvinotoolkit/nncf/blob/develop/examples/tensorflow/segmentation/README.md)
## See also
- [Compressed Model Zoo](https://github.com/openvinotoolkit/nncf#nncf-compressed-model-zoo)
- [NNCF in HuggingFace Optimum](https://github.com/dkurt/optimum-openvino)
- [OpenVINO™ Post-training Optimization tool](../../tools/pot/README.md)