Files
openvino/docs/optimization_guide/model_optimization_guide.md
Maxim Shevtsov cbfb8a1678 Perf Hints docs and General Opt Guide refactoring (#10815)
* Brushed the general optimization page

* Opt GUIDE, WIP

* perf hints doc placeholder

* WIP

* WIP2

* WIP 3

* added streams and few other details

* fixed titles, misprints etc

* Perf hints

* movin the runtime optimizations intro

* fixed link

* Apply suggestions from code review

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* some details on the FIL and other means when pure inference time is not the only factor

* shuffled according to general->use-case->device-specifics flow, minor brushing

* next iter

* section on optimizing for tput and latency

* couple of links to the features support matrix

* Links, brushing, dedicated subsections for Latency/FIL/Tput

* had to make the link less specific (otherwise docs compilations fails)

* removing the Temp/Should be moved to the Opt Guide

* shuffled the tput/latency/etc info into separated documents. also the following docs moved from the temp into specific feature, general product desc or corresponding plugins

-   openvino_docs_IE_DG_Model_caching_overview
-   openvino_docs_IE_DG_Int8Inference
-   openvino_docs_IE_DG_Bfloat16Inference
-   openvino_docs_OV_UG_NoDynamicShapes

* fixed toc for ov_dynamic_shapes.md

* referring the openvino_docs_IE_DG_Bfloat16Inference to avoid docs compilation errors

* fixed main product TOC, removed ref from the second-level items

* reviewers remarks

* reverted the openvino_docs_OV_UG_NoDynamicShapes

* reverting openvino_docs_IE_DG_Bfloat16Inference and openvino_docs_IE_DG_Int8Inference

* "No dynamic shapes" to the "Dynamic shapes" as TOC

* removed duplication

* minor brushing

* Caching to the next level in TOC

* brushing

* more on the perf counters ( for latency and dynamic cases)

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>
2022-03-17 11:09:13 +03:00

1.9 KiB

Model Optimization Guide

@sphinxdirective

.. toctree:: :maxdepth: 1 :hidden:

pot_README docs_nncf_introduction openvino_docs_IE_DG_Int8Inference

@endsphinxdirective

Model optimization assumes applying transformations to the model and relevant data flow to improve the inference performance. These transformations are basically offline and can require the availability of training and validation data. It includes such methods as quantization, pruning, preprocessing optimization, etc. OpenVINO provides several tools to optimize models at different steps of model development:

  • Post-training Optimization tool (POT) is designed to optimize the inference of deep learning models by applying post-training methods that do not require model retraining or fine-tuning, like post-training quantization.

  • Neural Network Compression Framework (NNCF) provides a suite of advanced algorithms for Neural Networks inference optimization with minimal accuracy drop, for example, quantization, pruning algorithms.

  • Model Optimizer implements optimization to a model, most of them added by default, but you can configure mean/scale values, batch size, RGB vs BGR input channels, and other parameters to speed-up preprocess of a model (Embedding Preprocessing Computation)

Detailed workflow:

To understand which development optimization tool you need, refer to the diagram:

POT is the easiest way to get optimized models, and usually takes several minutes depending on the model size and used HW. NNCF can be considered as an alternative or addition when the first one does not give accurate results.

See also