* Added info on DockerHub CI Framework
* Feature/azaytsev/change layout (#3295)
* Changes according to feedback comments
* Replaced @ref's with html links
* Fixed links, added a title page for installing from repos and images, fixed formatting issues
* Added links
* minor fix
* Added DL Streamer to the list of components installed by default
* Link fixes
* Link fixes
* ovms doc fix (#2988)
* added OpenVINO Model Server
* ovms doc fixes
Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>
* Updated openvino_docs.xml
* Added Intel® Iris® Xe Dedicated Graphics, naming convention info (#3523)
* Added Intel® Iris® Xe Dedicated Graphics, naming convention info
* Added GPU.0 GPU.1
* added info about Intel® Iris® Xe MAX Graphics drivers
* Feature/azaytsev/transition s3 bucket (#3609)
* Replaced https://download.01.org/ links with https://storage.openvinotoolkit.org/
* Fixed links
# Conflicts:
# inference-engine/ie_bridges/java/samples/README.md
* Benchmarks 2021 2 (#3590)
* Initial changes
* Updates
* Updates
* Updates
* Fixed graph names
* minor fix
* Fixed link
* Implemented changes according to the review changes
* fixed links
* Updated Legal_Information.md according to review feedback
* Replaced Uzel* UI-AR8 with Mustang-V100-MX8
* Feature/azaytsev/ovsa docs (#3627)
* Added ovsa_get_started.md
* Fixed formatting issues
* Fixed formatting issues
* Fixed formatting issues
* Fixed formatting issues
* Fixed formatting issues
* Fixed formatting issues
* Fixed formatting issues
* Updated the GSG topic, added a new image
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Formatting issues fixes
* Revert "Formatting issues fixes"
This reverts commit c6e6207431.
* Replaced to Security section
* doc fixes (#3626)
Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com>
# Conflicts:
# docs/IE_DG/network_state_intro.md
* fix latex formula (#3630)
Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com>
* fix comments ngraph api 2021.2 (#3520)
* fix comments ngraph api
* remove whitespace
* fixes
Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com>
* Feature/azaytsev/g api docs (#3731)
* Initial commit
* Added content
* Added new content for g-api documentation. Removed obsolete links through all docs
* Fixed layout
* Fixed layout
* Added new topics
* Added new info
* added a note
* Removed redundant .svg
# Conflicts:
# docs/get_started/get_started_dl_workbench.md
* [Cherry-pick] DL Workbench cross-linking (#3488)
* Added links to MO and Benchmark App
* Changed wording
* Fixes a link
* fixed a link
* Changed the wording
* Links to WB
* Changed wording
* Changed wording
* Fixes
* Changes the wording
* Minor corrections
* Removed an extra point
* cherry-pick
* Added the doc
* More instructions and images
* Added slide
* Borders for screenshots
* fixes
* Fixes
* Added link to Benchmark app
* Replaced the image
* tiny fix
* tiny fix
* Fixed a typo
* Feature/azaytsev/g api docs (#3731)
* Initial commit
* Added content
* Added new content for g-api documentation. Removed obsolete links through all docs
* Fixed layout
* Fixed layout
* Added new topics
* Added new info
* added a note
* Removed redundant .svg
* Doc updates 2021 2 (#3749)
* Change the name of parameter tensorflow_use_custom_operations_config to transformations_config
* Fixed formatting
* Corrected MYRIAD plugin name
* Installation Guides formatting fixes
* Installation Guides formatting fixes
* Installation Guides formatting fixes
* Installation Guides formatting fixes
* Installation Guides formatting fixes
* Installation Guides formatting fixes
* Installation Guides formatting fixes
* Installation Guides formatting fixes
* Installation Guides formatting fixes
* Fixed link to Model Optimizer Extensibility
* Fixed link to Model Optimizer Extensibility
* Fixed link to Model Optimizer Extensibility
* Fixed link to Model Optimizer Extensibility
* Fixed link to Model Optimizer Extensibility
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Fixed formatting
* Updated IGS, added links to Get Started Guides
* Fixed links
* Fixed formatting issues
* Fixed formatting issues
* Fixed formatting issues
* Fixed formatting issues
* Move the Note to the proper place
* Removed optimization notice
# Conflicts:
# docs/ops/detection/DetectionOutput_1.md
* minor fix
* Benchmark updates (#4041)
* Link fixes for 2021.2 benchmark page (#4086)
* Benchmark updates
* Fixed links
Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>
Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com>
Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com>
Co-authored-by: Alina Alborova <alina.alborova@intel.com>
7.9 KiB
Introduction to Intel® Deep Learning Deployment Toolkit
Deployment Challenges
Deploying deep learning networks from the training environment to embedded platforms for inference might be a complex task that introduces a number of technical challenges that must be addressed:
-
There are a number of deep learning frameworks widely used in the industry, such as Caffe*, TensorFlow*, MXNet*, Kaldi* etc.
-
Typically the training of the deep learning networks is performed in data centers or server farms while the inference might take place on embedded platforms, optimized for performance and power consumption. Such platforms are typically limited both from software perspective (programming languages, third party dependencies, memory consumption, supported operating systems), and from hardware perspective (different data types, limited power envelope), so usually it is not recommended (and sometimes just impossible) to use original training framework for inference. An alternative solution would be to use dedicated inference APIs that are well optimized for specific hardware platforms.
-
Additional complications of the deployment process include supporting various layer types and networks that are getting more and more complex. Obviously, ensuring the accuracy of the transforms networks is not trivial.
Deployment Workflow
The process assumes that you have a network model trained using one of the supported frameworks.
The scheme below illustrates the typical workflow for deploying a trained deep learning model:

The steps are:
-
Configure Model Optimizer for the specific framework (used to train your model).
-
Run Model Optimizer to produce an optimized Intermediate Representation (IR) of the model based on the trained network topology, weights and biases values, and other optional parameters.
-
Test the model in the IR format using the Inference Engine in the target environment with provided Inference Engine sample applications.
-
Integrate Inference Engine in your application to deploy the model in the target environment.
Model Optimizer
Model Optimizer is a cross-platform command line tool that facilitates the transition between the training and deployment environment, performs static model analysis and automatically adjusts deep learning models for optimal execution on end-point target devices.
Model Optimizer is designed to support multiple deep learning supported frameworks and formats.
While running Model Optimizer you do not need to consider what target device you wish to use, the same output of the MO can be used in all targets.
Model Optimizer Workflow
The process assumes that you have a network model trained using one of the supported frameworks. The Model Optimizer workflow can be described as following:
- Configure Model Optimizer for one of the supported deep learning framework that was used to train the model.
- Provide as input a trained network that contains a certain network topology, and the adjusted weights and biases (with some optional parameters).
- Run Model Optimizer to perform specific model optimizations (for example, horizontal fusion of certain network layers). Exact optimizations are framework-specific, refer to appropriate documentation pages: Converting a Caffe Model, Converting a TensorFlow Model, Converting a MXNet Model, Converting a Kaldi Model, Converting an ONNX Model.
- Model Optimizer produces as output an Intermediate Representation (IR) of the network which is used as an input for the Inference Engine on all targets.
Supported Frameworks and Formats
- Caffe* (most public branches)
- TensorFlow*
- MXNet*
- Kaldi*
- ONNX*
Supported Models
For the list of supported models refer to the framework or format specific page:
- Supported Caffe* models
- Supported TensorFlow* models
- Supported MXNet* models
- Supported ONNX* models
- Supported Kaldi* models
Intermediate Representation
Intermediate representation describing a deep learning model plays an important role connecting the OpenVINO™ toolkit components.
The IR is a pair of files:
* .xml: The topology file - an XML file that describes the network topology
* .bin: The trained data file - a .bin file that contains the weights and biases binary data
Intermediate Representation (IR) files can be read, loaded and inferred with the Inference Engine. Inference Engine API offers a unified API across a number of supported Intel® platforms. IR is also consumed, modified and written by Post-Training Optimization Tool which provides quantization capabilities.
Refer to a dedicated description about Intermediate Representation and Operation Sets for further details.
nGraph Integration
OpenVINO toolkit is powered by nGraph capabilities for Graph construction API, Graph transformation engine and Reshape. nGraph Function is used as an intermediate representation for a model in the run-time underneath the CNNNetwork API. The conventional representation for CNNNetwork is still available if requested for backward compatibility when some conventional API methods are used. Please refer to the Overview of nGraph describing the details of nGraph representation.
Inference Engine
Inference Engine is a runtime that delivers a unified API to integrate the inference with application logic:
- Takes a model as an input. The model can be presented in the native ONNX format or in the specific form of Intermediate Representation (IR) produced by Model Optimizer.
- Optimizes inference execution for target hardware.
- Delivers inference solution with reduced footprint on embedded inference platforms.
The Inference Engine supports inference of multiple image classification networks, including AlexNet, GoogLeNet, VGG and ResNet families of networks, fully convolutional networks like FCN8 used for image segmentation, and object detection networks like Faster R-CNN.
For the full list of supported hardware, refer to the Supported Devices section.
For Intel® Distribution of OpenVINO™ toolkit, the Inference Engine package contains headers, runtime libraries, and sample console applications demonstrating how you can use the Inference Engine in your applications.
The open source version is available in the OpenVINO™ toolkit GitHub repository and can be built for supported platforms using the Inference Engine Build Instructions.