parent
9e5be9ad24
commit
cf12f92fae
@ -1,179 +1,85 @@
|
|||||||
# Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™ {#openvino_docs_MO_DG_IR_and_opsets}
|
# Operation Sets in OpenVINO {#openvino_docs_MO_DG_IR_and_opsets}
|
||||||
|
|
||||||
|
@sphinxdirective
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
:hidden:
|
||||||
|
|
||||||
|
openvino_docs_ops_opset
|
||||||
|
openvino_docs_operations_specifications
|
||||||
|
openvino_docs_ops_broadcast_rules
|
||||||
|
|
||||||
|
|
||||||
This article provides essential information on the format used for representation of deep learning models in OpenVINO toolkit and supported operation sets.
|
This article provides essential information on the format used for representation of deep learning models in OpenVINO toolkit and supported operation sets.
|
||||||
|
|
||||||
## Overview of Artificial Neural Networks Representation
|
Overview of Artificial Neural Networks Representation
|
||||||
|
#####################################################
|
||||||
|
|
||||||
A deep learning network is usually represented as a directed graph describing the flow of data from the network input data to the inference results.
|
A deep learning network is usually represented as a directed graph describing the flow of data from the network input data to the inference results.
|
||||||
Input data can be in the form of images, video, audio, or preprocessed information representing objects from the target area of interest.
|
Input data can be in the form of images, video, text, audio, or preprocessed information representing objects from the target area of interest.
|
||||||
|
|
||||||
Here is an illustration of a small graph representing a model that consists of a single Convolutional layer and activation function:
|
Here is an illustration sof a small graph representing a model that consists of a single Convolutional layer and activation function:
|
||||||
|
|
||||||

|
.. image:: _static/images/small_IR_graph_demonstration.png
|
||||||
|
|
||||||
Vertices in the graph represent layers or operation instances such as convolution, pooling, and element-wise operations with tensors.
|
Vertices in the graph represent layers or operation instances such as convolution, pooling, and element-wise operations with tensors.
|
||||||
The terms of "layer" and "operation" are used interchangeably within OpenVINO documentation and define how input data is processed to produce output data for a node in a graph.
|
The terms of "layer" and "operation" are used interchangeably within OpenVINO documentation and define how the input data is processed to produce output data for a node in a graph.
|
||||||
An operation node in a graph may consume data at one or multiple input ports.
|
An operation node in a graph may consume data at one or multiple input ports.
|
||||||
For example, an element-wise addition operation has two input ports which accept tensors that are to be summed.
|
For example, an element-wise addition operation has two input ports that accept tensors to be summed.
|
||||||
Some operations do not have any input ports, for example the `Const` operation, which knows the data to be produced without any input.
|
Some operations do not have any input ports, for example the ``Const`` operation which produces without any input.
|
||||||
An edge between operations represents data flow or data dependency implied from one operation node to another.
|
An edge between operations represents data flow or data dependency implied from one operation node to another.
|
||||||
|
|
||||||
Each operation produces data on one or multiple output ports. For example, convolution produces output tensor with activations at a single output port. Split operation usually has multiple output ports, each producing part of an input tensor.
|
Each operation produces data on one or multiple output ports. For example, convolution produces an output tensor with activations at a single output port. The ``Split`` operation usually has multiple output ports, each producing part of an input tensor.
|
||||||
|
|
||||||
Depending on a deep learning framework, the graph can also contain extra nodes that explicitly represent tensors between operations.
|
Depending on a deep learning framework, the graph can also contain extra nodes that explicitly represent tensors between operations.
|
||||||
In such representations, operation nodes are not connected to each other directly. They are rather using data nodes as intermediate stops for data flow.
|
In such representations, operation nodes are not connected to each other directly. They are rather using data nodes as intermediate stops for data flow.
|
||||||
If data nodes are not used, the produced data is associated with an output port of a corresponding operation node that produces the data.
|
If data nodes are not used, the produced data is associated with an output port of the corresponding operation node that produces the data.
|
||||||
|
|
||||||
A set of various operations used in a network is usually fixed for each deep learning framework.
|
A set of various operations used in a network is usually fixed for each deep learning framework.
|
||||||
It determines expressiveness and level of representation available in that framework.
|
It determines expressiveness and level of representation available in that framework.
|
||||||
Sometimes, a network that can be represented in one framework is hard or impossible to be represented in another one or should use significantly different graph, because operation sets used in those two frameworks do not match.
|
Sometimes, a network that can be represented in one framework is hard or impossible to be represented in another one or should use significantly different graph, because operation sets used in those two frameworks do not match.
|
||||||
|
|
||||||
## Intermediate Representation Used in OpenVINO
|
Operation Sets
|
||||||
|
##############
|
||||||
OpenVINO toolkit introduces its own format of graph representation and its own operation set.
|
|
||||||
A graph is represented with two files: an XML file and a binary file.
|
|
||||||
This representation is commonly referred to as the *Intermediate Representation* or *IR*.
|
|
||||||
|
|
||||||
The XML file describes a network topology using a `<layer>` tag for an operation node and an `<edge>` tag for a data-flow connection.
|
|
||||||
Each operation has a fixed number of attributes that define operation flavor used for a node.
|
|
||||||
For example, the `Convolution` operation has such attributes as `dilation`, `stride`, `pads_begin`, and `pads_end`.
|
|
||||||
|
|
||||||
The XML file does not have big constant values like convolution weights.
|
|
||||||
Instead, it refers to a part of the accompanying binary file that stores such values in a binary format.
|
|
||||||
|
|
||||||
Here is an example of a small IR XML file that corresponds to a graph from the previous section:
|
|
||||||
|
|
||||||
```xml
|
|
||||||
<?xml version="1.0" ?>
|
|
||||||
<net name="model_file_name" version="10">
|
|
||||||
<layers>
|
|
||||||
<layer id="0" name="input" type="Parameter" version="opset1">
|
|
||||||
<data element_type="f32" shape="1,3,32,100"/> <!-- attributes of operation -->
|
|
||||||
<output>
|
|
||||||
<!-- description of output ports with type of element and tensor dimensions -->
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
<dim>32</dim>
|
|
||||||
<dim>100</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
<layer id="1" name="conv1/weights" type="Const" version="opset1">
|
|
||||||
<!-- Const is only operation from opset1 that refers to the IR binary file by specifying offset and size in bytes relative to the beginning of the file. -->
|
|
||||||
<data element_type="f32" offset="0" shape="64,3,3,3" size="6912"/>
|
|
||||||
<output>
|
|
||||||
<port id="1" precision="FP32">
|
|
||||||
<dim>64</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
<layer id="2" name="conv1" type="Convolution" version="opset1">
|
|
||||||
<data auto_pad="same_upper" dilations="1,1" output_padding="0,0" pads_begin="1,1" pads_end="1,1" strides="1,1"/>
|
|
||||||
<input>
|
|
||||||
<port id="0">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
<dim>32</dim>
|
|
||||||
<dim>100</dim>
|
|
||||||
</port>
|
|
||||||
<port id="1">
|
|
||||||
<dim>64</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
<output>
|
|
||||||
<port id="2" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>64</dim>
|
|
||||||
<dim>32</dim>
|
|
||||||
<dim>100</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
<layer id="3" name="conv1/activation" type="ReLU" version="opset1">
|
|
||||||
<input>
|
|
||||||
<port id="0">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>64</dim>
|
|
||||||
<dim>32</dim>
|
|
||||||
<dim>100</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
<output>
|
|
||||||
<port id="1" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>64</dim>
|
|
||||||
<dim>32</dim>
|
|
||||||
<dim>100</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
<layer id="4" name="output" type="Result" version="opset1">
|
|
||||||
<input>
|
|
||||||
<port id="0">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>64</dim>
|
|
||||||
<dim>32</dim>
|
|
||||||
<dim>100</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
</layer>
|
|
||||||
</layers>
|
|
||||||
<edges>
|
|
||||||
<!-- Connections between layer nodes: based on ids for layers and ports used in the descriptions above -->
|
|
||||||
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
|
|
||||||
<edge from-layer="1" from-port="1" to-layer="2" to-port="1"/>
|
|
||||||
<edge from-layer="2" from-port="2" to-layer="3" to-port="0"/>
|
|
||||||
<edge from-layer="3" from-port="1" to-layer="4" to-port="0"/>
|
|
||||||
</edges>
|
|
||||||
<meta_data>
|
|
||||||
<!-- This section that is not related to a topology; contains auxiliary information that serves for the debugging purposes. -->
|
|
||||||
<MO_version value="2019.1"/>
|
|
||||||
<cli_parameters>
|
|
||||||
<blobs_as_inputs value="True"/>
|
|
||||||
<caffe_parser_path value="DIR"/>
|
|
||||||
<data_type value="float"/>
|
|
||||||
|
|
||||||
...
|
|
||||||
|
|
||||||
<!-- Omitted a long list of CLI options that always are put here by MO for debugging purposes. -->
|
|
||||||
|
|
||||||
</cli_parameters>
|
|
||||||
</meta_data>
|
|
||||||
</net>
|
|
||||||
```
|
|
||||||
|
|
||||||
The IR does not use explicit data nodes described in the previous section.
|
|
||||||
In contrast, properties of data such as tensor dimensions and their data types are described as properties of input and output ports of operations.
|
|
||||||
|
|
||||||
## Operation Sets
|
|
||||||
|
|
||||||
Operations in OpenVINO Operation Sets are selected based on capabilities of supported deep learning frameworks and hardware capabilities of the target inference device.
|
Operations in OpenVINO Operation Sets are selected based on capabilities of supported deep learning frameworks and hardware capabilities of the target inference device.
|
||||||
It consists of several groups of operations:
|
A set consists of several groups of operations:
|
||||||
|
|
||||||
* Conventional deep learning layers such as `Convolution`, `MaxPool`, and `MatMul` (also known as `FullyConnected`).
|
* Conventional deep learning layers such as ``Convolution``, ``MaxPool``, and ``MatMul`` (also known as ``FullyConnected``).
|
||||||
|
|
||||||
* Various activation functions such as `ReLU`, `Tanh`, and `PReLU`.
|
* Various activation functions such as ``ReLU``, ``Tanh``, and ``PReLU``.
|
||||||
|
|
||||||
* Generic element-wise arithmetic tensor operations such as `Add`, `Subtract`, and `Multiply`.
|
* Generic element-wise arithmetic tensor operations such as ``Add``, ``Subtract``, and ``Multiply``.
|
||||||
|
|
||||||
* Comparison operations that compare two numeric tensors and produce boolean tensors, for example, `Less`, `Equeal`, `Greater`.
|
* Comparison operations that compare two numeric tensors and produce boolean tensors, for example, ``Less``, ``Equeal``, ``Greater``.
|
||||||
|
|
||||||
* Logical operations that are dealing with boolean tensors, for example, `And`, `Xor`, `Not`.
|
* Logical operations that are dealing with boolean tensors, for example, ``And``, ``Xor``, ``Not``.
|
||||||
|
|
||||||
* Data movement operations which are dealing with parts of tensors, for example, `Concat`, `Split`, `StridedSlice`, `Select`.
|
* Data movement operations which are dealing with parts of tensors, for example, ``Concat``, ``Split``, ``StridedSlice``, ``Select``.
|
||||||
|
|
||||||
* Specialized operations that implement complex algorithms dedicated for models of specific type, for example, `DetectionOutput`, `RegionYolo`, `PriorBox`.
|
* Specialized operations that implement complex algorithms dedicated for models of specific type, for example, ``DetectionOutput``, ``RegionYolo``, ``PriorBox``.
|
||||||
|
|
||||||
For more information, refer to the complete description of the supported operation sets in the [Available Operation Sets](../ops/opset.md) article.
|
For more information, refer to the complete description of the supported operation sets in the :doc:`Available Operation Sets <openvino_docs_ops_opset>` article.
|
||||||
|
|
||||||
## IR Versions vs Operation Set Versions
|
How to Read Opset Specification
|
||||||
|
###############################
|
||||||
|
|
||||||
|
In the :doc:`Available Operation Sets <openvino_docs_ops_opset>` there are opsets and there are operations.
|
||||||
|
Each opset specification has a list of links to operations descriptions that are included into that specific opset.
|
||||||
|
Two or more opsets may refer to the same operation.
|
||||||
|
That means an operation is kept unchanged from one operation set to another.
|
||||||
|
|
||||||
|
The description of each operation has a ``Versioned name`` field.
|
||||||
|
For example, the `ReLU` entry point in :doc:`opset1 <openvino_docs_ops_opset1>` refers to :doc:`ReLU-1 <openvino_docs_ops_activation_ReLU_1>` as the versioned name.
|
||||||
|
Meanwhile, `ReLU` in `opset2` refers to the same `ReLU-1` and both `ReLU` operations are the same operation and it has a single :doc:`description <openvino_docs_ops_activation_ReLU_1>`, which means that ``opset1`` and ``opset2`` share the same operation ``ReLU``.
|
||||||
|
|
||||||
|
To differentiate versions of the same operation type such as ``ReLU``, the ``-N`` suffix is used in a versioned name of the operation.
|
||||||
|
The ``N`` suffix usually refers to the first occurrence of ``opsetN`` where this version of the operation is introduced.
|
||||||
|
There is no guarantee that new operations will be named according to that rule. The naming convention might be changed, but not for old operations which are frozen completely.
|
||||||
|
|
||||||
|
IR Versions vs Operation Set Versions
|
||||||
|
######################################
|
||||||
|
|
||||||
The expressiveness of operations in OpenVINO is highly dependent on the supported frameworks and target hardware capabilities.
|
The expressiveness of operations in OpenVINO is highly dependent on the supported frameworks and target hardware capabilities.
|
||||||
As the frameworks and hardware capabilities grow over time, the operation set is constantly evolving to support new models.
|
As the frameworks and hardware capabilities grow over time, the operation set is constantly evolving to support new models.
|
||||||
@ -183,20 +89,21 @@ Version of IR specifies the rules which are used to read the XML and binary file
|
|||||||
|
|
||||||
Historically, there are two major IR version epochs:
|
Historically, there are two major IR version epochs:
|
||||||
|
|
||||||
1. The older one includes IR versions from version 1 to version 7 without versioning of the operation set. During that epoch, the operation set has been growing evolutionally accumulating more layer types and extending existing layer semantics. Changing of the operation set for those versions meant increasing of IR version.
|
1. The older one includes IR versions from version 1 to version 7 without versioning of the operation set. During that epoch, the operation set has been growing evolutionally accumulating more layer types and extending existing layer semantics. Changing of the operation set for those versions meant increasing of the IR version.
|
||||||
|
|
||||||
2. OpenVINO 2020.1 is the starting point of the next epoch. With IR version 10 introduced in OpenVINO 2020.1, the versioning of the operation set is tracked separately from the IR versioning. Also, the operation set was significantly reworked as the result of nGraph integration to the OpenVINO.
|
2. OpenVINO 2020.1 is the starting point of the next epoch. With IR version 10 introduced in OpenVINO 2020.1, the versioning of the operation set is tracked separately from the IR versioning. Also, the operation set was significantly reworked as the result of nGraph integration to the OpenVINO.
|
||||||
|
|
||||||
The first supported operation set in the new epoch is `opset1`.
|
The first supported operation set in the new epoch is ``opset1``.
|
||||||
The number after `opset` is going to be increased each time new operations are added or old operations deleted at the release cadence.
|
The number after ``opset`` is going to be increased each time new operations are added or old operations deleted at the release cadence.
|
||||||
|
|
||||||
The operations from the new epoch cover more TensorFlow and ONNX operators in a form that is closer to the original operation semantics from the frameworks in comparison to the operation set used in former versions of IR (7 and lower).
|
The operations from the new epoch cover more TensorFlow and ONNX operations that better match the original operation semantics from the frameworks, compared to the operation set used in the older IR versions (7 and lower).
|
||||||
|
|
||||||
The name of the opset is specified for each operation in IR.
|
The name of the opset is specified for each operation in IR.
|
||||||
The IR version is specified once per whole IR.
|
The IR version is specified once.
|
||||||
Here is an example from the IR snippet:
|
Here is an example from the IR snippet:
|
||||||
|
|
||||||
```xml
|
.. code-block:: cpp
|
||||||
|
|
||||||
<?xml version="1.0" ?>
|
<?xml version="1.0" ?>
|
||||||
<net name="model_file_name" version="10"> <!-- Version of the whole IR file is here; it is 10 -->
|
<net name="model_file_name" version="10"> <!-- Version of the whole IR file is here; it is 10 -->
|
||||||
<layers>
|
<layers>
|
||||||
@ -211,32 +118,16 @@ Here is an example from the IR snippet:
|
|||||||
<dim>3</dim>
|
<dim>3</dim>
|
||||||
|
|
||||||
...
|
...
|
||||||
```
|
|
||||||
|
|
||||||
The `type="Parameter"` and `version="opset1"` attributes in the example above mean "use that version of the `Parameter` operation that is included in the `opset1` operation set. "
|
The ``type="Parameter"`` and ``version="opset1"`` attributes in the example above mean "use that version of the ``Parameter`` operation that is included in the ``opset1`` operation set. "
|
||||||
|
|
||||||
When a new operation set is introduced, most of the operations remain unchanged and are just aliased from the previous operation set within a new one.
|
When a new operation set is introduced, most of the operations remain unchanged and are just aliased from the previous operation set within a new one.
|
||||||
The goal of operation set version evolution is to add new operations, and probably change small fractions of existing operations (fixing bugs and extending semantics).
|
The goal of operation set version evolution is to add new operations, and change small fractions of existing operations (fixing bugs and extending semantics).
|
||||||
However, such changes affect only new versions of operations from a new operation set, while old operations are used by specifying an appropriate `version`.
|
However, such changes affect only new versions of operations from a new operation set, while old operations are used by specifying an appropriate `version`.
|
||||||
When an old `version` is specified, the behavior will be kept unchanged from that specified version to provide backward compatibility with older IRs.
|
When an old `version` is specified, the behavior will be kept unchanged from that specified version to provide backward compatibility with older IRs.
|
||||||
|
|
||||||
A single `xml` file with IR may contain operations from different opsets.
|
A single ``xml`` file with IR may contain operations from different opsets.
|
||||||
An operation that is included in several opsets may be referred to with `version` which points to any opset that includes that operation.
|
An operation that is included in several opsets may be referred to with ``version`` which points to any opset that includes that operation.
|
||||||
For example, the same `Convolution` can be used with `version="opset1"` and `version="opset2"` because both opsets have the same `Convolution` operations.
|
For example, the same ``Convolution`` can be used with ``version="opset1"`` and ``version="opset2"`` because both opsets have the same ``Convolution`` operations.
|
||||||
|
|
||||||
## How to Read Opset Specification
|
|
||||||
|
|
||||||
In the [Available Operation Sets](../ops/opset.md) there are opsets and there are operations.
|
|
||||||
Each opset specification has a list of links to operations descriptions that are included into that specific opset.
|
|
||||||
Two or more opsets may refer to the same operation.
|
|
||||||
That means an operation is kept unchanged from one operation set to another.
|
|
||||||
|
|
||||||
The description of each operation has a `Versioned name` field.
|
|
||||||
For example, the `ReLU` entry point in [`opset1`](../ops/opset1.md) refers to [`ReLU-1`](../ops/activation/ReLU_1.md) as the versioned name.
|
|
||||||
Meanwhile, `ReLU` in `opset2` refers to the same `ReLU-1` and both `ReLU` operations are the same operation and it has a single [description](../ops/activation/ReLU_1.md), which means that `opset1` and `opset2` share the same operation `ReLU`.
|
|
||||||
|
|
||||||
To differentiate versions of the same operation type such as `ReLU`, the `-N` suffix is used in a versioned name of the operation.
|
|
||||||
The `N` suffix usually refers to the first occurrence of `opsetN` where this version of the operation is introduced.
|
|
||||||
There is no guarantee that new operations will be named according to that rule. The naming convention might be changed, but not for old operations which are frozen completely.
|
|
||||||
|
|
||||||
|
|
||||||
|
@endsphinxdirective
|
||||||
|
@ -1,37 +1,48 @@
|
|||||||
# Intermediate Representation Suitable for INT8 Inference {#openvino_docs_MO_DG_prepare_model_convert_model_IR_suitable_for_INT8_inference}
|
# Intermediate Representation Suitable for INT8 Inference {#openvino_docs_MO_DG_prepare_model_convert_model_IR_suitable_for_INT8_inference}
|
||||||
|
|
||||||
## Introduction
|
@sphinxdirective
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
############
|
||||||
|
|
||||||
OpenVINO Runtime CPU and GPU devices can infer models in low precision.
|
OpenVINO Runtime CPU and GPU devices can infer models in low precision.
|
||||||
For more details, refer to the [Model Optimization Guide](@ref openvino_docs_model_optimization_guide).
|
For more details, refer to the :doc:`Model Optimization Guide <openvino_docs_model_optimization_guide>`.
|
||||||
|
|
||||||
Intermediate Representation should be specifically formed to be suitable for low precision inference.
|
Intermediate Representation should be specifically formed to be suitable for low precision inference.
|
||||||
|
|
||||||
Such a model is called a Low Precision IR and can be generated in two ways:
|
Such a model is called a Low Precision IR and can be generated in two ways:
|
||||||
- By [quantizing regular IR with the Post-Training Optimization tool](@ref pot_introduction)
|
|
||||||
- Using Model Optimizer for a model pre-trained for Low Precision inference: TensorFlow pre-TFLite models (`.pb` model file with `FakeQuantize*` operations) and ONNX quantized models.
|
* By :doc:`quantizing regular IR with the Post-Training Optimization tool <pot_introduction>`
|
||||||
Both TensorFlow and ONNX quantized models can be prepared by [Neural Network Compression Framework](https://github.com/openvinotoolkit/nncf/blob/develop/README.md).
|
* Using Model Optimizer for a model pre-trained for Low Precision inference: TensorFlow pre-TFLite models (``.pb`` model file with ``FakeQuantize`` operations) and ONNX quantized models.
|
||||||
|
Both TensorFlow and ONNX quantized models can be prepared by `Neural Network Compression Framework <https://github.com/openvinotoolkit/nncf/blob/develop/README.md>`__.
|
||||||
|
|
||||||
For an operation to be executed in INT8, it must have `FakeQuantize` operations as inputs.
|
For an operation to be executed in INT8, it must have `FakeQuantize` operations as inputs.
|
||||||
For more details, see the [specification of `FakeQuantize` operation](../../../ops/quantization/FakeQuantize_1.md).
|
For more details, see the :doc:`specification of FakeQuantize operation <openvino_docs_ops_quantization_FakeQuantize_1>`.
|
||||||
|
|
||||||
To execute the `Convolution` operation in INT8 on CPU, both data and weight inputs should have `FakeQuantize` as an input operation:
|
To execute the ``Convolution`` operation in INT8 on CPU, both data and weight inputs should have ``FakeQuantize`` as an input operation:
|
||||||

|
|
||||||
|
|
||||||
Low precision IR is also suitable for FP32 and FP16 inference if a chosen plugin supports all operations of the IR. The only difference between a Low Precision IR and FP16 or FP32 IR is the existence of `FakeQuantize` in the Low Precision IR.
|
.. image:: _static/images/expanded_int8_Convolution_weights.png
|
||||||
|
|
||||||
|
|
||||||
|
Low precision IR is also suitable for FP32 and FP16 inference if a chosen plugin supports all operations of the IR. The only difference between a Low Precision IR and FP16 or FP32 IR is the existence of ``FakeQuantize`` in the Low Precision IR.
|
||||||
Plugins that support Low Precision Inference recognize these sub-graphs and quantize them during inference.
|
Plugins that support Low Precision Inference recognize these sub-graphs and quantize them during inference.
|
||||||
The ones that do not, execute all operations, including `FakeQuantize`, as is in the FP32 or FP16 precision.
|
The ones that do not, execute all operations, including ``FakeQuantize``, as is in the FP32 or FP16 precision.
|
||||||
|
|
||||||
Consequently, when `FakeQuantize` operations are present in an OpenVINO IR, it suggests to the inference device how to quantize particular operations in the model.
|
Consequently, when ``FakeQuantize`` operations are present in an OpenVINO IR, it suggests to the inference device how to quantize particular operations in the model.
|
||||||
If the device is capable, it accepts the suggestion and performs Low Precision Inference. If not, it executes the model in the floating-point precision.
|
If the device is capable, it accepts the suggestion and performs Low Precision Inference. If not, it executes the model in the floating-point precision.
|
||||||
|
|
||||||
## Compressed Low Precision Weights
|
Compressed Low Precision Weights
|
||||||
|
################################
|
||||||
|
|
||||||
Weighted operations, such as `Convolution` and `MatMul`, store weights as the floating-point `Constant` in the graph followed by the `FakeQuantize` operation.
|
Weighted operations, such as ``Convolution`` and ``MatMul``, store weights as the floating-point ``Constant`` in the graph followed by the `FakeQuantize` operation.
|
||||||
The `Constant` followed by the `FakeQuantize` operation could be optimized memory-wise due to the `FakeQuantize` operation semantics.
|
The ``Constant`` followed by the ``FakeQuantize`` operation could be optimized memory-wise due to the ``FakeQuantize`` operation semantics.
|
||||||
The resulting weights sub-graph stores weights in Low Precision `Constant`, which gets unpacked back to floating point with the `Convert` operation.
|
The resulting weights sub-graph stores weights in Low Precision ``Constant``, which gets unpacked back to floating point with the ``Convert`` operation.
|
||||||
Weights compression replaces `FakeQuantize` with optional `Subtract` and `Multiply` operation leaving output arithmetically the same and weights storing takes four times less memory.
|
Weights compression replaces ``FakeQuantize`` with optional ``Subtract`` and ``Multiply`` operation leaving output arithmetically the same and weights storing takes four times less memory.
|
||||||
|
|
||||||
See the visualization of `Convolution` with the compressed weights:
|
See the visualization of `Convolution` with the compressed weights:
|
||||||

|
|
||||||
|
.. image:: _static/images/compressed_int8_Convolution_weights.png
|
||||||
|
|
||||||
Both Model Optimizer and Post-Training Optimization tool generate a compressed IR by default.
|
Both Model Optimizer and Post-Training Optimization tool generate a compressed IR by default.
|
||||||
|
|
||||||
|
@endsphinxdirective
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
# Operations Specifications {#openvino_docs_operations_specifications}
|
# Operation Specifications {#openvino_docs_operations_specifications}
|
||||||
|
|
||||||
@sphinxdirective
|
@sphinxdirective
|
||||||
|
|
||||||
|
3
docs/_static/images/compressed_int8_Convolution_weights.png
vendored
Normal file
3
docs/_static/images/compressed_int8_Convolution_weights.png
vendored
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:4b14b03ebb6a00b5f52a8404282f83d4ad214c8d04aea74738027a775c4ef545
|
||||||
|
size 100581
|
3
docs/_static/images/expanded_int8_Convolution_weights.png
vendored
Normal file
3
docs/_static/images/expanded_int8_Convolution_weights.png
vendored
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:cbfadd457b4d943ffb46906a7daf03516e971fe49d2806cd32c84c5015178f03
|
||||||
|
size 92819
|
3
docs/_static/images/small_IR_graph_demonstration.png
vendored
Normal file
3
docs/_static/images/small_IR_graph_demonstration.png
vendored
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:e8a86ea362473121a266c0ec1257c8d428a4bb6438fecdc9d4a4f1ff5cfc9047
|
||||||
|
size 26220
|
@ -9,6 +9,7 @@
|
|||||||
API Reference <api/api_reference>
|
API Reference <api/api_reference>
|
||||||
Tool Ecosystem <openvino_ecosystem>
|
Tool Ecosystem <openvino_ecosystem>
|
||||||
OpenVINO Extensibility <openvino_docs_Extensibility_UG_Intro>
|
OpenVINO Extensibility <openvino_docs_Extensibility_UG_Intro>
|
||||||
|
OpenVINO IR format and Operation Sets <openvino_ir>
|
||||||
Media Processing and CV Libraries <media_processing_cv_libraries>
|
Media Processing and CV Libraries <media_processing_cv_libraries>
|
||||||
OpenVINO™ Security <openvino_docs_security_guide_introduction>
|
OpenVINO™ Security <openvino_docs_security_guide_introduction>
|
||||||
|
|
||||||
|
@ -1,12 +1,16 @@
|
|||||||
# Broadcast Rules For Elementwise Operations {#openvino_docs_ops_broadcast_rules}
|
# Broadcast Rules For Elementwise Operations {#openvino_docs_ops_broadcast_rules}
|
||||||
|
|
||||||
|
@sphinxdirective
|
||||||
|
|
||||||
The purpose of this document is to provide a set of common rules which are applicable for ops using broadcasting.
|
The purpose of this document is to provide a set of common rules which are applicable for ops using broadcasting.
|
||||||
|
|
||||||
## Description
|
Description
|
||||||
|
###########
|
||||||
|
|
||||||
Broadcast allows to perform element-wise operation for inputs of arbitrary number of dimensions. There are 2 types of broadcasts supported: Numpy and PDPD.
|
Broadcast allows to perform element-wise operation for inputs of arbitrary number of dimensions. There are 2 types of broadcasts supported: Numpy and PDPD.
|
||||||
|
|
||||||
## Rules
|
Rules
|
||||||
|
#####
|
||||||
|
|
||||||
**None broadcast**:
|
**None broadcast**:
|
||||||
1. Input tensors dimensions must match.
|
1. Input tensors dimensions must match.
|
||||||
@ -22,118 +26,126 @@ Broadcast allows to perform element-wise operation for inputs of arbitrary numbe
|
|||||||
**PDPD broadcast**:
|
**PDPD broadcast**:
|
||||||
1. First input tensor A is of any rank, second input B has rank smaller or equal to the first input.
|
1. First input tensor A is of any rank, second input B has rank smaller or equal to the first input.
|
||||||
2. Input tensor B is a continuous subsequence of input A.
|
2. Input tensor B is a continuous subsequence of input A.
|
||||||
3. Apply broadcast B to match the shape of A, where provided *axis* is the start dimension index
|
3. Apply broadcast B to match the shape of A, where provided *axis* is the start dimension index for broadcasting B onto A.
|
||||||
for broadcasting B onto A.
|
4. If *axis* is set to default (-1) calculate new value: ``axis = rank(A) - rank(B)``.
|
||||||
4. If *axis* is set to default (-1) calculate new value: `axis = rank(A) - rank(B)`.
|
5. The trailing dimensions of size 1 for input B will be ignored for the consideration of subsequence, such as ``shape(B) = (3, 1) => (3)``.
|
||||||
5. The trailing dimensions of size 1 for input B will be ignored for the consideration of
|
|
||||||
subsequence, such as `shape(B) = (3, 1) => (3)`.
|
|
||||||
|
|
||||||
## Numpy examples
|
Numpy examples
|
||||||
|
##############
|
||||||
|
|
||||||
* `A: Shape(,) -> scalar` <br>
|
* ``A: Shape(,) -> scalar``
|
||||||
`B: Shape(,) -> scalar` <br>
|
``B: Shape(,) -> scalar``
|
||||||
`Result: Shape(,) -> scalar`
|
``Result: Shape(,) -> scalar``
|
||||||
|
|
||||||
* `A: Shape(2, 3)` <br>
|
* ``A: Shape(2, 3)``
|
||||||
`B: Shape( 1)` <br>
|
``B: Shape( 1)``
|
||||||
`Result: Shape(2, 3)`
|
``Result: Shape(2, 3)``
|
||||||
|
|
||||||
* `A: Shape( 3)` <br>
|
* ``A: Shape( 3)``
|
||||||
`B: Shape(2, 3)` <br>
|
``B: Shape(2, 3)``
|
||||||
`Result: Shape(2, 3)`
|
``Result: Shape(2, 3)``
|
||||||
|
|
||||||
* `A: Shape(2, 3, 5)` <br>
|
* ``A: Shape(2, 3, 5)``
|
||||||
`B: Shape(,) -> scalar` <br>
|
``B: Shape(,) -> scalar``
|
||||||
`Result: Shape(2, 3, 5)`
|
``Result: Shape(2, 3, 5)``
|
||||||
|
|
||||||
* `A: Shape(2, 1, 5)` <br>
|
* ``A: Shape(2, 1, 5)``
|
||||||
`B: Shape(1, 4, 5)` <br>
|
``B: Shape(1, 4, 5)``
|
||||||
`Result: Shape(2, 4, 5)`
|
``Result: Shape(2, 4, 5)``
|
||||||
|
|
||||||
* `A: Shape( 6, 5)` <br>
|
* ``A: Shape( 6, 5)``
|
||||||
`B: Shape(2, 1, 5)` <br>
|
``B: Shape(2, 1, 5)``
|
||||||
`Result: Shape(2, 6, 5)`
|
``Result: Shape(2, 6, 5)``
|
||||||
|
|
||||||
* `A: Shape(2, 1, 5)` <br>
|
* ``A: Shape(2, 1, 5)``
|
||||||
`B: Shape( 4, 1)` <br>
|
``B: Shape( 4, 1)``
|
||||||
`Result: Shape(2, 4, 5)` <br>
|
``Result: Shape(2, 4, 5)``
|
||||||
|
|
||||||
* `A: Shape(3, 2, 1, 4)` <br>
|
* ``A: Shape(3, 2, 1, 4)``
|
||||||
`B: Shape( 5, 4)` <br>
|
``B: Shape( 5, 4)``
|
||||||
`Result: Shape(3, 2, 5, 4)`
|
``Result: Shape(3, 2, 5, 4)``
|
||||||
|
|
||||||
* `A: Shape( 1, 5, 3)` <br>
|
* ``A: Shape( 1, 5, 3)``
|
||||||
`B: Shape(5, 2, 1, 3)` <br>
|
``B: Shape(5, 2, 1, 3)``
|
||||||
`Result: Shape(5, 2, 5, 3)`
|
``Result: Shape(5, 2, 5, 3)``
|
||||||
|
|
||||||
* `A: Shape(3)` <br>
|
* ``A: Shape(3)``
|
||||||
`B: Shape(2)` <br>
|
``B: Shape(2)``
|
||||||
`Result: broadcast won't happen due to dimensions mismatch`
|
``Result: broadcast won't happen due to dimensions mismatch``
|
||||||
|
|
||||||
* `A: Shape(3, 1, 5)` <br>
|
* ``A: Shape(3, 1, 5)``
|
||||||
`B: Shape(4, 4, 5)` <br>
|
``B: Shape(4, 4, 5)``
|
||||||
`Result: broadcast won't happen due to dimensions mismatch on the leftmost axis`
|
``Result: broadcast won't happen due to dimensions mismatch on the leftmost axis``
|
||||||
|
|
||||||
## PDPD examples
|
PDPD examples
|
||||||
|
#############
|
||||||
|
|
||||||
* `A: Shape(2, 3, 4, 5)` <br>
|
* ``A: Shape(2, 3, 4, 5)``
|
||||||
`B: Shape( 3, 4 ) with axis = 1` <br>
|
``B: Shape( 3, 4 ) with axis = 1``
|
||||||
`Result: Shape(2, 3, 4, 5)`
|
``Result: Shape(2, 3, 4, 5)``
|
||||||
|
|
||||||
* `A: Shape(2, 3, 4, 5)` <br>
|
* ``A: Shape(2, 3, 4, 5)``
|
||||||
`B: Shape( 3, 1 ) with axis = 1` <br>
|
``B: Shape( 3, 1 ) with axis = 1``
|
||||||
`Result: Shape(2, 3, 4, 5)`
|
``Result: Shape(2, 3, 4, 5)``
|
||||||
|
|
||||||
* `A: Shape(2, 3, 4, 5)` <br>
|
* ``A: Shape(2, 3, 4, 5)``
|
||||||
`B: Shape( 4, 5) with axis=-1(default) or axis=2` <br>
|
``B: Shape( 4, 5) with axis=-1(default) or axis=2``
|
||||||
`Result: Shape(2, 3, 4, 5)`
|
``Result: Shape(2, 3, 4, 5)``
|
||||||
|
|
||||||
* `A: Shape(2, 3, 4, 5)` <br>
|
* ``A: Shape(2, 3, 4, 5)``
|
||||||
`B: Shape(1, 3 ) with axis = 0` <br>
|
``B: Shape(1, 3 ) with axis = 0``
|
||||||
`Result: Shape(2, 3, 4, 5)`
|
``Result: Shape(2, 3, 4, 5)``
|
||||||
|
|
||||||
* `A: Shape(2, 3, 4, 5)` <br>
|
* ``A: Shape(2, 3, 4, 5)``
|
||||||
`B: Shape(,)` <br>
|
``B: Shape(,)``
|
||||||
`Result: Shape(2, 3, 4, 5)` <br>
|
``Result: Shape(2, 3, 4, 5)``
|
||||||
|
|
||||||
* `A: Shape(2, 3, 4, 5)` <br>
|
* ``A: Shape(2, 3, 4, 5)``
|
||||||
`B: Shape(5,)` <br>
|
``B: Shape(5,)``
|
||||||
`Result: Shape(2, 3, 4, 5)`
|
``Result: Shape(2, 3, 4, 5)``
|
||||||
|
|
||||||
# Bidirectional Broadcast Rules {#openvino_docs_ops_bidirectional_broadcast_rules}
|
Bidirectional Broadcast Rules
|
||||||
|
#############################
|
||||||
|
|
||||||
## Description
|
Description
|
||||||
|
+++++++++++
|
||||||
|
|
||||||
Bidirectional Broadcast is not intended for element-wise operations. Its purpose is to broadcast an array to a given shape.
|
Bidirectional Broadcast is not intended for element-wise operations. Its purpose is to broadcast an array to a given shape.
|
||||||
|
|
||||||
## Rules
|
Rules
|
||||||
|
+++++
|
||||||
|
|
||||||
**Bidirectional broadcast**:
|
**Bidirectional broadcast**:
|
||||||
|
|
||||||
1. Dimensions of the input tensors are right alignment.
|
1. Dimensions of the input tensors are right alignment.
|
||||||
2. Following broadcast rule is applied: `numpy.array(input) * numpy.ones(target_shape)`.
|
2. Following broadcast rule is applied: ``numpy.array(input) * numpy.ones(target_shape)``.
|
||||||
3. Two corresponding dimension must have the same value, or one of them is equal to 1.
|
3. Two corresponding dimension must have the same value, or one of them is equal to 1.
|
||||||
4. Output shape may not be equal to `target_shape` if:
|
4. Output shape may not be equal to ``target_shape`` if:
|
||||||
* `target_shape` contains dimensions of size 1,
|
|
||||||
* `target_shape` rank is smaller than the rank of input tensor.
|
|
||||||
|
|
||||||
## Bidirectional examples
|
* ``target_shape`` contains dimensions of size 1,
|
||||||
|
* ``target_shape`` rank is smaller than the rank of input tensor.
|
||||||
|
|
||||||
* `A: Shape(5)` <br>
|
Bidirectional examples
|
||||||
`B: Shape(1)` <br>
|
++++++++++++++++++++++
|
||||||
`Result: Shape(5)`
|
|
||||||
|
|
||||||
* `A: Shape(2, 3)` <br>
|
* ``A: Shape(5)``
|
||||||
`B: Shape( 3)` <br>
|
``B: Shape(1)``
|
||||||
`Result: Shape(2, 3)`
|
``Result: Shape(5)``
|
||||||
|
|
||||||
* `A: Shape(3, 1)` <br>
|
* ``A: Shape(2, 3)``
|
||||||
`B: Shape(3, 4)` <br>
|
``B: Shape( 3)``
|
||||||
`Result: Shape(3, 4)`
|
``Result: Shape(2, 3)``
|
||||||
|
|
||||||
* `A: Shape(3, 4)` <br>
|
* ``A: Shape(3, 1)``
|
||||||
`B: Shape(,) -> scalar` <br>
|
``B: Shape(3, 4)``
|
||||||
`Result: Shape(3, 4)`
|
``Result: Shape(3, 4)``
|
||||||
|
|
||||||
* `A: Shape( 3, 1)` <br>
|
* ``A: Shape(3, 4)``
|
||||||
`B: Shape(2, 1, 6)` <br>
|
``B: Shape(,) -> scalar``
|
||||||
`Result: Shape(2, 3, 6)`
|
``Result: Shape(3, 4)``
|
||||||
|
|
||||||
|
* ``A: Shape( 3, 1)``
|
||||||
|
``B: Shape(2, 1, 6)``
|
||||||
|
``Result: Shape(2, 3, 6)``
|
||||||
|
|
||||||
|
@endsphinxdirective
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
# Available Operations Sets {#openvino_docs_ops_opset}
|
# Available Operation Sets {#openvino_docs_ops_opset}
|
||||||
|
|
||||||
@sphinxdirective
|
@sphinxdirective
|
||||||
|
|
||||||
@ -18,26 +18,45 @@
|
|||||||
openvino_docs_ops_opset2
|
openvino_docs_ops_opset2
|
||||||
openvino_docs_ops_opset1
|
openvino_docs_ops_opset1
|
||||||
|
|
||||||
@endsphinxdirective
|
|
||||||
|
|
||||||
According to capabilities of supported deep learning frameworks and hardware capabilities of a target inference device, all operations are combined into operations sets each fully supported in a specific version of OpenVINO™ toolkit.
|
According to capabilities of supported deep learning frameworks and hardware capabilities of a target inference device, all operations are combined into operation sets each fully supported in a specific version of OpenVINO™ toolkit.
|
||||||
|
|
||||||
This topic provides a complete list of available sets of operations supported in different versions of OpenVINO™ toolkit. Use the relevant version of the operations set for a particular release. For a list of operations included into an operations set, click a link in the table.
|
This topic provides a complete list of available sets of operations supported in different versions of OpenVINO™ toolkit. Use the relevant version of the operations set for a particular release. For a list of operations included into an operations set, click a link in the table.
|
||||||
|
|
||||||
| OpenVINO™ Version | Actual Operations Set |
|
.. list-table::
|
||||||
| :---------------- | :------------------------------- |
|
:header-rows: 1
|
||||||
| 2023.0 | [opset11](opset11.md) |
|
|
||||||
| 2022.3 | [opset10](opset10.md) |
|
|
||||||
| 2022.2 | [opset9](opset9.md) |
|
|
||||||
| 2022.1 | [opset8](opset8.md) |
|
|
||||||
| 2021.4 | [opset7](opset7.md) |
|
|
||||||
| 2021.3 | [opset6](opset6.md) |
|
|
||||||
| 2021.2 | [opset5](opset5.md) |
|
|
||||||
| 2021.1 | [opset4](opset4.md) |
|
|
||||||
| 2020.4 | [opset3](opset3.md) |
|
|
||||||
| 2020.3 | [opset2](opset2.md) |
|
|
||||||
| 2020.2 | [opset2](opset2.md) |
|
|
||||||
| 2020.1 | [opset1](opset1.md) |
|
|
||||||
|
|
||||||
## See Also
|
* - OpenVINO™ Version
|
||||||
[Deep Learning Network Intermediate Representation and Operations Sets in OpenVINO™](../MO_DG/IR_and_opsets.md)
|
- Actual Operations Set
|
||||||
|
* - 2023.0
|
||||||
|
- :doc:`opset11 <openvino_docs_ops_opset11>`
|
||||||
|
* - 2022.3
|
||||||
|
- :doc:`opset10 <openvino_docs_ops_opset10>`
|
||||||
|
* - 2022.2
|
||||||
|
- :doc:`opset9 <openvino_docs_ops_opset9>`
|
||||||
|
* - 2022.1
|
||||||
|
- :doc:`opset8 <openvino_docs_ops_opset8>`
|
||||||
|
* - 2021.4
|
||||||
|
- :doc:`opset7 <openvino_docs_ops_opset7>`
|
||||||
|
* - 2021.3
|
||||||
|
- :doc:`opset6 <openvino_docs_ops_opset6>`
|
||||||
|
* - 2021.2
|
||||||
|
- :doc:`opset5 <openvino_docs_ops_opset5>`
|
||||||
|
* - 2021.1
|
||||||
|
- :doc:`opset4 <openvino_docs_ops_opset4>`
|
||||||
|
* - 2020.4
|
||||||
|
- :doc:`opset3 <openvino_docs_ops_opset3>`
|
||||||
|
* - 2020.3
|
||||||
|
- :doc:`opset2 <openvino_docs_ops_opset2>`
|
||||||
|
* - 2020.2
|
||||||
|
- :doc:`opset2 <openvino_docs_ops_opset2>`
|
||||||
|
* - 2020.1
|
||||||
|
- :doc:`opset1 <openvino_docs_ops_opset1>`
|
||||||
|
|
||||||
|
See Also
|
||||||
|
########
|
||||||
|
|
||||||
|
:doc:`Operation Sets in OpenVINO <openvino_docs_MO_DG_IR_and_opsets>`
|
||||||
|
:doc:`OpenVINO IR format <openvino_ir>`
|
||||||
|
|
||||||
|
@endsphinxdirective
|
||||||
|
@ -1,6 +1,5 @@
|
|||||||
# OpenVINO IR format {#openvino_ir}
|
# OpenVINO IR format {#openvino_ir}
|
||||||
|
|
||||||
|
|
||||||
@sphinxdirective
|
@sphinxdirective
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
@ -8,16 +7,143 @@
|
|||||||
:hidden:
|
:hidden:
|
||||||
|
|
||||||
openvino_docs_MO_DG_IR_and_opsets
|
openvino_docs_MO_DG_IR_and_opsets
|
||||||
openvino_docs_ops_opset
|
|
||||||
openvino_docs_ops_broadcast_rules
|
|
||||||
openvino_docs_operations_specifications
|
|
||||||
openvino_docs_MO_DG_prepare_model_convert_model_IR_suitable_for_INT8_inference
|
openvino_docs_MO_DG_prepare_model_convert_model_IR_suitable_for_INT8_inference
|
||||||
|
|
||||||
|
The models, built and trained using various frameworks, can be large and architecture-dependent. To successfully run inference from any device and maximize the benefits of OpenVINO tools, you can convert the model to the OpenVINO Intermediate Representation (IR) format.
|
||||||
|
|
||||||
|
OpenVINO IR is the proprietary model format of OpenVINO. It is produced after converting a model with the Model Optimizer tool. Model Optimizer translates the frequently used deep learning operations to their respective similar representation in OpenVINO and tunes them with the associated weights and biases from the trained model. The resulting IR contains two files:
|
||||||
|
|
||||||
|
* ``.xml`` - Describes the model topology.
|
||||||
|
* ``.bin`` - Contains the weights and binary data.
|
||||||
|
|
||||||
|
IR Structure
|
||||||
|
############
|
||||||
|
|
||||||
|
OpenVINO toolkit introduces its own format of graph representation and its own operation set. A graph is represented with two files: an XML file and a binary file. This representation is commonly referred to as the Intermediate Representation or IR.
|
||||||
|
|
||||||
|
The XML file describes a model topology using a ``<layer>`` tag for an operation node and an ``<edge>`` tag for a data-flow connection.
|
||||||
|
Each operation has a fixed number of attributes that define operation flavor used for a node.
|
||||||
|
For example, the `Convolution` operation has such attributes as ``dilation``, ``stride``, ``pads_begin``, and ``pads_end``.
|
||||||
|
|
||||||
|
The XML file does not have big constant values like convolution weights.
|
||||||
|
Instead, it refers to a part of the accompanying binary file that stores such values in a binary format.
|
||||||
|
|
||||||
|
Here is an example of a small IR XML file that corresponds to a graph from the previous section:
|
||||||
|
|
||||||
|
.. scrollbox::
|
||||||
|
|
||||||
|
.. code-block:: cpp
|
||||||
|
|
||||||
|
<?xml version="1.0" ?>
|
||||||
|
<net name="model_file_name" version="10">
|
||||||
|
<layers>
|
||||||
|
<layer id="0" name="input" type="Parameter" version="opset1">
|
||||||
|
<data element_type="f32" shape="1,3,32,100"/> <!-- attributes of operation -->
|
||||||
|
<output>
|
||||||
|
<!-- description of output ports with type of element and tensor dimensions -->
|
||||||
|
<port id="0" precision="FP32">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>3</dim>
|
||||||
|
<dim>32</dim>
|
||||||
|
<dim>100</dim>
|
||||||
|
</port>
|
||||||
|
</output>
|
||||||
|
</layer>
|
||||||
|
<layer id="1" name="conv1/weights" type="Const" version="opset1">
|
||||||
|
<!-- Const is only operation from opset1 that refers to the IR binary file by specifying offset and size in bytes relative to the beginning of the file. -->
|
||||||
|
<data element_type="f32" offset="0" shape="64,3,3,3" size="6912"/>
|
||||||
|
<output>
|
||||||
|
<port id="1" precision="FP32">
|
||||||
|
<dim>64</dim>
|
||||||
|
<dim>3</dim>
|
||||||
|
<dim>3</dim>
|
||||||
|
<dim>3</dim>
|
||||||
|
</port>
|
||||||
|
</output>
|
||||||
|
</layer>
|
||||||
|
<layer id="2" name="conv1" type="Convolution" version="opset1">
|
||||||
|
<data auto_pad="same_upper" dilations="1,1" output_padding="0,0" pads_begin="1,1" pads_end="1,1" strides="1,1"/>
|
||||||
|
<input>
|
||||||
|
<port id="0">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>3</dim>
|
||||||
|
<dim>32</dim>
|
||||||
|
<dim>100</dim>
|
||||||
|
</port>
|
||||||
|
<port id="1">
|
||||||
|
<dim>64</dim>
|
||||||
|
<dim>3</dim>
|
||||||
|
<dim>3</dim>
|
||||||
|
<dim>3</dim>
|
||||||
|
</port>
|
||||||
|
</input>
|
||||||
|
<output>
|
||||||
|
<port id="2" precision="FP32">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>64</dim>
|
||||||
|
<dim>32</dim>
|
||||||
|
<dim>100</dim>
|
||||||
|
</port>
|
||||||
|
</output>
|
||||||
|
</layer>
|
||||||
|
<layer id="3" name="conv1/activation" type="ReLU" version="opset1">
|
||||||
|
<input>
|
||||||
|
<port id="0">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>64</dim>
|
||||||
|
<dim>32</dim>
|
||||||
|
<dim>100</dim>
|
||||||
|
</port>
|
||||||
|
</input>
|
||||||
|
<output>
|
||||||
|
<port id="1" precision="FP32">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>64</dim>
|
||||||
|
<dim>32</dim>
|
||||||
|
<dim>100</dim>
|
||||||
|
</port>
|
||||||
|
</output>
|
||||||
|
</layer>
|
||||||
|
<layer id="4" name="output" type="Result" version="opset1">
|
||||||
|
<input>
|
||||||
|
<port id="0">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>64</dim>
|
||||||
|
<dim>32</dim>
|
||||||
|
<dim>100</dim>
|
||||||
|
</port>
|
||||||
|
</input>
|
||||||
|
</layer>
|
||||||
|
</layers>
|
||||||
|
<edges>
|
||||||
|
<!-- Connections between layer nodes: based on ids for layers and ports used in the descriptions above -->
|
||||||
|
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
|
||||||
|
<edge from-layer="1" from-port="1" to-layer="2" to-port="1"/>
|
||||||
|
<edge from-layer="2" from-port="2" to-layer="3" to-port="0"/>
|
||||||
|
<edge from-layer="3" from-port="1" to-layer="4" to-port="0"/>
|
||||||
|
</edges>
|
||||||
|
<meta_data>
|
||||||
|
<!-- This section that is not related to a topology; contains auxiliary information that serves for the debugging purposes. -->
|
||||||
|
<MO_version value="2022.3"/>
|
||||||
|
<cli_parameters>
|
||||||
|
<blobs_as_inputs value="True"/>
|
||||||
|
<caffe_parser_path value="DIR"/>
|
||||||
|
<data_type value="float"/>
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
<!-- Omitted a long list of CLI options that always are put here by MO for debugging purposes. -->
|
||||||
|
|
||||||
|
</cli_parameters>
|
||||||
|
</meta_data>
|
||||||
|
</net>
|
||||||
|
|
||||||
|
The IR does not use explicit data nodes described in the previous section. In contrast, properties of data such as tensor dimensions and their data types are described as properties of input and output ports of operations.
|
||||||
|
|
||||||
|
Additional Resources
|
||||||
|
####################
|
||||||
|
|
||||||
|
* :doc:`IR and Operation Sets <openvino_docs_MO_DG_IR_and_opsets>`
|
||||||
|
* :doc:`OpenVINO API 2.0 transition guide <openvino_2_0_transition_guide>`
|
||||||
|
|
||||||
@endsphinxdirective
|
@endsphinxdirective
|
||||||
|
|
||||||
|
|
||||||
OpenVINO IR (Intermediate Representation) is the proprietary model format of OpenVINO.
|
|
||||||
In this section you will find detailed information on its operations and application
|
|
||||||
in OpenVINO.
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -8,7 +8,6 @@
|
|||||||
:hidden:
|
:hidden:
|
||||||
|
|
||||||
openvino_docs_performance_benchmarks
|
openvino_docs_performance_benchmarks
|
||||||
openvino_ir
|
|
||||||
prerelease_information
|
prerelease_information
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
@ -34,8 +33,6 @@ and its proprietary model format, OpenVINO IR.
|
|||||||
|
|
||||||
:doc:`Performance Benchmarks <openvino_docs_performance_benchmarks>` contain results from benchmarking models with OpenVINO on Intel hardware.
|
:doc:`Performance Benchmarks <openvino_docs_performance_benchmarks>` contain results from benchmarking models with OpenVINO on Intel hardware.
|
||||||
|
|
||||||
:doc:`OpenVINO IR format <openvino_ir>` is the proprietary model format of OpenVINO. Read more details on its operations and usage.
|
|
||||||
|
|
||||||
:doc:`Supported Devices <openvino_docs_OV_UG_supported_plugins_Supported_Devices>` is compatibility information about supported hardware accelerators.
|
:doc:`Supported Devices <openvino_docs_OV_UG_supported_plugins_Supported_Devices>` is compatibility information about supported hardware accelerators.
|
||||||
|
|
||||||
:doc:`Supported Models <openvino_supported_models>` is a table of models officially supported by OpenVINO.
|
:doc:`Supported Models <openvino_supported_models>` is a table of models officially supported by OpenVINO.
|
||||||
|
Loading…
Reference in New Issue
Block a user