[DOCS] Recreation of BDTI PRs - master (#16913)

Recreation of BDTI PRs for master.

Recreated PRs:

Docs: Update Dynamic Shapes documentation #15216
Docs: Edits to Performance Hints and Cumulative Throughput documentation #14793
Docs: Update Devices pages to state improved INT8 performance with 11th & 12th gen devices #12067
This commit is contained in:
Maciej Smyk 2023-05-08 12:33:15 +02:00 committed by GitHub
parent af3d1d69dd
commit 9ea3553d5d
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
9 changed files with 603 additions and 530 deletions

View File

@ -16,7 +16,7 @@ This article introduces how Automatic Device Selection works and how to use it f
How AUTO Works
####################
##############
The Automatic Device Selection mode, or AUTO for short, uses a "virtual" or a "proxy" device,
which does not bind to a specific type of hardware, but rather selects the processing unit for inference automatically.
@ -33,19 +33,19 @@ The logic behind the choice is as follows:
4. If models precision is FP32 but there is no device capable of supporting it, offload the model to a device supporting FP16.
+----------+------------------------------------------------------+-------------------------------------+
| Device || Supported || Supported |
| Priority || Device || model precision |
+==========+======================================================+=====================================+
| 1 || dGPU | FP32, FP16, INT8, BIN |
| || (e.g. Intel® Iris® Xe MAX) | |
+----------+------------------------------------------------------+-------------------------------------+
| 2 || iGPU | FP32, FP16, BIN |
| || (e.g. Intel® UHD Graphics 620 (iGPU)) | |
+----------+------------------------------------------------------+-------------------------------------+
| 3 || Intel® CPU | FP32, FP16, INT8, BIN |
| || (e.g. Intel® Core™ i7-1165G7) | |
+----------+------------------------------------------------------+-------------------------------------+
+----------+-----------------------------------------------------+------------------------------------+
| Device | Supported | Supported |
| Priority | Device | model precision |
+==========+=====================================================+====================================+
| 1 | dGPU | FP32, FP16, INT8, BIN |
| | (e.g. Intel® Iris® Xe MAX) | |
+----------+-----------------------------------------------------+------------------------------------+
| 2 | iGPU | FP32, FP16, BIN |
| | (e.g. Intel® UHD Graphics 620 (iGPU)) | |
+----------+-----------------------------------------------------+------------------------------------+
| 3 | Intel® CPU | FP32, FP16, INT8, BIN |
| | (e.g. Intel® Core™ i7-1165G7) | |
+----------+-----------------------------------------------------+------------------------------------+
To put it simply, when loading the model to the first device on the list fails, AUTO will try to load it to the next device in line, until one of them succeeds.
@ -63,12 +63,12 @@ Note that if you choose to exclude CPU from the priority list or disable the ini
This mechanism can be easily observed in the :ref:`Using AUTO with Benchmark app sample <using-auto-with-openvino-samples-and-benchmark-app>` section, showing how the first-inference latency (the time it takes to compile the model and perform the first inference) is reduced when using AUTO. For example:
.. code-block: sh
.. code-block:: sh
benchmark_app -m ../public/alexnet/FP32/alexnet.xml -d GPU -niter 128
.. code-block: sh
.. code-block:: sh
benchmark_app -m ../public/alexnet/FP32/alexnet.xml -d AUTO -niter 128
@ -80,70 +80,70 @@ This mechanism can be easily observed in the :ref:`Using AUTO with Benchmark app
Using AUTO
####################
##########
Following the OpenVINO™ naming convention, the Automatic Device Selection mode is assigned the label of "AUTO". It may be defined with no additional parameters, resulting in defaults being used, or configured further with the following setup options:
+-----------------------------------------------+----------------------------------------------------------------------+
| | Property | | Values and Description |
+===============================================+======================================================================+
| | <device candidate list> | | **Values**: |
| | | | empty |
| | | | ``AUTO`` |
| | | | ``AUTO: <device names>`` (comma-separated, no spaces) |
| | | | |
| | | | Lists the devices available for selection. |
| | | | The device sequence will be taken as priority from high to low. |
| | | | If not specified, ``AUTO`` will be used as default, |
| | | | and all devices will be "viewed" as candidates. |
+-----------------------------------------------+----------------------------------------------------------------------+
| | ``ov::device::priorities`` | | **Values**: |
| | | | ``<device names>`` (comma-separated, no spaces) |
| | | | |
| | | | Specifies the devices for AUTO to select. |
| | | | The device sequence will be taken as priority from high to low. |
| | | | This configuration is optional. |
+-----------------------------------------------+----------------------------------------------------------------------+
| | ``ov::hint::performance_mode`` | | **Values**: |
| | | | ``ov::hint::PerformanceMode::LATENCY`` |
| | | | ``ov::hint::PerformanceMode::THROUGHPUT`` |
| | | | ``ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT`` |
| | | | |
| | | | Specifies the performance option preferred by the application. |
+-----------------------------------------------+----------------------------------------------------------------------+
| | ``ov::hint::model_priority`` | | **Values**: |
| | | | ``ov::hint::Priority::HIGH`` |
| | | | ``ov::hint::Priority::MEDIUM`` |
| | | | ``ov::hint::Priority::LOW`` |
| | | | |
| | | | Indicates the priority for a model. |
| | | | IMPORTANT: This property is not fully supported yet. |
+-----------------------------------------------+----------------------------------------------------------------------+
| | ``ov::execution_devices`` | | Lists the runtime target devices on which the inferences are being |
| | | | executed. |
| | | | Examples of returning results could be ``(CPU)``(``CPU`` is a |
| | | | temporary device, indicating that CPU is used for acceleration at |
| | | | the model compilation stage), ``CPU``, ``GPU``, ``CPU GPU``, |
| | | | ``GPU.0``, etc. |
+-----------------------------------------------+----------------------------------------------------------------------+
| | ``ov::intel_auto::enable_startup_fallback`` | | **Values**: |
| | | | ``true`` |
| | | | ``false`` |
| | | | |
| | | | Enables/disables CPU as acceleration (or the helper device) in the |
| | | | beginning. The default value is ``true``, indicating that CPU is |
| | | | used as acceleration by default. |
+-----------------------------------------------+----------------------------------------------------------------------+
| | ``ov::intel_auto::enable_runtime_fallback`` | | **Values**: |
| | | | ``true`` |
| | | | ``false`` |
| | | | |
| | | | Enables/disables runtime fallback to other devices and performs |
| | | | the failed inference request again, if inference request fails on |
| | | | the currently selected device. |
| | | | The default value is ``true``. |
+-----------------------------------------------+----------------------------------------------------------------------+
+----------------------------------------------+--------------------------------------------------------------------+
| Property | Values and Description |
+==============================================+====================================================================+
| <device candidate list> | **Values**: |
| | empty |
| | ``AUTO`` |
| | ``AUTO: <device names>`` (comma-separated, no spaces) |
| | |
| | Lists the devices available for selection. |
| | The device sequence will be taken as priority from high to low. |
| | If not specified, ``AUTO`` will be used as default, |
| | and all devices will be "viewed" as candidates. |
+----------------------------------------------+--------------------------------------------------------------------+
| ``ov::device::priorities`` | **Values**: |
| | ``<device names>`` (comma-separated, no spaces) |
| | |
| | Specifies the devices for AUTO to select. |
| | The device sequence will be taken as priority from high to low. |
| | This configuration is optional. |
+----------------------------------------------+--------------------------------------------------------------------+
| ``ov::hint::performance_mode`` | **Values**: |
| | ``ov::hint::PerformanceMode::LATENCY`` |
| | ``ov::hint::PerformanceMode::THROUGHPUT`` |
| | ``ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT`` |
| | |
| | Specifies the performance option preferred by the application. |
+----------------------------------------------+--------------------------------------------------------------------+
| ``ov::hint::model_priority`` | **Values**: |
| | ``ov::hint::Priority::HIGH`` |
| | ``ov::hint::Priority::MEDIUM`` |
| | ``ov::hint::Priority::LOW`` |
| | |
| | Indicates the priority for a model. |
| | IMPORTANT: This property is not fully supported yet. |
+----------------------------------------------+--------------------------------------------------------------------+
| ``ov::execution_devices`` | Lists the runtime target devices on which the inferences are being |
| | executed. |
| | Examples of returning results could be ``(CPU)``(``CPU`` is a |
| | temporary device, indicating that CPU is used for acceleration at |
| | the model compilation stage), ``CPU``, ``GPU``, ``CPU GPU``, |
| | ``GPU.0``, etc. |
+----------------------------------------------+--------------------------------------------------------------------+
| ``ov::intel_auto::enable_startup_fallback`` | **Values**: |
| | ``true`` |
| | ``false`` |
| | |
| | Enables/disables CPU as acceleration (or the helper device) in the |
| | beginning. The default value is ``true``, indicating that CPU is |
| | used as acceleration by default. |
+----------------------------------------------+--------------------------------------------------------------------+
| ``ov::intel_auto::enable_runtime_fallback`` | **Values**: |
| | ``true`` |
| | ``false`` |
| | |
| | Enables/disables runtime fallback to other devices and performs |
| | the failed inference request again, if inference request fails on |
| | the currently selected device. |
| | The default value is ``true``. |
+----------------------------------------------+--------------------------------------------------------------------+
Inference with AUTO is configured similarly to when device plugins are used:
you compile the model on the plugin with configuration and execute inference.
@ -162,18 +162,22 @@ To specify the priority of devices, enter the device names in the priority order
See the following code for using AUTO and specifying devices:
.. tab-set::
.. tab:: C++
.. tab-item:: C++
:sync: cpp
.. doxygensnippet:: docs/snippets/AUTO0.cpp
:language: cpp
:fragment: [part0]
.. doxygensnippet:: docs/snippets/AUTO0.cpp
:language: cpp
:fragment: [part0]
.. tab:: Python
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part0]
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part0]
Note that OpenVINO Runtime lets you use "GPU" as an alias for "GPU.0" in function calls. More details on enumerating devices can be found in :doc:`Working with devices <openvino_docs_OV_UG_Working_with_devices>`.
@ -183,22 +187,25 @@ Checking Available Devices
To check what devices are present in the system, you can use Device API, as listed below. For information on how to use it, see :doc:`Query device properties and configuration <openvino_docs_OV_UG_query_api>`.
.. tab-set::
.. tab:: C++
.. tab-item:: C++
:sync: cpp
.. code-block:: sh
ov::runtime::Core::get_available_devices()
See the Hello Query Device C++ Sample for reference.
.. code-block:: sh
ov::runtime::Core::get_available_devices()
See the Hello Query Device C++ Sample for reference.
.. tab:: Python
.. code-block:: sh
openvino.runtime.Core.available_devices
See the Hello Query Device Python Sample for reference.
.. tab-item:: Python
:sync: py
.. code-block:: sh
openvino.runtime.Core.available_devices
See the Hello Query Device Python Sample for reference.
Excluding Devices from Device Candidate List
@ -206,18 +213,21 @@ Excluding Devices from Device Candidate List
You can also exclude hardware devices from AUTO, for example, to reserve CPU for other jobs. AUTO will not use the device for inference then. To do that, add a minus sign ``(-)`` before CPU in ``AUTO: <device names>``, as in the following example:
.. tab-set::
.. tab:: C++
.. tab-item:: C++
:sync: cpp
.. code-block:: sh
ov::CompiledModel compiled_model = core.compile_model(model, "AUTO:-CPU");
.. code-block:: sh
ov::CompiledModel compiled_model = core.compile_model(model, "AUTO:-CPU");
.. tab:: Python
.. code-block:: sh
compiled_model = core.compile_model(model=model, device_name="AUTO:-CPU")
.. tab-item:: Python
:sync: py
.. code-block:: sh
compiled_model = core.compile_model(model=model, device_name="AUTO:-CPU")
AUTO will then query all available devices and remove CPU from the candidate list.
@ -230,6 +240,11 @@ Performance Hints for AUTO
The ``ov::hint::performance_mode`` property enables you to specify a performance option for AUTO to be more efficient for particular use cases. The default hint for AUTO is ``LATENCY``.
The THROUGHPUT and CUMULATIVE_THROUGHPUT hints below only improve performance in an asynchronous inference pipeline. For information on asynchronous inference, see the `Async API documentation <https://docs.openvino.ai/latest/openvino_docs_OV_UG_Infer_request.html#doxid-openvino-docs-o-v-u-g-infer-request>`__ . The following notebooks provide examples of how to set up an asynchronous pipeline:
* :doc:`Image Classification Async Sample <openvino_inference_engine_samples_classification_sample_async_README>`
* `Notebook - Asynchronous Inference with OpenVINO™ <https://docs.openvino.ai/latest/notebooks/115-async-api-with-output.html>`__
* `Notebook - Automatic Device Selection with OpenVINO <https://docs.openvino.ai/latest/notebooks/106-auto-device-with-output.html>`__
LATENCY
--------------------
@ -257,33 +272,65 @@ While ``LATENCY`` and ``THROUGHPUT`` can select one target device with your pref
CUMULATIVE_THROUGHPUT has similar behavior as :doc:`the Multi-Device execution mode (MULTI) <openvino_docs_OV_UG_Running_on_multiple_devices>`. The only difference is that CUMULATIVE_THROUGHPUT uses the devices specified by AUTO, which means that it's not mandatory to add devices manually, while with MULTI, you need to specify the devices before inference.
With the CUMULATIVE_THROUGHPUT option:
If device priority is specified when using CUMULATIVE_THROUGHPUT, AUTO will run inference requests on devices based on the priority. In the following example, AUTO will always try to use GPU first, and then use CPU if GPU is busy:
* If ``AUTO`` without any device names is specified, and the system has more than two GPU devices, AUTO will remove CPU from the device candidate list to keep GPU running at full capacity.
* If device priority is specified, AUTO will run inference requests on devices based on the priority. In the following example, AUTO will always try to use GPU first, and then use CPU if GPU is busy:
.. tab-set::
.. tab-item:: C++
:sync: cpp
.. code-block: sh
.. code-block:: sh
ov::CompiledModel compiled_model = core.compile_model(model, "AUTO:GPU,CPU", ov::hint::performance_mode(ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT));
.. tab-item:: Python
:sync: py
ov::CompiledModel compiled_model = core.compile_model(model, "AUTO:GPU,CPU", ov::hint::performance_mode(ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT));
.. code-block:: sh
compiled_model = core.compile_model(model, "AUTO:GPU,CPU", {"PERFORMANCE_HINT" : {"CUMULATIVE_THROUGHPUT"}})
If AUTO is used without specifying any device names, and if there are multiple GPUs in the system, CUMULATIVE_THROUGHPUT mode will use all of the GPUs by default. If the system has more than two GPU devices, AUTO will remove CPU from the device candidate list to keep the GPUs running at full capacity. A full list of system devices and their unique identifiers can be queried using ov::Core::get_available_devices (for more information, see :doc:`Query Device Properties <openvino_docs_OV_UG_query_api>`). To explicitly specify which GPUs to use, set their priority when compiling with AUTO:
.. tab-set::
.. tab-item:: C++
:sync: cpp
.. code-block:: sh
ov::CompiledModel compiled_model = core.compile_model(model, "AUTO:GPU.1,GPU.0", ov::hint::performance_mode(ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT));
.. tab-item:: Python
:sync: py
.. code-block:: sh
compiled_model = core.compile_model(model, "AUTO:GPU.1,GPU.0", {"PERFORMANCE_HINT" : {"CUMULATIVE_THROUGHPUT"})
Code Examples
--------------------
To enable performance hints for your application, use the following code:
.. tab:: C++
.. tab-set::
.. doxygensnippet:: docs/snippets/AUTO3.cpp
:language: cpp
:fragment: [part3]
.. tab-item:: C++
:sync: cpp
.. tab:: Python
.. doxygensnippet:: docs/snippets/AUTO3.cpp
:language: cpp
:fragment: [part3]
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part3]
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part3]
Disabling Auto-Batching for THROUGHPUT and CUMULATIVE_THROUGHPUT
@ -297,18 +344,21 @@ Configuring Model Priority
The ``ov::hint::model_priority`` property enables you to control the priorities of models in the Auto-Device plugin. A high-priority model will be loaded to a supported high-priority device. A lower-priority model will not be loaded to a device that is occupied by a higher-priority model.
.. tab-set::
.. tab:: C++
.. tab-item:: C++
:sync: cpp
.. doxygensnippet:: docs/snippets/AUTO4.cpp
:language: cpp
:fragment: [part4]
.. doxygensnippet:: docs/snippets/AUTO4.cpp
:language: cpp
:fragment: [part4]
.. tab:: Python
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part4]
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part4]
Checking Target Runtime Devices
@ -317,17 +367,22 @@ Checking Target Runtime Devices
To query the runtime target devices on which the inferences are being executed using AUTO, you can use the ``ov::execution_devices`` property. It must be used with ``get_property``, for example:
.. tab:: C++
.. tab-set::
.. doxygensnippet:: docs/snippets/AUTO7.cpp
:language: cpp
:fragment: [part7]
.. tab-item:: C++
:sync: cpp
.. tab:: Python
.. doxygensnippet:: docs/snippets/AUTO7.cpp
:language: cpp
:fragment: [part7]
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part7]
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part7]
Configuring Individual Devices and Creating the Auto-Device plugin on Top
@ -335,17 +390,21 @@ Configuring Individual Devices and Creating the Auto-Device plugin on Top
Although the methods described above are currently the preferred way to execute inference with AUTO, the following steps can be also used as an alternative. It is currently available as a legacy feature and used if AUTO is incapable of utilizing the Performance Hints option.
.. tab:: C++
.. tab-set::
.. doxygensnippet:: docs/snippets/AUTO5.cpp
:language: cpp
:fragment: [part5]
.. tab-item:: C++
:sync: cpp
.. tab:: Python
.. doxygensnippet:: docs/snippets/AUTO5.cpp
:language: cpp
:fragment: [part5]
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part5]
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_auto.py
:language: python
:fragment: [part5]
.. _using-auto-with-openvino-samples-and-benchmark-app:
@ -357,13 +416,13 @@ To see how the Auto-Device plugin is used in practice and test its performance,
For unlimited device choice:
.. code-block:sh
.. code-block:: sh
benchmark_app d AUTO m <model> -i <input> -niter 1000
For limited device choice:
.. code-block:sh
.. code-block:: sh
benchmark_app d AUTO:CPU,GPU,GNA m <model> -i <input> -niter 1000

View File

@ -39,23 +39,55 @@ The decision about using dynamic shapes should be based on proper benchmarking o
Unlike statically shaped models, dynamically shaped ones require different inference time, depending on input data shape or input tensor content.
Furthermore, using the dynamic shapes can bring more overheads in memory and running time of each inference call depending on hardware plugin and model used.
Handling Dynamic Shapes Natively
################################
Handling Dynamic Shapes
#######################
This section describes how to handle dynamically shaped models natively with OpenVINO Runtime API version 2022.1 and higher.
There are three main parts in the flow that differ from static shapes:
This section describes how to handle dynamically shaped models with OpenVINO Runtime API version 2022.1 and higher. When using dynamic shapes, there are three main differences in the workflow than with static shapes:
* Configure the model.
* Prepare data for inference.
* Read resulting data after inference.
* Configuring the model
* Preparing and inferencing dynamic data
* Dynamic shapes in outputs
Configuring the Model
+++++++++++++++++++++
To avoid the methods mentioned in the previous section, there is a way to specify one or multiple dimensions to be dynamic, directly in the model inputs.
This is achieved with the same reshape method that is used for alternating static shape of inputs.
Dynamic dimensions are specified as ``-1`` or the ``ov::Dimension()`` instead of a positive number used for static dimensions:
Model input dimensions can be specified as dynamic using the model.reshape method. To set a dynamic dimension, use ``-1``, ``ov::Dimension()`` (C++), or ``ov.Dimension()`` (Python) as the value for that dimension.
.. note::
Some models may already have dynamic shapes out of the box and do not require additional configuration. This can either be because it was generated with dynamic shapes from the source framework, or because it was converted with Model Optimizer to use dynamic shapes. For more information, see the Dynamic Dimensions “Out of the Box” section.
The examples below show how to set dynamic dimensions with a model that has a static ``[1, 3, 224, 224]`` input shape (such as `mobilenet-v2 <https://docs.openvino.ai/latest/omz_models_model_mobilenet_v2.html>`__). The first example shows how to change the first dimension (batch size) to be dynamic. In the second example, the third and fourth dimensions (height and width) are set as dynamic.
.. tab-set::
.. tab-item:: C++
:sync: cpp
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.cpp
:language: cpp
:fragment: [ov_dynamic_shapes:reshape_undefined]
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.py
:language: cpp
:fragment: [reshape_undefined]
With Python, you may also pass all dimensions as a string and use ``?`` for the dynamic dimensions (e.g. ``model.reshape(“1, 3, ?, ?”)``).
.. tab-item:: C
:sync: c
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.c
:language: cpp
:fragment: [ov_dynamic_shapes:reshape_undefined]
The examples above assume that the model has a single input layer. To change models with multiple input layers (such as NLP models), iterate over all the input layers, update the shape per layer, and apply the model.reshape method. For example, the following code sets the second dimension as dynamic in every input layer:
.. tab-set::
@ -64,48 +96,24 @@ Dynamic dimensions are specified as ``-1`` or the ``ov::Dimension()`` instead of
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.cpp
:language: cpp
:fragment: ov_dynamic_shapes:reshape_undefined
:fragment: ov_dynamic_shapes:reshape_multiple_inputs
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.py
:language: python
:fragment: reshape_undefined
.. tab-item:: C
:sync: c
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.c
:language: cpp
:fragment: ov_dynamic_shapes:reshape_undefined
:fragment: reshape_multiple_inputs
To simplify the code, the examples assume that the model has a single input and single output.
However, there are no limitations on the number of inputs and outputs to apply dynamic shapes.
For more examples of how to change multiple input layers, see :doc:`Changing Input Shapes <openvino_docs_OV_UG_ShapeInference>`.
Undefined Dimensions "Out Of the Box"
+++++++++++++++++++++++++++++++++++++
-------------------------------------
Dynamic dimensions may appear in the input model without calling the ``reshape`` method.
Many DL frameworks support undefined dimensions.
If such a model is converted with Model Optimizer or read directly by the ``Core::read_model``, undefined dimensions are preserved.
Such dimensions are automatically treated as dynamic ones.
Therefore, there is no need to call the ``reshape`` method, if undefined dimensions are already configured in the original or the IR model.
If the input model has undefined dimensions that will not change during inference. It is recommended to set them to static values, using the same ``reshape`` method of the model.
From the API perspective, any combination of dynamic and static dimensions can be configured.
Model Optimizer provides identical capability to reshape the model during the conversion, including specifying dynamic dimensions.
Use this capability to save time on calling ``reshape`` method in the end application.
To get information about setting input shapes using Model Optimizer, refer to :doc:`Setting Input Shapes <openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model>`.
Dimension Bounds
++++++++++++++++++++
Apart from a dynamic dimension, the lower and/or upper bounds can also be specified. They define a range of allowed values for the dimension.
The bounds are coded as arguments for the ``ov::Dimension``:
Many DL frameworks support generating models with dynamic (or undefined) dimensions. If such a model is converted with Model Optimizer or read directly by ``Core::read_model``, its dynamic dimensions are preserved. These models do not need any additional configuration to use them with dynamic shapes.
To check if a model already has dynamic dimensions, first load it with the ``read_model()`` method, then check the ``partial_shape`` property of each layer. If the model has any dynamic dimensions, they will be reported as ``?``. For example, the following code will print the name and dimensions of each input layer:
.. tab-set::
@ -114,17 +122,55 @@ The bounds are coded as arguments for the ``ov::Dimension``:
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.cpp
:language: cpp
:fragment: ov_dynamic_shapes:reshape_bounds
:fragment: ov_dynamic_shapes:check_inputs
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.py
:language: python
:fragment: check_inputs
If the input model already has dynamic dimensions, that will not change during inference. If the inputs will not be used dynamically, it is recommended to set them to static values using the ``reshape`` method to save application memory and potentially improve inference speed. The OpenVINO API supports any combination of static and dynamic dimensions.
Static and dynamic dimensions can also be set when converting the model with Model Optimizer. It has identical capabilities to the ``reshape`` method, so you can save time by converting the model with dynamic shapes beforehand rather than in the application code. To get information about setting input shapes using Model Optimizer, refer to :doc:`Setting Input Shapes <openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model>`.
Dimension Bounds
----------------
The lower and/or upper bounds of a dynamic dimension can also be specified. They define a range of allowed values for the dimension. Dimension bounds can be set by passing the lower and upper bounds into the ``reshape`` method using the options shown below.
.. tab-set::
.. tab-item:: C++
:sync: cpp
The dimension bounds can be coded as arguments for ``ov::Dimension``, as shown in these examples:
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.cpp
:language: cpp
:fragment: ov_dynamic_shapes:reshape_bounds
.. tab-item:: Python
:sync: py
Each of these options are equivalent:
- Pass the lower and upper bounds directly into the ``reshape`` method, e.g. ``model.reshape([1, 10), (8,512)])``
- Pass the lower and upper bounds using ov.Dimension, e.g. ``model.reshape([ov.Dimension(1, 10), (8, 512)])``
- Pass the dimension ranges as strings, e.g. ``model.reshape(“1..10, 8..512”)``
The examples below show how to set dynamic dimension bounds for a mobilenet-v2 model with a default static shape of ``[1,3,224,224]``.
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.py
:language: python
:fragment: reshape_bounds
.. tab-item:: C
:sync: c
The dimension bounds can be coded as arguments for `ov_dimension <https://docs.openvino.ai/latest/structov_dimension.html#doxid-structov-dimension>`__, as shown in these examples:
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.c
:language: cpp
@ -143,13 +189,12 @@ Depending on the plugin, specifying the upper bounds can be required. For inform
If the lower and upper bounds for a dimension are known, it is recommended to specify them, even if a plugin can execute a model without the bounds.
Setting Input Tensors
+++++++++++++++++++++
Preparing and Inferencing Dynamic Data
++++++++++++++++++++++++++++++++++++++
Preparing a model with the ``reshape`` method is the first step.
The second step is passing a tensor with an appropriate shape to infer request.
This is similar to the :doc:`regular steps <openvino_docs_OV_UG_Integrate_OV_with_your_application>`. However, tensors can now be passed with different shapes for the same executable model and even for the same inference request:
After configuring a model with the ``reshape`` method, the next steps are to create tensors with the appropriate data shape and pass them to the model as an inference request. This is similar to the regular steps described in :doc:`Integrate OpenVINO™ with Your Application <openvino_docs_OV_UG_Integrate_OV_with_your_application>`. However, tensors can now be passed into the model with different shapes.
The sample below shows how a model can accept different input shapes. In the first case, the model runs inference on a 1x128 input shape and returns a result. In the second case, a 1x200 input shape is used, which the model can still handle because it is dynamically shaped.
.. tab-set::
@ -166,7 +211,7 @@ This is similar to the :doc:`regular steps <openvino_docs_OV_UG_Integrate_OV_wit
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.py
:language: python
:fragment: set_input_tensor
.. tab-item:: C
:sync: c
@ -175,49 +220,15 @@ This is similar to the :doc:`regular steps <openvino_docs_OV_UG_Integrate_OV_wit
:fragment: ov_dynamic_shapes:set_input_tensor
In the example above, the ``set_input_tensor`` is used to specify input tensors.
The real dimension of the tensor is always static, because it is a particular tensor and it does not have any dimension variations in contrast to model inputs.
Similar to static shapes, ``get_input_tensor`` can be used instead of ``set_input_tensor``.
In contrast to static input shapes, when using ``get_input_tensor`` for dynamic inputs, the ``set_shape`` method for the returned tensor should be called to define the shape and allocate memory.
Without doing so, the tensor returned by ``get_input_tensor`` is an empty tensor. The shape of the tensor is not initialized and memory is not allocated, because infer request does not have information about the real shape that will be provided.
Setting shape for an input tensor is required when the corresponding input has at least one dynamic dimension, regardless of the bounds.
Contrary to previous example, the following one shows the same sequence of two infer requests, using ``get_input_tensor`` instead of ``set_input_tensor``:
.. tab-set::
.. tab-item:: C++
:sync: cpp
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.cpp
:language: cpp
:fragment: ov_dynamic_shapes:get_input_tensor
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.py
:language: python
:fragment: get_input_tensor
.. tab-item:: C
:sync: c
.. doxygensnippet:: docs/snippets/ov_dynamic_shapes.c
:language: cpp
:fragment: ov_dynamic_shapes:get_input_tensor
For more information on how to apply input data to a model and run inference, see :doc:`OpenVINO™ Inference Request <openvino_docs_OV_UG_Infer_request>`.
Dynamic Shapes in Outputs
+++++++++++++++++++++++++
Examples above are valid approaches when dynamic dimensions in output may be implied by propagation of dynamic dimension from the inputs.
For example, batch dimension in an input shape is usually propagated through the whole model and appears in the output shape.
It also applies to other dimensions, like sequence length for NLP models or spatial dimensions for segmentation models, that are propagated through the entire network.
When using dynamic dimensions in the input of a model, one or more output dimensions may also be dynamic depending on how the dynamic inputs are propagated through the model. For example, the batch dimension in an input shape is usually propagated through the whole model and appears in the output shape. It also applies to other dimensions, like sequence length for NLP models or spatial dimensions for segmentation models, that are propagated through the entire network.
Whether the output has dynamic dimensions or not can be verified by querying the output partial shape after the model is read or reshaped.
The same applies to inputs. For example:
To determine if the output has dynamic dimensions, the ``partial_shape`` property of the models output layers can be queried after the model has been read or reshaped. The same property can be queried for model inputs. For example:
.. tab-set::
@ -244,9 +255,9 @@ The same applies to inputs. For example:
:fragment: ov_dynamic_shapes:print_dynamic
When there are dynamic dimensions in corresponding inputs or outputs, the ``?`` or ranges like ``1..10`` appear.
If the output has any dynamic dimensions, they will be reported as ``?`` or as a range (e.g.``1..10``).
It can also be verified in a more programmatic way:
Output layers can also be checked for dynamic dimensions using the ``partial_shape.is_dynamic()`` property. This can be used on an entire output layer, or on an individual dimension, as shown in these examples:
.. tab-set::
@ -273,9 +284,9 @@ It can also be verified in a more programmatic way:
:fragment: ov_dynamic_shapes:detect_dynamic
If at least one dynamic dimension exists in an output of a model, a shape of the corresponding output tensor will be set as the result of inference call.
Before the first inference, memory for such a tensor is not allocated and has the ``[0]`` shape.
If the ``set_output_tensor`` method is called with a pre-allocated tensor, the inference will call the ``set_shape`` internally, and the initial shape is replaced by the calculated shape.
Therefore, setting a shape for output tensors in this case is useful only when pre-allocating enough memory for output tensor. Normally, the ``set_shape`` method of a ``Tensor`` re-allocates memory only if a new shape requires more storage.
If at least one dynamic dimension exists in the output layer of a model, the actual shape of the output tensor will be determined during inference. Before the first inference, the output tensors memory is not allocated and has a shape of ``[0]``.
To pre-allocate space in memory for the output tensor, use the ``set_output_tensor`` method with the expected shape of the output. This will call the ``set_shape`` method internally, which will cause the initial shape to be replaced by the calculated shape.
@endsphinxdirective
@endsphinxdirective

View File

@ -13,14 +13,13 @@ The hints, in contrast, respect the actual model, so the parameters for optimal
Performance Hints: Latency and Throughput
#########################################
As discussed in the :doc:`Optimization Guide <openvino_docs_deployment_optimization_guide_dldt_optimization_guide>` there are a few different metrics associated with inference speed.
Throughput and latency are some of the most widely used metrics that measure the overall performance of an application.
As discussed in the :doc:`Optimization Guide <openvino_docs_deployment_optimization_guide_dldt_optimization_guide>` there are a few different metrics associated with inference speed. Throughput and latency are some of the most widely used metrics that measure the overall performance of an application.
Therefore, in order to ease the configuration of the device, OpenVINO offers two dedicated hints, namely `ov::hint::PerformanceMode::THROUGHPUT <enumov_1_1hint_1_1PerformanceMode.html#doxid-group-ov-runtime-cpp-prop-api-1gga032aa530efa40760b79af14913d48d73a50f9b1f40c078d242af7ec323ace44b3>`__ and `ov::hint::PerformanceMode::LATENCY <enumov_1_1hint_1_1PerformanceMode.html#doxid-group-ov-runtime-cpp-prop-api-1gga032aa530efa40760b79af14913d48d73a501069dd75f76384ba18f133fdce99c2>`__.
Therefore, in order to ease the configuration of the device, OpenVINO offers two dedicated hints, namely ``ov::hint::PerformanceMode::THROUGHPUT`` and ``ov::hint::PerformanceMode::LATENCY``. A special ``ov::hint::PerformanceMode::UNDEFINED`` hint acts the same as specifying no hint.
For more information on conducting performance measurements with the ``benchmark_app``, refer to the last section in this document.
Keep in mind that a typical model may take significantly more time to load with the ``ov::hint::PerformanceMode::THROUGHPUT`` and consume much more memory, compared to the ``ov::hint::PerformanceMode::LATENCY``.
Keep in mind that a typical model may take significantly more time to load with the ``ov::hint::PerformanceMode::THROUGHPUT`` and consume much more memory, compared to the ``ov::hint::PerformanceMode::LATENCY``. Also, the `THROUGHPUT` and `LATENCY` hints only improve performance in an asynchronous inference pipeline. For information on asynchronous inference, see the :ref:`Prefer Async API <prefer-async-api>` section of this document.
Performance Hints: How It Works
###############################
@ -32,7 +31,7 @@ Additionally, the optimal batch size is selected for the GPU and the :doc:`autom
The resulting (device-specific) settings can be queried back from the instance of the ``ov:Compiled_Model``.
Be aware that the ``benchmark_app`` outputs the actual settings for the ``THROUGHPUT`` hint. See the example of the output below:
.. code-block:: sh
.. code-block:: sh
$benchmark_app -hint tput -d CPU -m 'path to your favorite model'
...
@ -108,11 +107,18 @@ While an application is free to create more requests if needed (for example to s
Keep in mind that ``ov::hint::PerformanceMode::LATENCY`` does not necessarily imply using single inference request. For example, multi-socket CPUs can deliver as many requests at the same minimal latency as the number of NUMA nodes in the system.
To make your application fully scalable, make sure to query the ``ov::optimal_number_of_infer_requests`` directly.
.. _prefer-async-api:
Prefer Async API
################
The API of the inference requests offers Sync and Async execution. The ``ov::InferRequest::infer()`` is inherently synchronous and simple to operate (as it serializes the execution flow in the current application thread). The Async "splits" the ``infer()`` into ``ov::InferRequest::start_async()`` and ``ov::InferRequest::wait()`` (or callbacks). For more information, refer to the doc:`API examples <openvino_docs_OV_UG_Infer_request>`.
Although the Synchronous API can be somewhat easier to start with, it is recommended to use the Asynchronous (callbacks-based) API in the production code. It is the most general and scalable way to implement the flow control for any possible number of requests (and thus both latency and throughput scenarios).
The API of the inference requests offers Sync and Async execution. The ``ov::InferRequest::infer()`` is inherently synchronous and simple to operate (as it serializes the execution flow in the current application thread). The Async "splits" the ``infer()`` into ``ov::InferRequest::start_async()`` and ``ov::InferRequest::wait()`` (or callbacks). For more information on synchronous and asynchronous modes, refer to the :doc:`OpenVINO Inference Request documentation <openvino_docs_OV_UG_Infer_request>`.
Although the synchronous API can be easier to start with, it is recommended to use the asynchronous (callbacks-based) API in production code. It is the most general and scalable way to implement the flow control for any possible number of requests. The ``THROUGHPUT`` and ``LATENCY`` performance hints automatically configure the Asynchronous pipeline to use the optimal number of processing streams and inference requests.
.. note::
**Important:** Performance Hints only work when asynchronous execution mode is used. They do not affect the performance of a synchronous pipeline.
Combining the Hints and Individual Low-Level Settings
#####################################################

View File

@ -6,7 +6,7 @@
@sphinxdirective
The CPU plugin is a part of the Intel® Distribution of OpenVINO™ toolkit. It is developed to achieve high performance inference of neural networks on Intel® x86-64 CPUs.
The CPU plugin is a part of the Intel® Distribution of OpenVINO™ toolkit. It is developed to achieve high performance inference of neural networks on Intel® x86-64 CPUs.The newer 11th generation and later Intel® CPUs provide even further performance boost, especially with INT8 models.
For an in-depth description of CPU plugin, see:
- `CPU plugin developers documentation <https://github.com/openvinotoolkit/openvino/blob/master/docs/dev/cmake_options_for_custom_comiplation.md>`__.

View File

@ -125,8 +125,8 @@ Floating-point precision of a GPU primitive is selected based on operation preci
.. note::
Hardware acceleration for ``i8``/``u8`` precision may be unavailable on some platforms. In such cases, a model is executed in the floating-point precision taken from IR.
Hardware support of ``u8``/``i8`` acceleration can be queried via the `ov::device::capabilities` property.
The newer generation Intel Iris Xe and Xe MAX GPUs provide accelerated performance for i8/u8 models. Hardware acceleration for ``i8``/``u8`` precision may be unavailable on older generation platforms. In such cases, a model is executed in the floating-point precision taken from IR.
Hardware support of ``u8``/``i8`` acceleration can be queried via the ``ov::device::capabilities`` property.
:doc:`Hello Query Device C++ Sample<openvino_inference_engine_samples_hello_query_device_README>` can be used to print out the supported data types for all detected devices.

View File

@ -5,7 +5,7 @@
The OpenVINO runtime can infer various models of different input and output formats. Here, you can find configurations
supported by OpenVINO devices, which are CPU, GPU, or GNA (Gaussian neural accelerator coprocessor).
supported by OpenVINO devices, which are CPU, GPU, or GNA (Gaussian neural accelerator coprocessor). Currently, 11th generation and later processors (currently up to 13th generation) provide a further performance boost, especially with INT8 models.
.. note::
@ -13,7 +13,6 @@ supported by OpenVINO devices, which are CPU, GPU, or GNA (Gaussian neural accel
The OpenVINO Runtime provides unique capabilities to infer deep learning models on the following devices:
+--------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------+
| OpenVINO Device | Supported Hardware |
+==========================================================================+===============================================================================================================+
@ -66,17 +65,17 @@ This page shows supported and optimal configurations for each plugin.
Terminology
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
=================== =============================================
Acronym/Term Description
=================== =============================================
FP32 format Single-precision floating-point format
BF16 format Brain floating-point format
FP16 format Half-precision floating-point format
I16 format 2-byte signed integer format
I8 format 1-byte signed integer format
U16 format 2-byte unsigned integer format
U8 format 1-byte unsigned integer format
=================== =============================================
+------------------+----------------------------------------------+
| Acronym/Term | Description |
+==================+==============================================+
| FP32 format | Single-precision floating-point format |
| BF16 format | Brain floating-point format |
| FP16 format | Half-precision floating-point format |
| I16 format | 2-byte signed integer format |
| I8 format | 1-byte signed integer format |
| U16 format | 2-byte unsigned integer format |
| U8 format | 1-byte unsigned integer format |
+------------------+----------------------------------------------+
NHWC, NCHW, and NCDHW refer to the data ordering in batches of images:
@ -98,31 +97,31 @@ For example, the CHW value at index (c,h,w) is physically located at index (c\*H
Supported Model Formats
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
================== =========================== ============================ ==========================
Plugin FP32 FP16 I8
================== =========================== ============================ ==========================
CPU plugin Supported and preferred Supported Supported
GPU plugin Supported Supported and preferred Supported
GNA plugin Supported Supported Not supported
Arm® CPU plugin Supported and preferred Supported Supported (partially)
================== =========================== ============================ ==========================
+------------------+--------------------------+--------------------------+------------------------+
| Plugin | FP32 | FP16 | I8 |
+==================+==========================+===================================================+
| CPU plugin | Supported and preferred | Supported | Supported |
| GPU plugin | Supported | Supported and preferred | Supported |
| GNA plugin | Supported | Supported | Not supported |
| Arm® CPU plugin | Supported and preferred | Supported | Supported (partially) |
+------------------+--------------------------+--------------------------+------------------------+
For :doc:`Multi-Device <openvino_docs_OV_UG_Running_on_multiple_devices>` and
:doc:`Heterogeneous <openvino_docs_OV_UG_Hetero_execution>` executions, the supported models formats depends
on the actual underlying devices. *Generally, FP16 is preferable as it is most ubiquitous and performant*.
on the actual underlying devices.
Supported Input Precision
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
================= =========== =============== ============== =============== ============== =================
Plugin FP32 FP16 U8 U16 I8 I16
================= =========== =============== ============== =============== ============== =================
CPU plugin Supported Supported Supported Supported Supported Supported
GPU plugin Supported Supported\* Supported\* Supported\* Not supported Supported\*
GNA plugin Supported Not supported Supported Not supported Supported Supported
Arm® CPU plugin Supported Supported Supported Supported Not supported Not supported
================= =========== =============== ============== =============== ============== =================
+------------------+------------+----------------+--------------+----------------+----------------+----------------+
| Plugin | FP32 | FP16 | U8 | U16 | I8 | I16 |
+==================+============+================+==============+================+================+================+
| CPU plugin | Supported | Supported | Supported | Supported | Supported | Supported |
| GPU plugin | Supported | Supported\* | Supported\* | Supported\* | Not supported | Supported\* |
| GNA plugin | Supported | Not supported | Supported | Not supported | Supported | Supported |
| Arm® CPU plugin | Supported | Supported | Supported | Supported | Not supported | Not supported |
+------------------+------------+----------------+--------------+----------------+----------------+----------------+
\* - Supported via ``SetBlob`` only, ``GetBlob`` returns FP32
@ -133,14 +132,14 @@ depends on the actual underlying devices. *Generally, U8 is preferable as it is
Supported Output Precision
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
================== ========== ================
Plugin FP32 FP16
================== ========== ================
CPU plugin Supported Supported
GPU plugin Supported Supported
GNA plugin Supported Not supported
Arm® CPU plugin Supported Supported
================== ========== ================
+------------------+-----------------------------+
| Plugin | FP32 | FP16 |
+==================+============+================+
| CPU plugin | Supported | Supported |
| GPU plugin | Supported | Supported |
| GNA plugin | Supported | Not supported |
| Arm® CPU plugin | Supported | Supported |
+------------------+------------+----------------+
For :doc:`Multi-Device <openvino_docs_OV_UG_Running_on_multiple_devices>` and
:doc:`Heterogeneous <openvino_docs_OV_UG_Hetero_execution>` executions, the supported output precision
@ -149,23 +148,23 @@ depends on the actual underlying devices. *Generally, FP32 is preferable as it i
Supported Input Layout
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
================== =============== ============ ============ ============
Plugin NCDHW NCHW NHWC NC
================== =============== ============ ============ ============
CPU plugin Supported Supported Supported Supported
GPU plugin Supported Supported Supported Supported
GNA plugin Not supported Supported Supported Supported
Arm® CPU plugin Not supported Supported Supported Supported
================== =============== ============ ============ ============
+------------------+----------------+------------+------------+------------+
| Plugin | NCDHW | NCHW | NHWC | NC |
+==================+================+============+============+============+
| CPU plugin | Supported | Supported | Supported | Supported |
| GPU plugin | Supported | Supported | Supported | Supported |
| GNA plugin | Not supported | Supported | Supported | Supported |
| Arm® CPU plugin | Not supported | Supported | Supported | Supported |
+------------------+----------------+------------+------------+------------+
Supported Output Layout
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
====================== ======= ====== ===== ==== ====
Number of dimensions 5 4 3 2 1
====================== ======= ====== ===== ==== ====
Layout NCDHW NCHW CHW NC C
====================== ======= ====== ===== ==== ====
+-----------------------+--------+-------+------+-----+----+
| Number of dimensions | 5 | 4 | 3 | 2 | 1 |
+=======================+========+=======+======+=====+====+
| Layout | NCDHW | NCHW | CHW | NC | C |
+-----------------------+--------+-------+------+-----+----+
For setting relevant configuration, refer to the
@ -179,154 +178,154 @@ Supported Layers
The following layers are supported by the plugins:
============================== ============== =============== ============== ==================
Layers GPU CPU GNA Arm® CPU
============================== ============== =============== ============== ==================
Abs Supported Supported\*\* Not Supported Supported
Acos Supported Supported\*\* Not Supported Supported\*\*\*\*
Acosh Supported Supported\*\* Not Supported Supported\*\*\*\*
Activation-Clamp Supported Supported\*\*\* Supported Supported
Activation-ELU Supported Supported\*\*\* Not Supported Supported
Activation-Exp Supported Supported\*\*\* Supported Supported
Activation-Leaky ReLU Supported Supported\*\*\* Supported Not Supported
Activation-Not Supported Supported\*\*\* Not Supported Not Supported
Activation-PReLU Supported Supported\*\*\* Not Supported Supported
Activation-ReLU Supported Supported\*\*\* Supported Supported
Activation-ReLU6 Supported Supported\*\*\* Not Supported Not Supported
Activation-Sigmoid/Logistic Supported Supported\*\*\* Supported Supported
Activation-TanH Supported Supported\*\*\* Supported Supported
ArgMax Supported Supported\*\* Not Supported Not Supported
Asin Supported Supported\*\* Not Supported Supported\*\*\*\*
Asinh Supported Supported\*\* Not Supported Supported\*\*\*\*
Atan Supported Supported\*\* Not Supported Supported\*\*\*\*
Atanh Supported Supported\*\* Not Supported Supported\*\*\*\*
BatchNormalization Supported Supported Not Supported Supported
BinaryConvolution Supported Supported Not Supported Not Supported
Broadcast Supported Supported\*\* Not Supported Supported
Ceil Supported Supported\*\* Not Supported Supported
Concat Supported Supported\*\*\* Supported Supported
Const Supported Supported Supported Supported
Convolution-Dilated Supported Supported Not Supported Supported
Convolution-Dilated 3D Supported Supported Not Supported Not Supported
Convolution-Grouped Supported Supported Not Supported Supported
Convolution-Grouped 3D Supported Supported Not Supported Not Supported
Convolution-Ordinary Supported Supported Supported\* Supported
Convolution-Ordinary 3D Supported Supported Not Supported Not Supported
Cos Supported Supported\*\* Not Supported Supported\*\*\*\*
Cosh Supported Supported\*\* Not Supported Supported\*\*\*\*
Crop Supported Supported Supported Not Supported
CTCGreedyDecoder Supported\*\* Supported\*\* Not Supported Supported\*\*\*\*
Deconvolution Supported Supported Not Supported Not Supported
Deconvolution 3D Supported Supported Not Supported Not Supported
DeformableConvolution Supported Supported Not Supported Not Supported
DepthToSpace Supported Supported\*\* Not Supported Supported\*
DetectionOutput Supported Supported\*\* Not Supported Supported\*\*\*\*
Eltwise-And Supported Supported\*\*\* Not Supported Supported
Eltwise-Add Supported Supported\*\*\* Not Supported Supported
Eltwise-Div Supported Supported\*\*\* Not Supported Supported
Eltwise-Equal Supported Supported\*\*\* Not Supported Supported\*
Eltwise-FloorMod Supported Supported\*\*\* Not Supported Supported\*\*\*\*
Eltwise-Greater Supported Supported\*\*\* Not Supported Supported
Eltwise-GreaterEqual Supported Supported\*\*\* Not Supported Supported
Eltwise-Less Supported Supported\*\*\* Not Supported Supported\*
Eltwise-LessEqual Supported Supported\*\*\* Not Supported Supported\*
Eltwise-LogicalAnd Supported Supported\*\*\* Not Supported Supported
Eltwise-LogicalOr Supported Supported\*\*\* Not Supported Supported
Eltwise-LogicalXor Supported Supported\*\*\* Not Supported Supported
Eltwise-Max Supported Supported\*\*\* Not Supported Supported
Eltwise-Min Supported Supported\*\*\* Not Supported Supported
Eltwise-Mul Supported Supported\*\*\* Supported Supported
Eltwise-NotEqual Supported Supported\*\*\* Not Supported Supported\*
Eltwise-Pow Supported Supported\*\*\* Not Supported Supported
Eltwise-Prod Supported Supported\*\*\* Supported Not Supported
Eltwise-SquaredDiff Supported Supported\*\*\* Not Supported Supported
Eltwise-Sub Supported Supported\*\*\* Supported Supported
Eltwise-Sum Supported Supported\*\*\* Supported Supported\*\*\*\*
Erf Supported Supported\*\* Not Supported Supported\*\*\*\*
Exp Supported Supported Supported Supported
FakeQuantize Not Supported Supported Not Supported Supported\*
Fill Not Supported Supported\*\* Not Supported Not Supported
Flatten Supported Supported Not Supported Not Supported
Floor Supported Supported\*\* Not Supported Supported
FullyConnected (Inner Product) Supported Supported\*\*\* Supported Supported
Gather Supported Supported\*\* Not Supported Supported\*
GatherTree Not Supported Supported\*\* Not Supported Supported\*\*\*\*
Gemm Supported Supported Not Supported Not Supported
GRN Supported\*\* Supported\*\* Not Supported Supported
HardSigmoid Supported Supported\*\* Not Supported Supported\*\*\*\*
Interp Supported\*\* Supported\*\* Not Supported Supported\*
Log Supported Supported\*\* Supported Supported
LRN (Norm) Supported Supported Not Supported Supported\*
LSTMCell Supported Supported Supported Supported
GRUCell Supported Supported Supported Supported
RNNCell Supported Supported Not Supported Supported
LSTMSequence Supported Supported Supported Supported\*\*\*\*
GRUSequence Supported Supported Supported Supported\*\*\*\*
RNNSequence Supported Supported Not Supported Supported\*\*\*\*
LogSoftmax Supported Supported\*\* Not Supported Supported
Memory Not Supported Supported Supported Not Supported
MVN Supported Supported\*\* Not Supported Supported\*
Neg Supported Supported\*\* Not Supported Supported
NonMaxSuppression Not Supported Supported\*\* Not Supported Supported\*\*\*\*
Normalize Supported Supported\*\* Not Supported Supported\*
OneHot Supported Supported\*\* Not Supported Supported\*\*\*\*
Pad Supported Supported\*\* Not Supported Supported\*
Permute Supported Supported Supported\* Not Supported
Pooling(AVG,MAX) Supported Supported Supported Supported
Pooling(AVG,MAX) 3D Supported Supported Not Supported Supported\*
Power Supported Supported\*\* Supported\* Supported
PowerFile Not Supported Supported\*\* Not Supported Not Supported
PriorBox Supported Supported\*\* Not Supported Supported
PriorBoxClustered Supported\*\* Supported\*\* Not Supported Supported
Proposal Supported Supported\*\* Not Supported Supported\*\*\*\*
PSROIPooling Supported Supported\*\* Not Supported Supported\*\*\*\*
Range Not Supported Supported\*\* Not Supported Not Supported
Reciprocal Supported Supported\*\* Not Supported Not Supported
ReduceAnd Supported Supported\*\* Not Supported Supported\*\*\*\*
ReduceL1 Supported Supported\*\* Not Supported Supported
ReduceL2 Supported Supported\*\* Not Supported Supported
ReduceLogSum Supported Supported\*\* Not Supported Supported
ReduceLogSumExp Supported Supported\*\* Not Supported Not Supported
ReduceMax Supported Supported\*\* Not Supported Supported
ReduceMean Supported Supported\*\* Not Supported Supported
ReduceMin Supported Supported\*\* Not Supported Supported
ReduceOr Supported Supported\*\* Not Supported Supported\*\*\*\*
ReduceProd Supported Supported\*\* Not Supported Supported
ReduceSum Supported Supported\*\* Not Supported Supported
ReduceSumSquare Supported Supported\*\* Not Supported Not Supported
RegionYolo Supported Supported\*\* Not Supported Supported\*\*\*\*
ReorgYolo Supported Supported\*\* Not Supported Supported
Resample Supported Supported\*\* Not Supported Not Supported
Reshape Supported Supported\*\*\* Supported Supported
ReverseSequence Supported Supported\*\* Not Supported Supported\*\*\*\*
RNN Not Supported Supported Not Supported Supported
ROIPooling Supported\* Supported Not Supported Supported\*\*\*\*
ScaleShift Supported Supported\*\*\* Supported Not Supported
ScatterUpdate Not Supported Supported\*\* Not Supported Not Supported
Select Supported Supported Not Supported Supported
Selu Supported Supported\*\* Not Supported Supported\*\*\*\*
ShuffleChannels Supported Supported\*\* Not Supported Supported
Sign Supported Supported\*\* Not Supported Supported
Sin Supported Supported\*\* Not Supported Supported
Sinh Supported Supported\*\* Not Supported Supported\*\*\*\*
SimplerNMS Supported Supported\*\* Not Supported Not Supported
Slice Supported Supported\*\*\* Supported Not Supported
SoftMax Supported Supported\*\*\* Not Supported Supported
Softplus Supported Supported\*\* Not Supported Supported
Softsign Supported Supported\*\* Supported Not Supported
SpaceToDepth Not Supported Supported\*\* Not Supported Supported\*
SpatialTransformer Not Supported Supported\*\* Not Supported Not Supported
Split Supported Supported\*\*\* Supported Supported
Squeeze Supported Supported\*\* Supported Supported
StridedSlice Supported Supported\*\* Not Supported Supported\*
Tan Supported Supported\*\* Not Supported Supported\*\*\*\*
TensorIterator Not Supported Supported Supported Supported
Tile Supported\*\* Supported\*\*\* Not Supported Supported
TopK Supported Supported\*\* Not Supported Supported\*\*\*\*
Unpooling Supported Not Supported Not Supported Not Supported
Unsqueeze Supported Supported\*\* Supported Supported
Upsampling Supported Not Supported Not Supported Not Supported
============================== ============== =============== ============== ==================
+--------------------------------+----------------+-----------------+----------------+--------------------+
| Layers | GPU | CPU | GNA | Arm® CPU |
+================================+================+=================+================+====================+
| Abs | Supported | Supported\*\* | Not Supported | Supported |
| Acos | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Acosh | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Activation-Clamp | Supported | Supported\*\*\* | Supported | Supported |
| Activation-ELU | Supported | Supported\*\*\* | Not Supported | Supported |
| Activation-Exp | Supported | Supported\*\*\* | Supported | Supported |
| Activation-Leaky ReLU | Supported | Supported\*\*\* | Supported | Not Supported |
| Activation-Not | Supported | Supported\*\*\* | Not Supported | Not Supported |
| Activation-PReLU | Supported | Supported\*\*\* | Not Supported | Supported |
| Activation-ReLU | Supported | Supported\*\*\* | Supported | Supported |
| Activation-ReLU6 | Supported | Supported\*\*\* | Not Supported | Not Supported |
| Activation-Sigmoid/Logistic | Supported | Supported\*\*\* | Supported | Supported |
| Activation-TanH | Supported | Supported\*\*\* | Supported | Supported |
| ArgMax | Supported | Supported\*\* | Not Supported | Not Supported |
| Asin | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Asinh | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Atan | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Atanh | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| BatchNormalization | Supported | Supported | Not Supported | Supported |
| BinaryConvolution | Supported | Supported | Not Supported | Not Supported |
| Broadcast | Supported | Supported\*\* | Not Supported | Supported |
| Ceil | Supported | Supported\*\* | Not Supported | Supported |
| Concat | Supported | Supported\*\*\* | Supported | Supported |
| Const | Supported | Supported | Supported | Supported |
| Convolution-Dilated | Supported | Supported | Not Supported | Supported |
| Convolution-Dilated 3D | Supported | Supported | Not Supported | Not Supported |
| Convolution-Grouped | Supported | Supported | Not Supported | Supported |
| Convolution-Grouped 3D | Supported | Supported | Not Supported | Not Supported |
| Convolution-Ordinary | Supported | Supported | Supported\* | Supported |
| Convolution-Ordinary 3D | Supported | Supported | Not Supported | Not Supported |
| Cos | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Cosh | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Crop | Supported | Supported | Supported | Not Supported |
| CTCGreedyDecoder | Supported\*\* | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Deconvolution | Supported | Supported | Not Supported | Not Supported |
| Deconvolution 3D | Supported | Supported | Not Supported | Not Supported |
| DeformableConvolution | Supported | Supported | Not Supported | Not Supported |
| DepthToSpace | Supported | Supported\*\* | Not Supported | Supported\* |
| DetectionOutput | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Eltwise-And | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Add | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Div | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Equal | Supported | Supported\*\*\* | Not Supported | Supported\* |
| Eltwise-FloorMod | Supported | Supported\*\*\* | Not Supported | Supported\*\*\*\* |
| Eltwise-Greater | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-GreaterEqual | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Less | Supported | Supported\*\*\* | Not Supported | Supported\* |
| Eltwise-LessEqual | Supported | Supported\*\*\* | Not Supported | Supported\* |
| Eltwise-LogicalAnd | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-LogicalOr | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-LogicalXor | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Max | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Min | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Mul | Supported | Supported\*\*\* | Supported | Supported |
| Eltwise-NotEqual | Supported | Supported\*\*\* | Not Supported | Supported\* |
| Eltwise-Pow | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Prod | Supported | Supported\*\*\* | Supported | Not Supported |
| Eltwise-SquaredDiff | Supported | Supported\*\*\* | Not Supported | Supported |
| Eltwise-Sub | Supported | Supported\*\*\* | Supported | Supported |
| Eltwise-Sum | Supported | Supported\*\*\* | Supported | Supported\*\*\*\* |
| Erf | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Exp | Supported | Supported | Supported | Supported |
| FakeQuantize | Not Supported | Supported | Not Supported | Supported\* |
| Fill | Not Supported | Supported\*\* | Not Supported | Not Supported |
| Flatten | Supported | Supported | Not Supported | Not Supported |
| Floor | Supported | Supported\*\* | Not Supported | Supported |
| FullyConnected (Inner Product) | Supported | Supported\*\*\* | Supported | Supported |
| Gather | Supported | Supported\*\* | Not Supported | Supported\* |
| GatherTree | Not Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Gemm | Supported | Supported | Not Supported | Not Supported |
| GRN | Supported\*\* | Supported\*\* | Not Supported | Supported |
| HardSigmoid | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Interp | Supported\*\* | Supported\*\* | Not Supported | Supported\* |
| Log | Supported | Supported\*\* | Supported | Supported |
| LRN (Norm) | Supported | Supported | Not Supported | Supported\* |
| LSTMCell | Supported | Supported | Supported | Supported |
| GRUCell | Supported | Supported | Supported | Supported |
| RNNCell | Supported | Supported | Not Supported | Supported |
| LSTMSequence | Supported | Supported | Supported | Supported\*\*\*\* |
| GRUSequence | Supported | Supported | Supported | Supported\*\*\*\* |
| RNNSequence | Supported | Supported | Not Supported | Supported\*\*\*\* |
| LogSoftmax | Supported | Supported\*\* | Not Supported | Supported |
| Memory | Not Supported | Supported | Supported | Not Supported |
| MVN | Supported | Supported\*\* | Not Supported | Supported\* |
| Neg | Supported | Supported\*\* | Not Supported | Supported |
| NonMaxSuppression | Not Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Normalize | Supported | Supported\*\* | Not Supported | Supported\* |
| OneHot | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Pad | Supported | Supported\*\* | Not Supported | Supported\* |
| Permute | Supported | Supported | Supported\* | Not Supported |
| Pooling(AVG,MAX) | Supported | Supported | Supported | Supported |
| Pooling(AVG,MAX) 3D | Supported | Supported | Not Supported | Supported\* |
| Power | Supported | Supported\*\* | Supported\* | Supported |
| PowerFile | Not Supported | Supported\*\* | Not Supported | Not Supported |
| PriorBox | Supported | Supported\*\* | Not Supported | Supported |
| PriorBoxClustered | Supported\*\* | Supported\*\* | Not Supported | Supported |
| Proposal | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| PSROIPooling | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Range | Not Supported | Supported\*\* | Not Supported | Not Supported |
| Reciprocal | Supported | Supported\*\* | Not Supported | Not Supported |
| ReduceAnd | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| ReduceL1 | Supported | Supported\*\* | Not Supported | Supported |
| ReduceL2 | Supported | Supported\*\* | Not Supported | Supported |
| ReduceLogSum | Supported | Supported\*\* | Not Supported | Supported |
| ReduceLogSumExp | Supported | Supported\*\* | Not Supported | Not Supported |
| ReduceMax | Supported | Supported\*\* | Not Supported | Supported |
| ReduceMean | Supported | Supported\*\* | Not Supported | Supported |
| ReduceMin | Supported | Supported\*\* | Not Supported | Supported |
| ReduceOr | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| ReduceProd | Supported | Supported\*\* | Not Supported | Supported |
| ReduceSum | Supported | Supported\*\* | Not Supported | Supported |
| ReduceSumSquare | Supported | Supported\*\* | Not Supported | Not Supported |
| RegionYolo | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| ReorgYolo | Supported | Supported\*\* | Not Supported | Supported |
| Resample | Supported | Supported\*\* | Not Supported | Not Supported |
| Reshape | Supported | Supported\*\*\* | Supported | Supported |
| ReverseSequence | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| RNN | Not Supported | Supported | Not Supported | Supported |
| ROIPooling | Supported\* | Supported | Not Supported | Supported\*\*\*\* |
| ScaleShift | Supported | Supported\*\*\* | Supported | Not Supported |
| ScatterUpdate | Not Supported | Supported\*\* | Not Supported | Not Supported |
| Select | Supported | Supported | Not Supported | Supported |
| Selu | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| ShuffleChannels | Supported | Supported\*\* | Not Supported | Supported |
| Sign | Supported | Supported\*\* | Not Supported | Supported |
| Sin | Supported | Supported\*\* | Not Supported | Supported |
| Sinh | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| SimplerNMS | Supported | Supported\*\* | Not Supported | Not Supported |
| Slice | Supported | Supported\*\*\* | Supported | Not Supported |
| SoftMax | Supported | Supported\*\*\* | Not Supported | Supported |
| Softplus | Supported | Supported\*\* | Not Supported | Supported |
| Softsign | Supported | Supported\*\* | Supported | Not Supported |
| SpaceToDepth | Not Supported | Supported\*\* | Not Supported | Supported\* |
| SpatialTransformer | Not Supported | Supported\*\* | Not Supported | Not Supported |
| Split | Supported | Supported\*\*\* | Supported | Supported |
| Squeeze | Supported | Supported\*\* | Supported | Supported |
| StridedSlice | Supported | Supported\*\* | Not Supported | Supported\* |
| Tan | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| TensorIterator | Not Supported | Supported | Supported | Supported |
| Tile | Supported\*\* | Supported\*\*\* | Not Supported | Supported |
| TopK | Supported | Supported\*\* | Not Supported | Supported\*\*\*\* |
| Unpooling | Supported | Not Supported | Not Supported | Not Supported |
| Unsqueeze | Supported | Supported\*\* | Supported | Supported |
| Upsampling | Supported | Not Supported | Not Supported | Not Supported |
+--------------------------------+----------------+-----------------+----------------+--------------------+
\* - support is limited to the specific parameters. Refer to "Known Layer Limitations" section for the device :doc:`from the list of supported <openvino_docs_OV_UG_supported_plugins_Supported_Devices>`.

View File

@ -12,21 +12,21 @@ ov_core_create(&core);
ov_model_t* model = NULL;
ov_core_read_model(core, "model.xml", NULL, &model);
// Set one static dimension (= 1) and another dynamic dimension (= Dimension())
// Set first dimension as dynamic ({-1, -1}) and remaining dimensions as static
{
ov_partial_shape_t partial_shape;
ov_dimension_t dims[2] = {{1, 1}, {-1, -1}};
ov_partial_shape_create(2, dims, &partial_shape);
ov_model_reshape_single_input(model, partial_shape); // {1,?}
ov_dimension_t dims[4] = {{-1, -1}, {3, 3}, {224, 224}, {224, 224}};
ov_partial_shape_create(4, dims, &partial_shape);
ov_model_reshape_single_input(model, partial_shape); // {?,3,224,224}
ov_partial_shape_free(&partial_shape);
}
// Or set both dimensions as dynamic if both are going to be changed dynamically
// Or, set third and fourth dimensions as dynamic
{
ov_partial_shape_t partial_shape;
ov_dimension_t dims[2] = {{-1, -1}, {-1, -1}};
ov_partial_shape_create(2, dims, &partial_shape);
ov_model_reshape_single_input(model, partial_shape); // {?,?}
ov_dimension_t dims[4] = {{1, 1}, {3, 3}, {-1, -1}, {-1, -1}};
ov_partial_shape_create(4, dims, &partial_shape);
ov_model_reshape_single_input(model, partial_shape); // {1,3,?,?}
ov_partial_shape_free(&partial_shape);
}
//! [ov_dynamic_shapes:reshape_undefined]

View File

@ -10,17 +10,11 @@ void reshape_with_dynamics() {
ov::Core core;
auto model = core.read_model("model.xml");
// Set one static dimension (= 1) and another dynamic dimension (= Dimension())
model->reshape({{1, ov::Dimension()}}); // {1,?}
// Set first dimension as dynamic (ov::Dimension()) and remaining dimensions as static
model->reshape({{ov::Dimension(), 3, 224, 224}}); // {?,3,224,224}
// The same as above
model->reshape({{1, -1}}); // {1,?}
// Or set both dimensions as dynamic if both are going to be changed dynamically
model->reshape({{ov::Dimension(), ov::Dimension()}}); // {?,?}
// The same as above
model->reshape({{-1, -1}}); // {?,?}
// Or, set third and fourth dimensions as dynamic
model->reshape({{1, 3, ov::Dimension(), ov::Dimension()}}); // {1,3,?,?}
//! [ov_dynamic_shapes:reshape_undefined]
//! [ov_dynamic_shapes:reshape_bounds]
// Both dimensions are dynamic, first has a size within 1..10 and the second has a size within 8..512
@ -59,6 +53,34 @@ if (model->output(0).get_partial_shape()[1].is_dynamic()) {
}
//! [ov_dynamic_shapes:detect_dynamic]
}
{
//! [ov_dynamic_shapes:check_inputs]
ov::Core core;
auto model = core.read_model("model.xml");
// Print info of first input layer
std::cout << model->input(0).get_partial_shape() << "\n";
// Print info of second input layer
std::cout << model->input(1).get_partial_shape() << "\n";
//etc
//! [ov_dynamic_shapes:check_inputs]
}
{
ov::Core core;
auto model = core.read_model("model.xml");
//! [ov_dynamic_shapes:reshape_multiple_inputs]
// Assign dynamic shapes to second dimension in every input layer
std::map<ov::Output<ov::Node>, ov::PartialShape> port_to_shape;
for (const ov::Output<ov::Node>& input : model->inputs()) {
ov::PartialShape shape = input.get_partial_shape();
shape[1] = -1;
port_to_shape[input] = shape;
}
model->reshape(port_to_shape);
//! [ov_dynamic_shapes:reshape_multiple_inputs]
}
}
void set_tensor() {

View File

@ -7,40 +7,22 @@ import openvino.runtime as ov
#! [import]
#! [reshape_undefined]
core = ov.Core()
model = core.read_model("model.xml")
Core = ov.Core()
model = core.read_model(model.xml)
# Set one static dimension (= 1) and another dynamic dimension (= Dimension())
model.reshape([1, ov.Dimension()])
# Set first dimension to be dynamic while keeping others static
model.reshape([-1, 3, 224, 224])
# The same as above
model.reshape([1, -1])
# The same as above
model.reshape("1, ?")
# Or set both dimensions as dynamic if both are going to be changed dynamically
model.reshape([ov.Dimension(), ov.Dimension()])
# The same as above
model.reshape([-1, -1])
# The same as above
model.reshape("?, ?")
# Or, set third and fourth dimensions as dynamic
model.reshape([1, 3, -1, -1])
#! [reshape_undefined]
#! [reshape_bounds]
# Both dimensions are dynamic, first has a size within 1..10 and the second has a size within 8..512
model.reshape([ov.Dimension(1, 10), ov.Dimension(8, 512)])
# Example 1 - set first dimension as dynamic (no bounds) and third and fourth dimensions to range of 112..448
model.reshape([-1, 3, (112, 448), (112, 448)])
# The same as above
model.reshape([(1, 10), (8, 512)])
# The same as above
model.reshape("1..10, 8..512")
# Both dimensions are dynamic, first doesn't have bounds, the second is in the range of 8..512
model.reshape([-1, (8, 512)])
# Example 2 - Set first dimension to a range of 1..8 and third and fourth dimensions to range of 112..448
model.reshape([(1, 8), 3, (112, 448), (112, 448)])
#! [reshape_bounds]
model = core.read_model("model.xml")
@ -73,45 +55,21 @@ executable = core.compile_model(model)
infer_request = executable.create_infer_request()
#! [set_input_tensor]
# The first inference call
# For first inference call, prepare an input tensor with 1x128 shape and run inference request
Input_data1 = np.ones(shape=[1,128])
infer_request.infer([input_data1])
# Create tensor compatible to the model input
# Shape {1, 128} is compatible with any reshape statements made in previous examples
input_tensor1 = ov.Tensor(model.input().element_type, [1, 128])
# ... write values to input_tensor_1
# Get resulting outputs
Output_tensor1 = infer_request.get_output_tensor()
Output_data1 = output_tensor.data[:]
# Set the tensor as an input for the infer request
infer_request.set_input_tensor(input_tensor1)
# For second inference call, prepare a 1x200 input tensor and run inference request
Input_data2 = np.ones(shape=[1,200])
infer_request.infer([input_data2])
# Do the inference
infer_request.infer()
# Or pass a tensor in infer to set the tensor as a model input and make the inference
infer_request.infer([input_tensor1])
# Or pass the numpy array to set inputs of the infer request
input_data = np.ones(shape=[1, 128])
infer_request.infer([input_data])
# Retrieve a tensor representing the output data
output_tensor = infer_request.get_output_tensor()
# Copy data from tensor to numpy array
data1 = output_tensor.data[:]
# The second inference call, repeat steps:
# Create another tensor (if the previous one cannot be utilized)
# Notice, the shape is different from input_tensor_1
input_tensor2 = ov.Tensor(model.input().element_type, [1, 200])
# ... write values to input_tensor_2
infer_request.infer([input_tensor2])
# No need to call infer_request.get_output_tensor() again
# output_tensor queried after the first inference call above is valid here.
# But it may not be true for the memory underneath as shape changed, so re-take an output data:
data2 = output_tensor.data[:]
# Get resulting outputs
Output_tensor2 = infer_request.get_output_tensor()
Output_data2 = output_tensor.data[:]
#! [set_input_tensor]
infer_request = executable.create_infer_request()
@ -137,3 +95,21 @@ input_tensor.shape = [1, 200]
infer_request.infer()
data2 = output_tensor.data[:]
#! [get_input_tensor]
#! [check_inputs]
core = ov.Core()
model = core.read_model("model.xml")
# Print model input layer info
for input_layer in model.inputs:
print(input_layer.names, input_layer.partial_shape)
#! [check_inputs]
#! [reshape_multiple_inputs]
# Assign dynamic shapes to second dimension in every input layer
shapes = {}
for input_layer in model.inputs:
shapes[input_layer] = input_layer.partial_shape
shapes[input_layer][1] = -1
model.reshape(shapes)
#! [reshape_multiple_inputs]