Files

Ilya Lavrenov e3098ece7e DOCS: port changes from releases/2022/1 (#11040 )

* Added migration for deployment (#10800)

* Added migration for deployment

* Addressed comments

* more info after the What's new Sessions' questions (#10803)

* more info after the What's new Sessions' questions

* generalizing the optimal_batch_size vs explicit value message

* Update docs/OV_Runtime_UG/automatic_batching.md

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* Update docs/OV_Runtime_UG/automatic_batching.md

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* Update docs/OV_Runtime_UG/automatic_batching.md

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* Update docs/OV_Runtime_UG/automatic_batching.md

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* Update docs/OV_Runtime_UG/automatic_batching.md

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* Update docs/OV_Runtime_UG/automatic_batching.md

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* Perf Hints docs and General Opt Guide refactoring (#10815)

* Brushed the general optimization page

* Opt GUIDE, WIP

* perf hints doc placeholder

* WIP

* WIP2

* WIP 3

* added streams and few other details

* fixed titles, misprints etc

* Perf hints

* movin the runtime optimizations intro

* fixed link

* Apply suggestions from code review

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* some details on the FIL and other means when pure inference time is not the only factor

* shuffled according to general->use-case->device-specifics flow, minor brushing

* next iter

* section on optimizing for tput and latency

* couple of links to the features support matrix

* Links, brushing, dedicated subsections for Latency/FIL/Tput

* had to make the link less specific (otherwise docs compilations fails)

* removing the Temp/Should be moved to the Opt Guide

* shuffled the tput/latency/etc info into separated documents. also the following docs moved from the temp into specific feature, general product desc or corresponding plugins

-   openvino_docs_IE_DG_Model_caching_overview
-   openvino_docs_IE_DG_Int8Inference
-   openvino_docs_IE_DG_Bfloat16Inference
-   openvino_docs_OV_UG_NoDynamicShapes

* fixed toc for ov_dynamic_shapes.md

* referring the openvino_docs_IE_DG_Bfloat16Inference to avoid docs compilation errors

* fixed main product TOC, removed ref from the second-level items

* reviewers remarks

* reverted the openvino_docs_OV_UG_NoDynamicShapes

* reverting openvino_docs_IE_DG_Bfloat16Inference and openvino_docs_IE_DG_Int8Inference

* "No dynamic shapes" to the "Dynamic shapes" as TOC

* removed duplication

* minor brushing

* Caching to the next level in TOC

* brushing

* more on the perf counters ( for latency and dynamic cases)

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* Updated common IE pipeline infer-request section (#10844)

* Updated common IE pipeline infer-reqest section

* Update ov_infer_request.md

* Apply suggestions from code review

Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>

Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com>
Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>

* DOCS: Removed useless 4 spaces in snippets (#10870)

* Updated snippets

* Added link to encryption

* [DOCS] ARM CPU plugin docs (#10885)

* initial commit

ARM_CPU.md added
ARM CPU is added to the list of supported devices

* Update the list of supported properties

* Update Device_Plugins.md

* Update CODEOWNERS

* Removed quotes in limitations section

* NVIDIA and Android are added to the list of supported devices

* Added See Also section and reg sign to arm

* Added Preprocessing acceleration section

* Update the list of supported layers

* updated list of supported layers

* fix typos

* Added support disclaimer

* update trade and reg symbols

* fixed typos

* fix typos

* reg fix

* add reg symbol back

Co-authored-by: Vitaly Tuzov <vitaly.tuzov@intel.com>

* Try to fix visualization (#10896)

* Try to fix visualization

* New try

* Update Install&Deployment for migration guide to 22/1 (#10933)

* updates

* update

* Getting started improvements (#10948)

* Onnx updates (#10962)

* onnx changes

* onnx updates

* onnx updates

* fix broken anchors api reference (#10976)

* add ote repo (#10979)

* DOCS: Increase content width (#10995)

* fixes

* fix

* Fixed compilation

Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com>
Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>
Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>
Co-authored-by: Aleksandr Voron <aleksandr.voron@intel.com>
Co-authored-by: Vitaly Tuzov <vitaly.tuzov@intel.com>
Co-authored-by: Ilya Churaev <ilya.churaev@intel.com>
Co-authored-by: Yuan Xu <yuan1.xu@intel.com>
Co-authored-by: Victoria Yashina <victoria.yashina@intel.com>
Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com>

2022-03-18 17:48:45 +03:00

5.5 KiB

Raw Blame History

Arm® CPU device

Introducing the Arm® CPU Plugin

The Arm® CPU plugin is developed in order to enable deep neural networks inference on Arm® CPU, using Compute Library as a backend.

Note

: Note that this is a community-level add-on to OpenVINO™. Intel® welcomes community participation in the OpenVINO™ ecosystem and technical questions on community forums as well as code contributions are welcome. However, this component has not undergone full release validation or qualification from Intel®, and no official support is offered.

The Arm® CPU plugin is not a part of the Intel® Distribution of OpenVINO™ toolkit and is not distributed in pre-built form. To use the plugin, it should be built from source code. Plugin build procedure is described on page How to build Arm® CPU plugin.

The set of supported layers is defined on Operation set specification.

Supported inference data types

The Arm® CPU plugin supports the following data types as inference precision of internal primitives:

Floating-point data types:
- f32
- f16
Quantized data types:
- i8

Note

: i8 support is experimental.

Hello Query Device C++ Sample can be used to print out supported data types for all detected devices.

Supported features

Preprocessing acceleration

The Arm® CPU plugin supports the following accelerated preprocessing operations:

Precision conversion:
- u8 -> u16, s16, s32
- u16 -> u8, u32
- s16 -> u8, s32
- f16 -> f32
Transposion of tensors with dims < 5
Interpolation of 4D tensors with no padding (pads_begin and pads_end equal 0).

The Arm® CPU plugin supports the following preprocessing operations, however they are not accelerated:

Precision conversion that are not mentioned above
Color conversion:
- NV12 to RGB
- NV12 to BGR
- i420 to RGB
- i420 to BGR

See preprocessing API guide for more details.

Supported properties

The plugin supports the properties listed below.

Read-write properties

All parameters must be set before calling ov::Core::compile_model() in order to take effect or passed as additional argument to ov::Core::compile_model()

ov::enable_profiling

Read-only properties

ov::supported_properties
ov::available_devices
ov::range_for_async_infer_requests
ov::range_for_streams
ov::device::full_name
ov::device::capabilities

Known Layers Limitation

AvgPool layer is supported via arm_compute library for 4D input tensor and via reference implementation for another cases.
BatchToSpace layer is supported 4D tensors only and constant nodes: block_shape with N = 1 and C= 1, crops_begin with zero values and crops_end with zero values.
ConvertLike layer is supported configuration like Convert.
DepthToSpace layer is supported 4D tensors only and for BLOCKS_FIRST of mode attribute.
Equal does not support broadcast for inputs.
Gather layer is supported constant scalar or 1D indices axes only. Layer is supported as via arm_compute library for non negative indices and via reference implementation otherwise.
Less does not support broadcast for inputs.
LessEqual does not support broadcast for inputs.
LRN layer is supported axes = {1} or axes = {2, 3} only.
MaxPool-1 layer is supported via arm_compute library for 4D input tensor and via reference implementation for another cases.
Mod layer is supported for f32 only.
MVN layer is supported via arm_compute library for 2D inputs and false value of normalize_variance and false value of across_channels, for another cases layer is implemented via runtime reference.
Normalize layer is supported via arm_compute library with MAX value of eps_mode and axes = {2 | 3}, and for ADD value of eps_mode layer uses DecomposeNormalizeL2Add, for another cases layer is implemented via runtime reference.
NotEqual does not support broadcast for inputs.
Pad layer works with pad_mode = {REFLECT | CONSTANT | SYMMETRIC} parameters only.
Round layer is supported via arm_compute library with RoundMode::HALF_AWAY_FROM_ZERO value of mode, for another cases layer is implemented via runtime reference.
SpaceToBatch layer is supported 4D tensors only and constant nodes: shapes, pads_begin or pads_end with zero paddings for batch or channels and one values shapes for batch and channels.
SpaceToDepth layer is supported 4D tensors only and for BLOCKS_FIRST of mode attribute.
StridedSlice layer is supported via arm_compute library for tensors with dims < 5 and zero values of ellipsis_mask or zero values of new_axis_mask and shrink_axis_mask, for another cases layer is implemented via runtime reference.
FakeQuantize layer is supported via arm_compute library in Low Precision evaluation mode for suitable models and via runtime reference otherwise.

5.5 KiB Raw Blame History