* Added migration for deployment (#10800) * Added migration for deployment * Addressed comments * more info after the What's new Sessions' questions (#10803) * more info after the What's new Sessions' questions * generalizing the optimal_batch_size vs explicit value message * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Perf Hints docs and General Opt Guide refactoring (#10815) * Brushed the general optimization page * Opt GUIDE, WIP * perf hints doc placeholder * WIP * WIP2 * WIP 3 * added streams and few other details * fixed titles, misprints etc * Perf hints * movin the runtime optimizations intro * fixed link * Apply suggestions from code review Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * some details on the FIL and other means when pure inference time is not the only factor * shuffled according to general->use-case->device-specifics flow, minor brushing * next iter * section on optimizing for tput and latency * couple of links to the features support matrix * Links, brushing, dedicated subsections for Latency/FIL/Tput * had to make the link less specific (otherwise docs compilations fails) * removing the Temp/Should be moved to the Opt Guide * shuffled the tput/latency/etc info into separated documents. also the following docs moved from the temp into specific feature, general product desc or corresponding plugins - openvino_docs_IE_DG_Model_caching_overview - openvino_docs_IE_DG_Int8Inference - openvino_docs_IE_DG_Bfloat16Inference - openvino_docs_OV_UG_NoDynamicShapes * fixed toc for ov_dynamic_shapes.md * referring the openvino_docs_IE_DG_Bfloat16Inference to avoid docs compilation errors * fixed main product TOC, removed ref from the second-level items * reviewers remarks * reverted the openvino_docs_OV_UG_NoDynamicShapes * reverting openvino_docs_IE_DG_Bfloat16Inference and openvino_docs_IE_DG_Int8Inference * "No dynamic shapes" to the "Dynamic shapes" as TOC * removed duplication * minor brushing * Caching to the next level in TOC * brushing * more on the perf counters ( for latency and dynamic cases) Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Updated common IE pipeline infer-request section (#10844) * Updated common IE pipeline infer-reqest section * Update ov_infer_request.md * Apply suggestions from code review Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com> Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com> Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com> * DOCS: Removed useless 4 spaces in snippets (#10870) * Updated snippets * Added link to encryption * [DOCS] ARM CPU plugin docs (#10885) * initial commit ARM_CPU.md added ARM CPU is added to the list of supported devices * Update the list of supported properties * Update Device_Plugins.md * Update CODEOWNERS * Removed quotes in limitations section * NVIDIA and Android are added to the list of supported devices * Added See Also section and reg sign to arm * Added Preprocessing acceleration section * Update the list of supported layers * updated list of supported layers * fix typos * Added support disclaimer * update trade and reg symbols * fixed typos * fix typos * reg fix * add reg symbol back Co-authored-by: Vitaly Tuzov <vitaly.tuzov@intel.com> * Try to fix visualization (#10896) * Try to fix visualization * New try * Update Install&Deployment for migration guide to 22/1 (#10933) * updates * update * Getting started improvements (#10948) * Onnx updates (#10962) * onnx changes * onnx updates * onnx updates * fix broken anchors api reference (#10976) * add ote repo (#10979) * DOCS: Increase content width (#10995) * fixes * fix * Fixed compilation Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com> Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com> Co-authored-by: Aleksandr Voron <aleksandr.voron@intel.com> Co-authored-by: Vitaly Tuzov <vitaly.tuzov@intel.com> Co-authored-by: Ilya Churaev <ilya.churaev@intel.com> Co-authored-by: Yuan Xu <yuan1.xu@intel.com> Co-authored-by: Victoria Yashina <victoria.yashina@intel.com> Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com>
5.5 KiB
Arm® CPU device
Introducing the Arm® CPU Plugin
The Arm® CPU plugin is developed in order to enable deep neural networks inference on Arm® CPU, using Compute Library as a backend.
Note
: Note that this is a community-level add-on to OpenVINO™. Intel® welcomes community participation in the OpenVINO™ ecosystem and technical questions on community forums as well as code contributions are welcome. However, this component has not undergone full release validation or qualification from Intel®, and no official support is offered.
The Arm® CPU plugin is not a part of the Intel® Distribution of OpenVINO™ toolkit and is not distributed in pre-built form. To use the plugin, it should be built from source code. Plugin build procedure is described on page How to build Arm® CPU plugin.
The set of supported layers is defined on Operation set specification.
Supported inference data types
The Arm® CPU plugin supports the following data types as inference precision of internal primitives:
- Floating-point data types:
- f32
- f16
- Quantized data types:
- i8
Note
: i8 support is experimental.
Hello Query Device C++ Sample can be used to print out supported data types for all detected devices.
Supported features
Preprocessing acceleration
The Arm® CPU plugin supports the following accelerated preprocessing operations:
- Precision conversion:
- u8 -> u16, s16, s32
- u16 -> u8, u32
- s16 -> u8, s32
- f16 -> f32
- Transposion of tensors with dims < 5
- Interpolation of 4D tensors with no padding (
pads_beginandpads_endequal 0).
The Arm® CPU plugin supports the following preprocessing operations, however they are not accelerated:
- Precision conversion that are not mentioned above
- Color conversion:
- NV12 to RGB
- NV12 to BGR
- i420 to RGB
- i420 to BGR
See preprocessing API guide for more details.
Supported properties
The plugin supports the properties listed below.
Read-write properties
All parameters must be set before calling ov::Core::compile_model() in order to take effect or passed as additional argument to ov::Core::compile_model()
- ov::enable_profiling
Read-only properties
- ov::supported_properties
- ov::available_devices
- ov::range_for_async_infer_requests
- ov::range_for_streams
- ov::device::full_name
- ov::device::capabilities
Known Layers Limitation
AvgPoollayer is supported via arm_compute library for 4D input tensor and via reference implementation for another cases.BatchToSpacelayer is supported 4D tensors only and constant nodes:block_shapewithN= 1 andC= 1,crops_beginwith zero values andcrops_endwith zero values.ConvertLikelayer is supported configuration likeConvert.DepthToSpacelayer is supported 4D tensors only and forBLOCKS_FIRSTofmodeattribute.Equaldoes not supportbroadcastfor inputs.Gatherlayer is supported constant scalar or 1D indices axes only. Layer is supported as via arm_compute library for non negative indices and via reference implementation otherwise.Lessdoes not supportbroadcastfor inputs.LessEqualdoes not supportbroadcastfor inputs.LRNlayer is supportedaxes = {1}oraxes = {2, 3}only.MaxPool-1layer is supported via arm_compute library for 4D input tensor and via reference implementation for another cases.Modlayer is supported for f32 only.MVNlayer is supported via arm_compute library for 2D inputs andfalsevalue ofnormalize_varianceandfalsevalue ofacross_channels, for another cases layer is implemented via runtime reference.Normalizelayer is supported via arm_compute library withMAXvalue ofeps_modeandaxes = {2 | 3}, and forADDvalue ofeps_modelayer usesDecomposeNormalizeL2Add, for another cases layer is implemented via runtime reference.NotEqualdoes not supportbroadcastfor inputs.Padlayer works withpad_mode = {REFLECT | CONSTANT | SYMMETRIC}parameters only.Roundlayer is supported via arm_compute library withRoundMode::HALF_AWAY_FROM_ZEROvalue ofmode, for another cases layer is implemented via runtime reference.SpaceToBatchlayer is supported 4D tensors only and constant nodes:shapes,pads_beginorpads_endwith zero paddings for batch or channels and one valuesshapesfor batch and channels.SpaceToDepthlayer is supported 4D tensors only and forBLOCKS_FIRSTofmodeattribute.StridedSlicelayer is supported via arm_compute library for tensors with dims < 5 and zero values ofellipsis_maskor zero values ofnew_axis_maskandshrink_axis_mask, for another cases layer is implemented via runtime reference.FakeQuantizelayer is supported via arm_compute library in Low Precision evaluation mode for suitable models and via runtime reference otherwise.