Files
openvino/docs/IE_DG/supported_plugins/CPU.md
Andrey Zaytsev 940eb43095 Feature/azaytsev/merge to master (#2786)
* [IE CLDNN] Memory allocation optimizations (#2178)

* [GNA] Safety fixes (#2193)

* LSTMCell test [GNA] LSTMCell fix for GNA (#2216)

* [GNA] fix scale factor calculation for unfused bias after fc (2021.1) (#2195)

* [GNA] fix scale factor calculation for unfused bias after fc

* change check

* add test

* apply requested changes

* cpplint fix

* apply test changes

* modify model for test to match ::op::

* [LPT] Copy constant with several outputs before blob update (#2197)

* [LPT] Copy constant implementation

* [LPT] the same Constant ops as FQ interval boundaries

* [Scripts] Fixing issue with exporting path-like env when it undef  (#2164)

* setupvars.sh: Added logic for exporting path env in case if it not defined

* setupvars: Removed duplicated colon

* Kept quotes where they were

* setupvars: updated copyrights

* FakeQuantize + Mul fusion (#2133)

* FQ+Mul fusion transform skeleton

* FQ+Mul fusion transform tests prep

* Basic UT for the transform

* Basic implementation of the transform

* Parametrized UTs for FQMul transform

* Parametrization of FQ+Mul UTs

* Make sure that the shapes of constants match

* Check if the mul constant matches FQ data

* CentOs compilation error fix

* PR feedback and adjusted tests

* NHWC layout of the mul constant

* UT: FQ output limits 4D

* Redundant CF pass removed

* Rewrite the graph in a different way

* Shape checking infrastructure skeleton

* Handle some negative cases

* Check the rt info in the fusion test

* Fuse all Mul nodes detected after FQ node

* Dont cast the original FQ node

* Dont throw if CF fails in new output range calculation

* More UTs

* Accept any type of input to FQ in the transformation

* Test the fusion when all FQ inputs are non-const

* Fusion test when only one output limit is const

* Extend error message (#2174)

* some nGraph KW fixes (#2176)

* Removed redundant methods

* Fixed KW for linux

* Fix QueryNetwork for networks with KSO (#2202)

* Added a test to reproduce QueryNetwork with KSO

* Fixed QueryNetwork for networks with KSO

* Added additional test

* Fixed output names for case with redundant ops before result (#2209)

* [IE][VPU]: Workaround to support parameter Beta for layer Swish (#2207)

* Workaround to full support Swish layer. It is faster than native Swish for now.

* [IE][VPU]: Remove the second call of ngraph::CommonOptimizations (#2221)

* Remove the second call of ngraph::CommonOptimizations in myriad plugin
* Reuse code with vpu ngraph transformations

* Duplicate PR 2167 for release branch: GatherTree description was extended and outdated link fixed (#2235)

* add more alrifications to description

* move clarification to comment

* pseudo code become more accurate

* review changes

* Add exposing function signatures via Cython (#2244)

* [DOC] Reshape feature (#2194)

* [IE][VPU][OpenCL] 2021.1 release compiler (#2189)

* Statically analyzed issues. (#2261)

* [IE][VPU]: Fix K propagation through Reshape (2021.1) (#2180)

* Fix K propagation through Reshape
* Add test cases

* Revert "[IE TESTS] dynavic batch for mvn layer (#1010)" (#2256)

This reverts commit 2e3378c50f.

* Fixed KW warning and review issues (#2262)

* [IE][VPU]: update firmware 1381 (#2236)

* Reverting devicePriorities to be vector and respect the order, as opposed to the incorrect (recent?) refactoring that introduced the unordered_map that effectively ignores the priorities (#2251)

* update OpenCV version to 4.5.0 (#2260)

* Add VPUX configuration to compile_tool (#2248)

* [IE][TESTS] Fix compareRawBuffers and compareBlobData methods (#2246)

Use `<=` comparison instead of `<` with thresholds.
This allows to use `0` threshold for bit-exact comparison.

* [IE][VPU]: KW fixes (#2186)

* Some KW fixes
* Fix printTo in vpu ngraph transformations

* Fix for static PartialShape detection algorithm (#2177)

* Fixes for Interpolate-4. (#2281)

* Update get_ov_update_message.py (#2286)

* Clone a specific tag for pybind11 (#2296)

* [Scripts] Fix setting PYTHONPATH logic (#2305)

* setupvars.sh: Added logic for exporting path env in case if it not defined

* setupvars: Removed duplicated colon

* install_openvino_dependencies: Updated copyrights

setupvars.bat: Updated notification about incorrect Python version. Removed checking ICC2019
setupvars.sh: Removed logic with choosing higher version of installed Python. Added dynamic detecting python3 major and minor version for setting path. Add checking minimum required Python version(now 3.6)

* Added python3-gi package and fixed libglib2.0-0 package location. (#2294)

* [IE TESTS] CoreThreading_LoadNetwork tests were disabled for GPU plugin (#2245) (#2283)

* setupvars: Updated notifications, fixed calling python in Windows case (#2318)

* Updated operations specification documents (2021.1) (#2268)

* Updated documentation structure and remove incorrect added files for Acosh-1, Asinh-1 and Atanh-1

* Fixed broken links

* Fixed c samples build (#2278) (#2304)

* Fixed c samples build

fixed CVS-38816 - Failure to build samples in C

* Fixed issue with gflags

* Revert "[IE][VPU]: Fix K propagation through Reshape (2021.1) (#2180)" (#2322)

This reverts commit d604a03ac0.

* Added ONNX Resize-11 and ONNX Resize-13 to supported frameworks layers list. (#2325)

* Implement `run_executable.py` to run `TimeTests` several times (#2125) (#2188)

CI passed

* install_NEO_OCL_driver: Updated exit codes, messages. Updated way to remove old driver on Ubuntu (#2333)

* Bump cmake version to 3.13 (#2339)

* install_NEO_OCL_driver: Added checking of installed packages before trying to remove them. Added quotes for echo. (#2350)

* convert to doxygen comments

* add doxygen doc build configurations (#2191)

Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com>

* [DOCS] Added an evaluate method for custom operation (#2272)

* Added an evaluate method for custom operation

* Fixed comments

* Downgrade cmake for samples (#2372)

* Downgrade cmake for samples

Downgraded cmake version to default version for Ubuntu 18.04

* Updated supported python version

The minimal python version in 2021.1 is 3.5

* Added notes about cmake requirements for samples and demo

* Install dependency refactoring. (#2381)

* Updated Transformation development doc (#2370)

* Delete xfail for resolved known issue (#2385)

* Fix layout links for dl streamer and c api (#2375)

* fix layouts

* change the dl-streamer link

Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com>

* Added link options for cross-compilation (#2397)

* Added new GSG for macOS, made minor changes in Windows GSG (#2070) (#2405)

* Added new GSG for macOS, made minor changes in Windows GSG

* Update get_started_macos.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Fixed docs build on Windows (#2383)

* layouts and code comments

* Replace absolute links to docs.openvinotoolkit.org by relative ones (#2439)

* Replaced direct links to docs.openvinotoolkit.org with relative links

* Replaced direct links to docs.openvinotoolkit.org with relative links. Added GSGs for Win and macOS

* Minor fixes in GSGs

* Replaced direct links to docs.openvinotoolkit.org with relative links

* Removed links to OpenVINO markdown files that contain anchor - they don't work in the current implementation of the doc process

* Fixed Notes

* Removed links to OpenVINO markdown files that contain anchor - they don't work in the current implementation of the doc process

* fixed link to installing-openvino-linux.md

* Update the menu to align with POT doc headers (#2433)

* Update the menu to align with POT doc headers

It changes the menu to align with Post-training Optimization Toolkit documentation titles.

* Corrected one title

Run Examples => How to Run Examples

* Added closing braсket (#2466)

Fixed syntax error (b4b03b1)

* Remove the deprecation notice (#2314)

* Removed deprecation notice

* Removed the note from other files

* [DOCS] Update Installation Guide - GPU steps (#2308)

* Initial commit

* fixing lists

* Update installing-openvino-linux.md

* Get rid of the note

* Added the scrrenshot

* Update installing-openvino-linux.md

* fixes

* separate layout

* [Docs] Update MO What's new description (#2481)

* Azure CI: Add separated pipelines for Windows, Linux, Mac

* Feature/azaytsev/benchmarks 2021 1 (#2501)

* Initial changes for 2021.1

* Inserted Graphtool scripts, updated configurations info

* Updated FAQ and minor changes to performance_benchmarks.md

* Updated for 2021.1

* Updated

* incorporated review comments

* incorporated review comments for FAQ

* fixed link

* Update build-instruction.md for MacOsX (#2457)

* Update build-instruction.md for MacOsX

* Removed call of install_dependencies.sh from the steps

* Changed layouts

* Feature/azaytsev/cvs-38240 (#2469)

* Updated for 2020 version, replaced Ubuntu 16.04 with Ubuntu 20.04

* Updated the release package numbers

* Removed FPGA from the documentation

* Updated according to the comments in the ticket CVS-37827 (#2448)

* Updated according to CVS-38225

* some changes

* Update docs for speech libs and demos (#2518)

* Made changes to benchmarks according to review comments

* Remove `--collect_results_only` (#2523)

* Remove `--collect_results_only` from MemCheckTests

* Remove CLI keys from README

* Added logo info to the Legal_Information, updated Ubuntu, CentOS supported versions

* Updated supported Intel® Core™ processors list

* Fixed table formatting

* [Jenkinsfile] Bump infra (#2546)

* [GNA] Documentation updates for 2021.1 (#2460)

* [GNA] Documentation updates for 2021.1

* Take Mike's comments into account

* More fixes according to review

* Fix processor generation names

* update api layouts

* Added new index page with overview

* Changed CMake and Python versions

* Fixed links

* some layout changes

* some layout changes

* nGraph Python API tutorial (#2500)

* nGraph Python API tutorial

* Tweaks

* Code review comments

* Code review comments

* some layout changes

* COnverted svg images to png

* layouts

* update layout

* Added a label for nGraph_Python_API.md

* fixed links

* Fixed image

* First draft of nGraph documentation (#2271)

* First draft of nGraph documentation

* updated according to review comments

* Updated

* Reviewed the nGraph Transformation section, added missing images

* Update nGraph_dg.md

* Delete python_api.md

Removed since there is already the nGraph_Python_API.md document with a comprehensive overview.

Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com>
Co-authored-by: CCR\avladimi <anastasiya.ageeva@intel.com>

* Feature/azaytsev/docs 2021 1 (#2560)

* Removed FPGA from the documentation

* Updated according to CVS-38225

* Added logo info to the Legal_Information, updated Ubuntu, CentOS supported versions

* Updated supported Intel® Core™ processors list

* Added new index page with overview

* Changed CMake and Python versions

* Fixed links

* COnverted svg images to png

* Added a label for nGraph_Python_API.md

* fixed links

* Fixed image

* Update SW requirements in build instructions and change latest release to 2021.1 (#2565)

* removed links to ../IE_DG/Introduction.md

* Removed links to tools overview page as removed

* some changes

* Remove link to Integrate_your_kernels_into_IE.md

* remove openvino_docs_IE_DG_Graph_debug_capabilities from layout as it was removed

* Fixed links to images (#2569)

* update layouts

* Added deprecation note for PassConfig class (#2593)

* Post-release fixes and installation path changes

* Added pip install documentation (#2465)

* Added pip install documentation

* Change references

* tiny fixes of links

* Update installing-openvino-pip.md

Co-authored-by: Alina Alborova <alina.alborova@intel.com>

* Update OpenVino ONNX CI check (#2599)

* Update OpenVino ONNX CI

* Change parallel execution to single

* Enlarge timeout

* Remove timeout

* Add timeout to test execution

* Added PIP installation and Build from Source to the layout

* Fixed formatting issue, removed broken link

* Renamed section EXAMPLES to RESOURCES according to review comments

* add mo faq navigation by url param

* Skip hanging test case of OpenVino ONNX CI (#2608)

* Update OpenVino ONNX CI

* Change parallel execution to single

* Enlarge timeout

* Remove timeout

* Add timeout to test execution

* Skip hanging test

* Add description to skip issue

* Removed DLDT description

* Replaced wrong links

* MInor fix for path to the cpp samples

* fixes

* Update ops.py

* Fix style

* Improve pip installation guide (#2644)

* Improve pip installation guide

* Updated after comments

* Feature/ntyukaev/separate layout (#2629)

* convert to doxygen comments

* layouts and code comments

* separate layout

* Changed layouts

* Removed FPGA from the documentation

* Updated according to CVS-38225

* some changes

* Made changes to benchmarks according to review comments

* Added logo info to the Legal_Information, updated Ubuntu, CentOS supported versions

* Updated supported Intel® Core™ processors list

* Fixed table formatting

* update api layouts

* Added new index page with overview

* Changed CMake and Python versions

* Fixed links

* some layout changes

* some layout changes

* some layout changes

* COnverted svg images to png

* layouts

* update layout

* Added a label for nGraph_Python_API.md

* fixed links

* Fixed image

* removed links to ../IE_DG/Introduction.md

* Removed links to tools overview page as removed

* some changes

* Remove link to Integrate_your_kernels_into_IE.md

* remove openvino_docs_IE_DG_Graph_debug_capabilities from layout as it was removed

* update layouts

* Post-release fixes and installation path changes

* Added PIP installation and Build from Source to the layout

* Fixed formatting issue, removed broken link

* Renamed section EXAMPLES to RESOURCES according to review comments

* add mo faq navigation by url param

* Removed DLDT description

* Replaced wrong links

* MInor fix for path to the cpp samples

* fixes

* Update ops.py

* Fix style

Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com>
Co-authored-by: Tyukaev <nikolay.tyukaev@intel.com>
Co-authored-by: aalborov <alina.alborova@intel.com>
Co-authored-by: Rafal Blaczkowski <rafal.blaczkowski@intel.com>
Co-authored-by: Alexander Zhogov <alexander.zhogov@intel.com>

* Fixed CVS-35316 (#2072) (#2670)

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* [install_dependencies.sh] install latest cmake if current version is lower 3.13 (#2695) (#2701)

* [install_dependencies.sh] install latest cmake if current version is lower 3.13

* add shellcheck for Ubuntu

* install python 2.7 for Ubuntu

* Removed redundant file

* Exclude files that we didn't changed from merging

Co-authored-by: Sergey Shlyapnikov <sergey.shlyapnikov@intel.com>
Co-authored-by: Denis Orlov <denis.orlov@intel.com>
Co-authored-by: Kamil Magierski <kamil.magierski@intel.com>
Co-authored-by: Anna Alberska <anna.alberska@intel.com>
Co-authored-by: Edward Shogulin <edward.shogulin@intel.com>
Co-authored-by: Artyom Anokhov <artyom.anokhov@intel.com>
Co-authored-by: Tomasz Dołbniak <tomasz.dolbniak@intel.com>
Co-authored-by: Ilya Churaev <ilya.churaev@intel.com>
Co-authored-by: Roman Vyunov (Intel) <roman.vyunov@intel.com>
Co-authored-by: Maksim Doronin <maksim.doronin@intel.com>
Co-authored-by: Svetlana Dolinina <svetlana.a.dolinina@intel.com>
Co-authored-by: Evgeny Talanin <evgeny.talanin@intel.com>
Co-authored-by: Evgenya Stepyreva <evgenya.stepyreva@intel.com>
Co-authored-by: Maxim Kurin <maxim.kurin@intel.com>
Co-authored-by: Nikolay Shchegolev <nikolay.shchegolev@intel.com>
Co-authored-by: Andrew Bakalin <andrew.bakalin@intel.com>
Co-authored-by: Gorokhov Dmitriy <dmitry.gorokhov@intel.com>
Co-authored-by: Evgeny Latkin <evgeny.latkin@intel.com>
Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com>
Co-authored-by: Alexey Suhov <alexey.suhov@intel.com>
Co-authored-by: Alexander Novak <sasha-novak@yandex.ru>
Co-authored-by: Vladislav Vinogradov <vlad.vinogradov@intel.com>
Co-authored-by: Vladislav Volkov <vladislav.volkov@intel.com>
Co-authored-by: Vladimir Gavrilov <vladimir.gavrilov@intel.com>
Co-authored-by: Zoe Cayetano <zoe.cayetano@intel.com>
Co-authored-by: Dmitrii Denisov <dmitrii.denisov@intel.com>
Co-authored-by: Irina Efode <irina.efode@intel.com>
Co-authored-by: Evgeny Lazarev <evgeny.lazarev@intel.com>
Co-authored-by: Mikhail Ryzhov <mikhail.ryzhov@intel.com>
Co-authored-by: Vitaliy Urusovskij <vitaliy.urusovskij@intel.com>
Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com>
Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com>
Co-authored-by: Gleb Kazantaev <gleb.kazantaev@intel.com>
Co-authored-by: Rafal Blaczkowski <rafal.blaczkowski@intel.com>
Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
Co-authored-by: Maksim Proshin <mvproshin@gmail.com>
Co-authored-by: Alina Alborova <alina.alborova@intel.com>
Co-authored-by: Maxim Vafin <maxim.vafin@intel.com>
Co-authored-by: azhogov <alexander.zhogov@intel.com>
Co-authored-by: Alina Kladieva <alina.kladieva@intel.com>
Co-authored-by: Michał Karzyński <4430709+postrational@users.noreply.github.com>
Co-authored-by: Anton Romanov <anton.romanov@intel.com>
2020-10-27 00:41:46 +03:00

8.8 KiB

CPU Plugin

Introducing CPU Plugin

The CPU plugin was developed in order to provide opportunity for high performance scoring of neural networks on CPU, using the Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN).

Currently, the CPU plugin uses Intel® Threading Building Blocks (Intel® TBB) in order to parallelize calculations. Please refer to the Optimization Guide for associated performance considerations.

The set of supported layers can be expanded with the Extensibility mechanism.

Supported Platforms

OpenVINO™ toolkit is officially supported and validated on the following platforms:

Host OS (64-bit)
Development Ubuntu* 18.04, CentOS* 7.5, MS Windows* 10
Target Ubuntu* 18.04, CentOS* 7.5, MS Windows* 10

The CPU Plugin supports inference on Intel® Xeon® with Intel® Advanced Vector Extensions 2 (Intel® AVX2), Intel® Advanced Vector Extensions 512 (Intel® AVX-512), and AVX512_BF16, Intel® Core™ Processors with Intel® AVX2, Intel Atom® Processors with Intel® Streaming SIMD Extensions (Intel® SSE).

You can use -pc the flag for samples to know which configuration is used by some layer. This flag shows execution statistics that you can use to get information about layer name, execution status, layer type, execution time, and the type of the execution primitive.

Internal CPU Plugin Optimizations

CPU plugin supports several graph optimization algorithms, such as fusing or removing layers. Refer to the sections below for details.

Note

: For layer descriptions, see the IR Notation Reference.

Lowering Inference Precision

CPU plugin follows default optimization approach. This approach means that inference is made with lower precision if it is possible on a given platform to reach better performance with acceptable range of accuracy.

Note

: For details, see the Using Bfloat16 Inference.

Fusing Convolution and Simple Layers

Merge of a Convolution layer and any of the simple layers listed below:

  • Activation: ReLU, ELU, Sigmoid, Clamp
  • Depthwise: ScaleShift, PReLU
  • FakeQuantize

Note

: You can have any number and order of simple layers.

A combination of a Convolution layer and simple layers results in a single fused layer called Convolution:
conv_simple_01

Fusing Pooling and FakeQuantize Layers

A combination of Pooling and FakeQuantize layers results in a single fused layer called Pooling:
pooling_fakequant_01

Fusing FullyConnected and Activation Layers

A combination of FullyConnected and Activation layers results in a single fused layer called FullyConnected:
fullyconnected_activation_01

Fusing Convolution and Depthwise Convolution Layers Grouped with Simple Layers

Note

: This pattern is possible only on CPUs with support of Streaming SIMD Extensions 4.2 (SSE 4.2) and Intel AVX2 Instruction Set Architecture (ISA).

A combination of a group of a Convolution (or Binary Convolution) layer and simple layers and a group of a Depthwise Convolution layer and simple layers results in a single layer called Convolution (or Binary Convolution):

Note

: Depthwise convolution layers should have the same values for the group, input channels, and output channels parameters.

conv_depth_01

Fusing Convolution and Sum Layers

A combination of Convolution, Simple, and Eltwise layers with the sum operation results in a single layer called Convolution:
conv_sum_relu_01

Fusing a Group of Convolutions

If a topology contains the following pipeline, a CPU plugin merges Split, Convolution, and Concatenation layers into a single Convolution layer with the group parameter:

Note

: Parameters of the Convolution layers must coincide.

group_convolutions_01

Removing a Power Layer

CPU plugin removes a Power layer from a topology if it has the following parameters:

  • power = 1
  • scale = 1
  • offset = 0

Supported Configuration Parameters

The plugin supports the configuration parameters listed below. All parameters must be set with the InferenceEngine::Core::LoadNetwork() method. When specifying key values as raw strings (that is, when using Python API), omit the KEY_ prefix. Refer to the OpenVINO samples for usage examples: Benchmark App.

These are general options, also supported by other plugins:

Parameter name Parameter values Default Description
KEY_EXCLUSIVE_ASYNC_REQUESTS YES/NO NO Forces async requests (also from different executable networks) to execute serially. This prevents potential oversubscription
KEY_PERF_COUNT YES/NO NO Enables gathering performance counters

CPU-specific settings:

Parameter name Parameter values Default Description
KEY_CPU_THREADS_NUM positive integer values 0 Specifies the number of threads that CPU plugin should use for inference. Zero (default) means using all (logical) cores
KEY_CPU_BIND_THREAD YES/NUMA/NO YES Binds inference threads to CPU cores. 'YES' (default) binding option maps threads to cores - this works best for static/synthetic scenarios like benchmarks. The 'NUMA' binding is more relaxed, binding inference threads only to NUMA nodes, leaving further scheduling to specific cores to the OS. This option might perform better in the real-life/contended scenarios. Note that for the latency-oriented cases (single execution stream, see below) both YES and NUMA options limit number of inference threads to the number of hardware cores (ignoring hyper-threading) on the multi-socket machines.
KEY_CPU_THROUGHPUT_STREAMS KEY_CPU_THROUGHPUT_NUMA, KEY_CPU_THROUGHPUT_AUTO, or positive integer values 1 Specifies number of CPU "execution" streams for the throughput mode. Upper bound for the number of inference requests that can be executed simultaneously. All available CPU cores are evenly distributed between the streams. The default value is 1, which implies latency-oriented behavior with all available cores processing requests one by one.
KEY_CPU_THROUGHPUT_NUMA creates as many streams as needed to accommodate NUMA and avoid associated penalties.
KEY_CPU_THROUGHPUT_AUTO creates bare minimum of streams to improve the performance; this is the most portable option if you don't know how many cores your target machine has (and what would be the optimal number of streams). Note that your application should provide enough parallel slack (for example, run many inference requests) to leverage the throughput mode.
Non-negative integer value creates the requested number of streams. If a number of streams is 0, no internal streams are created and user threads are interpreted as stream master threads.
KEY_ENFORCE_BF16 YES/NO YES The name for setting to execute in bfloat16 precision whenever it is possible. This option lets plugin know to downscale the precision where it sees performance benefits from bfloat16 execution. Such option does not guarantee accuracy of the network, you need to verify the accuracy in this mode separately, based on performance and accuracy results. It should be your decision whether to use this option or not.

Note

: To disable all internal threading, use the following set of configuration parameters: KEY_CPU_THROUGHPUT_STREAMS=0, KEY_CPU_THREADS_NUM=1, KEY_CPU_BIND_THREAD=NO.

See Also