* rebasing the perf-modes-2021.3 to the 2021.4
Caveats:
the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet)
(cherry picked from commit 1ae1edc0ed)
* overriding streams (to force the TPUT mode to the DLBenchnark)
(cherry picked from commit 7f506cda31)
* disabling reducing #streams to fully mimic baseline c4df94d42d of the 2021.3 (before experiments)
(cherry picked from commit 85073dd1dd)
* clang/identation
(cherry picked from commit 050a4155a9)
* splitting the Transformation to general and CPU specific.
Now hopefully,this fully mimics the baseline c4df94d42d of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled
(cherry picked from commit e98b2c1a67)
* disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8
(cherry picked from commit 32b8d80dee)
* refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations
(cherry picked from commit f2b972171b)
* isa-based threshold logic
(cherry picked from commit b218457e1a)
* mode->hint
(cherry picked from commit ec20aa8eca)
* optional PERFORMANCE_HINT_NUM_REQUESTS
(cherry picked from commit 5a3883e3f3)
* moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master)
(cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776)
* AUTO support for PerfHints
* MULTI support for PerfHints
* Enabling Perf hints for the GPU plugin
* brushing settings output a bit
* disabling "throughput" perf hint being default (until OV 2.0)
* uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default
* removing dead and experimental code, and debug printfs
* clang/code-style
* code-review remarks
* Moved the output of the actual params that the hint produced to the right place
* aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum
* clang
* benchmark_app brushing
* Update inference-engine/samples/benchmark_app/README.md
* propagating the perf hints thru one more scenario in the merged AUTO-MULTI
* fixed mispint
* Python benchmark_app update for perf hints
* addresssing reviewers comments on the python benchmark_app
* simplifying/brushing logic a bit
* refactor the heuristic to the separate file (to be shared with iGPU soon)
* refactor conversion of modes to the specific GPU config per feedback from Vladimir
* add missed __init__.py files
* Update __init__.py
empty line
* Merge infirence_engine/tools/benchmark_tool with tools/benchmark_tool
* Update MD links
* remove benchmark_tool from package_BOM.txt
* add tools folder to the list of Doxygen files
* fix relative paths
* Update index.md
remove extra line
* Add input image scale flag in benchmark app.
- user set input image scale with -iscale.
input is divided by scale.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Apply image scale, mean parameter in benchmark APP
Means and sacles values per channel
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Fix clang-format
Signed-off-by: hyunback <hyunback.kim@intel.com>
* fix clang-format issue2.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Update benchmark tool to align the format of mean and sacle values with MO arguments.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Remove debug print.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Added pugixml as submodule
* CVS-34900: updated pugixml to v1.11.4
* Fixed link with pugixml
* USe pugixml::static
* Try to fix bug
* Removed GITHUB_PULL_REQUEST
* Replaced OpenVINO_MAIN_SOURCE_DIR -> OpenVINO_SOURCE_DIR
* Removed some usages of IE_MAIN_SOURCE_DIR
* Use ngraph target directly
* enable make clean to remove ie_wheel artifacts
* ./setup.py:132:1: E302 expected 2 blank lines
* fix CI build issue
* Removed not-needed components from ie_wheel
* Use explicit python3 vresion in ngraph pythpn
* Use python3 everywhere
* Reuse python3 more
* Added function to build with Py_LIMITED_API
* Sync 2 cmake python modules
* Fix for tools
* Fixed typo
* Enable python by default
* Enable python build iff python-dev is found
* More migration to Python3_VERSION
* Install wheel requirements
* Fixed ngraph Python separate build
* Fixed cython compilation
* Revert to old packages
* Added suffix
* Specify python version explicitly
* Don't depend on python interp to build python itself
* More improvements
* Revert offline transformations back to ie_wheel
* Refactoring
* Trying to build wheel independently on C++ runtime
* Build wheel only with main OpenVINO
* Fixed typo in test_utils cmake lists
* Adding link stage
* small fix
* git diff
* Try to fix python tests
Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com>
* Python API for LoadNetwork by model file name
* BenchmarkApp: Add caching and LoadNetworkFromFile support
2 new options are introduced
- cache_dir <dir> - enables models caching
- load_from_file - use new perform "LoadNetwork" by model file name
Using both parameters will achieve maximum performance of read/load network on startup
Tests:
1) Run "benchmark_app -h". Help will display 2 new options. After available devices there will be list of devices with cache support
2) ./benchmark_app -d CPU -i <model.xml> -load_from_file
Verify that some test steps are skipped (related to ReadNetwork, re-shaping etc)
3) Pre-requisite: support of caching shall be enabled for Template plugin
./benchmark_app -d TEMPLATE -i <model.onnx> -load_from_file -cache_dir someDir
Verify that "someDir" is created and generated blob is available
Run again, verify that loading works as well (should be faster as it will not load onnx model)
4) Run same test as (3), but without -load_from_file option. Verify that cache is properly created
For some devices loadNetwork time shall be improved when cache is available
* Removed additional timing prints
* Correction from old code
* Revert "Removed additional timing prints"
Additional change - when .blob is chosen instead of .xml, it takes priority over caching flags
* Removed new time printings
As discussed, these time measurements like 'total first inference time' will be available in 'timeTests' scripts
* Fix clang-format issues
* align pypi deps of benchmark, cross check tool, python API
* move cython from python API requirements to requirements-dev
* change requirements to >= for most packages
* update requirements
* set pinned numpy major version in wheel requirements
* set more strict pip requirements-dev in wheel
* change scikit-image version to 0.17
* Fix python_tools benchmark installation location
Before this fix, when running "make install", the benchmark python files
would be installed under <python_dest_dir>/openvino/tools, instead of
<python_dest_dir>/openvino/tools/benchmark. This commit fixes this.
* Alternative implementation
* change the deprecated method to the recent
* first ver of the hybrid cores aware CPU streams (+debug info)
* more debug and fixed sum threads
* disabled NUMA pinning to experiment with affinity via OS
* further brushing of stream to core type logic
* hybrid CPU-aware getNumberOfCPUCores
* adding check on the efficiency
* experimental TBB package (that cmake should pull from the internal server)
* iterating over core types in the reversed order (so the big cores are populated first in case user specified less than all #threads)
* adding back the NUMA affinity code-path for the full validation (incl 2 sockets Windows Server)
* cpplint fix and tabbing the #if clauses for the readbility
* pre-production TBB from internal server
* wrapping over #cores/types
* wrapping over #cores/types, ver 2
* wrapping over #streams instead
* disabling warnings as errors for a while (to unlock testing)
* accomodating new TBB layout for dependencies.bat
* next tbb ver (with debug binaries that probably can unlock the commodity builds, without playing product_configs)
* minor brushing for experiments (so that pinning can be disabled)
* minor brushing from code review
* Updating the SHA hash which appeared when rebasing to the master
* WIP refactoring
* Completed refactoring of the "config" phase of the cpu stream executor and on-the-fly streams to core types mapping
* making the benchmark_app aware about new pinning mode
* Brushing a bit (in preparation for the "soft" affinity)
* map to vector to simplify the things
* updated executors comparison
* more fine-grained pinning scheme for the HYBRID (required to allow all cores on 2+8 1+4, and other LITTLE-skewed scenarios)
TODO: seprate little to big ratio for the fp322 and int8 (and pass the fp32Only flag to the MakeDefaultMultiTHreaded)
* separating fp32 and int8 intensive cases for hybrid execution, also leveraging the HT if the #big_cores is small, refactored. also switched to the 2021.2 oneTBB RC package
* code style
* stripped tbb archives from unused folders and files, also has to rename the LICENSE.txt to the LICENSE to match existing OV packaging tools
* assigning nodeId regradless of pinning mode
* tests OpenCV builds with same 2021.2 oneTBB, ubuntu 18/20
* cmake install paths for oneTBB, alos a ie_parallel.cmake warning on older ver of TBB
* Updated latency case desc to cover multi-socket machines
* adding centos8 OCV with oneTBB build
updating TBB drops with hwloc shared libs added.
* enabled internal OCV from THIRD_PARTY_SERVER to test thru CI..
Added Centos7 notbb OCV build (until g-api get ready for onetbb) to unlock the Centos7 CI build
* separate rpath log to respect one-tbb specific paths
* fixed SEQ code-path
* fixed doc misprint
* allowing all cores in 2+8 for int8 as well
* cleaned from debug printfs
* HYBRID_AWARE pinning option for the Python benchmark_app
* OpenVINO Hybrid CPUs support
* Remove custom::task_arena abstraction layout
* Get back to the custom::task_arena interface
* Add windows.h inclusion
* Fix typo in macro name
* Separate TBB and TBBbind packages
* Fix compile-time conditions
* Fix preprocessors conditions
* Fix typo
* Fix linking
* make linking private
* Fix typo
* Fix target_compile_definitions syntax
* Implement CMake install logic, update sha hash for the tbbbind_2_4 package
* Add tbbbind_2_4 required paths to setup_vars
* Update CI paths
* Include ie_parallel.hpp to ie_system_conf.cpp
* Try to update dependencies scripts
* Try to fix dependencies.bat
* Modify dependencies script
* Use static tbbbind_2_4 library
* Remove redundant paths from CI
* Revert "cleaned from debug printfs"
This reverts commit 82c9bd90c5.
# Conflicts:
# inference-engine/src/inference_engine/os/win/win_system_conf.cpp
# inference-engine/src/inference_engine/threading/ie_cpu_streams_executor.cpp
# inference-engine/src/mkldnn_plugin/config.cpp
* Update tbbbind package version
* fixed compilation
* removing the direct tbb::info calls from CPU plugin, to aggregate everything in the single module (that exposes the higher level APIs)
* Update tbbbind package version
(cherry picked from commit f66b8f6aa6)
* compilation fix
* brushing the headers a bit
* Make custom::task_arena inherited from tbb::task_arena
* change to the latest TBB API, and more debug printfs
* code-style
* ARM compilation
* aligned "failed system config" between OV and TBB (by using '-1')
* macos compilation fix
* default arena creation (to make sure all code-path have that fallback)
* Incapsulate all TBB versions related logic inside the custom namespace
* Move custom layer header to internal scope + minor improvements
* with all NUMA/Hybrid checks now consolidated in the custom_arena, cleaning the ugly ifdefs thta we had
* Introduce new ThreadBindingType + fix compilation
* fixing OMP compilation
* OpenVINO Hybrid CPUs support
* Remove custom::task_arena abstraction layout
* Get back to the custom::task_arena interface
* Add windows.h inclusion
* Fix typo in macro name
* Separate TBB and TBBbind packages
* Fix compile-time conditions
* Fix preprocessors conditions
* Fix typo
* Fix linking
* make linking private
* Fix typo
* Fix target_compile_definitions syntax
* Implement CMake install logic, update sha hash for the tbbbind_2_4 package
* Add tbbbind_2_4 required paths to setup_vars
* Update CI paths
* Include ie_parallel.hpp to ie_system_conf.cpp
* Try to update dependencies scripts
* Try to fix dependencies.bat
* Modify dependencies script
* Use static tbbbind_2_4 library
* Remove redundant paths from CI
* Update tbbbind package version
* Make custom::task_arena inherited from tbb::task_arena
* Incapsulate all TBB versions related logic inside the custom namespace
* Move custom layer header to internal scope + minor improvements
* Introduce new ThreadBindingType + fix compilation
* Fix compilation
* Use public tbbbind_2_4 package
* fixed macos build, corrected comments/desc
* reverted to the default binding selection logic ( to preserve the legacy beh)
* OpenVINO Hybrid CPUs support
* Remove custom::task_arena abstraction layout
* Get back to the custom::task_arena interface
* Add windows.h inclusion
* Fix typo in macro name
* Separate TBB and TBBbind packages
* Fix compile-time conditions
* Fix preprocessors conditions
* Fix typo
* Fix linking
* make linking private
* Fix typo
* Fix target_compile_definitions syntax
* Implement CMake install logic, update sha hash for the tbbbind_2_4 package
* Add tbbbind_2_4 required paths to setup_vars
* Update CI paths
* Include ie_parallel.hpp to ie_system_conf.cpp
* Try to update dependencies scripts
* Try to fix dependencies.bat
* Modify dependencies script
* Use static tbbbind_2_4 library
* Remove redundant paths from CI
* Update tbbbind package version
* Make custom::task_arena inherited from tbb::task_arena
* Incapsulate all TBB versions related logic inside the custom namespace
* Move custom layer header to internal scope + minor improvements
* Introduce new ThreadBindingType + fix compilation
* Fix compilation
* Use public tbbbind_2_4 package
* Apply review comments
* Fix compilation without tbbbind_2_4
* Fix compilation with different TBB versions
* code review remarks
* fix for the NONE pinning code-path under HYBRID_AWAR
* whitespace and cleaning the debug printfs (per review)
* code-review comments
* fixed code-style
Co-authored-by: Kochin, Ivan <ivan.kochin@intel.com>
Co-authored-by: Kochin Ivan <kochin.ivan@intel.com>
* enable make install for openvino/tools folder
* fix component name
* use python_tools as component name
* update ie_cpack_add_component name
* enable CPack for python tools
* use find_package(PythonInterp)
* replace .format on f-string in cross_check_tool
* Replace f-string on .format in utils.py
* replace f-string in benchmark tool
* Replace .format on f-string in benchmark tool
* Add f-string after update
* Fix some lines
* Fix utils
* Update benchmark_app to pass precision via command line
* Update vpu_perfcheck
* Update python benchmark_app to support setting precision from cmd
* Review comments
* Address more review comments
* Fixes after rebase
* update license
* remove the first blank lines and undo changes in inference-engine/thirdparty/mkl-dnn
* test commit
* Update license in tools
* Undo changes in api_overview.md
* update ie_api.pyx and set interpreter in hello_query_device
* Adding MYRIAD_THROUGHPUT_STREAMS to the list of plugin's supported config vals (omitted incorrectly) and enabing streams for the myriad devices in the benchmark_app
* docs update and python benchmark_app