* Remove fp16 of Convert layer test from skip_tests.config.cpp as it works now
* update repo
* add op reference test of region_tolo
* add type_prop test and remove backend test of region_yolo
* apply type conversion for loading file test and add bf16 test case in skip_test_config
* change location of compile definition under target and use path_join from file_util
* add dependency of test_model_zoo
* apply ov::Model
* remove unnecessary
* changed compile definition of TEST_FILES
* skip test cases of external test file
* remove test cases of importing data file
* add warning about order if both mean and scale set
* Update tools/mo/openvino/tools/mo/main.py
Co-authored-by: Anastasia Popova <anastasia.popova@intel.com>
* removed warning, added phrase in documentation
* fixed merge
* added phrase about order of ,mean and scale in MO help
* duplicate MO help phrase in doc
* Update docs/MO_DG/prepare_model/convert_model/Converting_Model.md
Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
* Update docs/MO_DG/prepare_model/convert_model/Converting_Model.md
Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
* Update docs/MO_DG/prepare_model/convert_model/Converting_Model.md
Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
* Update tools/mo/openvino/tools/mo/utils/cli_parser.py
Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
* Update tools/mo/openvino/tools/mo/utils/cli_parser.py
Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
* remove tabs
* fix in order of reverse, mean, scale
Co-authored-by: Anastasia Popova <anastasia.popova@intel.com>
Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
* Move 'NV12toRGB/BGR' reference evaluates to template plugin
CPU doesn't need this fallback, so implementation can be moved to reduce core binary size
* Moved evaluate_nv12 to 'runtime::reference'
* Fix arm build
* Calculate model layout based on 'tensor' layout and convert steps
Previously, 'model layout' is set to '...' by default,
thus no shape conversion happened when tensor layout is set to 'NHWC', then there was explicit convert_layout "NCHW"
Now "model layout" is calculated based on tensor layout and conversion steps:
Examples:
1) Tensor: NHWC, Convert: NCHW. Result: NCHW
2) Tensor: NHWC, Convert: 0312. Result: NCHW
* Fix for set_shape + resize case
* [LPT] Documentation
* 1) ToC was removed 2) SVG => PNG temporary conversion
* [LPT] Refactoring + developer guide
* [LPT] attribute doxygen documentation was added
* [LPT] Developer Guide to Reference API links were added
* [LPT] comments fixes
* [LPT] Reference API to Developer Guide links were added
* [LPT] titles were changed
* [LPT] comments fixes#2
* [LPT] root document was moved to Plugin DG
* [LPT] Documentation: image link quick fix
* [LPT] Docummentation: PrecisionsAttribute description quick fix
* fix comments from Karol
* fixes
* movement
* directive was added
* movement #2
* LPT reference in Executable Network rollback
* snippets were updated ini accordance with new API
* cli_parser.py fix to accept scalar value for freezing
* update cli help
* fixed unit-tests, clarified help for specifying data type
* typos correction
* auto-batching POC squashed (all commits from auto-batch-2021.3 branch)
(cherry picked from commit d7742f2c747bc514a126cc9a4d5b99f0ff5cbbc7)
* applying/accomodating the API changes after rebase to the master
* replaying modified version of actual batch selection
* eearly experiments with model mem footprint
* changes from rebasing to the latest master
* experimenting with DG1 on the batch size selection, also collecting the mem footprint
* WIP:moving the auto-batching to the icore to let the MULT/AUTO support that, ALLOW_AUTO_BATCHING as a conventional config key. still fials hot device swap
* quick-n-dirty batch footpint vs device total mem
* code style
* testing which models perform badly due to kernels and NOT (batched) footprint
* stub pipeline task to comunicate the readiness rather than promise/future
* quick-n-dirty timeout impl
* explicit _completionTasks,reverting BA to use the timeout
* inputs outputs copies, works with AUTO and demo now
* accomodate the config per device-id, after rebase to the latest master
* allowing the auto-batching only with tput hint to let more conventional tests pass
* fix the pre-mature timeout restaring via waiting for batch1 requests completion
* moved the bacthed request statring ( along with input copies) to the dedicated thread
* [IE CLDNN] Disable bs_fs_yx_bsv16_fsv16 format for int8 convolution
* code style
* increasing the timeout to test the ssd_* models perf (timeout?) issues
* reducing number of output stuff in BA to avoid bloating the logs in experiments
* more aggressive batching for experiments, not limited to 32 and also 4 as a min
* more accurate timeout debugging info
* getting the reqs limitation from the plugin SetConfig as well
* refactor the reshape logic a bit to accomodate CPU for bathcing, also added remeote context
* let the benchamrk_app to consume specific batch values for the auto-batching such as BATCH:GPU(4)
* auto-batching functional test (with results check vs ref) and GPU instance for that
* fixed arithemtic on blobs ptrs
* clang
* handling possible batched network failure
* BATCH as the constants device name in test
* ENABLE_BATCH
* func tests for CPU, also DetectionOutput hetero tests (CPU and GPU)
* DetectionOutput hetero test for the CPU
* reenabling the Auto-Batching in the AUTO
* auto-batching device enabled in the test
* fixed the DO test
* improve the loading loop logic
* brushed the config keys
* allow hetero code-path for explicit device name like BATCH:GPU(4), used in the hetero code-path tests
* fix the test after refactoring
* clang
* moving ThreadSafeQueue to the ie_parallel, as it is re-used in the AUTO/MULTI and BATCH now
* auto-batching hetero test (subgraph with DetectionOutput)
* fixed minor changes that were result of experiments with impl
* code-style
* brushing, disabling CPU's HETERO tests until planned activity for 22.2
* removing home-baked MAX_BATCH_SZIE and swicthing to the official impl by GPU team
* remote blobs tests for the auto-batching (old API)
* brushed names a bit
* CreateContext and LoadNEtwork with context for the Auto-Batching plus remote-blobs tests
* fixed the ieUnitTests with adding CreateContext stub to the MockICore
* clang
* improved remote-blobs tests
* revert the back BA from exeprimenents with AB + device_use_mem
* conformance tests for BATCH, alos batch size 1 is default for BATCH:DEVICE
* remote blobs 2.0 tests, issue with context having the orig device name
* debugging DG1 perf drop (presumably due to non-fitting the device-mem)
* disbaling WA with batch/=2 for excesive mem footptint, leaving only streams 2
* remote blobs 2.0 tests for different tensor sharing types
* converting assert to throw to accomodate legacy API where the lock() was possible to be called
* revert the timeout back to avoid mixing the studies, fixed the footprint calc
* reverting to estimating the max batch by extrapolating from bacth1 size
* more conservative footptint etimation (with bacth1), graceful bacth 1 handling without duplication
* even graceful batch 1 handling without duplication
* WA for MAX_BATCH_SIZE failure, removing batch4 as a min for the auto-batching
* AutoBatchPlugin -> ov_auto_batch_plugin
* WA for gcc 4.8
* clang
* fix misprint
* fixed errors resulted from recent OV's Variant to Any transition
* skip auto-batching for already-batched networks
* AUTO_BATCH_TIMEOUT and tests
* GPU-specific L3
* switched to pure config, also improved ALLOW_AUTO_BATCHING config key handling logic
* debugging device info
* enabling the config tests for the GPU and fixing the Auto-batching tests to pass
* making the default (when not recognized the driver) cache size more aggressive, to accomodate recent HW with old drivers
* skip auto-batching for RNNs and alikes (e.g. single CHW input)
* fixed fallback to the bacth1 and moved HETERO path under condition to avoid bloating
* brushing
* Auto plugin GetMetric support gpu auto-batch
Signed-off-by: Hu, Yuan2 <yuan2.hu@intel.com>
* add test case
Signed-off-by: Hu, Yuan2 <yuan2.hu@intel.com>
* add comments on test
Signed-off-by: Hu, Yuan2 <yuan2.hu@intel.com>
* brushing the vars names, alos adding the excpetion handling
* disabling the auto-batching for the networks with non-batched outputs and faster-rcnn and alikes (CVS-74085) to minimize the of #failures
* add try catch
Signed-off-by: Hu, Yuan2 <yuan2.hu@intel.com>
* brushing the code changed in the GPU plugin
* Auto-Batch requests tests
* brushed varibles a bit (ref)
* cleaned debug output from the ie_core
* cleaned cmake for the Auto-Batch
* removed batchN estimation from batch1
* cleaned from debug printf
* comments, cleanup
* WA the mock test errors introduced with merging the https://github.com/myshevts/openvino/pull/13
* Adding back removed batchN estimation from batch1 to debug degradations on DG1 (resulted from too optimistic MAX_BATCH_SIZE?). This partially reverts commit e8f1738ac1.
* brushing ie_core.cpp
* fix 32bit compilation
* Code review: ENABLE_AUTO_BATCH
* consolidate the auot-batching logic in ie_core.cpp into single ApplyAutoBAtching
* renamed brushed the OPTIMAL_BATCH (now with_SIZE) and mimicks the MAX_BATCH_SZIE wrt MODEL_PTR
* default value for the OPTIMAL_BATCH_SIZE
* clang
* accomodate new func tests location
* fix shuffle of headers after clang + copyrights
* fixed misprint made during code refactoring
* moving the common therad-safe containers (like ThreadSafeQueue) to the dedicated dev_api header
* switch from the device name to the OPTIMAL_BATCH_SIZE metric presence as a conditin to consider Auto-Batching
* switching from the unsafe size() and minimizing time under lock
* code style
* brushed the ApplyAutoBatching
* brushed the netric/config names and descriptions
* completed the core intergration tests for the auto-batching
* ExecGraphInfo and check for incorrect cfg
* removed explicit dependencies from cmake file of the plugin
* disabling Auto-Batching thru the tput hint (to preserve current product default), only excplicit like BATCH:GPU used in the tests
Co-authored-by: Roman Lyamin <roman.lyamin@intel.com>
Co-authored-by: Hu, Yuan2 <yuan2.hu@intel.com>
* Remove some legacy targets
* Replace some targets
* Removed inference_engine_plugin_api dependency
* Minor comment for developer config
* Fixed include paths
* Small fixes for static build
* Try to fix build pyopenvino
* Fixed comments
* Try to fix build
* Include OpenVINODeveloperPackage inside InferenceEngineDeveloperPackageConfig
* Try to fix GAPI tests
* Fix incomprehensible error message during layout conversion when layout rank doesn't match with shape rank
* Stash
* stash
* Memcpy implementation
Added tests
* Revert "Fix incomprehensible error message during layout conversion when layout rank doesn't match with shape rank"
This reverts commit 37064741b2.
* Fix clang-format and remove redundant headers
* Covered "cached" case (+ tested on Myriad)
* Apply review comments
Introduced 'applyBatchedBlob' function which allows override 'memcpy' on inferefnce time
* clang-format fix
* Added dynamic shape case
* - Review comments
- Deep copy of parameters/results for caching from cnnNetwork. Deep copy logic is moved to Utils
- Caching Tests: return correct inputs/outputs map after ImportNetwork mock call
* Reworked according to discussion
Also introduced 'SetBlobsImpl' which throws 'Not implemented' exception by default.
Template plugin updates internal '_batched_inputs' map
* Updated according to moved tests
* don't support 'memcpy' for ROI tensors
* Fix caching tests
* Just to retrigger CI
* Correct offset padding (however there is no test update as current implementation will not hit here due to other checks)
* Fix clang-format
* Applied review comments
* Added check that 'get_tensor' throws if set_tensors/set_input_tensors is used
* Fix review comments - part 1
* Fix caching tests - mock implementation becomes more complicated
Cached mock model shall identify its inputs/outputs, otherwise core will assert on SetExeNetworkInfo stage
* More comments fix
* More comments fixes
* More cleanup
* And more style comment
* typo fix
* Try fix caching windows tests
* Blind attempt to fix Ubuntu20 CI