* [IE SAMPLES] activated NCC tool for c++ samples
* exclude ov_ncc_naming_style for tests
* fixed NCC hit
* Added support for source files in samples
* changed style of methods for benchmark
* changed style for speech sample
* changed code style
* changed code style for shared_tensor
* benchmark changes
* changed remote_tensors_filling
* fixed notes
* rebase of branch
* Set THROUGHPUT as the default configration for all the plugin and display the config of the plugin.
Signed-off-by: Wang, Yang <yang4.wang@intel.com>
* updated format.
Signed-off-by: Wang, Yang <yang4.wang@intel.com>
* Update benchmark python API.
Signed-off-by: Wang, Yang <yang4.wang@intel.com>
* Replace str 'THROUGHPUT' with CONFIG_VALUE(THROUGHPUT).
Signed-off-by: Wang, Yang <yang4.wang@intel.com>
* Using CONFIG_VALUE(THROUGHPUT) replace 'THROUGHPUT' string.
Signed-off-by: Wang, Yang <yang4.wang@intel.com>
* update code style.
Signed-off-by: Wang, Yang <yang4.wang@intel.com>
* Move the setting output code into the try block.
Signed-off-by: Wang, Yang <yang4.wang@intel.com>
I'll merge this. We need to talk with someone who know openvino build system better than we to "use cmake function ov_ncc_style_check to perform such check automatically". As far as I can see, currently it used only in one place, so it is not common approach for openvino components
* auto-batching POC squashed (all commits from auto-batch-2021.3 branch)
(cherry picked from commit d7742f2c747bc514a126cc9a4d5b99f0ff5cbbc7)
* applying/accomodating the API changes after rebase to the master
* replaying modified version of actual batch selection
* eearly experiments with model mem footprint
* changes from rebasing to the latest master
* experimenting with DG1 on the batch size selection, also collecting the mem footprint
* WIP:moving the auto-batching to the icore to let the MULT/AUTO support that, ALLOW_AUTO_BATCHING as a conventional config key. still fials hot device swap
* quick-n-dirty batch footpint vs device total mem
* code style
* testing which models perform badly due to kernels and NOT (batched) footprint
* stub pipeline task to comunicate the readiness rather than promise/future
* quick-n-dirty timeout impl
* explicit _completionTasks,reverting BA to use the timeout
* inputs outputs copies, works with AUTO and demo now
* accomodate the config per device-id, after rebase to the latest master
* allowing the auto-batching only with tput hint to let more conventional tests pass
* fix the pre-mature timeout restaring via waiting for batch1 requests completion
* moved the bacthed request statring ( along with input copies) to the dedicated thread
* [IE CLDNN] Disable bs_fs_yx_bsv16_fsv16 format for int8 convolution
* code style
* increasing the timeout to test the ssd_* models perf (timeout?) issues
* reducing number of output stuff in BA to avoid bloating the logs in experiments
* more aggressive batching for experiments, not limited to 32 and also 4 as a min
* more accurate timeout debugging info
* getting the reqs limitation from the plugin SetConfig as well
* refactor the reshape logic a bit to accomodate CPU for bathcing, also added remeote context
* let the benchamrk_app to consume specific batch values for the auto-batching such as BATCH:GPU(4)
* auto-batching functional test (with results check vs ref) and GPU instance for that
* fixed arithemtic on blobs ptrs
* clang
* handling possible batched network failure
* BATCH as the constants device name in test
* ENABLE_BATCH
* func tests for CPU, also DetectionOutput hetero tests (CPU and GPU)
* DetectionOutput hetero test for the CPU
* reenabling the Auto-Batching in the AUTO
* auto-batching device enabled in the test
* fixed the DO test
* improve the loading loop logic
* brushed the config keys
* allow hetero code-path for explicit device name like BATCH:GPU(4), used in the hetero code-path tests
* fix the test after refactoring
* clang
* moving ThreadSafeQueue to the ie_parallel, as it is re-used in the AUTO/MULTI and BATCH now
* auto-batching hetero test (subgraph with DetectionOutput)
* fixed minor changes that were result of experiments with impl
* code-style
* brushing, disabling CPU's HETERO tests until planned activity for 22.2
* removing home-baked MAX_BATCH_SZIE and swicthing to the official impl by GPU team
* remote blobs tests for the auto-batching (old API)
* brushed names a bit
* CreateContext and LoadNEtwork with context for the Auto-Batching plus remote-blobs tests
* fixed the ieUnitTests with adding CreateContext stub to the MockICore
* clang
* improved remote-blobs tests
* revert the back BA from exeprimenents with AB + device_use_mem
* conformance tests for BATCH, alos batch size 1 is default for BATCH:DEVICE
* remote blobs 2.0 tests, issue with context having the orig device name
* debugging DG1 perf drop (presumably due to non-fitting the device-mem)
* disbaling WA with batch/=2 for excesive mem footptint, leaving only streams 2
* remote blobs 2.0 tests for different tensor sharing types
* converting assert to throw to accomodate legacy API where the lock() was possible to be called
* revert the timeout back to avoid mixing the studies, fixed the footprint calc
* reverting to estimating the max batch by extrapolating from bacth1 size
* more conservative footptint etimation (with bacth1), graceful bacth 1 handling without duplication
* even graceful batch 1 handling without duplication
* WA for MAX_BATCH_SIZE failure, removing batch4 as a min for the auto-batching
* AutoBatchPlugin -> ov_auto_batch_plugin
* WA for gcc 4.8
* clang
* fix misprint
* fixed errors resulted from recent OV's Variant to Any transition
* skip auto-batching for already-batched networks
* AUTO_BATCH_TIMEOUT and tests
* GPU-specific L3
* switched to pure config, also improved ALLOW_AUTO_BATCHING config key handling logic
* debugging device info
* enabling the config tests for the GPU and fixing the Auto-batching tests to pass
* making the default (when not recognized the driver) cache size more aggressive, to accomodate recent HW with old drivers
* skip auto-batching for RNNs and alikes (e.g. single CHW input)
* fixed fallback to the bacth1 and moved HETERO path under condition to avoid bloating
* brushing
* Auto plugin GetMetric support gpu auto-batch
Signed-off-by: Hu, Yuan2 <yuan2.hu@intel.com>
* add test case
Signed-off-by: Hu, Yuan2 <yuan2.hu@intel.com>
* add comments on test
Signed-off-by: Hu, Yuan2 <yuan2.hu@intel.com>
* brushing the vars names, alos adding the excpetion handling
* disabling the auto-batching for the networks with non-batched outputs and faster-rcnn and alikes (CVS-74085) to minimize the of #failures
* add try catch
Signed-off-by: Hu, Yuan2 <yuan2.hu@intel.com>
* brushing the code changed in the GPU plugin
* Auto-Batch requests tests
* brushed varibles a bit (ref)
* cleaned debug output from the ie_core
* cleaned cmake for the Auto-Batch
* removed batchN estimation from batch1
* cleaned from debug printf
* comments, cleanup
* WA the mock test errors introduced with merging the https://github.com/myshevts/openvino/pull/13
* Adding back removed batchN estimation from batch1 to debug degradations on DG1 (resulted from too optimistic MAX_BATCH_SIZE?). This partially reverts commit e8f1738ac1.
* brushing ie_core.cpp
* fix 32bit compilation
* Code review: ENABLE_AUTO_BATCH
* consolidate the auot-batching logic in ie_core.cpp into single ApplyAutoBAtching
* renamed brushed the OPTIMAL_BATCH (now with_SIZE) and mimicks the MAX_BATCH_SZIE wrt MODEL_PTR
* default value for the OPTIMAL_BATCH_SIZE
* clang
* accomodate new func tests location
* fix shuffle of headers after clang + copyrights
* fixed misprint made during code refactoring
* moving the common therad-safe containers (like ThreadSafeQueue) to the dedicated dev_api header
* switch from the device name to the OPTIMAL_BATCH_SIZE metric presence as a conditin to consider Auto-Batching
* switching from the unsafe size() and minimizing time under lock
* code style
* brushed the ApplyAutoBatching
* brushed the netric/config names and descriptions
* completed the core intergration tests for the auto-batching
* ExecGraphInfo and check for incorrect cfg
* removed explicit dependencies from cmake file of the plugin
* disabling Auto-Batching thru the tput hint (to preserve current product default), only excplicit like BATCH:GPU used in the tests
Co-authored-by: Roman Lyamin <roman.lyamin@intel.com>
Co-authored-by: Hu, Yuan2 <yuan2.hu@intel.com>
* Moved Speech sample to OpenVINO 2.0
* improved samples's score
* changed code style
* added GNA configs
* renamed function due to new API
* added dynamic batch
* reordered includes
* speech_sample has more than 1 input
* added oname processing
* added multi input
* fixed notes
* removed getFullDeviceName with old api
* getFullDeviceName for benchmark
* Merged and compiling
* Fix for dynamic shape type
* review fixes
* renamed blob shape to tensor shape, small improvements
* fix code style
* added parsing of multiple shapes
* store latency per group, add isIdleRequestAvailable() to Infer Queue
* added cached random inputs
* redesign pipeline, added new metrics(avg, max, min), added metrics per groups
* fixed code style
* small improvements
* modified tensor parameters parsing
* modified -i parameter parsing: added possibility to specify input names
* implemented image cashing
* added cashed blobs creating
* added -pcseq flag, modified batch filling, changes fps formula
* improvements
* code formatting
* code formatting2
* apply suggestions from review
* replaced Buffer class with InferenceEngine Blobs
* use batch size in blobs filling
* added shared blob allocator to handle blob's data
* fixed warnings & code style
* allocate blobs
* fix for networks with image info input
* added comments & fixed codestyle
* clear data in free() in SharedBlobAllocator
* remove unnecessary check
* Delimeter is changed to ::
* stylefix
* added layout from string function, small improvements
* modified parsing to enable : in input parameters
* small fixes
* small fixes
* added missed blob allocation, fixes
* [TEST]added support for remote blobs
* fix remote blobs
* new inputs/files output format
* removed vectors resize which caused bugs
* made cl::Buffer type under ifdef, fix inputs filling
* changed batch() function to not throwing exceptions
* removed unused var
* fix code style
* replace empty name in input files with name from net input
* restored old behaviour for static models
* fix code style
* fix warning - made const iterator
* fix warning - remove reference in loop variable
* added random and image_info input types to -i, fix problem with layout
* replaced batch() with getBatchSize() in main
* fix layout, shape, tensor shape parameters parsing
* upd help messages for input, tensor shape and pcseq command
* added buffer for cl output blobs, small fixes
Signed-off-by: ivikhrev <ivan.vikhrev@intel.com>
* added legacy mode
* restore setBlob
* code style formatting
* move collecting latency for groups under flag
* removed not applicable layouts
* added hint to error message when wrong input name in -tensor_shape was specified
* added new metrics to statistics report
* Apply suggestions from code review
* fix binary blobs filling when layout is CN
* apply suggestions
* moved file in the right place after rebase
* improved -pcseq output
* updated args and readme
* removed TEMPLATE plugin registration
* fix -shape arg decsription
* enable providing several -i args as input
* renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape
* upd readme
* use getBlob() in inference only mode
* fix old input type for static case
* fix typo
* upd readme
* move log about benchmark mode to the measuring perfomance step
* added class for latency metrics
* upd readme, fix typos, renamed funcs
* fix warning and upd parsing to avoid error with : in file paths
* fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&)
* added check for key in inputs
* renamed input to inputs
* adjust batch size for binary blobs
* replaced warning with exception in bench mode defining
* align measurement cycle with master
Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
* ov2.0 IE samples modification
apply code style
turn off clang style check for headers order
unify samples a bit
add yuv nv12 reader to format_reader, helloe_nv112 sample
hello_reshape_ssd ov2.0
* sync with PR 8629 preprocessing api changes
* fix for slog << vector<int>
* add operator<< for ov::Version from PR-8687
* Update samples/cpp/hello_nv12_input_classification/main.cpp
Co-authored-by: Mikhail Nosov <mikhail.nosov@intel.com>
* apply code style
* change according to review comments
* add const qualifier
* apply code style
* std::ostream for old inference engine version to make VPU plugin tests happy
* apply code style
* revert changes in print version for old api samples
* keep inference_engine.hpp for not ov2.0 yet samples
* fix merge artifacts
* fix compilation
* apply code style
* Fixed classification sample test
* Revert changes in hello_reshape_ssd sample
* rebase to master, sync with PR-9054
* fix issues found by C++ tests
* rebased and sync with PR-9051
* fix test result parsers for classification tests (except unicode one)
* fix mismatches after merge
* rebase and sync with PR-9144
Co-authored-by: Mikhail Nosov <mikhail.nosov@intel.com>
Co-authored-by: antonrom23 <anton.romanov@intel.com>