* [LPT] INT16, INT32 Quantization support
* [LPT] Support build on platforms with size_t == unsigned int
* [LPT] Test and fix wrong constant
* Fix build for size_t = unsigned int
* rebasing the perf-modes-2021.3 to the 2021.4
Caveats:
the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet)
(cherry picked from commit 1ae1edc0ed)
* overriding streams (to force the TPUT mode to the DLBenchnark)
(cherry picked from commit 7f506cda31)
* disabling reducing #streams to fully mimic baseline c4df94d42d of the 2021.3 (before experiments)
(cherry picked from commit 85073dd1dd)
* clang/identation
(cherry picked from commit 050a4155a9)
* splitting the Transformation to general and CPU specific.
Now hopefully,this fully mimics the baseline c4df94d42d of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled
(cherry picked from commit e98b2c1a67)
* disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8
(cherry picked from commit 32b8d80dee)
* refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations
(cherry picked from commit f2b972171b)
* isa-based threshold logic
(cherry picked from commit b218457e1a)
* mode->hint
(cherry picked from commit ec20aa8eca)
* optional PERFORMANCE_HINT_NUM_REQUESTS
(cherry picked from commit 5a3883e3f3)
* moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master)
(cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776)
* AUTO support for PerfHints
* MULTI support for PerfHints
* Enabling Perf hints for the GPU plugin
* brushing settings output a bit
* disabling "throughput" perf hint being default (until OV 2.0)
* uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default
* removing dead and experimental code, and debug printfs
* clang/code-style
* code-review remarks
* Moved the output of the actual params that the hint produced to the right place
* aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum
* clang
* benchmark_app brushing
* Update inference-engine/samples/benchmark_app/README.md
* propagating the perf hints thru one more scenario in the merged AUTO-MULTI
* fixed mispint
* Python benchmark_app update for perf hints
* addresssing reviewers comments on the python benchmark_app
* simplifying/brushing logic a bit
* refactor the heuristic to the separate file (to be shared with iGPU soon)
* refactor conversion of modes to the specific GPU config per feedback from Vladimir
* gather-8 upgrade/downgrade transforms
* bump to opset8
* add Gather-8 to MO
* fix permutation for Gather
* added TF layer tests, ONNX layer tests for negative indices, and nG python api unit-tests for negative indices
* typo fix, disable downgrading transformation
* disable downgrade in clDNN, line width style fix
* all Gathers are converted to 7th version, transformations will be enabled/disabled while op will be added into plugins
* disabled Gather8LayerTest on GPU
* added common function for Op replacement
* concretized meaning of negative indices, fixed some typos
* applied review comments: left only meaningful layer tests
* removed op replacing functions from common utils
* returned back transformations without subroutines
* corrected style, added comments to common_optimizations.cpp
* init version, need revise: opset7
* add convert testcase
* multiclass_nms support spec
* init version
* matrixnms support spec
* init support for matrix_nms
* impl matirx_nms
* implemented multiclass_nms reference.
TODO: more test cases.
* support dynamic shape in test
* update to spec 0611
* update to spec 0611
* fixes.
* fix: now sort by class_id and score work.
* fix clang check error
* more test cases verified.
* fixes in ref impl.
* attribute nms_eta works
* test cross_batch and output_type i32.
* enable multiclass-nms cpu plugin fallback ngraph
* keep topk typo
* enable matrix-nms cpu plugin fallback ngraph
* support sort_result_across_batch
* Add matrix_nms unit test
* Add cross batch test cases
* fix typo
* move multiclass to opset8
* move matrixnms to opset8
* Reference implementations for MulticlassNms and MatrixNms ops
* fix name conflict
* remove unused var
sort_result_across_batch default set to false
* avoid float overflow
* fix clang check error
* info for mac fail
* change testcase due to unstable sort
* nms add 'normalized' attribute
* multiclass cpu test support 'normalized'
* nms add 'normalized' attribute
* fixes: 1. normalized support. 2. sort by score before keep_top_k inside a batch.
* fixes: 1. normalized support. 2. sort by score before keep_top_k inside a batch.
* fix sort order in matrix_nms
* fix review comments
* add matrix_nms MKLDNN extension layer
* parallel in matirx nms
* separate filtered_box
* separate class_nms result
* parallel in class
* parallel in batch
* partial new nms
* partial remove useless function
* debug & fix
* debug in indexing
* fix test cases
* remove logging
* fix code-style
* fix typo
* add matrix_nms extension
* nms python api
* remove unused testcases
* refactor transformation
* transform dynamic shape to static shape
* Update inference-engine/src/transformations/include/ngraph_ops/nms_static_shape_ie.hpp
Co-authored-by: Ilya Churaev <ilyachur@gmail.com>
* remove register_pass call
* [MKLDNN]migrate matrix_nms to MKLDNNNode
* bug fix in matrix_nms
* padding on matrix_nms
* remove logging
* test case refine
* merged transform_matrix_nms branch
* refine matrixnms testcase
* multiclass nms cpu plugin implement for static shape, rebased on Reference implementations PR
* rebase to new multi-classs transform provided by lc
* Name style algin with matrix-nms
* static shape padding style to batch inside,new unit test method, real classnum shape
* fix format
* fix ci error
* multi-class NMS modification based on PR reviewer opinion: code format, copyright, delete unused include and funciton way
* explicit template instantiation due to mac ci fail
* Yi3/fix review (#16)
* fix coding style
* use parallel_for2d
* fix ci fail
* unify 'copyright 2021'
* mkldnn_multiclass_nms node update based on PR review (#17)
* [MKLDNN] apply suggestion for matrix_nms (#18)
* fix bug
* apply review comments
* apply review comments
* apply review comments
* apply review comments
* skip only Nms test, not MatrixNms MulticlassNms test
Co-authored-by: Zhang Yi3 <yi3.zhang@intel.com>
Co-authored-by: jialipen <cecilia.peng@intel.com>
Co-authored-by: mangguo <mang.guo@intel.com>
Co-authored-by: Ilya Churaev <ilyachur@gmail.com>
Co-authored-by: liubo-intel <bo4.liu@intel.com>
* Fix NormalizeL2Fusion and allow LpNormalization to be fused to NormalizeL2
* apply code format
* use cast_vector<uint64_t>
* use MKLDNNNormalizeL2Node::isSupportedOperation in normalizeL2FusionCallback
* [LPT] [CPU] Convert dequantization shift in low precision before FP32 conversion in CPU
* [LPT] Avoid not neccessary conversion to FP32
* [LPT] Split workaround: replace_node manual handling
* [nGraph] [LPT] Q/DQ representation on weights extension: update transformation for conversion to old format
* review notes fix
* [LPT] checkZeroPoint reuse
* Compile time enabling or disabling of first inference time counters
* First inference time counters
* Counters for validate_nodes_and_infer_types and check_all_parameters_registered removed from first inference time counters scope
* Code style fix
* Missing macro for CC and invalid domain names
* Code style fix
* Unused function warnings fixed
* do not convert Sequences to TensorIterator when plugin supports Sequence primitive
* fix referece implementations for Sequences, processing seq_len value == 0 case
* Adding new mode for LSTMSequence single layer tests
* update single layer tests
* fix failed unit tests, updated single layer tests for rnn/gru sequences
* fix failed unit tests
* fix single layer tests
* ignore failed single layer tests on gpu (known issue), fix review remarks
* Added support for Gelu-6 to the MO
* Adding Gelu-6 to ngraph and python API + some tests
* Fixed typo in the Gelu approximation mode
* Fixed Gelu-6 reference implementation for Tanh mode
* Added transformation to downgrade v6::Gelu to v2::Gelu
* Added specification for the Gelu-6
* Code style fixes
* The Gelu-6 operation specification update
* Fixed compilation issue in reference implementation for Gelu
* Fix compilation issues for some OSs
* Code style fix
* One more cpplint issue fix
* Fixed Gelu6 reference implementation compilation on Windows.
* Code style fix
* Fixed various ngraph unit tests
* Code style check
* Reverted Gelu-2 to be fused op
* Fixed Gelu6 downgrade transformation
* Added unit test for Gelu6Downgrade transformation
* Update copyright year
* Updated copyright year
* Replaced tab characters with 4 spaces in IR reader tests
* Code style fixes
* Added default value for GeluApproximation mode for Gelu-6 op
* Fixed code style for Gelu-6
* Changed order of parameters for the Gelu evaluate to potentially avoid backward compatibility issues with ARM plugin
* Fixed code style
* Introduced opset7. Moved Gelu6 to opset7
* Fixed non-updated transformation
* Fixed opset version in ngraph Python API for Gelu operation
* Fixed typo in the opset number in the documentation
* Reverted some changes related to Gelu6
* Updated MO to produce Gelu7
* Updated unit tests for Gelu
* Updated Gelu7 specification
* Changed gelu reference implementation. Added opset7 to Python packages
* Updated Python API tests for Gelu operation
* Code style fix
* Marked get_approximation_mode function as const
* Added missing "const" qualifier
* Fixed code style issues in tests
* Added extractor for MxNet operation Gelu
* Spelling issues fix
* Updated MxNet supported symbols
* Added NGRAPH_OP_SCOPE for Gelu7 validate_and_infer_types
* Fixed a typo in the comment
* Removed legacy IE shape infer
* Removed GenericIE operation
* Removed legacy shape infer tests
* Removed legacy test with legacy IE reshape
* Fixed compilation issues related to removal of GenericIE
* Fixed one more compilation issue with clDNN
* Fixed test for reading experimental ops
* Updated tests and make IR Reader to load old experimenal and extension ops as opset6
* Change opset of some ops only if they are currently experimental/extension to avoid situation like opset1::Proposal -> opset6::Proposal
* Removed more legacy code
* Returned back code removed by mistake
* Fixed issues related to incorrect merge with master
* Merge fixes
* Fixed unit tests which starts to fail because now loading the model with unknown operation is failed earlier
* Removed incorrectly added code
Co-authored-by: Evgeny Lazarev <elazarev.nnov@gmail.com>