- add error reporting for failed kernel runs during auto-tune
- fix auto-tuning for asymmetric quantization
- add asymmetric quantization information to cache
- change auto-tuning metric from average to min
This change adds checks, macros and defines for two early/experimental
features:
- local memory block reads
- builtin optimization hints, ie: __builtin_assume
* [IE VPU] Set name for outDSR in DTS transformations
* [IE VPU] Enable NonZero_Transpose tests
* [IE VPU] Set name for outDSR in Reduce DTS
* [IE VPU] Use move semantic in DTS
* Specification for the NMS-4 operation (updated shape infer function)
* Enabled NMS-4 in the Model Optimizer
* Changed opset version for NMS with dynamic outputs and namespace to be "dynamic"
* Added NMS-4
* Added opset4 to the nGraph
* Added unit tests for NMS-4 type infer
* Renamed UpgradeNMS3ToNMS4 to UpgradeNMS3ToNMSDynamic. Added stub for ConvertNMS4ToLegacy
* Make IE aware of opset4 ops
* Updated NMSIE to have different shape infer function based on the NMS it was converted from. Implemented NMS4->NMSIE conversion
* Apply code style
* Updated StaticShapeNonMaximumSuppression op in the VPU
* Introduced new version of NMSIE operation with shape infer function from v4::NMS
* Fixed dynamicToStaticNonMaxSuppression transformation
* Added new version of NMSIE op with updated shape infer function
* Fixed NMS4 to NMSIE2 transformation
* Fixed constructors for nGraph ops v4::NM and dynamic::NMS
* Updated text in the opset4 specification document
* Code style fixes
* Fixed constructors for StaticShapeNMS + fixed test
* Minor change to the NMS op in the MO
* Fixed typo in the dynamic_to_static_shape_non_max_suppression transformation
* Removed redundant checks
* Refactored NMS infer and validate functions
* Added more checks to the validate_and_infer_types functions for NMS-3 and NMS-4
* Fixed compilation issue on Windows for op NMS
* Code style fixes
* Fixed typos in the NMSIE and NMSIE2 to CNNLayer op conversion
* Fixed typo in the ie_cnn_layer_builder_ngraph.cpp
* Fixed the NMSToLegacyNMS transformation. Added unit tests
* Apply code review comments
* Refactored NMSIE to use visitors
* Removed calling ConvertNMS4ToLegacy in the common optimizations
* Moved NMS4ToNMSLegacy to convert1_to_legacy group of transformations
* Removed useless include statement
* Removed copy-paste issue
Co-authored-by: Evgeny Lazarev <elazarev.nnov@gmail.com>
* Fixed deleting Transpose layers after and before Interpolate layers.
* Added run_after() for the transformation InterpolateTranspose.
* Some checks were moved from the replacement function to the pattern.
* Added a check of the attribute 'axes' into the pattern.
The ExtractImagePatches operation collects patches from the input
tensor, as if applying a convolution. All extracted patches are stacked
in the depth dimension of the output.
JIRA: 30055
* LayerNorm(PyTorch/HuggingFace pattern)->MVN+Mul+Add. Improves perf on BERT by 5%
* deducing the across_channels from axes passed to the MVN op.
axes are normalized. if no axes is specified, falling back to the (previously) default across_channel value
Co-authored-by: myshevts <maim.y.shevtsov@intel.com>