Eltwise order of the supported primitive descriptors affects the
performance of the model.
Often only one of the port descriptors matches with the layout
of the parent descriptors, i.e. when two parent ports have mixed
layout "nchw nhwc".
So either nchw or nhwc layout will be used for the eltwise node
and reorder will be used for either of the ports.
The shape of the ports also can be different (when one of the inputs is
broadcasted). So reorders on different ports have different
performance impact.
The layout of the eltwise node child has an effect on the performance
as well, since it may or may not require reorder on input.
* Optimization for gemm & fc in iGPU.
FC: fake alignment for 16 is better in iGPU.
Gemm: permute + gemm_tiled_opt is better than transposed_input + gemm_ref kernel for unaligned shapes to 16. Note that this is an temporal optimization and will be removed once the final solution (i.e., support unaligned transposed input shape in gemm_tiled_opt kernel) is availalbe.
* Fix unittest
* Fix for model_cache
* Fix unittest
* Init rotated non max suppresion spec
* Add opset13 docs
* Apply minor refactor from review
* Update boxes definition
* Update example format from cpp to xml
* Add version in op list
* Add clockwise attr to the example
* Align indent
* Remove redundant input from example
* Add steps of iou_rotated
* Add default values for attributes
* Drop box encoding attribute
* Rephrase nput description
* Applay grammatical suggestions
* use input memory buffer as output memeory when input1/2 are empty
* fix wrong rebase
* add func test
* implement in on_execute()
* remove deleted function definitioin
* remove unused header files
* fix include error
* update condition of empty input check
* Symbolic shape inference and graph optimizations
- Prepares a place in CommonOptimizations pipeline for symbolic optimizations
- Introduces symbolic propagation and symbolic optimizations for ChainedMaximum, NopBroadcast and shape sub-graph optimization
- Introduces utility runtime info for TableOfEquivalence passing and disabling of value invalidation during shape inference
* Executes NgramFusion in a symbolic environment. Relaxes Ngram fusion pattern utilizing symbolic knowledge
* Remove debug model visualization
* rt_info copying to new Add operation
* Fix visualization and place validation in nicer place in symbolic transformation
* Fix Slice operation not to propagate labels if input and output dimension is fully dynamic
* Covering Vladislav comments
* Replace value invalidation followed by validation to revalidation since it does the same thing
* Adding back invalidation of cached values to Symbolic Propagation pass
* Fix StridedSlice label propagation. Code style
* Update src/common/transformations/tests/symbolic_transformations/nop_broadcast.cpp
* [GPU] Fix canonicalization for fused dep's shape
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Update TC to reproducible on the latest master
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Fix custom canonicalize shapes for Gather
---------
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Avoid Constant casting / printing when OV_VISUALIZE_TREE_CONST_MAX_ELEMENTS==0
Cast only requested amount of elements in Constant::cast_vector<>
* Refactor
* Revert style back
* Fix signed/unsigned comparison
* test
* Style
* Style