If gemm input dimensions are not multiple of 16 and any of
transpose_a/transpose_b attribute is set - cldnn picks
'gemm_ref' kernel in favor of faster 'gemm_tiled_opt'.
By emplacing explicit permute operation on the gemm input that it requires,
we make cldnn to pick 'gemm_tiled_opt', which in result improves
performance.
For some input shapes, transpose(s) + gemm_tiled_opt can be slower than
just gemm_ref. Based on benchmarks - the cutoff point was set for inputs shapes > (64, 64).
Ticket: 67271
* Replacing image with a scalable svg file
Replacing image in Protecting Model Guide -changing the png image to scalable svg format.
Removing image project files.
* added gather tree blocked layout support
* gather tree blocked layout support review cleanup
* gather tree blocked layout support review cleanup
* gather tree blocked layout support review cleanup
* gather tree blocked layout support review cleanup
* gather tree blocked layout support review cleanup
* build fixed
* usage of program and program_node are changed to use kernerl_impl_params
* pointer to network is replaced with reference
* moved primitive specific parameters out of kernel_impl_params
* use impl_params instead of program_node in onednn_impl
* relocate params from impl_params to primitive_inst
* added conditional directives around fused_desc_onednn
* removal of unnecessary changes
* node_output is separated from impl_param->output
* Improvements in rpm / debian build
* Fixed several debian warnings
* Supported old gflags from CentOS 7
* Reverted back OpenCV version
* Fixed clang-format
* [TF FE] Refactor translators for Select and SelectV2 and test it
It fixes a case when the condition of a smaller rank than operands in Select
Separately added tests for Select and SelectV2
Do not mix-up Select with Where, so tests for Where are moved to test_tf_Where.py
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
* Revert extra changes
* Apply code-review feedback: support undefined rank
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
Co-authored-by: Andrei Kochin <andrei.kochin@intel.com>
* [TF FE] Refactor translators for Reverse, ReverseV2 and test it
Make these operations reshapeable. Add layer tests for them to the pre-commit
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
* Apply code-review feedback: simplify checks in Reverse
* Apply the rest of code-review feedback: simplify code for Reverse
* Remove redundant check for axes
* Apply code-review feedback: support dynamic rank
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
* [GPU] add blocked format to experimental detectron detection output
* [GPU] add this kernel in whitelist of program.cpp
Co-authored-by: ozhydkov-lohika <ozhydkov@lohika.com>
* cldnn serialization
* read layout from _impl_param instead of node
* changed ref in kernel_impl_param to pointer
* removed serialization utils
* removed serialzation related changes
* restored references in function arguments
* remove trailing spaces
* revert change in bs_x_bsv16
* fix to rebase