* [GPU] Reorder weights refactoring (#17787)
* [GPU] Fix DG2 with weights optimization
* [GPU] Fix DG2 with weights optimization
* [GPU] Fix DG2 with weights optimization
* [GPU] Fix DG2 with weights optimization
* [GPU] Fix inner order description for some of formats
* [GPU] Fix expected number of primitives in test
---------
Co-authored-by: Roman Lyamin <Roman.Lyamin@intel.com>
Co-authored-by: Sergey Shlyapnikov <sergey.shlyapnikov@intel.com>
* [GPU] Permute f and y axes
It is supported cases where y and f (and X if it is not equal 1) axes size divisible by 4,8 or 16.
added kernel to switch f and y axes in 4d model blocked and planar formats
added test fot tests
* Added subgroup read/write to THREE_DIM_TRANSPOSE kernel case.
* Better checking of is SIMD size supported.
* Added support for long type to subgroup read/write.
* Added subgroup read/write support to 2d permute.
* Fixed win build issue.
* Changed f and y indexes in iteration.
* Added vector read/write.
* Fixed j_times calculation.
* Better naming.
* Rollback test logic.
* Fixed fusion logic.
* Accept only supported blocked layouts and SIMD sizes.
---------
Co-authored-by: Mykhailo Hnap <mykhailo.hnap@capgemini.com>
Co-authored-by: Wilson Seok <wilson.seok@intel.com>
* [GPU] Add oneDNN primitives profiling support
* [GPU] Add stream.wait() method to prevent caches flushing and other possible impacts of finish() call
* Add comment for wait() usage
* Provided visualization of partial values and labels. Adopted DimensionTracker for better equivalence tracking
* Addressed comments and fixed one test
* Removed copy of consts in translator, added test.
* Fixed memory loss for tf.Const.
* Added test, minor corrections.
* Update src/bindings/python/src/openvino/frontend/tensorflow/node_decoder.py
Co-authored-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Test corrections.
* Added comment.
---------
Co-authored-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Enable `LoadedTensor.*HETERO` test
* Fix use of `ICompiledModel::outputs()`
* Remove extra `loaded_from_cache` argument
* Misprint
* Small refactoring
* Remove extra `model` from `CompiledModelDesc`
Use `get_runtime_model()` instead
* ClangFormat
* [PyOV] Extend Tensor API
* one more ctor
* apply comments
* support constoutput
* add checks for shape
* checks for type and shape
* apply comments
* is_continuos
* codestyle