* Review interpolate shapes and label propagation
* Review shape_infer template implementation
* Update shape infer of interpolate in GPU plugin
- Add new tensor accessor for ov::Tensor map
* Correct casting in dim::scale function
* Remove validation of size of input 1 in v0
* Relax inputs check for interpolate v4
* Correct GPU shape inference
* Use ov::Tensors in interpolate's evaluate
- Remove some duplicated code
- Apply comments from review
* Set shape in interpolate's eval for output tensor
* primitive serialization
* updated primitive::desc() to use impl_param instead of program_node
* added hash caching unit tests
* added missed calls to save and load of parent
* updated copyright year
* [GPU] Added shape agnostic optimized Permute_tile_8x8_4x4 kernel
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Add permute_gpu_tile_8x8_4x4 shape agnostic TCs for ov_gpu_unit_tests
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Fix calculation for required local mem size
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Update not to condisder x and feature dimension for tile size on shape agnostic kernel case
Signed-off-by: Andrew Park <andrew.park@intel.com>
---------
Signed-off-by: Andrew Park <andrew.park@intel.com>
+ Invalid calculation in reducing un-aligned feature axis for b_fs_yx_fsv16
+ Some reduce modes are not invariant by using 0 value out of range
+ Added jit ZERO_INVARIANT_REDUCTION
+ Enable blocked unit-tests on dGPU by PR#15873
Signed-off-by: Min, Byungil <byungil.min@intel.com>
* enable PaddlePaddle elementwise broadcast
* fix CI fail issue
* Apply suggestions from code review
* fix CI fail issue
* only B to A broadcast is supported for PDPD
* fix GPU plugin testcase fail issue
* keep PDPD broadcast_merge cpu plugin implement align with ov core
* add type prop test case for pdpd broadcast dst shape smaller than src shape
* Build using conanfile.txt
* Update .ci/azure/linux_arm64.yml
* Several improvements
* Removed conanfile.py
* Try to use activate / deactivate
* Fixed clang-format code style
* Supported TBB version from Conan
* Added more NOMINMAX
* Fixed static build
* More improvements for static build
* Add usage of static snappy in case of static build
* More fixes
* Small fixes
* Final fixes
* deserialization of dynamic batch
* updated multi stream tests
* added unit tests
* updated cache dir name
* resolved type conversion warning
* removed teardown()
* added const
* [GPU] Fix with permute mismatching input layout with ouput in batch 2
* Add unit test
* Fix unit test
* Don't use deprecated interface for layer test
* Tensor accessor for shape inference
- as functor for getting data from tensor vector or map.
- as lambda in GPU plugin on tile op
* Make tensor data adapter pure virtual
- function accessor to data returns pointer to interface
* Refactor tensor data accessor and adapter
* Extract memory adapter make it GPU graph internal
- can't be part of GPU runtime memory core dev API not visible there
* Expand IStaticShapeInfer by port map
- update factory map for new infer interface with port map information
- add bit util to generate bit mask use it in PortMask
* Pass tensor accessor as reference not fun object
- Add cldnn data adapter and accessor
- Reduce dynamic allocations in data accessors
* Fix compilation issues
* Use ov::Tensor for data accessor
- remove data adapters are they not required
* Update comments
* Fix build issues
* Fix tile shape infer test
* Add empty null tensor accessor as specialization
* Apply style formatting
* Move data accessor from dev API to shape inference
* Fix linking issues
* Add reorder with usr's output data type for assign
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Fix incorrect input index for handling leftovers
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Add TCs for ov_gpu_unit_tests
Signed-off-by: Andrew Park <andrew.park@intel.com>
---------
Signed-off-by: Andrew Park <andrew.park@intel.com>
* [GPU] Improve dump naming rule in debug feature.
Now, we can support below dump naming rule.
- Exec_graph name
- Wildcard letter for target names ('*', '?')
- Case-insensitive name searching
- Apply to show loop body primitives.
New introduces OV_GPU_xxx
- OV_GPU_ListLayers = 1 (Show layer names and exit)
- OV_GPU_VerboseColor = 1 (Show verbose with color)
Add file, line, function in log prefix.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Fix bug
1) reshape w/ fused primiitive should not be optimized out
2) Wrong usage of slice mem / concat mem in loop
3) LWS not set in lstm_elt
* Added unittest