* Tensor accessor for shape inference
- as functor for getting data from tensor vector or map.
- as lambda in GPU plugin on tile op
* Make tensor data adapter pure virtual
- function accessor to data returns pointer to interface
* Refactor tensor data accessor and adapter
* Extract memory adapter make it GPU graph internal
- can't be part of GPU runtime memory core dev API not visible there
* Expand IStaticShapeInfer by port map
- update factory map for new infer interface with port map information
- add bit util to generate bit mask use it in PortMask
* Pass tensor accessor as reference not fun object
- Add cldnn data adapter and accessor
- Reduce dynamic allocations in data accessors
* Fix compilation issues
* Use ov::Tensor for data accessor
- remove data adapters are they not required
* Update comments
* Fix build issues
* Fix tile shape infer test
* Add empty null tensor accessor as specialization
* Apply style formatting
* Move data accessor from dev API to shape inference
* Fix linking issues
* Add reorder with usr's output data type for assign
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Fix incorrect input index for handling leftovers
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Add TCs for ov_gpu_unit_tests
Signed-off-by: Andrew Park <andrew.park@intel.com>
---------
Signed-off-by: Andrew Park <andrew.park@intel.com>
* [GPU] Improve dump naming rule in debug feature.
Now, we can support below dump naming rule.
- Exec_graph name
- Wildcard letter for target names ('*', '?')
- Case-insensitive name searching
- Apply to show loop body primitives.
New introduces OV_GPU_xxx
- OV_GPU_ListLayers = 1 (Show layer names and exit)
- OV_GPU_VerboseColor = 1 (Show verbose with color)
Add file, line, function in log prefix.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Fix bug
1) reshape w/ fused primiitive should not be optimized out
2) Wrong usage of slice mem / concat mem in loop
3) LWS not set in lstm_elt
* Added unittest
* initial fix
* add corresponding unit test
* skip reorder fusing when sibling node does not support fused padding
* fix data type of axis for win build
* Revert "fix data type of axis for win build"
This reverts commit 719ea75d7826aafc7bb94c1971586c33a9842f10.
* add static casting for win build
* [GPU] Fix i8 representation error for clamp due to overflow
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Fix to not include in ocl code
Signed-off-by: Andrew Park <andrew.park@intel.com>
---------
Signed-off-by: Andrew Park <andrew.park@intel.com>
* [GPU] Fix levit-128s accuracy issue
Wrong batch dims for fused eltwise of gemm.
-> The issue is getting incorrect batch size of fused eltwise used by gemm.
Its rank is different from src tensor. Eltwise tensor rank was reduced by mistake.
It is only reproduce in batch 1 and full tensor.
The batch size in here means all of non spatial dims, but previous implementation was default batch dim role.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* [GPU] Resolve failed unit-tests on dGPU
+ Modified unit-tests of asymetric conv with per channel(WA for oneDNN issue)
+ Modified conv unit-tests with padded input or output
+ For testing oneDNN conv, it needs to query oneDNN about format. Applied this to conv tests.
+ Modified accuracy checking logic in unit-tests which have different format on dGPU.
+ reorder from fsv16 to bfyx should not be optimized out if not aligned by 16
Signed-off-by: Min, Byungil <byungil.min@intel.com>
* [GPU] Fix dump_graph failure issue in levit-128s model.
1. to_string() in strided_slice always access begin/end/stride param id from dependencies
regardless of max dependencies.
2. Add an exception in dump_full_node(). It helps below.
- Avoid a dump failure. Usually, graph dump are used during debugging,
which reduces unnecessary debugging time due to graph dump failure.
- You can immediately see which node has failed, making it easy to find it.
Signed-off-by: hyunback <hyunback.kim@intel.com>