+ Added is_padded_spatial to program_node
+ Added reorder to remove padded input in spatial axis for mvn
+ case applied only for blocked formats of implemented mvn opt kernel
Signed-off-by: Min, Byungil <byungil.min@intel.com>
* updated to enqueue only fc for async build
* updated use_async_compilation(), make_task_executor_config() and disabled gemm_onednn.impl_replacement_with_cldnn
* added _num_async_build_threads
* added gemm to the async compliation targets
* removed priorbox in mark_if_constant
* fix priorbox operation for dynamic shape
* restore share test classes and disabled the test cases
* add exception throw for PriorBoxClustered
* Allow StridedSlice as predecessor for in place concat
* Enable padding support for strided slice
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Add prepare_buffer_fusing TC for ov_gpu_unit_tests
---------
Signed-off-by: Andrew Park <andrew.park@intel.com>
* Skip reorder at runtime if data type and format are not changedt
* Update shape of reorder user at predecessor node so that we can allocate pred nodes' output to host mem if needed
* Reinterpret reorder memory at runtime if needed
(e.g., input is fake-aligned fc and reorder uses that memory)
* Add debug config
* Fix CI test failure
* Do not skip after optimized reshape
* Do not skip user reorder if the user reorder is output and current node is static, and the memory is allocated to device
* Disable skip reorder user if current node has fused node
* Update src/plugins/intel_gpu/src/graph/include/reorder_inst.h
Co-authored-by: Eddy Kim <eddy.kim@intel.com>
* Minor fix for compilation error
* Do not skip reorder if the reorder's user is optimizable concat
* Fix CI failures
* No need to wait for input_layout because the events is already resolved in dgpu
* Fixed corner case where only some of the multiple output layouts are static
---------
Co-authored-by: Eddy Kim <eddy.kim@intel.com>
* * Not to reuse internal memory for dynamic shape because of the current inefficiency in the pool
* Added a new debug config for dump runtime memory pool
* Apply DisableMemoryReuse for all usages
* Resolved perf issue of memory reuse from pool : Previously original ibuf record was not released when we allocate new memory for that buf.
After releasing the memory, # of the memory pool record does not increase => no longer inefficient memory pool retireval.
* Added test
* Extract axes normalization and validation in separate functions in Interpolate op
* Update resample primitive declaration
* Update output layout calculation for Interpolate v11
* Update Interpolate op builder
* Add a shared test instance for Interpolate from 11th opset
* Add basic tests for Interpolate from opset 11
* Add new resample types and appropriate flags in ParamsKey
* Replace map which holds axes and scales with two separate vectors in resample_params
* Add resample kernel implementation
* [dGPU] Enable user scratchpad mode.
* Reuse intermediate buffer.
* Add own id to the memory dependencies at the c-tor of program_node
+ Allocate intermediate memory with memory_pool::get_memory() function.
+ Assign scratchpad memory desc in load() function for onednn primitive
serialization
* Allocate device mem for onednn scratchpad mem
* Modify the condition making batch interpretation true/false
- When the user is Convert for Constant node, and tensor is 1d,
- Set needBatchInterpretation to true
* Narrow down the range of the condition
* Merge the condition
* Add additional condition not to check self node
* Fix incomplete condition
* Check if all inputs to binary eltwise is 1d
* Change code style
* [GPU] Improvement for buffer dump
+ added OV_GPU_DumpLayersInput to support dump input layers
+ added OV_GPU_DumpLayersRawBinary to make binary dump
+ added OV_GPU_LoadDumpRawBinary to use binary dump as input
+ binary dump naming rule layername_datatype_tensor_format.bin
Signed-off-by: Min, Byungil <byungil.min@intel.com>
* Previously reorder / permute was not allocating its memory at build time thought the shape has upper bound
* Update src/plugins/intel_gpu/src/graph/permute.cpp
Co-authored-by: Sergey Shlyapnikov <Sergeishlyapnikov@gmail.com>
* Fix as review comment
---------
Co-authored-by: Sergey Shlyapnikov <Sergeishlyapnikov@gmail.com>