Commit Graph

728 Commits

Author SHA1 Message Date
hyunback kim
a9cbccd829 Broadcast for post ops enable enable onednngemm (#16074)
* [GPU] Add data broadcasting for OneDNN binary ops for Gemm primitive
* Based on https://github.com/openvinotoolkit/openvino/pull/15790 and enable onednn gemm from support multiple users and non constant input.

--------

Signed-off-by: hyunback <hyunback.kim@intel.com>
Co-authored-by: Sergey Shlyapnikov <sergey.shlyapnikov@intel.com>
2023-03-08 13:55:51 +09:00
Roman Lyamin
681faadce3 [GPU] Added shape agnostic kernels for GatherElements and Tile (#15798)
* [GPU] Added shape agnostic kernel for GatherElements

* [GPU] Added shape agnostic kernel for Tile
2023-03-08 08:34:24 +04:00
Vladimir Paramuzov
a1eb76ad06 [GPU] Move is_local_block_io_supported WA to kernel selector (#15235) 2023-03-07 15:12:08 +04:00
Min, Byungil
87b18a21c1 [GPU] Optimize eltwise kernel for blocked format (#15717)
* [GPU] Optimize eltwise kernel for blocked format

+ Optimize etlwise_blocked_opt
+ Replace deprecated kernels with eltwise_blocked_opt
+ Remove eltwise_b_fs_yx_fsv16, b_fs_yx_fsv4 kernels
+ Add test-cases in eltwise_gpu_test

Signed-off-by: byungilm <byungil.min@intel.com>
2023-03-07 14:21:09 +09:00
Vladimir Paramuzov
eff0bce7e3 [GPU] Move some op parameters from node to primitive class (#16070)
* [GPU] Move parameters of conv and quantize primitive from node to primitive

---------

Co-authored-by: Eddy Kim <eddy.kim@intel.com>
2023-03-07 08:56:00 +04:00
Andrew Kwangwoong Park
7123e8879e [GPU] Added shape agnostic optimized SoftMax kernel (#15834)
* [GPU] Added shape agnostic optimized SoftMax kernel

Signed-off-by: Andrew Park <andrew.park@intel.com>

* Update SoftmaxKernelBaseBF::Validate policy for shape agnostic kernel

Signed-off-by: Andrew Park <andrew.park@intel.com>

* Add softmax_gpu_bf shape agnostic TC for ov_gpu_unit_tests

Signed-off-by: Andrew Park <andrew.park@intel.com>

* Fix failed TCs for ie-tests-linux-ubuntu20-gpu

Signed-off-by: Andrew Park <andrew.park@intel.com>

* Update to use stack array instead of global buffer

Signed-off-by: Andrew Park <andrew.park@intel.com>

* Remove global buffer usage completely

Signed-off-by: Andrew Park <andrew.park@intel.com>

* Add #undef directive

Signed-off-by: Andrew Park <andrew.park@intel.com>

---------

Signed-off-by: Andrew Park <andrew.park@intel.com>
2023-03-06 09:10:29 -08:00
Andrew Kwangwoong Park
4ce35fd851 [GPU] Minor fixes for dynamic model (#16075)
Signed-off-by: Andrew Park <andrew.park@intel.com>
2023-03-06 15:50:38 +04:00
Xiping Yan
8b66b35bf7 [CPU]Remove C4250 warning suppress, and fix the corresponding warning. (#15966) 2023-03-06 12:43:53 +04:00
Xuejun Zhai
9b97235902 Xuejun/remove api in ov any (#15667)
* [Remove APIs] remove ov::any api  &

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>

* [Remove APIs] remove ov::any api

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>

* [Remove APIs] remove interfaces in ov::any  Base* operator->() & const Base* operator->()

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>

* [Remove APIs] remove ov::any interfaces Base* get() & const Base* get()

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>

* [Remove APIs] remove ov::any interfaces call(const Any& any) & dynamic_pointer_cast(const ::ov::Any& any) & static_pointer_cast(const ::ov::Any& any)

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>

* [Remove APIs] fix code format issues in ov::any

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>

* [Remove APIs] fix review issue

Signed-off-by: xuejun <xuejun.zhai@intel.com>

* [Remove APIs] clear code

Signed-off-by: xuejun <xuejun.zhai@intel.com>

* [Remove APIs] fix review issue

Signed-off-by: xuejun <xuejun.zhai@intel.com>

* [Remove APIs] fix compiler issue

Signed-off-by: xuejun <xuejun.zhai@intel.com>

* [Remove APIs] fix compiler issue

Signed-off-by: xuejun <xuejun.zhai@intel.com>

* [Remove APIs] fix compiler issue

Signed-off-by: xuejun <xuejun.zhai@intel.com>

* Fix variant error

Signed-off-by: xuejun <xuejun.zhai@intel.com>

---------

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>
Signed-off-by: xuejun <xuejun.zhai@intel.com>
2023-03-06 10:24:08 +04:00
Ilya Lavrenov
e1fbb7d768 Fixes for multi-config generators (#16097) 2023-03-05 10:46:53 +04:00
Ilya Lavrenov
9c4c559909 Fixed compilation on Debian 11 with gcc 12.2 (#16096) 2023-03-04 20:45:04 +04:00
Steve Yoo
a16f1923d7 Added recalculating processing order if it is not correct (#15987) 2023-03-02 14:40:15 -08:00
Kelvin Choi
6979c06ca1 [GPU] Support non constant input for Pad (#15697)
* [GPU] Support non constant input for Pad

* Refactor by comments
2023-03-02 10:38:43 -08:00
Ilya Lavrenov
4d925e0a3d Test GPU plugin arm64 build via Android precommit (#16055) 2023-03-02 21:06:36 +04:00
hyunback kim
cb7eeadd62 [GPU] Integration oneDNN3.1 (#15804)
* [GPU] Integration oneDNN3.1
* [GPU] Add os_iyx_osv8 format

Signed-off-by: hyunback <hyunback.kim@intel.com>
2023-03-03 00:18:42 +09:00
Ilya Lavrenov
0d798b7431 Building GPU plugin for Linux ARM64 (#16008)
* Building GPU plugin for ARM64

* changed order of headers

* Fixed clang-format
2023-03-02 12:43:33 +04:00
Roman Lyamin
24b0baa0d1 [GPU] Added support mixed input formats for Select (#16009) 2023-03-02 09:19:02 +04:00
Vladimir Paramuzov
27ac7d9092 [GPU] backend independent code for fuse params in program_node (#16028) 2023-03-02 09:18:29 +04:00
Vladimir Paramuzov
c5c7e4ff65 [GPU] Cleanup tuning cache methods (#16000) 2023-03-01 16:30:47 +04:00
Vladimir Paramuzov
3de00347f3 [GPU] Code cleanup (#16014)
* [GPU] Improve exception message for program build

* [GPU] Code cleanup
2023-03-01 14:05:59 +04:00
Roman Lyamin
1070a3b6c1 [GPU] Added fp16 support for GatherTree (#15983) 2023-02-28 09:54:56 +04:00
Wilson Seok
93a1be3607 Skip set_selected_impl() of post_optimize_weight when target generic layer is already created (#15852) 2023-02-27 11:24:53 -08:00
Eddy Kim
d2a5be0ab8 enabled exec_graph and pc in deserialized model (#15975) 2023-02-27 10:14:04 -08:00
Andrew Kwangwoong Park
39e63ace67 [GPU] Minor fix for dynamic mobilebert (#15909)
Signed-off-by: Andrew Park <andrew.park@intel.com>
2023-02-25 20:22:44 -08:00
Taylor Yeonbok Lee
fabf67ee5e [GPU] Enable crop for shape agnostic kernel (#15866)
* Enable crop shape agnostic kernel

* Added unit test

* Added new scalar argument for crop (eltwise) for being used as runtime input offset in shape agnostic kernel

* Fix eltwise to have runtime offset only for crop

* Fix unittest error

* Applied review comment
2023-02-25 15:49:46 -08:00
Taylor Yeonbok Lee
9822568194 Fix build error in clang++ (#15948) 2023-02-25 06:48:12 +04:00
Andrew Kwangwoong Park
46e8aad4bb [GPU] Fix output format not changing at runtime (#15887)
* [GPU] Fix output format not changing at runtime

Signed-off-by: Andrew Park <andrew.park@intel.com>

* Add remove_redundant_reorders pass TC for ov_gpu_unit_tests

Signed-off-by: Andrew Park <andrew.park@intel.com>

---------

Signed-off-by: Andrew Park <andrew.park@intel.com>
2023-02-24 14:26:54 -08:00
Eddy Kim
30939f5021 updated to share constant data memories across multiple streams (#15915) 2023-02-24 14:26:10 -08:00
Ilya Lavrenov
6d7b94b8cd Improved API validator logic (#15942) 2023-02-25 01:11:50 +04:00
Paul Youngsoo Ahn
c1c8d6320e [GPU] Apply multi-threads for async compilation context (#15683)
* [GPU] Apply multi-threads for async compilation context (#15683)
- Use CPUStreamExecutor in compilation context
- Use single compilation context, impl_cache and kernels_cache for multple streams
- Move compilation context to cldnn::program
- Move impl_cache to cldnn::program
- Create thread-safe impl_cache
- Create thread independent compilation function in kernels_cache
- Use kernels_cache in program and remove it from network

* [GPU] Fix segfault issue: ocl_engine and ocl_device are released during remained compilation context task are running (#15683)
- compilation context has own CPUStreamExecutor

* [GPU] Follow-up codereview (#15683)
- LruCacheThreadSafe inherit LruCache
- FuncRemoveItem has std::pair<Key,Value> as input
- Change prepare_tools to init_program

* [GPU] Create primitive_impl::build_kernels (#15683)

* [GPU] Fix unit test build error (#15683)

* [GPU] Remove redundant code (#15683)
- Remove try catch for debug
- Call compilation_context.cancel() in destructor of network

* [GPU] combine two atomic counter in kernels_cache (#15683)

* [GPU] Follow-up code review (#15683)

* [GPU] Fix nullptr exception in unit test (#15683)

* [GPU] Follow-up code review (#15683)
- Modify mutex lock in compilation context

* [GPU] Fix windows build issue (#15683)
2023-02-23 23:08:50 -08:00
hyunback kim
be5f90199d [GPU] Add oneDNN FC preferred_format to bfyx (#15704)
Signed-off-by: hyunback <hyunback.kim@intel.com>
2023-02-24 15:19:54 +09:00
Eddy Kim
f562e96305 [GPU] Fallback to kernel caching in the case of dynamic models (#15842)
* use kernel caching for dynamic models

* replaced cl_cache with blob

* updated to serialize dims info of input and output

* updated to skip unicode tests in Windows
2023-02-23 22:05:16 -08:00
Dohyun Kim (Felix)
a4f0b340d0 [GPU] Resolve unit test not run as onednn (#15217) 2023-02-24 10:07:56 +09:00
Dohyun Kim (Felix)
f00fb325a6 [GPU][DG2] Disable remained failing tests (#15873) 2023-02-24 10:07:01 +09:00
Ilya Lavrenov
87bcbc1747 Supported OpenSUSE 15.3 (#15897) 2023-02-23 11:25:33 +04:00
Dohyun Kim (Felix)
1028c7b5d5 [GPU] Fix weight reorder bug (#15672) 2023-02-23 14:48:46 +09:00
Jade Cho
c749163f72 [GPU] Update unit tests for swap XY (#15833) 2023-02-23 14:38:10 +09:00
Dohyun Kim (Felix)
1f196bacd3 [GPU][DG2] Fix some testcases (#15774)
* C++ exception with description write lock_type thrown in the test body. 
   Use get_output_values_to_float()
   * fusings_gpu/gemm_2in_act_scale_quantize_eltwise_i8.basic/2
   * fusings_gpu/gemm_2in_act_scale_eltwise.basic/2
* Remove WA test code of [GPU][DG2] Fix fusings_gpu/gemm_2in_scale.basic/7 #15353
   * Now non full-tensor post-ops are broadcasted
2023-02-23 14:23:40 +09:00
Dohyun Kim (Felix)
ed65583957 [GPU] Fix OV_GPU_DumpGraphs option (#15800) 2023-02-23 14:10:21 +09:00
Taylor Yeonbok Lee
4fd38844a2 [GPU] Fix remote blob creation to use original shape (#15864)
* Fix remote blob creation to use original shape

* Revert "Fix remote blob creation to use original shape"

This reverts commit 35c674aa97.

* Fix cldnn tensor adjusted blob to be reinterpreted with actual input layout
2023-02-21 22:22:51 -08:00
Eddy Kim
a6ff809ad7 [GPU] Model caching unit tests (#15413)
* gpu model caching unit tests

* added serialization unit tests

* added save and load for quantize primitive_inst

* reduced the range of inputs for Gemm tests

* updated the copyright year
2023-02-22 05:53:43 +00:00
Konstantin Beluchenko
7f3f576151 [GPU] Permute 5d optimization (#14170) 2023-02-21 14:39:53 +09:00
Dohyun Kim (Felix)
b7bcef6864 [GPU] Improve OV_GPU_DumpLayers debug configuration (#15719)
Co-authored-by: Kim,SungEun <sungeun.kim@intel.com>
2023-02-19 14:57:19 +00:00
Ilya Lavrenov
1d5839fb92 Fixed compilation with clang (#15801) 2023-02-19 16:22:18 +04:00
Ilya Lavrenov
ed5fa69b41 Fixed compilation on CI (#15787) 2023-02-17 22:28:48 +04:00
Roman Lyamin
efb51b058c [GPU] Added operator== for cldnn primitives (#15736) 2023-02-17 19:09:12 +04:00
Xuejun Zhai
91df0a8aa9 [API remove] remove variantImpl & variantwrapper related class/interfaces (#15580)
* [API remove] remove variantImpl & variantwrapper related class/interfaces

Signed-off-by: xuejun <xuejun.zhai@intel.com>

* [Remove APIs] fix code format issue

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>

* [Remove api] fix python compiler issue caused by deprecated varient

Signed-off-by: xuejun <Xuejun.Zhai@intel.com>

* [Remove APIs] fix code format issue

Signed-off-by: xuejun <xuejun.zhai@intel.com>

---------

Signed-off-by: xuejun <xuejun.zhai@intel.com>
Signed-off-by: xuejun <Xuejun.Zhai@intel.com>
2023-02-17 16:31:26 +04:00
Jade Cho
71cff0ae62 [GPU] Fix a bug of permute optimization (#15701)
* [GPU] Fix a bug of permute optimization

For int8 models, if there is FakeQuantize between permute and convolution, an operation like data type casting could be fused to permute. In this case, do not optimize permute.
2023-02-16 11:32:23 +00:00
Maciej Smyk
70cb829992 [DOCS] Move of developer documentation from wiki to md documents - master (#15372)
* CPU Plugin README creation

* debug capabilities

* Update debug_capabilities.md

* performance_analysis_ITT_counters

* cpu-emulation

* runtime_parameters_cache

* Update README.md

* internal_cpu_plugin_optimization

* See Also update for CPU Plugin

* See Also update for CPU Plugin 2

* intel_gpu

* Update README.md

* source code structure & See Also update for CPU plugin

* Update README.md

* See also update

* basic_data_structure

* memory_allocation_gpu_plugin

* Update memory_allocation_gpu_plugin.md

* simplified workflow

* graph optimization passes

* execution_of_inference

* GPU Plugin

* GPU Plugin fix

* Snippets

* Update README.md

* Update README.md

* fixes

* Snippets fix

* Update README.md

* component description

* Key Contacts

* Apply suggestions from code review

Co-authored-by: Ilya Churaev <ilyachur@gmail.com>

* Update src/plugins/intel_gpu/README.md

* Update src/plugins/intel_cpu/docs/internal_cpu_plugin_optimization.md

* Update src/plugins/intel_cpu/docs/internal_cpu_plugin_optimization.md

* Update src/plugins/intel_cpu/docs/internal_cpu_plugin_optimization.md

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

Text graphs to mermaid

* Update src/plugins/intel_gpu/docs/simplified_workflow.md

* Delete ov_intel_gpu_plugin_diagram.png

Removal of ov_intel_gpu_plugin_diagram.png file as the mermaid version is replacing it.

* Apply suggestions from code review

* Update src/common/snippets/README.md

---------

Co-authored-by: Sebastian Golebiewski <sebastianx.golebiewski@intel.com>
Co-authored-by: Ilya Churaev <ilyachur@gmail.com>
2023-02-16 11:03:11 +04:00
Taylor Yeonbok Lee
523b516835 [GPU] Support empty tensor (#15631)
* Support empty tensor in gpu plugin

* Common kernel setup for skipping

* Refactor

* Cleanup

* Fix for shape agnostic kernel

* Fix error due to memory allocation conflict for an empty input blob with other input blob

* Fix output blob parsing error

* Fixed quantize unittest error

* Fixed wrong TC

* Rename set_skip_kernels to update_kernels_list_to_skip

* Refactor output blob processing

* Applied review comments : more cleanup
2023-02-15 21:53:22 -08:00