Commit Graph

647 Commits

Author SHA1 Message Date
yanlan song
72c3bf222b fix coredump when quit benchmark_app (#13026)
* fix coredump when quit benchmark_app

Signed-off-by: fishbell <bell.song@intel.com>

* enable tests

Signed-off-by: fishbell <bell.song@intel.com>

* add macro to handle CPU not built

Signed-off-by: fishbell <bell.song@intel.com>

Signed-off-by: fishbell <bell.song@intel.com>
2022-09-15 16:47:11 +08:00
Evgenya Stepyreva
af16ea1d79 Revert "Fix experimental detectron do ref impl (#10621)" (#12683) (#13009)
* Revert "Fix experimental detectron do ref impl (#10621)"

This reverts commit d87233863d.

* Disabled Experimental Detectron per agreement with GPU team. Ticket to fix it: 90209
2022-09-12 18:16:13 +04:00
Sergey Shlyapnikov
af29d221b4 [GPU] Add NV12 -> Grayscale mode support (#12988)
* [GPU] Add NV12 -> Grayscale mode support

* Fix uv plane shape
2022-09-09 19:00:37 +04:00
yanlan song
facf990dfd fix inconsistent tbb config due to executor used in multi (#12929)
* fix inconsistent tbb config due to executor used in multi

Signed-off-by: fishbell <bell.song@intel.com>

* refine comment

Signed-off-by: fishbell <bell.song@intel.com>

Signed-off-by: fishbell <bell.song@intel.com>
2022-09-08 13:34:22 +08:00
Mateusz Tabaka
41fa6f360b Explicitly link onednn with tbb for tbb version in [2018,2019.4] (#12789) (#12837)
Ticket: 89800
2022-08-31 17:14:54 +03:00
Gorokhov Dmitriy
a0b661a274 [CPU] Fixed MHA accuracy for mixed precision case (#12820) 2022-08-31 10:53:38 +04:00
Tomasz Dołbniak
72d7b518ca cltools update to 22.08 [2022/2] (#12690)
* cltools update to 22.08

* Hash update

* Hash update

* Adjustments for the new package
2022-08-26 15:28:40 +04:00
Sergey Shlyapnikov
41a404f290 [GPU] fix Transpose issue for ConvertColor with FakeQuantize. (#12645) (#12761)
Co-authored-by: Tang Wei <wei1.tang@intel.com>
Co-authored-by: Kurt Chen <kurt.chen@intel.com>
2022-08-26 12:29:21 +04:00
Sergey Shlyapnikov
429c7265df [GPU] Implement NMS-9 operation (#11890) (#12760)
* Fix GPU NonMaxSuppression implementation

* Introduce Nms9 single layer tests

* Adapt internal NMS and GPU implementation for NMS9 implementation

* Adapt CPU implementation in GPU for NMS9

* Add blocked layouts support to NMS

* Add unit tests for blocked formats for NMS

* Fix boxes groups size for the small shapes

* Use ocl implementation for blocked layout input

* Fix templates typedefs to pass win build

* Fix second output to set data in correct format

Co-authored-by: Tetiana Gubanova <tgubanova@lohika.com>
2022-08-26 00:37:20 +04:00
Sergey Shlyapnikov
a3f8cef198 [GPU] Shared memory optimization for network::execute_impl() call (#12748) 2022-08-25 15:49:56 +04:00
guozhong wang
f409e95768 do not remove cpu when bind buffer (#12556)
Co-authored-by: Shen, Wanglei <wanglei.shen@intel.com>
2022-08-25 09:05:42 +03:00
Chen Xu
1e5fec7e25 [CPU] Reduce node improve performance for nspc layout (#12671) 2022-08-24 15:39:55 +04:00
Luwei Zhou
aa1a607328 [CPU] Fix the strided slice issue when ellipsis_mask has redundant data. (#12705) 2022-08-24 09:43:08 +04:00
Andrei Kochin
f87e00398d updated to convert b_fs_yx_fsv16 to o_is_yx_isv16 (#12630) (#12675)
Co-authored-by: Eddy Kim <eddy.kim@intel.com>
2022-08-23 15:46:54 +03:00
Gorokhov Dmitriy
a6bfc0cf0e [CPU] Support MHA optimization (#12643)
* [CPU] Support MHA optimization

* [CPU] Extend pattern supported by MHA node

* [CPU] MHA: fixed int8 perf issue

Co-authored-by: Gu, Jianan <jianan.gu@intel.com>
2022-08-23 12:50:02 +04:00
yanlan song
4d9443eb0e do not call get_profiling in threads (#12635)
* do not call get_profiling in threads

Signed-off-by: fishbell <bell.song@intel.com>

* indent

Signed-off-by: fishbell <bell.song@intel.com>

Signed-off-by: fishbell <bell.song@intel.com>
Co-authored-by: Chen Peter <peter.chen@intel.com>
2022-08-23 13:50:52 +08:00
Luo Cheng
e03fbd5c15 [CPU] Default enable avx512 f32 brgconv (#12620) 2022-08-19 17:59:15 +04:00
Ilya Lavrenov
29628a89b7 Tbb port (#12541)
* Fixes for TBB 2018-2019.4

* Fixed CVS-89248
2022-08-15 06:26:47 +04:00
Mateusz Tabaka
c0212a361a [CPU] Add RDFT and IRDFT operators (#12290)
Tickets: 79178 and 79192

Co-authored-by: Mateusz Bencer <mateusz.bencer@intel.com>
2022-08-12 14:10:53 +02:00
Mateusz Bencer
e628fae196 [GPU] Decompose NormalizeL2 for not supported cases (#12404) 2022-08-11 11:32:03 +02:00
Min, Byungil
f0f6896fc0 [GPU] Fix network loading time related to onednn engine creation (#12492)
+ benchmark cache_dir option takes longer than cl_cache_dir env in loading network.
+ For clDNN execution, benchmark cache_dir created onednn_engine if just ONEDNN_ENABLE config is ON.
+ Creation of onednn_engine in ocl_engine is changed to on-demand.

Signed-off-by: Min, Byungil <byungil.min@intel.com>

Signed-off-by: Min, Byungil <byungil.min@intel.com>
2022-08-11 09:32:20 +04:00
River Li
d328b00e48 [CC]Fix CC issue for transformation (#12292) (#12489)
* Revert "Fixed 3 naming issue"

This reverts commit a92d3cfff5.

* Revert "Fix CC issues for transformation and snippets"

This reverts commit d08a3f5aac.

* Fix NGRAPH_PASS_CALLBACK issue to make it can work

* Fix matcher name missing issue
2022-08-10 11:36:51 +04:00
Wilson Seok
1788c86943 change to node.weights() from weights_memory(0) (#12407) 2022-08-10 16:18:58 +09:00
Andrew Kwangwoong Park
ea302afb47 Update pre_replace_deconv to support output_shape for transposed conv (#12418)
Signed-off-by: Andrew Park <andrew.park@intel.com>
2022-08-10 10:37:51 +09:00
Ilya Churaev
c9f9795d29 Fixed newAPI for case if core was removed (#12208)
* Fixed newAPI for case if core was removed

* Fixed code style

* Fixed typo

* Use new API by default

* Create core with template plugin

* Added doxygen comment

Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
2022-07-23 11:53:26 +00:00
Kelvin Choi
3a72200f92 [GPU] Add reorder from i32 to f32 for max-pooling/conv/fc which doesn't support i32 (#12144) 2022-07-20 22:14:22 +09:00
Egor Duplenskii
fdae95a769 [CPU] Explicitly enable DNNL_VERBOSE only in case of CPU_DEBUG_CAPS (#12151)
and rely on oneDNN default behavior otherwise
2022-07-20 14:07:42 +04:00
Chenhu Wang
123f8e62bf [DOC][CPU] Denormals optimization document (#12132) 2022-07-18 16:37:44 +04:00
Taylor Yeonbok Lee
8c80f9ff58 [GPU] optimize permute_ref (#12160)
* change memory access pattern of fsv layout for permute

* Fix permute_ref to process F first only when (bf...) => (b...f)

* Refactor

Co-authored-by: si-eun-kim <sieun.kim@intel.com>
2022-07-18 18:26:00 +09:00
Eddy Kim
de5e9bb397 Revert "[GPU] Pass activation unit tests on DG2 (#11969)" (#12165)
This reverts commit 3334e8933c.
2022-07-18 18:25:45 +09:00
zihan wu
32f800c6a6 [CPU] polish onednn cc readme (#12114) (#12176) 2022-07-15 16:36:31 +00:00
Min, Byungil
b492f98d30 [GPU] modify fusing condition for reduce (#12147)
Signed-off-by: Min, Byungil <byungil.min@intel.com>
2022-07-15 16:07:43 +09:00
Andrew Kwangwoong Park
9c49b71c11 Enable tensor offset to GemmKernelRef for input padding support (#12140)
Signed-off-by: Andrew Park <andrew.park@intel.com>
2022-07-15 16:01:35 +09:00
Luo Cheng
4412e1ddfa [CPU] revert pr 11990 and enable brgconv avx512 on SPR by default (#12134) 2022-07-14 14:10:51 +04:00
Tingqian Li
b7b3f0ab4a move cpu_dump_check into CPU plugin's tools folder (#12123) 2022-07-13 13:38:17 +08:00
Paul Youngsoo Ahn
0621e8cf28 [GPU] Fix gather data type issue (#12089) (#12089) 2022-07-12 19:01:07 +09:00
Tomasz Dołbniak
9d6d84088f Virtual destructor for the base class (#12103) 2022-07-12 11:55:41 +02:00
Eddy Kim
a63dad6fdd updated to fuse activation in eltwise_vload8 (#12092) 2022-07-12 18:51:48 +09:00
Wang, Yang
bbc1c26750 setting tput as the default performance mode only for AUTO, excluding MULTI plugin. (#12090)
Signed-off-by: ywang2 <yang4.wang@intel.com>

Co-authored-by: Chen Peter <peter.chen@intel.com>
Co-authored-by: Shen, Wanglei <wanglei.shen@intel.com>
2022-07-10 15:16:59 +08:00
Eddy Kim
8d852b4aee fixed 'is_rotating_except_batch' to follow the IE order (#12050) 2022-07-08 15:36:17 +09:00
Tingqian Li
bc34fa0934 [CPU] Re-enable Selective build on oneDNN2.6 (#12074)
* update submodule onednn26 selective build

* onednn code review

* merge onednn selective build

* fix bug in cc onednn26

Co-authored-by: zihan wu <zihan.wu@intel.com>
2022-07-08 03:48:12 +00:00
guozhong wang
ab8c2f6fd8 change gpunum to 3 (#12073) 2022-07-07 18:15:27 +03:00
Andrew Kwangwoong Park
32937ab7ca Add Debug Config for maximum kernels per batch (#12068)
Signed-off-by: Andrew Park <andrew.park@intel.com>
2022-07-07 14:26:51 +03:00
guozhong wang
cd6c7da91c AUTO/MULTI supports ov::auto_batch_timeout (#12023)
* add auto_batch_timeout for MULTI and AUTO

* fix clang-format for ie_core.cpp

* fix coredump

* simplify insert key to deviceConfig logic and parseDeviceNameIntoConfig() check "AUTO" and "AUTO:" only

* check config auto_batch_timeout

* add CleanUpInIECore()

* fix clang-format for ie_core.cpp
2022-07-07 10:33:04 +00:00
Luwei Zhou
0224e6a067 Fix the deconv depwise post ops issue on AVX2 and AVX512 and enable deconv test (#11870)
* Fix the deconv fused issue on AVX2 and AVX512 and enable deconv test

* Keep GroupDeconv BF16 test cases still disabled.

* Update to also excluding nightly

* Update onednn submodule.

* Update onednn submodule

* Update onednn submodule.

* Update the ONDENN submodule

* Update the ONEDNN commit.

* Update with merged onednn commit.
2022-07-07 13:26:44 +08:00
River Li
b80f724414 Fix rnn cache missing issue (#12053) 2022-07-06 11:20:27 +00:00
Kelvin Choi
63ab516c85 [GPU] Delete previous inputs by numbered new name for batching (#12045) 2022-07-06 16:32:14 +09:00
yanlan song
e718e51a85 Bell/fix lifecycle coredump (#11934)
* enable binder schedule

Signed-off-by: fishbell <bell.song@intel.com>

* add cases

Signed-off-by: fishbell <bell.song@intel.com>

* refine

Signed-off-by: fishbell <bell.song@intel.com>

* fix build failure

Signed-off-by: fishbell <bell.song@intel.com>

* fix coredump

Signed-off-by: fishbell <bell.song@intel.com>

* do not return hw requests directly, potential issues

Signed-off-by: fishbell <bell.song@intel.com>

* fix bug

Signed-off-by: fishbell <bell.song@intel.com>

typo

Signed-off-by: fishbell <bell.song@intel.com>

* optimize memory

Signed-off-by: fishbell <bell.song@intel.com>

* hold the hw plugin

Signed-off-by: fishbell <bell.song@intel.com>

* Revert "hold the hw plugin"

This reverts commit 5b537f5b6f.

* apply the fix

Signed-off-by: fishbell <bell.song@intel.com>

apply the fix

Signed-off-by: fishbell <bell.song@intel.com>

* hold the plugin library for destructing tensor

Signed-off-by: fishbell <bell.song@intel.com>

* solve the virtuual plugin Getblob life cycle issue

Signed-off-by: fishbell <bell.song@intel.com>

* remove log

Signed-off-by: fishbell <bell.song@intel.com>

* refine interface

Signed-off-by: fishbell <bell.song@intel.com>

* fix build failure

Signed-off-by: fishbell <bell.song@intel.com>

* fix for hetero plugin

Signed-off-by: fishbell <bell.song@intel.com>

* replace with vector

* enable life time tests for virtual plugins

Signed-off-by: fishbell <bell.song@intel.com>

rework cases due to vpux build issue

Signed-off-by: fishbell <bell.song@intel.com>

disable context test for now

Signed-off-by: fishbell <bell.song@intel.com>

Co-authored-by: Chen Peter <peter.chen@intel.com>
2022-07-06 05:21:17 +00:00
opoluektov-lohika
7a50ce2491 Coverity: fix issue with uninitialized members (#11996) 2022-07-05 23:55:53 +00:00
Mateusz Bencer
43c0c964b8 Added FoldSubgraphEmptyInputs transformation (#11957) 2022-07-05 19:38:46 +02:00