yanlan song
72c3bf222b
fix coredump when quit benchmark_app ( #13026 )
...
* fix coredump when quit benchmark_app
Signed-off-by: fishbell <bell.song@intel.com >
* enable tests
Signed-off-by: fishbell <bell.song@intel.com >
* add macro to handle CPU not built
Signed-off-by: fishbell <bell.song@intel.com >
Signed-off-by: fishbell <bell.song@intel.com >
2022-09-15 16:47:11 +08:00
Evgenya Stepyreva
af16ea1d79
Revert "Fix experimental detectron do ref impl ( #10621 )" ( #12683 ) ( #13009 )
...
* Revert "Fix experimental detectron do ref impl (#10621 )"
This reverts commit d87233863d .
* Disabled Experimental Detectron per agreement with GPU team. Ticket to fix it: 90209
2022-09-12 18:16:13 +04:00
Sergey Shlyapnikov
af29d221b4
[GPU] Add NV12 -> Grayscale mode support ( #12988 )
...
* [GPU] Add NV12 -> Grayscale mode support
* Fix uv plane shape
2022-09-09 19:00:37 +04:00
yanlan song
facf990dfd
fix inconsistent tbb config due to executor used in multi ( #12929 )
...
* fix inconsistent tbb config due to executor used in multi
Signed-off-by: fishbell <bell.song@intel.com >
* refine comment
Signed-off-by: fishbell <bell.song@intel.com >
Signed-off-by: fishbell <bell.song@intel.com >
2022-09-08 13:34:22 +08:00
Mateusz Tabaka
41fa6f360b
Explicitly link onednn with tbb for tbb version in [2018,2019.4] ( #12789 ) ( #12837 )
...
Ticket: 89800
2022-08-31 17:14:54 +03:00
Gorokhov Dmitriy
a0b661a274
[CPU] Fixed MHA accuracy for mixed precision case ( #12820 )
2022-08-31 10:53:38 +04:00
Tomasz Dołbniak
72d7b518ca
cltools update to 22.08 [2022/2] ( #12690 )
...
* cltools update to 22.08
* Hash update
* Hash update
* Adjustments for the new package
2022-08-26 15:28:40 +04:00
Sergey Shlyapnikov
41a404f290
[GPU] fix Transpose issue for ConvertColor with FakeQuantize. ( #12645 ) ( #12761 )
...
Co-authored-by: Tang Wei <wei1.tang@intel.com >
Co-authored-by: Kurt Chen <kurt.chen@intel.com >
2022-08-26 12:29:21 +04:00
Sergey Shlyapnikov
429c7265df
[GPU] Implement NMS-9 operation ( #11890 ) ( #12760 )
...
* Fix GPU NonMaxSuppression implementation
* Introduce Nms9 single layer tests
* Adapt internal NMS and GPU implementation for NMS9 implementation
* Adapt CPU implementation in GPU for NMS9
* Add blocked layouts support to NMS
* Add unit tests for blocked formats for NMS
* Fix boxes groups size for the small shapes
* Use ocl implementation for blocked layout input
* Fix templates typedefs to pass win build
* Fix second output to set data in correct format
Co-authored-by: Tetiana Gubanova <tgubanova@lohika.com >
2022-08-26 00:37:20 +04:00
Sergey Shlyapnikov
a3f8cef198
[GPU] Shared memory optimization for network::execute_impl() call ( #12748 )
2022-08-25 15:49:56 +04:00
guozhong wang
f409e95768
do not remove cpu when bind buffer ( #12556 )
...
Co-authored-by: Shen, Wanglei <wanglei.shen@intel.com >
2022-08-25 09:05:42 +03:00
Chen Xu
1e5fec7e25
[CPU] Reduce node improve performance for nspc layout ( #12671 )
2022-08-24 15:39:55 +04:00
Luwei Zhou
aa1a607328
[CPU] Fix the strided slice issue when ellipsis_mask has redundant data. ( #12705 )
2022-08-24 09:43:08 +04:00
Andrei Kochin
f87e00398d
updated to convert b_fs_yx_fsv16 to o_is_yx_isv16 ( #12630 ) ( #12675 )
...
Co-authored-by: Eddy Kim <eddy.kim@intel.com >
2022-08-23 15:46:54 +03:00
Gorokhov Dmitriy
a6bfc0cf0e
[CPU] Support MHA optimization ( #12643 )
...
* [CPU] Support MHA optimization
* [CPU] Extend pattern supported by MHA node
* [CPU] MHA: fixed int8 perf issue
Co-authored-by: Gu, Jianan <jianan.gu@intel.com >
2022-08-23 12:50:02 +04:00
yanlan song
4d9443eb0e
do not call get_profiling in threads ( #12635 )
...
* do not call get_profiling in threads
Signed-off-by: fishbell <bell.song@intel.com >
* indent
Signed-off-by: fishbell <bell.song@intel.com >
Signed-off-by: fishbell <bell.song@intel.com >
Co-authored-by: Chen Peter <peter.chen@intel.com >
2022-08-23 13:50:52 +08:00
Luo Cheng
e03fbd5c15
[CPU] Default enable avx512 f32 brgconv ( #12620 )
2022-08-19 17:59:15 +04:00
Ilya Lavrenov
29628a89b7
Tbb port ( #12541 )
...
* Fixes for TBB 2018-2019.4
* Fixed CVS-89248
2022-08-15 06:26:47 +04:00
Mateusz Tabaka
c0212a361a
[CPU] Add RDFT and IRDFT operators ( #12290 )
...
Tickets: 79178 and 79192
Co-authored-by: Mateusz Bencer <mateusz.bencer@intel.com >
2022-08-12 14:10:53 +02:00
Mateusz Bencer
e628fae196
[GPU] Decompose NormalizeL2 for not supported cases ( #12404 )
2022-08-11 11:32:03 +02:00
Min, Byungil
f0f6896fc0
[GPU] Fix network loading time related to onednn engine creation ( #12492 )
...
+ benchmark cache_dir option takes longer than cl_cache_dir env in loading network.
+ For clDNN execution, benchmark cache_dir created onednn_engine if just ONEDNN_ENABLE config is ON.
+ Creation of onednn_engine in ocl_engine is changed to on-demand.
Signed-off-by: Min, Byungil <byungil.min@intel.com >
Signed-off-by: Min, Byungil <byungil.min@intel.com >
2022-08-11 09:32:20 +04:00
River Li
d328b00e48
[CC]Fix CC issue for transformation ( #12292 ) ( #12489 )
...
* Revert "Fixed 3 naming issue"
This reverts commit a92d3cfff5 .
* Revert "Fix CC issues for transformation and snippets"
This reverts commit d08a3f5aac .
* Fix NGRAPH_PASS_CALLBACK issue to make it can work
* Fix matcher name missing issue
2022-08-10 11:36:51 +04:00
Wilson Seok
1788c86943
change to node.weights() from weights_memory(0) ( #12407 )
2022-08-10 16:18:58 +09:00
Andrew Kwangwoong Park
ea302afb47
Update pre_replace_deconv to support output_shape for transposed conv ( #12418 )
...
Signed-off-by: Andrew Park <andrew.park@intel.com >
2022-08-10 10:37:51 +09:00
Ilya Churaev
c9f9795d29
Fixed newAPI for case if core was removed ( #12208 )
...
* Fixed newAPI for case if core was removed
* Fixed code style
* Fixed typo
* Use new API by default
* Create core with template plugin
* Added doxygen comment
Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com >
2022-07-23 11:53:26 +00:00
Kelvin Choi
3a72200f92
[GPU] Add reorder from i32 to f32 for max-pooling/conv/fc which doesn't support i32 ( #12144 )
2022-07-20 22:14:22 +09:00
Egor Duplenskii
fdae95a769
[CPU] Explicitly enable DNNL_VERBOSE only in case of CPU_DEBUG_CAPS ( #12151 )
...
and rely on oneDNN default behavior otherwise
2022-07-20 14:07:42 +04:00
Chenhu Wang
123f8e62bf
[DOC][CPU] Denormals optimization document ( #12132 )
2022-07-18 16:37:44 +04:00
Taylor Yeonbok Lee
8c80f9ff58
[GPU] optimize permute_ref ( #12160 )
...
* change memory access pattern of fsv layout for permute
* Fix permute_ref to process F first only when (bf...) => (b...f)
* Refactor
Co-authored-by: si-eun-kim <sieun.kim@intel.com >
2022-07-18 18:26:00 +09:00
Eddy Kim
de5e9bb397
Revert "[GPU] Pass activation unit tests on DG2 ( #11969 )" ( #12165 )
...
This reverts commit 3334e8933c .
2022-07-18 18:25:45 +09:00
zihan wu
32f800c6a6
[CPU] polish onednn cc readme ( #12114 ) ( #12176 )
2022-07-15 16:36:31 +00:00
Min, Byungil
b492f98d30
[GPU] modify fusing condition for reduce ( #12147 )
...
Signed-off-by: Min, Byungil <byungil.min@intel.com >
2022-07-15 16:07:43 +09:00
Andrew Kwangwoong Park
9c49b71c11
Enable tensor offset to GemmKernelRef for input padding support ( #12140 )
...
Signed-off-by: Andrew Park <andrew.park@intel.com >
2022-07-15 16:01:35 +09:00
Luo Cheng
4412e1ddfa
[CPU] revert pr 11990 and enable brgconv avx512 on SPR by default ( #12134 )
2022-07-14 14:10:51 +04:00
Tingqian Li
b7b3f0ab4a
move cpu_dump_check into CPU plugin's tools folder ( #12123 )
2022-07-13 13:38:17 +08:00
Paul Youngsoo Ahn
0621e8cf28
[GPU] Fix gather data type issue ( #12089 ) ( #12089 )
2022-07-12 19:01:07 +09:00
Tomasz Dołbniak
9d6d84088f
Virtual destructor for the base class ( #12103 )
2022-07-12 11:55:41 +02:00
Eddy Kim
a63dad6fdd
updated to fuse activation in eltwise_vload8 ( #12092 )
2022-07-12 18:51:48 +09:00
Wang, Yang
bbc1c26750
setting tput as the default performance mode only for AUTO, excluding MULTI plugin. ( #12090 )
...
Signed-off-by: ywang2 <yang4.wang@intel.com >
Co-authored-by: Chen Peter <peter.chen@intel.com >
Co-authored-by: Shen, Wanglei <wanglei.shen@intel.com >
2022-07-10 15:16:59 +08:00
Eddy Kim
8d852b4aee
fixed 'is_rotating_except_batch' to follow the IE order ( #12050 )
2022-07-08 15:36:17 +09:00
Tingqian Li
bc34fa0934
[CPU] Re-enable Selective build on oneDNN2.6 ( #12074 )
...
* update submodule onednn26 selective build
* onednn code review
* merge onednn selective build
* fix bug in cc onednn26
Co-authored-by: zihan wu <zihan.wu@intel.com >
2022-07-08 03:48:12 +00:00
guozhong wang
ab8c2f6fd8
change gpunum to 3 ( #12073 )
2022-07-07 18:15:27 +03:00
Andrew Kwangwoong Park
32937ab7ca
Add Debug Config for maximum kernels per batch ( #12068 )
...
Signed-off-by: Andrew Park <andrew.park@intel.com >
2022-07-07 14:26:51 +03:00
guozhong wang
cd6c7da91c
AUTO/MULTI supports ov::auto_batch_timeout ( #12023 )
...
* add auto_batch_timeout for MULTI and AUTO
* fix clang-format for ie_core.cpp
* fix coredump
* simplify insert key to deviceConfig logic and parseDeviceNameIntoConfig() check "AUTO" and "AUTO:" only
* check config auto_batch_timeout
* add CleanUpInIECore()
* fix clang-format for ie_core.cpp
2022-07-07 10:33:04 +00:00
Luwei Zhou
0224e6a067
Fix the deconv depwise post ops issue on AVX2 and AVX512 and enable deconv test ( #11870 )
...
* Fix the deconv fused issue on AVX2 and AVX512 and enable deconv test
* Keep GroupDeconv BF16 test cases still disabled.
* Update to also excluding nightly
* Update onednn submodule.
* Update onednn submodule
* Update onednn submodule.
* Update the ONDENN submodule
* Update the ONEDNN commit.
* Update with merged onednn commit.
2022-07-07 13:26:44 +08:00
River Li
b80f724414
Fix rnn cache missing issue ( #12053 )
2022-07-06 11:20:27 +00:00
Kelvin Choi
63ab516c85
[GPU] Delete previous inputs by numbered new name for batching ( #12045 )
2022-07-06 16:32:14 +09:00
yanlan song
e718e51a85
Bell/fix lifecycle coredump ( #11934 )
...
* enable binder schedule
Signed-off-by: fishbell <bell.song@intel.com >
* add cases
Signed-off-by: fishbell <bell.song@intel.com >
* refine
Signed-off-by: fishbell <bell.song@intel.com >
* fix build failure
Signed-off-by: fishbell <bell.song@intel.com >
* fix coredump
Signed-off-by: fishbell <bell.song@intel.com >
* do not return hw requests directly, potential issues
Signed-off-by: fishbell <bell.song@intel.com >
* fix bug
Signed-off-by: fishbell <bell.song@intel.com >
typo
Signed-off-by: fishbell <bell.song@intel.com >
* optimize memory
Signed-off-by: fishbell <bell.song@intel.com >
* hold the hw plugin
Signed-off-by: fishbell <bell.song@intel.com >
* Revert "hold the hw plugin"
This reverts commit 5b537f5b6f .
* apply the fix
Signed-off-by: fishbell <bell.song@intel.com >
apply the fix
Signed-off-by: fishbell <bell.song@intel.com >
* hold the plugin library for destructing tensor
Signed-off-by: fishbell <bell.song@intel.com >
* solve the virtuual plugin Getblob life cycle issue
Signed-off-by: fishbell <bell.song@intel.com >
* remove log
Signed-off-by: fishbell <bell.song@intel.com >
* refine interface
Signed-off-by: fishbell <bell.song@intel.com >
* fix build failure
Signed-off-by: fishbell <bell.song@intel.com >
* fix for hetero plugin
Signed-off-by: fishbell <bell.song@intel.com >
* replace with vector
* enable life time tests for virtual plugins
Signed-off-by: fishbell <bell.song@intel.com >
rework cases due to vpux build issue
Signed-off-by: fishbell <bell.song@intel.com >
disable context test for now
Signed-off-by: fishbell <bell.song@intel.com >
Co-authored-by: Chen Peter <peter.chen@intel.com >
2022-07-06 05:21:17 +00:00
opoluektov-lohika
7a50ce2491
Coverity: fix issue with uninitialized members ( #11996 )
2022-07-05 23:55:53 +00:00
Mateusz Bencer
43c0c964b8
Added FoldSubgraphEmptyInputs transformation ( #11957 )
2022-07-05 19:38:46 +02:00