Commit Graph

226 Commits

Author SHA1 Message Date
Evgenya Stepyreva
af16ea1d79 Revert "Fix experimental detectron do ref impl (#10621)" (#12683) (#13009)
* Revert "Fix experimental detectron do ref impl (#10621)"

This reverts commit d87233863d.

* Disabled Experimental Detectron per agreement with GPU team. Ticket to fix it: 90209
2022-09-12 18:16:13 +04:00
Mateusz Tabaka
41fa6f360b Explicitly link onednn with tbb for tbb version in [2018,2019.4] (#12789) (#12837)
Ticket: 89800
2022-08-31 17:14:54 +03:00
Gorokhov Dmitriy
a0b661a274 [CPU] Fixed MHA accuracy for mixed precision case (#12820) 2022-08-31 10:53:38 +04:00
Chen Xu
1e5fec7e25 [CPU] Reduce node improve performance for nspc layout (#12671) 2022-08-24 15:39:55 +04:00
Luwei Zhou
aa1a607328 [CPU] Fix the strided slice issue when ellipsis_mask has redundant data. (#12705) 2022-08-24 09:43:08 +04:00
Gorokhov Dmitriy
a6bfc0cf0e [CPU] Support MHA optimization (#12643)
* [CPU] Support MHA optimization

* [CPU] Extend pattern supported by MHA node

* [CPU] MHA: fixed int8 perf issue

Co-authored-by: Gu, Jianan <jianan.gu@intel.com>
2022-08-23 12:50:02 +04:00
Luo Cheng
e03fbd5c15 [CPU] Default enable avx512 f32 brgconv (#12620) 2022-08-19 17:59:15 +04:00
Ilya Lavrenov
29628a89b7 Tbb port (#12541)
* Fixes for TBB 2018-2019.4

* Fixed CVS-89248
2022-08-15 06:26:47 +04:00
Mateusz Tabaka
c0212a361a [CPU] Add RDFT and IRDFT operators (#12290)
Tickets: 79178 and 79192

Co-authored-by: Mateusz Bencer <mateusz.bencer@intel.com>
2022-08-12 14:10:53 +02:00
River Li
d328b00e48 [CC]Fix CC issue for transformation (#12292) (#12489)
* Revert "Fixed 3 naming issue"

This reverts commit a92d3cfff5.

* Revert "Fix CC issues for transformation and snippets"

This reverts commit d08a3f5aac.

* Fix NGRAPH_PASS_CALLBACK issue to make it can work

* Fix matcher name missing issue
2022-08-10 11:36:51 +04:00
Ilya Churaev
c9f9795d29 Fixed newAPI for case if core was removed (#12208)
* Fixed newAPI for case if core was removed

* Fixed code style

* Fixed typo

* Use new API by default

* Create core with template plugin

* Added doxygen comment

Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
2022-07-23 11:53:26 +00:00
Egor Duplenskii
fdae95a769 [CPU] Explicitly enable DNNL_VERBOSE only in case of CPU_DEBUG_CAPS (#12151)
and rely on oneDNN default behavior otherwise
2022-07-20 14:07:42 +04:00
Chenhu Wang
123f8e62bf [DOC][CPU] Denormals optimization document (#12132) 2022-07-18 16:37:44 +04:00
zihan wu
32f800c6a6 [CPU] polish onednn cc readme (#12114) (#12176) 2022-07-15 16:36:31 +00:00
Luo Cheng
4412e1ddfa [CPU] revert pr 11990 and enable brgconv avx512 on SPR by default (#12134) 2022-07-14 14:10:51 +04:00
Tingqian Li
b7b3f0ab4a move cpu_dump_check into CPU plugin's tools folder (#12123) 2022-07-13 13:38:17 +08:00
Tingqian Li
bc34fa0934 [CPU] Re-enable Selective build on oneDNN2.6 (#12074)
* update submodule onednn26 selective build

* onednn code review

* merge onednn selective build

* fix bug in cc onednn26

Co-authored-by: zihan wu <zihan.wu@intel.com>
2022-07-08 03:48:12 +00:00
Luwei Zhou
0224e6a067 Fix the deconv depwise post ops issue on AVX2 and AVX512 and enable deconv test (#11870)
* Fix the deconv fused issue on AVX2 and AVX512 and enable deconv test

* Keep GroupDeconv BF16 test cases still disabled.

* Update to also excluding nightly

* Update onednn submodule.

* Update onednn submodule

* Update onednn submodule.

* Update the ONDENN submodule

* Update the ONEDNN commit.

* Update with merged onednn commit.
2022-07-07 13:26:44 +08:00
River Li
b80f724414 Fix rnn cache missing issue (#12053) 2022-07-06 11:20:27 +00:00
Mateusz Bencer
43c0c964b8 Added FoldSubgraphEmptyInputs transformation (#11957) 2022-07-05 19:38:46 +02:00
Pawel Raasz
e1bcfeca9d Add SoftSign to CPU plugin (#12034) 2022-07-05 13:34:42 +02:00
Chenhu Wang
8c152405ad [CPU] General denormals optimization (#11883)
* FTZ_and_DAZ_set_for_cpu

* remove DAZ

* fix

* extract to utils

* ie core part changes to add do as property and benchmark_app enable do

* enable brgcov from Luocheng patch

* add debug info

* enable_brgemm_on_avx512

* add python binding

* dlb test

* FTZ_and_DAZ_set_for_cpu

* remove DAZ

* fix

* extract to utils

* ie core part changes to add do as property and benchmark_app enable do

* enable brgcov from Luocheng patch

* add debug info

* enable_brgemm_on_avx512

* add python binding

* dlb test

* revert test code

* revert test code
2022-07-05 15:50:16 +08:00
Tingqian Li
3f9c6b2f3f [BUG fix] Reshape node: WA in-place failure case by mem-copy (#10828)
* Handle in-place failure cases in reshape node

* Disable inplace when non-const reshape connected to constant

* Add comment to reshape_inplace test

* move copy WA into execute() to cover more general in-place failure cases
2022-07-05 04:46:27 +00:00
Mang Guo
a571539107 Optimize FullyConnected FakeQuantize post-ops (#11819)
* Optimize FullyConnected FakeQuantize post-ops

* matmul bias fuse

* Add simplifyToScale for FakeQuantize and use it in FC and Conv.

* Add fakequantize documentation

* Update doc and fix accuracy issue

* Update doc

* Fix accuracy regression

* Generalize the judgment Criteria about fake quantization with scale

* Update document

Co-authored-by: Zhang Yi3 <yi3.zhang@intel.com>
Co-authored-by: xuchen-intel <chen.xu@intel.com>
2022-07-05 09:39:42 +08:00
Luo Cheng
35ee842446 [CPU] [WA] Use config to enable brgconv f32 kernel (#11990)
* enable brgconv f32

* use config to enable brgconv f32

* when brg disabled not init bin-postops

* change prop name for extensive

* use more general field

* fix review comments.
2022-07-05 07:14:40 +08:00
avoskoboinyk-lohika
88784c2b6f [CPU] Optimize NonZero operation (#11549)
* [CPU] Optimize NonZero operation

# Conflicts:
#	src/plugins/intel_cpu/src/nodes/non_zero.cpp

* [CPU] Rewrite NonZero implementation, so it will use generic ie_parallel API

* [CPU] NonZero operation: apply an additional optimization

* NonZero operation: add fallback code for inRank >= 6

* NonZero operation: apply review modifications

# Conflicts:
#	src/plugins/intel_cpu/src/nodes/non_zero.cpp

* NonZero operation: inShape.getDims().size() -> inRank

* NonZero operation: eliminate input array index calculation by slight modification of ie_parallel API

* Adjust ie_parallel.hpp style for clang-format

* Try to unbreak the build

* Move to parallel_nt and add a cache for nd loops to optimize more

* Add minimal size threshold for threading and reduce warning count

* Try to workaround linter errors

* One more try to unbreak cpplint build

Co-authored-by: Michal Lukaszewski <michal.lukaszewski@intel.com>
2022-07-04 10:52:18 +08:00
Mang Guo
d22c429d0e [CPU] Remove vmaxps in store_vector. (#12005)
* Remove vmaxps in store_vector.
This instruction is not needed for dst_prc int8.
And it may lead to wrong result with denormals optimization is on.

* Add vpmaxsd if dst_prc is u8 or u16.
2022-07-02 13:22:05 +00:00
Bo Liu
7834dba545 fix CPU Plugin deformable conv Node output incorrect issues with uneven dilations (#11940) 2022-06-29 18:14:30 +08:00
Chenhu Wang
1288706589 large_batch_opt (#11951) 2022-06-28 10:33:16 +08:00
opoluektov-lohika
8a21e4e062 [GPU] Implement ExperimentalDetectronDetectionOutput operation (#11772)
* ExperimentalDetectronDetectionOutput: refine sorting criteria for NMS stage

This is to ensure the operation produces stable predictable results across
the possible sorting algorithm implementaions.
This property is useful for the operation testing.

* [GPU] Implement ExperimentalDetectronDetectionOutput operation

* [GPU] ExperimentalDetectronDetectionOutput: use vector types and operations in kernel

* Reformat changed files to make clang format checker happy

* [GPU] ExperimentalDetectronDetectionOutput: add another test case to the unit test

* [GPU] ExperimentalDetectronDetectionOutput: Add f16 test

* ExperimentalDetectronDetectionOutput: single-layer test: use all three outputs

* [GPU] ExperimentalDetectronDetectionOutput: increase single layer test coverage

More attribute permutations were added.
2022-06-27 23:11:03 +09:00
Chenhu Wang
95a297ed68 onednn_update (#11930) 2022-06-27 11:22:50 +00:00
Luwei Zhou
4be0c59505 Fix the Non_Zero childedge check. (#11963) 2022-06-27 10:43:38 +08:00
Roman Baranchuk
5cba0ae871 [CPU] GRN: dynamic shapes support (#11678) 2022-06-22 10:45:06 +08:00
Roman Baranchuk
dab9da25fa [CPU] Roll: dynamic shapes support (#11707) 2022-06-22 10:33:18 +08:00
Roman Baranchuk
44cecc8579 [CPU] ReverseSequence: dynamic shapes support (#11644) 2022-06-22 10:27:06 +08:00
Xiping Yan
870f84f19b Xp/maxnick in place fix 43602 (#11664)
* Convolution concat sum inplace conflict fix

* Minor refactoring.

* Rebase to OV2.0, build pass.

Signed-off-by: Yan, Xiping <xiping.yan@intel.com>

* Remove old file.
Rebase introduce this file by mistake.

Signed-off-by: Yan, Xiping <xiping.yan@intel.com>

* Move functional test for subgraph.

Signed-off-by: Yan, Xiping <xiping.yan@intel.com>

* Disable some crash test for continue to test others.

* Rename ConcatConvSumInPlaceTest to ReLuConcatConvSumInPlaceTest
fix ci crash issue.

Signed-off-by: Yan, Xiping <xiping.yan@intel.com>

* Revert "Disable some crash test for continue to test others."

This reverts commit f7a8677c002747b45e84f74672f76e2fdfc7ab22.

* Add const for inPlace.

Signed-off-by: Yan, Xiping <xiping.yan@intel.com>

* fix build issue, missing braces;

Co-authored-by: Maksim Kutakov <maksim.kutakov@intel.com>
2022-06-17 16:35:58 +08:00
Tingqian Li
2fec03024d Add signal stack management for AMX in linux python API (#11894)
* Add signal stack management for AMX in linux python API

* fix wording

* fix empty line

* add AT_MINSIGSTKSZ definition

* Fix misspelling and conditional compiling on __linux__
2022-06-16 20:17:05 +08:00
mei, yang
39981bf2b8 relax the class number check in paddle multiclass_nms op (#11857)
* relax the class number check in paddle multiclass_nms op

* relax checks in paddle multiclass_nms op
2022-06-16 11:29:15 +08:00
Luo Cheng
151d77062f [CPU] remove unused primitive (#11811)
* remove unused primitive

* update onednn commit
2022-06-14 06:19:05 +08:00
Luo Cheng
922e32e2f1 disable avx512 brgconv (#11866) 2022-06-13 17:10:42 +08:00
Luo Cheng
9fe27be1cb [CPU] Fix smoke_Conv_3D_FP32_fusingScaleShiftAndFakeQuantizePerChannel sporadic failure (#11813)
* fix smoke_Conv_3D_FP32_fusingScaleShiftAndFakeQuantizePerChannel sporadic failure

* rebase onednn
2022-06-13 15:29:20 +08:00
Luo Cheng
91216fef5a [CPU] Revert enable ReduceSum -> AvgPool transformation due to perf issues (#11865)
* disable ConvertReduceMeanToPooling

* recover testcase
2022-06-13 14:11:53 +08:00
River Li
0571124fd3 Fix CC issues for transformation and snippets (#11798)
* Fix CC issues for transformation and snippets

Matcher should be enabled if it was hit during analyze stage.

* Fixed 3 naming issue
2022-06-13 13:36:35 +08:00
Luo Cheng
c848e138f8 [CPU] cherry-pick: Fix possible data race when accessing global reorder list (#11829)
* [CPU] cherry-pick: Fix possible data race when accessing global reorder list

* rebase onednn
2022-06-13 13:11:53 +08:00
Luwei Zhou
0066ddbd22 Update onednn submodule hash to fix 3D deconv post-ops issue. (#11836) 2022-06-13 09:21:29 +08:00
Bo Liu
79d3fbe3c1 remove limitation usage of brgemm for 'FullyConnected' Node (#11783) 2022-06-10 10:19:41 +08:00
Chenhu Wang
604dc4589c [CPU] Deconvolution caching support (#11835)
* Deconvolution caching support

* get ride of deprecated name

Co-authored-by: mandrono <maxim.andronov@intel.com>
2022-06-10 10:17:59 +08:00
Chenhu Wang
e2e7417c2a load_store_emitters_optimization_and_apply_to_interpolate (#11742)
* load_store_emitters_opt_and_apply_to_interpolate

* zmm_zero_is_always_needed_on_all_platform
2022-06-10 10:17:29 +08:00
opoluektov-lohika
d87233863d Fix experimental detectron do ref impl (#10621) 2022-06-10 03:10:13 +03:00
Roman Baranchuk
0231637441 [CPU] GatherTree: dynamic shapes support (#11544) 2022-06-09 10:18:51 +08:00