openvino

Author	SHA1	Message	Date
Evgenya Stepyreva	af16ea1d79	Revert "Fix experimental detectron do ref impl (#10621 )" (#12683 ) (#13009 ) * Revert "Fix experimental detectron do ref impl (#10621)" This reverts commit `d87233863d`. * Disabled Experimental Detectron per agreement with GPU team. Ticket to fix it: 90209	2022-09-12 18:16:13 +04:00
Mateusz Tabaka	41fa6f360b	Explicitly link onednn with tbb for tbb version in [2018,2019.4] (#12789 ) (#12837 ) Ticket: 89800	2022-08-31 17:14:54 +03:00
Gorokhov Dmitriy	a0b661a274	[CPU] Fixed MHA accuracy for mixed precision case (#12820 )	2022-08-31 10:53:38 +04:00
Chen Xu	1e5fec7e25	[CPU] Reduce node improve performance for nspc layout (#12671 )	2022-08-24 15:39:55 +04:00
Luwei Zhou	aa1a607328	[CPU] Fix the strided slice issue when ellipsis_mask has redundant data. (#12705 )	2022-08-24 09:43:08 +04:00
Gorokhov Dmitriy	a6bfc0cf0e	[CPU] Support MHA optimization (#12643 ) * [CPU] Support MHA optimization * [CPU] Extend pattern supported by MHA node * [CPU] MHA: fixed int8 perf issue Co-authored-by: Gu, Jianan <jianan.gu@intel.com>	2022-08-23 12:50:02 +04:00
Luo Cheng	e03fbd5c15	[CPU] Default enable avx512 f32 brgconv (#12620 )	2022-08-19 17:59:15 +04:00
Ilya Lavrenov	29628a89b7	Tbb port (#12541 ) * Fixes for TBB 2018-2019.4 * Fixed CVS-89248	2022-08-15 06:26:47 +04:00
Mateusz Tabaka	c0212a361a	[CPU] Add RDFT and IRDFT operators (#12290 ) Tickets: 79178 and 79192 Co-authored-by: Mateusz Bencer <mateusz.bencer@intel.com>	2022-08-12 14:10:53 +02:00
River Li	d328b00e48	[CC]Fix CC issue for transformation (#12292 ) (#12489 ) * Revert "Fixed 3 naming issue" This reverts commit `a92d3cfff5`. * Revert "Fix CC issues for transformation and snippets" This reverts commit `d08a3f5aac`. * Fix NGRAPH_PASS_CALLBACK issue to make it can work * Fix matcher name missing issue	2022-08-10 11:36:51 +04:00
Ilya Churaev	c9f9795d29	Fixed newAPI for case if core was removed (#12208 ) * Fixed newAPI for case if core was removed * Fixed code style * Fixed typo * Use new API by default * Create core with template plugin * Added doxygen comment Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>	2022-07-23 11:53:26 +00:00
Egor Duplenskii	fdae95a769	[CPU] Explicitly enable DNNL_VERBOSE only in case of CPU_DEBUG_CAPS (#12151 ) and rely on oneDNN default behavior otherwise	2022-07-20 14:07:42 +04:00
Chenhu Wang	123f8e62bf	[DOC][CPU] Denormals optimization document (#12132 )	2022-07-18 16:37:44 +04:00
zihan wu	32f800c6a6	[CPU] polish onednn cc readme (#12114 ) (#12176 )	2022-07-15 16:36:31 +00:00
Luo Cheng	4412e1ddfa	[CPU] revert pr 11990 and enable brgconv avx512 on SPR by default (#12134 )	2022-07-14 14:10:51 +04:00
Tingqian Li	b7b3f0ab4a	move cpu_dump_check into CPU plugin's tools folder (#12123 )	2022-07-13 13:38:17 +08:00
Tingqian Li	bc34fa0934	[CPU] Re-enable Selective build on oneDNN2.6 (#12074 ) * update submodule onednn26 selective build * onednn code review * merge onednn selective build * fix bug in cc onednn26 Co-authored-by: zihan wu <zihan.wu@intel.com>	2022-07-08 03:48:12 +00:00
Luwei Zhou	0224e6a067	Fix the deconv depwise post ops issue on AVX2 and AVX512 and enable deconv test (#11870 ) * Fix the deconv fused issue on AVX2 and AVX512 and enable deconv test * Keep GroupDeconv BF16 test cases still disabled. * Update to also excluding nightly * Update onednn submodule. * Update onednn submodule * Update onednn submodule. * Update the ONDENN submodule * Update the ONEDNN commit. * Update with merged onednn commit.	2022-07-07 13:26:44 +08:00
River Li	b80f724414	Fix rnn cache missing issue (#12053 )	2022-07-06 11:20:27 +00:00
Mateusz Bencer	43c0c964b8	Added FoldSubgraphEmptyInputs transformation (#11957 )	2022-07-05 19:38:46 +02:00
Pawel Raasz	e1bcfeca9d	Add SoftSign to CPU plugin (#12034 )	2022-07-05 13:34:42 +02:00
Chenhu Wang	8c152405ad	[CPU] General denormals optimization (#11883 ) * FTZ_and_DAZ_set_for_cpu * remove DAZ * fix * extract to utils * ie core part changes to add do as property and benchmark_app enable do * enable brgcov from Luocheng patch * add debug info * enable_brgemm_on_avx512 * add python binding * dlb test * FTZ_and_DAZ_set_for_cpu * remove DAZ * fix * extract to utils * ie core part changes to add do as property and benchmark_app enable do * enable brgcov from Luocheng patch * add debug info * enable_brgemm_on_avx512 * add python binding * dlb test * revert test code * revert test code	2022-07-05 15:50:16 +08:00
Tingqian Li	3f9c6b2f3f	[BUG fix] Reshape node: WA in-place failure case by mem-copy (#10828 ) * Handle in-place failure cases in reshape node * Disable inplace when non-const reshape connected to constant * Add comment to reshape_inplace test * move copy WA into execute() to cover more general in-place failure cases	2022-07-05 04:46:27 +00:00
Mang Guo	a571539107	Optimize FullyConnected FakeQuantize post-ops (#11819 ) * Optimize FullyConnected FakeQuantize post-ops * matmul bias fuse * Add simplifyToScale for FakeQuantize and use it in FC and Conv. * Add fakequantize documentation * Update doc and fix accuracy issue * Update doc * Fix accuracy regression * Generalize the judgment Criteria about fake quantization with scale * Update document Co-authored-by: Zhang Yi3 <yi3.zhang@intel.com> Co-authored-by: xuchen-intel <chen.xu@intel.com>	2022-07-05 09:39:42 +08:00
Luo Cheng	35ee842446	[CPU] [WA] Use config to enable brgconv f32 kernel (#11990 ) * enable brgconv f32 * use config to enable brgconv f32 * when brg disabled not init bin-postops * change prop name for extensive * use more general field * fix review comments.	2022-07-05 07:14:40 +08:00
avoskoboinyk-lohika	88784c2b6f	[CPU] Optimize NonZero operation (#11549 ) * [CPU] Optimize NonZero operation # Conflicts: # src/plugins/intel_cpu/src/nodes/non_zero.cpp * [CPU] Rewrite NonZero implementation, so it will use generic ie_parallel API * [CPU] NonZero operation: apply an additional optimization * NonZero operation: add fallback code for inRank >= 6 * NonZero operation: apply review modifications # Conflicts: # src/plugins/intel_cpu/src/nodes/non_zero.cpp * NonZero operation: inShape.getDims().size() -> inRank * NonZero operation: eliminate input array index calculation by slight modification of ie_parallel API * Adjust ie_parallel.hpp style for clang-format * Try to unbreak the build * Move to parallel_nt and add a cache for nd loops to optimize more * Add minimal size threshold for threading and reduce warning count * Try to workaround linter errors * One more try to unbreak cpplint build Co-authored-by: Michal Lukaszewski <michal.lukaszewski@intel.com>	2022-07-04 10:52:18 +08:00
Mang Guo	d22c429d0e	[CPU] Remove vmaxps in store_vector. (#12005 ) * Remove vmaxps in store_vector. This instruction is not needed for dst_prc int8. And it may lead to wrong result with denormals optimization is on. * Add vpmaxsd if dst_prc is u8 or u16.	2022-07-02 13:22:05 +00:00
Bo Liu	7834dba545	fix CPU Plugin deformable conv Node output incorrect issues with uneven dilations (#11940 )	2022-06-29 18:14:30 +08:00
Chenhu Wang	1288706589	large_batch_opt (#11951 )	2022-06-28 10:33:16 +08:00
opoluektov-lohika	8a21e4e062	[GPU] Implement ExperimentalDetectronDetectionOutput operation (#11772 ) * ExperimentalDetectronDetectionOutput: refine sorting criteria for NMS stage This is to ensure the operation produces stable predictable results across the possible sorting algorithm implementaions. This property is useful for the operation testing. * [GPU] Implement ExperimentalDetectronDetectionOutput operation * [GPU] ExperimentalDetectronDetectionOutput: use vector types and operations in kernel * Reformat changed files to make clang format checker happy * [GPU] ExperimentalDetectronDetectionOutput: add another test case to the unit test * [GPU] ExperimentalDetectronDetectionOutput: Add f16 test * ExperimentalDetectronDetectionOutput: single-layer test: use all three outputs * [GPU] ExperimentalDetectronDetectionOutput: increase single layer test coverage More attribute permutations were added.	2022-06-27 23:11:03 +09:00
Chenhu Wang	95a297ed68	onednn_update (#11930 )	2022-06-27 11:22:50 +00:00
Luwei Zhou	4be0c59505	Fix the Non_Zero childedge check. (#11963 )	2022-06-27 10:43:38 +08:00
Roman Baranchuk	5cba0ae871	[CPU] GRN: dynamic shapes support (#11678 )	2022-06-22 10:45:06 +08:00
Roman Baranchuk	dab9da25fa	[CPU] Roll: dynamic shapes support (#11707 )	2022-06-22 10:33:18 +08:00
Roman Baranchuk	44cecc8579	[CPU] ReverseSequence: dynamic shapes support (#11644 )	2022-06-22 10:27:06 +08:00
Xiping Yan	870f84f19b	Xp/maxnick in place fix 43602 (#11664 ) * Convolution concat sum inplace conflict fix * Minor refactoring. * Rebase to OV2.0, build pass. Signed-off-by: Yan, Xiping <xiping.yan@intel.com> * Remove old file. Rebase introduce this file by mistake. Signed-off-by: Yan, Xiping <xiping.yan@intel.com> * Move functional test for subgraph. Signed-off-by: Yan, Xiping <xiping.yan@intel.com> * Disable some crash test for continue to test others. * Rename ConcatConvSumInPlaceTest to ReLuConcatConvSumInPlaceTest fix ci crash issue. Signed-off-by: Yan, Xiping <xiping.yan@intel.com> * Revert "Disable some crash test for continue to test others." This reverts commit f7a8677c002747b45e84f74672f76e2fdfc7ab22. * Add const for inPlace. Signed-off-by: Yan, Xiping <xiping.yan@intel.com> * fix build issue, missing braces; Co-authored-by: Maksim Kutakov <maksim.kutakov@intel.com>	2022-06-17 16:35:58 +08:00
Tingqian Li	2fec03024d	Add signal stack management for AMX in linux python API (#11894 ) * Add signal stack management for AMX in linux python API * fix wording * fix empty line * add AT_MINSIGSTKSZ definition * Fix misspelling and conditional compiling on __linux__	2022-06-16 20:17:05 +08:00
mei, yang	39981bf2b8	relax the class number check in paddle multiclass_nms op (#11857 ) * relax the class number check in paddle multiclass_nms op * relax checks in paddle multiclass_nms op	2022-06-16 11:29:15 +08:00
Luo Cheng	151d77062f	[CPU] remove unused primitive (#11811 ) * remove unused primitive * update onednn commit	2022-06-14 06:19:05 +08:00
Luo Cheng	922e32e2f1	disable avx512 brgconv (#11866 )	2022-06-13 17:10:42 +08:00
Luo Cheng	9fe27be1cb	[CPU] Fix smoke_Conv_3D_FP32_fusingScaleShiftAndFakeQuantizePerChannel sporadic failure (#11813 ) * fix smoke_Conv_3D_FP32_fusingScaleShiftAndFakeQuantizePerChannel sporadic failure * rebase onednn	2022-06-13 15:29:20 +08:00
Luo Cheng	91216fef5a	[CPU] Revert enable ReduceSum -> AvgPool transformation due to perf issues (#11865 ) * disable ConvertReduceMeanToPooling * recover testcase	2022-06-13 14:11:53 +08:00
River Li	0571124fd3	Fix CC issues for transformation and snippets (#11798 ) * Fix CC issues for transformation and snippets Matcher should be enabled if it was hit during analyze stage. * Fixed 3 naming issue	2022-06-13 13:36:35 +08:00
Luo Cheng	c848e138f8	[CPU] cherry-pick: Fix possible data race when accessing global reorder list (#11829 ) * [CPU] cherry-pick: Fix possible data race when accessing global reorder list * rebase onednn	2022-06-13 13:11:53 +08:00
Luwei Zhou	0066ddbd22	Update onednn submodule hash to fix 3D deconv post-ops issue. (#11836 )	2022-06-13 09:21:29 +08:00
Bo Liu	79d3fbe3c1	remove limitation usage of brgemm for 'FullyConnected' Node (#11783 )	2022-06-10 10:19:41 +08:00
Chenhu Wang	604dc4589c	[CPU] Deconvolution caching support (#11835 ) * Deconvolution caching support * get ride of deprecated name Co-authored-by: mandrono <maxim.andronov@intel.com>	2022-06-10 10:17:59 +08:00
Chenhu Wang	e2e7417c2a	load_store_emitters_optimization_and_apply_to_interpolate (#11742 ) * load_store_emitters_opt_and_apply_to_interpolate * zmm_zero_is_always_needed_on_all_platform	2022-06-10 10:17:29 +08:00
opoluektov-lohika	d87233863d	Fix experimental detectron do ref impl (#10621 )	2022-06-10 03:10:13 +03:00
Roman Baranchuk	0231637441	[CPU] GatherTree: dynamic shapes support (#11544 )	2022-06-09 10:18:51 +08:00

1 2 3 4 5

226 Commits