* Migrate on OneDNN 2.7 * [CPU] Enabled brconv implementation * Post ops optimizations * [CPU] Enabled I8 precision on activations for Convolution node * [CPU][WA] Disabled Deconvolution + post ops fusing optimization * Fixed FQ post op optimization * [CPU] Optimize post ops processing * [WA] Add node name if tensor names are empty * [WA] remove layout compatibility chheck that leads to the fase-positive exceptions * [CPU] Optimize processing for FQ + Sum + FQ post ops pattern * [CPU][WA] Enabled ReduceSum -> AvgPool transformation due to perf issues * fix compiler error * rebase onednn master * cherry pick from 2.7 to 2.6 * [WA] make cpu case to run completed * fix xmm zero check * reopen 'FuseDeconvolutionAndSimpleOperation' Transform to fix CPU 'ConvolutionBackpropDataLayerTest' fail issue * [WR] Removed failed the ReduceMean tests caused by21f3555. * group deconv may crash on memory out of bound * [WA] Remove the moc fail case by #af4731a1 * testcase conv maxpool will check brgconv instead of jit * test subgraph added nhwc format check * fix gemm bf16 win crash * fix avx2 groupconv accuracy problem * [WA] remove invalid FQ tests * WR to disable the LPT multiplyToGroupConv test because the transformation was disabled in d5e16f * add gemm int8 binary postops to fix GroupConvolutionQDqTransformation fail * add gemm int8 binary postops to fix GroupConvolutionQDqTransformation fail * fix gemm bf16 fail * Fix ConcatConvSumInPlaceTest * Add cpuDebugFuncTests target * [WA] bf16 crash due to MemoryInput/Output * OVClassBasicTest case typo * testcase subgraph sets default ENFORCE_BF16 to NO * fix clang check * Fix primType check issue * Fix cpplint error * MemoryInput/Output support bf16; Enforce bf16 'NO' should enable snipptes * disable BF16 fusing fakequant testcase * testcase init support amx check * testcase for conv brgconv avx512/amx * testcase for conv brgconv avx512/amx * WR enforce reorder bug and add NSPC into deconv supported list. * Compiling issue fix. * [WA] skip fakequantize fusing in bf16 * mix legacy/new binary postops * make nightly case run. tested on amx/avx512/avx2. * [CPU] Add BF16 AMX test for Matmul * Add CPU dump check tool * Add verbose log * Generate exec graph in cpu dump check tool * fix binary prelu post Ops * fix cpplint * Update ONEDNN version to fix AVX2 bug. * cpu dump check supports compare dump files * Add a new CPU_DEBUG_CAPS: OV_CPU_SUMMARY_PERF * change VERBOSE_LOG to DEBUG_LOG * fix oneDNN register_jit_code log * fix cpplint * Add OV_CPU_DEBUG_LOG controls debug logs to show * Revert reorder WR. * Enhanced CPU debug logs and breakpoint support * Enhanced cpu_dump_check with --ports * Fix DEBUG_LOG compile issue * GroupDeconvolutionLayerCPUTest extend to add amx test cases * Add Node into DBUEG_LOG * cpu_dump_check: Dump results even no port is specified * FIx MergeTransposeAndReorder for blocked input * Fix cpu_dump_check result names * Enhance DEBUG_LOG on edges * Cpu dump check support shape mismatch * Fix bi-directionl inplace * Cpu dump check support inference_precion_hing f32. * fix windows dump fail. * fix depthwise nwc conv * add rtol arg * win debugbreak * fix pooling accuracy * GroupDeconvolutionLayerCPUTest remove invalid test param for nspc * recover ov onednn fork * revertaf4731a1f1'[WA] remove layout compatibility chheck' * [WA] disable avx2 conv3d fusing case * [WA] disable avx2 conv3d fusing case * [WA] Disabled weights md transpose in FC to prevent perf degradations Co-authored-by: dmitrygo <dmitry.gorokhov@intel.com> Co-authored-by: Vladislav Golubev <vladislav.golubev@intel.com> Co-authored-by: Zhang Yi3 <yi3.zhang@intel.com> Co-authored-by: liubo-intel <bo4.liu@intel.com> Co-authored-by: Luwei Zhou <luwei.zhou@intel.com> Co-authored-by: Li, Tingqian <tingqian.li@intel.com> Co-authored-by: xuchen-intel <chen.xu@intel.com> Co-authored-by: ceciliapeng2011 <cecilia.peng@intel.com>
CPU Dump Check Tool
Compile CPU plugin with -DENABLE_DEBUG_CAPS=ON, then this tool allows:
- dump each output tensors from CPU plugin:
python3 cpu_dump_check.py -m=/path/to/model dump1
- comparing two dumps and analyze differences:
python3 cpu_dump_check.py -m=/path/to/model dump1 dump2
- visualize first error map:
python3 cpu_dump_check.py -m=/path/to/model dump1 dump2 -v