* add _streams_info_table in Executor config
* change useHyperThreading init value
* restore cmake
* fix comments
* add calling enableCpuPinning property
* fix judgment about number of sockets in init_stream
* fix test case compile issue
* fix ci test case fail issue
* modify GetPerformanceStreams calling position
* add affinity in get_cpu_pinning
* modify ecore judgement
* add no binding core on ADL
* fix ci issue, add get_num_numa_nodes()
* fix code style
* fix StreamsHasHigherPriority issue
* fix according to comments
* fix performance degression
* fix code style
* code style
* fix warning
* fix ci test failed
* fix ImportNetwork issue
* fix ci test case issue
* fix smoke_CachingSupportCase_CPU issue
* add ExportOptimalNumStreamsTest test
* modify test name
* modify ExportOptimalNumStreams test
---------
Co-authored-by: Chen Peter <peter.chen@intel.com>
* Update MULTI doc per current implementation
Signed-off-by: Peter Chen <peter.chen@intel.com>
* Update the description of Multi-Device execution mode
Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>
* Remove sample code and video
1. Remove the sample code for removed behaviors
2. Remove the video to avoid confusion
Signed-off-by: Peter Chen <peter.chen@intel.com>
---------
Signed-off-by: Peter Chen <peter.chen@intel.com>
Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>
* Intermediate state
* Remove old dyn batch path in the new api
* Remove legacy dyn batch support
* Remove dyn batch support field from the config
* Revert changes to the common part
* Revert accidental change in the test file
* Minor fixes
* Fix support for dyn batch without setting current
* Typo fix
* TypeRelaxed<>::clone_with_new_inputs thread safety fix
* Style
* Make TypeRelaxed<BaseOp>::clone_with_new_inputs copy node the same way as copy ctor of ov::Node
* Removed mutex field from intel_cpu::GraphContext
* Removed all about has_type_relaxed_ops field from the snippets subgraph
* Clonning test
* update auto architecture doc
* update auto architecture doc
* Apply suggestions from code review
Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>
* update for comments
---------
Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>
* [GPU] Fix levit-128s accuracy issue
Wrong batch dims for fused eltwise of gemm.
-> The issue is getting incorrect batch size of fused eltwise used by gemm.
Its rank is different from src tensor. Eltwise tensor rank was reduced by mistake.
It is only reproduce in batch 1 and full tensor.
The batch size in here means all of non spatial dims, but previous implementation was default batch dim role.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* use oneTBB for arm64
* force THREADING=TBB
* test: remove TBB_DIR for linux arm64
* update linux and mac arm64 packages
* update SHA256
* add comment
* disable add_rpath for tbb libraries on mac arm64
---------
Co-authored-by: Chen Peter <peter.chen@intel.com>
* [GPU] Resolve failed unit-tests on dGPU
+ Modified unit-tests of asymetric conv with per channel(WA for oneDNN issue)
+ Modified conv unit-tests with padded input or output
+ For testing oneDNN conv, it needs to query oneDNN about format. Applied this to conv tests.
+ Modified accuracy checking logic in unit-tests which have different format on dGPU.
+ reorder from fsv16 to bfyx should not be optimized out if not aligned by 16
Signed-off-by: Min, Byungil <byungil.min@intel.com>