Quote: The Skylake microarchitecture implements a different state
machine than prior generations to manage the YMM state transition
associated with mixing SSE and AVX instructions.
It no longer saves the entire upper YMM state when executing
an SSE instruction when in “Modified and Unsaved” state,
but saves the upper bits of individual register.
As a result, mixing SSE and AVX instructions will experience
a penalty associated with partial register dependency of
the destination registers being used and additional blend
operation on the upper bits of the destination registers.
Such type of penalties have a huge impact on openvino's and oneDNN's kernels.
Basically the mixing of VEX and non-VEX instructions should be
avoided.
* Implement bmp reader
* Use not os specific functions
* Fix code style
* Move `i` declaration from `for` loop
Co-authored-by: Vladimir Dudnik <vladimir.dudnik@intel.com>
Co-authored-by: Vladimir Dudnik <vladimir.dudnik@intel.com>
* Renamed ov::Function to ov::Model
* Fixed all for macos
* Fixed build
* Fixed build
* Revert changes in GPU plugin
* Fixed ngraphFunctions
* Fixed all for mac
* Fixed new test
* Fixed if for Windows
* Fixed unit tests and renamed Function in python API
* Fixed code style
* Fixed import
* Fixed conflict
* Fixed merge issues
* [CPU] Mode to DetectionOutput-8
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Fix build issue
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Use ov namespace
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Disable downgrading transformation in CPU explicitly
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Correct functional layer tests
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Revert "Correct functional layer tests"
This reverts commit 0428159fb8.
* Revert "Disable downgrading transformation in CPU explicitly"
This reverts commit 7cd0f48d5d.
* Correct upgrade transformation
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Correct transformation tests
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
* Disable downgrading transformation and enable upgrade on CPU
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
* [GPU] Added more exception handling
* More compact exception
* Enable GPU_THROUGHPUT_AUTO for MAX_BATCH_SIZE option
* Fixed global variable of default # streams to a function
* Added ShuffleChannels operation into OpVersioning.opset_1_types set.
* Fixed the attribute 'version' of the MO operation ShuffleChannels.
* Reverted opset change for the operation ShuffleChannels.
* fix iteration count calculation for output record for case of negative start/end and non integer division
* added tests;
process case with both start/end
* test refactoring done
* Alignment of OV and ONNX models outputs naming
* Python tests adaptation to new naming rules
* New output naming rules
* Output name retrieval adaptation (tensor iterator node)
* Copying of tensor names during output replacement
* Multiout multinode subgraphs handling in the importer
* Proper replacement tensor naming
* Model zoo test runner adaptation
* Backwards compatible python tests runner adaptation
* If node adaptation
* Adaptation to changes in master
* Deprecation warning suppression
* Imports fix in compatibility tests
* If node adaptation to the new naming
* MaxPool python tests re-enabled
* ONNX Identity elimination adaptation
* XFAIL for the Identity op test
* Support for Param->Result models and indentity op
* Fix of the ONNX Indentity handling
* The test that fails only on windows temporarily disabled
* ONNX tensor names test adaptation
* Code cleanup
* Code formatting
* Obsolete helper removal
* One more spot where output name helper should be used
* PyApi fix for tensors with multiple names
* Don't set friendly names for unnamed ONNX nodes
* Revert "Don't set friendly names for unnamed ONNX nodes"
This reverts commit 92c7ac59b5.
* Missing dot...
* And now the mypy nonsense...
* Use get_any_name in Loop
* New way of naming result nodes in ONNX FE
* removed unnecessary code of PWL & added additional tests for increase PWL code coverage
* fixed errors
* removed Configuration from PowerParamsTuple
* fixed a bug related to an incorrect PWL value for functions neglog and neghalflog
* Remove fp16 of Convert layer test from skip_tests.config.cpp as it works now
* update repo
* add op reference test of ExperimentalDetectronPriorGridGenerator
* implement actual_comparision_size for compare
* update slt for actual comparison size and add visitor api test
* fixed clang error