* [CPU] CPU plugin migrates to plugin API 2.0 * Fix legacy config/metric issue * Fix some issue of ov_cpu_func_tests 1. set_tensors_impl segment fault 2. ov::loaded_from_cache unsupported issue * Resolve some comments 1. ov::loaded_from_cache issue 2. throw_if_cancelled issue 3. import_model issue 4. set_tensor_impl issue 5. batched_inference issue * Fix dynamic shape inference issue * Fix build error * keep original model info in infer_request * Fix minor error * cache internal tensors for input/output precision change * Disable import model test cases with precision changes * fix precision issue * Fix issue for import model * Fix InferRequestCancellationTests exception issue * Skip InferRequestIOBBlobTest.*secondCallGetInputDoNotReAllocateData due to new plugin api have different behavior * Fix graph name issue * Fix ROI issues * Fix Transpose shape issue * Skip vie::Version test due to change to ov::Version * Solve input port name changes issue * Solve preprocess layout issue * Fix minor issue * tidy up code * Fix conflict after rebase * Fix Windows build warning * Add aux tensors for precision change issue * Fix import/export model issue * WA single layer name changed by preprocess * Revert "WA single layer name changed by preprocess" This reverts commitbc8fcdd43c. * Skip some legacy tests due to plugin api 2.0 is enabled 1. skip some python legacy tests for plugin api 2.0 some different behaviors 2. skip some smoke tests due to output port name was changed * Fix 2 build warnings * Skip some AUTO plugin tests * Fix property issue caused by AUTO plugin * Skip PSROIPooling issues * Follow header files reference policy * Split out transformation fixing for nop_elimination * Fix AUTO plugin mismatch issue for get_tensor function * Fix aux tensor shape issue * Fix tensor shape issue * WA python sync inference sample's segmentfault issue * Fix reshape issue for dynamic inference * Fixed incorrect tensor name in e2e test Fixe issue: e2e ONNX_Customized_Cascade_Rcnn_api_2_True_batch_1_device_CPU_precision_FP325den8cnk * Fix python segmentfault issue of plugin api 2.0 * Fix python segmentfault issue of plugin api 2.0 * Revert "Fix python segmentfault issue of plugin api 2.0" This reverts commit6f502e5d86. * Fix onnx_duplicated_output_name due to empty tensor Co-authored-by: Bell, Song <bell.song@intel.com> * Remove redundant code * Remove python segment fault WA * Keep rt_info to fix test failure in case of legacy public api * Fix output port names missing issue * Adress some reviewers' comments * Restore OnnxBackendNodeModelTest::test_maxpool_with_argmax_2d_precomputed_pads_cpu after fixing has been merged * Resolve tensor sharing issue when there are same name output port name In some case, model has 2 or more same name input/output ports, they aslo have the same precision and partial_shape. Compiled_model will share the same ov::Descriptor::Tensor pointer and ov::Tensor between multiple such ports. Considered solving python segment fault issue to create seperated input/output ports, which also need handle such tensor shared case, this patch will do it. * Resolve tensor sharing issue when there are same name output port name In some case, model has 2 or more same name input/output ports, they aslo have the same precision and partial_shape. Compiled_model will share the same ov::Descriptor::Tensor pointer and ov::Tensor between multiple such ports. Considered solving python segment fault issue to create seperated input/output ports, which also need handle such tensor shared case, this patch will do it. * Better method to find shrared tensor desc * rename with snake_case style * Remove ngraph header files * Keep external_ptr naming * Add OPENVINO_SUPPRESS_DEPRECATED for some legacy code * Use port's tensor_ptr to replace creating new tensor_ptr * Resolve some reviewer comments * Implement ov::IInferRequestInternalWrapper::GetPreProcess to recover python GetPrepProcess tests * Remove unnecessary header files reference * Assert the risk of precision change and reorder at the same time * Modify legacy python test to fit plugin api 2.0 behavior * Recover smoke_Transpose(2|4|5|6)D/TransposeLayerTest.CompareWithRefs due to fixing is merged * Fix typo issue * Address reviewer's comments * Disable precision coversion * Fix error when CpuBlockedMemoryDesc * Remove precision mismatch WA * WA precision issue for query_model * Solve precision mismatch between compiled model and graph * Fixe failure of query_model * Rebase to new plugin api update * Recover the test cases of precision mismatch * Try to fix name changing for graph model * Remove tets code * Remove fp64 * Rebase to new plugin api update * Update for some failure cases * Fix bert_benchmark failure issue * Avoid segment fault in arm acl Legacy public api + cpu plugin api will add convert op by preprocess by default for unsupported precision, but ACLConvertExecutor cannot support dimension > 6, so this test will be segment fault due to dimension > 6 smoke_TestNumpyBroadcastNgraphEvaluate/BroadcastLayerTest.CompareWithRefs/targetShape=(1.2.3.4.5.6.7.8.9.10)_axesMapping=()_mode=numpy_inShape=(1.2.1.4.1.6.1.8.1.10)_inNPrec=I8_trgDev=CPU smoke_TestNumpyBroadcastNgraphEvaluate/BroadcastLayerTest.CompareWithRefs/targetShape=(1.2.3.4.5.6.7.8.9.10)_axesMapping=()_mode=numpy_inShape=(1.2.1.4.1.6.1.8.1.10)_inNPrec=U8_trgDev=CPU * Remove precision change from preprocess to avoid ACL unsupport convert dim > 6 * ACLConvertExecutor cannot support dimension > 6, don't let preprocess to add Convert * Revert "ACLConvertExecutor cannot support dimension > 6, don't let preprocess to add Convert" This reverts commitfd7a8b35af. * Revert "Remove precision change from preprocess to avoid ACL unsupport convert dim > 6" This reverts commit3c2d9a5f17. * Debug * Debug incorrect precision checking issue * Debug Eltwise FP64 unsupported issue * Add logs for precision * debug log * Update for new dependent PRs merged * Fix failure caused by preprocess Fix below failures due to cannot find ops by name smoke_LPT/ReduceMaxTransformation.CompareWithRefImpl/f32_[1,3,10,10]_CPU_f32__256* * Fix build error * Fix failure caused by missing code during rebase * Add debug * Fix precision unsupport issue * U16/I16/U64 precision support * Resolve the issue of f64 reorder Fix below issue: Cannot create reorder primitive: unsupported reorder case * Fix convert multiple child edge issue * Solve ROI tensor failure issues * Temporarily disable num_nodes comparison * Only change convert precision for fp64 * Put convert precision change before reorder to avoid confusion * Add debug log for transformation * Fix rebase confilict * Fix clang issue * Temporarily disable test_infer_mixed_values python test of bf16 * Solve issue of smoke_ConvertCPULayerTest_BOOL_Dynamic_inputPRC=BF16 choose FP32 primType rather than BP16 primType * Fix issue of pytorch_tests/test_outer.py There are 2 output ports, but with the same port name, they should share the same tensor. * Fix arm cannot find Eltwise executor issue smoke_SetBlobCPU/SetBlobTest.CompareWithRefs/Type=INPUT_Device=CPU_PrecisionInNet=FP16_PrecisionInNgraph=BOOL will report below error: [ GENERAL_ERROR ] Supported Eltwise executor is not found It need change convert precision to avoid such problem. * Fix memory overwritten issue * Temporarily skip arm fp16 SetBlobTest * Fix compile error after rebase * Restore smoke_IsOp test due to fixing pr merged * Fix float to bf16 issue in avx2 isa * solve onnx test xfail issue * Skip test cases that ARM Eltwise executor FP16 is not supported smoke_SetBlobCPU/SetBlobTest.CompareWithRefs/Type=INPUT_Device=CPU_PrecisionInNet=FP16_PrecisionInNgraph=BOOL smoke_SetBlobCPU/SetBlobTest.CompareWithRefs/Type=BOTH_Device=CPU_PrecisionInNet=FP16_PrecisionInNgraph=BOOL [ GENERAL_ERROR ] Supported Eltwise executor is not found * [CPU] improve reorder to support any precision * Implement ReorderExecutor * Fix builld error * Not cache executor due to its primitive has been cached * Keep convert one time at most At most insert one convert if needed, if still cannot do reorder it will throw exception rather than insert the second convert For example, below reorder will not be supported: FP64<->I64/U64/U32 U32<->I64/U64 U32<->I16/U16 FP64<->FP64 BIN<->BIN * Only do conversion if layout is same * update for only convert case * Update for reviewer comments * update for failure cases * Address reviewer comments * Update rebase issue * minor update * Solve unsupported precision issue in tranfromation rather than init_edge * Remove unnecessary convert in init_edge * Minor changes * Update Reorder::reorderData * Solve issue if only coversion without reorder * Address reviewer comments * Address reviewer comments * Keep exception for unsuported precision * update * Revert reorder executor implement * Solve float->bool issue on transformation pipeline * Solve I64 is not supported issues * Solve reviewer's comments * Fixed dynamic top_k node issue * Skip nhwc and nChw16c test cases for ConvertLayer * Update for reviewers' comments * Fix some failures * Update for several failure cases * Update for apiConformanceTests failures * Fix incorrect node name after import model * update * update comments * Solve issue of smoke_MatMul_NoTranspose and smoke_MatMul_BothTranspose * Fixed AlignMatMulInputRanks scalar issue * Address reviewers' comments, remove redundant path in graph.cpp * Remove test_div_uint8_cpu from xfail_issue_58676 * Solve invalid number of nodes for smoke_Snippets_BroadcastSelect * ConstantResultSubgraphTest of u16/i16/u32/i64/u64 * restore smoke_SetBlobCPU BOOL tests for arm * [CPU] Fix ARM precision issue ARM64 ACL prefers fp16 than fp32, API 2.0 requires input/output precision not changes, then fp32 input will trigger convert node is added to convert fp32 to fp16. * Solve some ARM64 failures * Fix arm64 InferRequestVariableStateTest tests out of memory issue ARM64 will force fp16 precision, which cause states memory can be fp16, so memcpy to state_memory cannot use float * element_size, else it will be out of memory bound. * Skip 2 arm64 tests caused by forcing fp16 precision * Revert "Fix arm64 InferRequestVariableStateTest tests out of memory issue" This reverts commit3e12bd48c2. * Fix python test_get_profiling_info failure issue --------- Co-authored-by: Bell, Song <bell.song@intel.com> Co-authored-by: Chen Peter <peter.chen@intel.com>
Contents:
- What is OpenVINO?
- Supported Hardware matrix
- License
- Documentation
- Tutorials
- Products which use OpenVINO
- System requirements
- How to build
- How to contribute
- Get a support
- See also
What is OpenVINO toolkit?
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference.
- Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks
- Use models trained with popular frameworks like TensorFlow, PyTorch and more
- Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud
This open-source version includes several components: namely OpenVINO Model Converter (OVC), OpenVINO™ Runtime, as well as CPU, GPU, GNA, multi device and heterogeneous plugins to accelerate deep learning inference on Intel® CPUs and Intel® Processor Graphics. It supports pre-trained models from Open Model Zoo, along with 100+ open source and public models in popular formats such as TensorFlow, ONNX, PaddlePaddle, MXNet, Caffe, Kaldi.
Components
- OpenVINO™ Runtime - is a set of C++ libraries with C and Python bindings providing a common API to deliver inference solutions on the platform of your choice.
- core - provides the base API for model representation and modification.
- inference - provides an API to infer models on the device.
- transformations - contains the set of common transformations which are used in OpenVINO plugins.
- low precision transformations - contains the set of transformations that are used in low precision models
- bindings - contains all available OpenVINO bindings which are maintained by the OpenVINO team.
- Plugins - contains OpenVINO plugins which are maintained in open-source by the OpenVINO team. For more information, take a look at the list of supported devices.
- Frontends - contains available OpenVINO frontends that allow reading models from the native framework format.
- OpenVINO Model Converter (OVC) - is a cross-platform command-line tool that facilitates the transition between training and deployment environments, and adjusts deep learning models for optimal execution on end-point target devices.
- Samples - applications in C, C++ and Python languages that show basic OpenVINO use cases.
Supported Hardware matrix
The OpenVINO™ Runtime can infer models on different hardware devices. This section provides the list of supported devices.
| Device | Plugin | Library | Short Description |
|---|---|---|---|
| CPU | Intel CPU | openvino_intel_cpu_plugin | Intel Xeon with Intel® Advanced Vector Extensions 2 (Intel® AVX2), Intel® Advanced Vector Extensions 512 (Intel® AVX-512), and AVX512_BF16, Intel Core Processors with Intel AVX2, Intel Atom Processors with Intel® Streaming SIMD Extensions (Intel® SSE), Intel® Advanced Matrix Extensions (Intel® AMX) |
| ARM CPU | openvino_arm_cpu_plugin | Raspberry Pi™ 4 Model B, Apple® Mac mini with Apple silicon | |
| GPU | Intel GPU | openvino_intel_gpu_plugin | Intel Processor Graphics, including Intel HD Graphics and Intel Iris Graphics |
| GNA | Intel GNA | openvino_intel_gna_plugin | Intel Speech Enabling Developer Kit, Amazon Alexa* Premium Far-Field Developer Kit, Intel Pentium Silver J5005 Processor, Intel Pentium Silver N5000 Processor, Intel Celeron J4005 Processor, Intel Celeron J4105 Processor, Intel Celeron Processor N4100, Intel Celeron Processor N4000, Intel Core i3-8121U Processor, Intel Core i7-1065G7 Processor, Intel Core i7-1060G7 Processor, Intel Core i5-1035G4 Processor, Intel Core i5-1035G7 Processor, Intel Core i5-1035G1 Processor, Intel Core i5-1030G7 Processor, Intel Core i5-1030G4 Processor, Intel Core i3-1005G1 Processor, Intel Core i3-1000G1 Processor, Intel Core i3-1000G4 Processor |
OpenVINO™ Toolkit also contains several plugins which simplify loading models on several hardware devices:
| Plugin | Library | Short Description |
|---|---|---|
| Auto | openvino_auto_plugin | Auto plugin enables selecting Intel device for inference automatically |
| Auto Batch | openvino_auto_batch_plugin | Auto batch plugin performs on-the-fly automatic batching (i.e. grouping inference requests together) to improve device utilization, with no programming effort from the user |
| Hetero | openvino_hetero_plugin | Heterogeneous execution enables automatic inference splitting between several devices |
| Multi | openvino_auto_plugin | Multi plugin enables simultaneous inference of the same model on several devices in parallel |
License
OpenVINO™ Toolkit is licensed under Apache License Version 2.0. By contributing to the project, you agree to the license and copyright terms therein and release your contribution under these terms.
Telemetry
OpenVINO™ collects software performance and usage data for the purpose of improving OpenVINO™ tools. This data is collected directly by OpenVINO™ or through the use of Google Analytics 4. You can opt-out at any time by running the command:
opt_in_out --opt_out
More Information is available at https://docs.openvino.ai/latest/openvino_docs_telemetry_information.html.
Documentation
User documentation
The latest documentation for OpenVINO™ Toolkit is available here. This documentation contains detailed information about all OpenVINO components and provides all the important information you may need to create an application based on binary OpenVINO distribution or own OpenVINO version without source code modification.
Developer documentation
Developer documentation contains information about architectural decisions which are applied inside the OpenVINO components. This documentation has all necessary information which could be needed in order to contribute to OpenVINO.
Tutorials
The list of OpenVINO tutorials:
Products which use OpenVINO
System requirements
The system requirements vary depending on platform and are available on dedicated pages:
How to build
See How to build OpenVINO to get more information about the OpenVINO build process.
How to contribute
See Contributions Welcome for good first issues.
See CONTRIBUTING for contribution details. Thank you!
Take the issue
If you wish to be assigned to an issue please add a comment with .take command.
Get a support
Report questions, issues and suggestions, using:
- GitHub* Issues
- The
openvinotag on StackOverflow* - Forum
Additional Resources
- OpenVINO Wiki
- OpenVINO Storage
- Additional OpenVINO™ toolkit modules:
- Intel® Distribution of OpenVINO™ toolkit Product Page
- Intel® Distribution of OpenVINO™ toolkit Release Notes
- Neural Network Compression Framework (NNCF) - a suite of advanced algorithms for model inference optimization including quantization, filter pruning, binarization and sparsity
- OpenVINO™ Training Extensions (OTE) - convenient environment to train Deep Learning models and convert them using OpenVINO for optimized inference.
- OpenVINO™ Model Server (OVMS) - a scalable, high-performance solution for serving deep learning models optimized for Intel architectures
- Computer Vision Annotation Tool (CVAT) - an online, interactive video and image annotation tool for computer vision purposes.
- Dataset Management Framework (Datumaro) - a framework and CLI tool to build, transform, and analyze datasets.
* Other names and brands may be claimed as the property of others.
