diff --git a/docs/OV_Runtime_UG/supported_plugins/GNA.md b/docs/OV_Runtime_UG/supported_plugins/GNA.md index 4fdd2f7320d..f0a08710495 100644 --- a/docs/OV_Runtime_UG/supported_plugins/GNA.md +++ b/docs/OV_Runtime_UG/supported_plugins/GNA.md @@ -3,7 +3,7 @@ The Intel® Gaussian & Neural Accelerator (GNA) is a low-power neural coprocessor for continuous inference at the edge. Intel® GNA is not intended to replace typical inference devices such as the -CPU, graphics processing unit (GPU), or vision processing unit (VPU). It is designed for offloading +CPU and GPU. It is designed for offloading continuous inference workloads including but not limited to noise reduction or speech recognition to save power and free CPU resources. diff --git a/docs/glossary.md b/docs/glossary.md index 714f561d85d..b4152697c05 100644 --- a/docs/glossary.md +++ b/docs/glossary.md @@ -26,7 +26,7 @@ | LPR | License-Plate Recognition | | LRN | Local Response Normalization | | mAP | Mean Average Precision | -| Intel(R) OneDNN | Intel(R) OneAPI Deep Neural Network Library | +| Intel® OneDNN | Intel® OneAPI Deep Neural Network Library | | MO | Model Optimizer | | MVN | Mean Variance Normalization | | NCDHW | Number of images, Channels, Depth, Height, Width | @@ -53,24 +53,56 @@ ## Terms -Glossary of terms used in the OpenVINO™ +Glossary of terms used in OpenVINO™ +@sphinxdirective -| Term | Description | -| :--- |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Batch | Number of images to analyze during one call of infer. Maximum batch size is a property of the model and it is set before compiling of the model by the device. In NHWC, NCHW and NCDHW image data layout representation, the N refers to the number of images in the batch | -| Tensor | Memory container used for storing inputs, outputs of the model, weights and biases of the operations | -| Device (Affinitity) | A preferred Intel(R) hardware device to run the inference (CPU, GPU, GNA, etc.) | -| Extensibility mechanism, Custom layers | The mechanism that provides you with capabilities to extend the OpenVINO™ Runtime and Model Optimizer so that they can work with models containing operations that are not yet supported | -| ov::Model | A class of the Model that OpenVINO™ Runtime reads from IR or converts from ONNX, PaddlePaddle formats. Consists of model structure, weights and biases | -| ov::CompiledModel | An instance of the compiled model which allows the OpenVINO™ Runtime to request (several) infer requests and perform inference synchronously or asynchronously | -| ov::InferRequest | A class that represents the end point of inference on the model compiled by the device and represented by a compiled model. Inputs are set here, outputs should be requested from this interface as well | -| ov::ProfilingInfo | Represents basic inference profiling information per operation | -| OpenVINO™ Runtime | A C++ library with a set of classes that you can use in your application to infer input tensors and get the results | -| OpenVINO™ API | The basic default API for all supported devices, which allows you to load a model from Intermediate Representation or convert from ONNX, PaddlePaddle file formars, set input and output formats and execute the model on various devices | -| OpenVINO™ Core | OpenVINO™ Core is a software component that manages inference on certain Intel(R) hardware devices: CPU, GPU, GNA, etc. | -| ov::Layout | Image data layout refers to the representation of images batch. Layout shows a sequence of 4D or 5D tensor data in memory. A typical NCHW format represents pixel in horizontal direction, rows by vertical dimension, planes by channel and images into batch. See also [Layout API Overview](./OV_Runtime_UG/layout_overview.md) | -| ov::element::Type | Represents data element type. For example, f32 is 32-bit floating point, f16 is 16-bit floating point. | +| Batch +| Number of images to analyze during one call of infer. Maximum batch size is a property of the model set before its compilation. In NHWC, NCHW, and NCDHW image data layout representations, the 'N' refers to the number of images in the batch. + +| Device Affinitity +| A preferred hardware device to run inference (CPU, GPU, GNA, etc.). + +| Extensibility mechanism, Custom layers +| The mechanism that provides you with capabilities to extend the OpenVINO™ Runtime and Model Optimizer so that they can work with models containing operations that are not yet supported. + +| layer / operation +| In OpenVINO, both terms are treated synonymously. To avoid confusion, "layer" is being pushed out and "operation" is the currently accepted term. + +| OpenVINO™ Core +| OpenVINO™ Core is a software component that manages inference on certain Intel(R) hardware devices: CPU, GPU, GNA, etc. + +| OpenVINO™ API +| The basic default API for all supported devices, which allows you to load a model from Intermediate Representation or convert from ONNX, PaddlePaddle file formars, set input and output formats and execute the model on various devices. + +| OpenVINO™ Runtime +| A C++ library with a set of classes that you can use in your application to infer input tensors and get the results. + +| ov::Model +| A class of the Model that OpenVINO™ Runtime reads from IR or converts from ONNX, PaddlePaddle formats. Consists of model structure, weights and biases. + +| ov::CompiledModel +| An instance of the compiled model which allows the OpenVINO™ Runtime to request (several) infer requests and perform inference synchronously or asynchronously. + +| ov::InferRequest +| A class that represents the end point of inference on the model compiled by the device and represented by a compiled model. Inputs are set here, outputs should be requested from this interface as well. + +| ov::ProfilingInfo +| Represents basic inference profiling information per operation. + +| ov::Layout +| Image data layout refers to the representation of images batch. Layout shows a sequence of 4D or 5D tensor data in memory. A typical NCHW format represents pixel in horizontal direction, rows by vertical dimension, planes by channel and images into batch. See also [Layout API Overview](./OV_Runtime_UG/layout_overview.md). + +| ov::element::Type +| Represents data element type. For example, f32 is 32-bit floating point, f16 is 16-bit floating point. + +| plugin / Inference Device / Inference Mode +| OpenVINO makes hardware available for inference based on several core components. They used to be called "plugins" in earlier versions of documentation and you may still find this term in some articles. Because of their role in the software, they are now referred to as Devices and Modes ("virtual" devices). For a detailed description of the concept, refer to [Inference Modes](@ref openvino_docs_Runtime_Inference_Modes_Overview) and [Inference Devices](@ref openvino_docs_OV_UG_Working_with_devices). + +| Tensor +| A memory container used for storing inputs and outputs of the model, as well as weights and biases of the operations. + +@endsphinxdirective ## See Also diff --git a/docs/home.rst b/docs/home.rst index b1ea7e7f4f1..6ccdff8252b 100644 --- a/docs/home.rst +++ b/docs/home.rst @@ -19,9 +19,9 @@ Overview ~~~~~~~~ -OpenVINO enables you to optimize a deep learning model from almost any framework and deploy it with best-in-class performance on a range of Intel processors and other hardware platforms. +OpenVINO enables you to optimize deep learning models from almost any framework and deploy them with best-in-class performance on a range of Intel hardware. -A typical workflow with OpenVINO is shown below. +A typical workflow with OpenVINO: .. container:: section :name: welcome-to-openvino-toolkit-s-documentation @@ -63,7 +63,7 @@ A typical workflow with OpenVINO is shown below. High-Performance Deep Learning ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -OpenVINO Runtime automatically optimizes deep learning pipelines using aggressive graph fusion, memory reuse, load balancing, and inferencing parallelism across CPU, GPU, VPU, and more. +OpenVINO Runtime automatically optimizes deep learning pipelines using aggressive graph fusion, memory reuse, load balancing, and inference parallelism across CPU, GPU, and more. You can integrate and offload to accelerators additional operations for pre- and post-processing to reduce end-to-end latency and improve throughput. Model Quantization and Compression @@ -94,7 +94,7 @@ Boost your model’s speed even further with quantization and other state-of-the **Enhanced App Start-Up Time** - In applications where fast start-up is required, OpenVINO significantly reduces first-inference latency by using the CPU for initial inference and then switching to GPU or VPU once the model has been compiled and loaded to memory. Compiled models are cached to further improving start-up time. + In applications where fast start-up is required, OpenVINO significantly reduces first-inference latency by using the CPU for initial inference and then switching to another device once the model has been compiled and loaded to memory. Compiled models are cached improving start-up time even more. Supported Devices diff --git a/docs/install_guides/installing-openvino-overview.md b/docs/install_guides/installing-openvino-overview.md index c116bc9c81d..7719f3bcddf 100644 --- a/docs/install_guides/installing-openvino-overview.md +++ b/docs/install_guides/installing-openvino-overview.md @@ -16,7 +16,7 @@ Intel® Distribution of OpenVINO™ toolkit is a comprehensive toolkit for developing applications and solutions based on deep learning tasks, such as computer vision, automatic speech recognition, natural language processing, recommendation systems, and more. It provides high-performance and rich deployment options, from edge to cloud. Some of its advantages are: * Enables CNN-based and transformer-based deep learning inference on the edge or cloud. -* Supports various execution modes across Intel® technologies: Intel® CPU, Intel® Integrated Graphics, Intel® Discrete Graphics, Intel® Neural Compute Stick 2, and Intel® Vision Accelerator Design with Intel® Movidius™ VPUs. +* Supports various execution modes across Intel® technologies: Intel® CPU, Intel® Integrated Graphics, Intel® Discrete Graphics, and more. * Speeds time-to-market via an easy-to-use library of computer vision functions and pre-optimized kernels. * Compatible with models from a wide variety of frameworks, including TensorFlow, PyTorch, PaddlePaddle, ONNX, and more.