This guide introduces things to notice and how to use the benchmark_app to get performance numbers. It also explains how the performance numbers are reflected through internal inference performance counters and execution graphs. In the last section, it includes information on using ITT and Intel® VTune™ Profiler to get performance insights.
This guide explains how to use the benchmark_app to get performance numbers. It also explains how the performance numbers are reflected through internal inference performance counters and execution graphs. It also includes information on using ITT and Intel® VTune™ Profiler to get performance insights.
## Tip 1: Select Proper Set of Operations to Measure
## Test performance with the benchmark_app
### Prerequisites
To run benchmarks, you need both OpenVINO developer tools and Runtime installed. Follow the [Installation guide](../../install_guides/installing-model-dev-tools.md) and make sure to install the latest general release package with support for frameworks of the models you want to test.
To test performance of your model, make sure you [prepare the model for use with OpenVINO](../../Documentation/model_introduction.md). For example, if you use [OpenVINO's automation tools](@ref omz_tools_downloader), these two lines of code will download the resnet-50-tf and convert it to OpenVINO IR.
```bash
omz_downloader --name resnet-50-tf
omz_converter --name resnet-50-tf
```
### Running the benchmark application
For a detailed description, see the dedicated articles: [benchmark_app for C++](../../../samples/cpp/benchmark_app/README.md) and [benchmark_app for Python](../../../tools/benchmark_tool/README.md).
The benchmark_app includes a lot of device-specific options, but the primary usage is as simple as:
```bash
benchmark_app -m <model> -d <device> -i <input>
```
Each of the [OpenVINO supported devices](../../OV_Runtime_UG/supported_plugins/Supported_Devices.md) offers performance settings that contain command-line equivalents in the Benchmark app.
While these settings provide really low-level control for the optimal model performance on the _specific_ device, it is recommended to always start performance evaluation with the [OpenVINO High-Level Performance Hints](../../OV_Runtime_UG/performance_hints.md) first, like so:
### 1 - Select a Proper Set of Operations to Measure
When evaluating performance of a model with OpenVINO Runtime, it is required to measure a proper set of operations.
When evaluating the performance of a model with OpenVINO Runtime, it is required to measure proper set of operations. Remember the following tips:
- Avoid including one-time costs such as model loading.
- Track operations that occur outside OpenVINO Runtime (such as video decoding) separately.
> **NOTE**: Some image pre-processing can be baked into OpenVINO IR and accelerated accordingly. For more information, refer to [Embedding the Pre-processing](Additional_Optimizations.md) and [General Runtime Optimizations](../../optimization_guide/dldt_deployment_optimization_common.md).
## Tip 2: Try to Get Credible Data
### 2 - Try to Get Credible Data
Performance conclusions should be build upon reproducible data. As for the performance measurements, they should be done with a large number of invocations of the same routine. Since the first iteration is almost always significantly slower than the subsequent ones, an aggregated value can be used for the execution time for final projections:
@@ -19,26 +55,8 @@ Performance conclusions should be build upon reproducible data. As for the perfo
- If the time values range too much, consider geomean.
- Be aware of the throttling and other power oddities. A device can exist in one of several different power states. When optimizing your model, consider fixing the device frequency for better performance data reproducibility. However, the end-to-end (application) benchmarking should also be performed under real operational conditions.
## Using benchmark_app to Measure Reference Performance Numbers
To get performance numbers, use the dedicated [OpenVINO Benchmark app](../../../samples/cpp/benchmark_app/README.md) sample, which is the most-recommended solution to produce performance reference.
It includes a lot of device-specific knobs, but the primary usage is as simple as in the following command to measure the performance of the model on GPU:
```bash
$ ./benchmark_app –d GPU –m <model> -i <input>
```
to measure the performance of the model on the GPU.
Or
```bash
$ ./benchmark_app –d CPU –m <model> -i <input>
```
to execute on the CPU instead.
Each of the [OpenVINO supported devices](../../OV_Runtime_UG/supported_plugins/Supported_Devices.md) offers performance settings that contain command-line equivalents in the [Benchmark app](../../../samples/cpp/benchmark_app/README.md).
While these settings provide really low-level control and allow leveraging the optimal model performance on the _specific_ device, it is recommended to always start the performance evaluation with the [OpenVINO High-Level Performance Hints](../../OV_Runtime_UG/performance_hints.md) first:
- benchmark_app **-hint tput** -d 'device' -m 'path to your model'
- benchmark_app **-hint latency** -d 'device' -m 'path to your model'
## Notes for Comparing Performance with Native/Framework Code
### 3 - Compare Performance with Native/Framework Code
When comparing the OpenVINO Runtime performance with the framework or another reference code, make sure that both versions are as similar as possible:
@@ -49,11 +67,12 @@ When comparing the OpenVINO Runtime performance with the framework or another re
- When applicable, leverage the [Dynamic Shapes support](../../OV_Runtime_UG/ov_dynamic_shapes.md).
- If possible, demand the same accuracy. For example, TensorFlow allows `FP16` execution, so when comparing to that, make sure to test the OpenVINO Runtime with the `FP16` as well.
## Data from Internal Inference Performance Counters and Execution Graphs <a name="performance-counters"></a>
### Internal Inference Performance Counters and Execution Graphs <a name="performance-counters"></a>
More detailed insights into inference performance breakdown can be achieved with device-specific performance counters and/or execution graphs.
Both [C++](../../../samples/cpp/benchmark_app/README.md) and [Python](../../../tools/benchmark_tool/README.md) versions of the `benchmark_app` support a `-pc` command-line parameter that outputs internal execution breakdown.
For example, the table shown below is the part of performance counters for quantized [TensorFlow implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model inference on [CPU Plugin](../../OV_Runtime_UG/supported_plugins/CPU.md).
For example, the table shown below is part of performance counters for quantized [TensorFlow implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model inference on [CPU Plugin](../../OV_Runtime_UG/supported_plugins/CPU.md).
Keep in mind that since the device is CPU, the `realTime` wall clock and the `cpu` time layers are the same. Information about layer precision is also stored in the performance counters.
@@ -85,6 +104,6 @@ Especially when performance-debugging the [latency](../../optimization_guide/dld
Lastly, the performance statistics with both performance counters and execution graphs are averaged, so such data for the [inputs of dynamic shapes](../../OV_Runtime_UG/ov_dynamic_shapes.md) should be measured carefully, preferably by isolating the specific shape and executing multiple times in a loop, to gather the reliable data.
## Using ITT to Get Performance Insights
### Use ITT to Get Performance Insights
In general, OpenVINO and its individual plugins are heavily instrumented with Intel® Instrumentation and Tracing Technology (ITT). Therefore, you can also compile OpenVINO from the source code with ITT enabled and use tools like [Intel® VTune™ Profiler](https://software.intel.com/en-us/vtune) to get detailed inference performance breakdown and additional insights in the application-level performance on the timeline view.
@@ -16,9 +17,12 @@ The[Intel® Distribution of OpenVINO™ toolkit](https://software.intel.com/c
The benchmark results below demonstrate high performance gains on several public neural networks on multipleIntel® CPUs, GPUs and VPUscovering a broad performance range. The results may be helpful when deciding which hardware is best for your applications and solutions or to plan AI workload on the Intel computing already included in your solutions.
The following benchmarks are available:
Benchmarks are available for:
* [Intel® Distribution of OpenVINO™ toolkit Benchmark Results](performance_benchmarks_openvino.md).
* [Intel® Distribution of OpenVINO™ toolkit](performance_benchmarks_openvino.md).
* [OpenVINO™ Model Server](performance_benchmarks_ovms.md).
You can also test performance for your system yourself, following the guide on [getting performance numbers](../MO_DG/prepare_model/Getting_performance_numbers.md).
Performance of a particular application can also be evaluated virtually using [Intel® DevCloud for the Edge](https://devcloud.intel.com/edge/). It is a remote development environment with access to Intel® hardware and the latest versions of the Intel® Distribution of the OpenVINO™ Toolkit. To learn more about it, visit [the website](https://www.intel.com/content/www/us/en/developer/tools/devcloud/edge/overview.html) or [create an account](https://www.intel.com/content/www/us/en/forms/idz/devcloud-registration.html?tgt=https://www.intel.com/content/www/us/en/secure/forms/devcloud-enrollment/account-provisioning.html).
.. dropdown:: How can I run the benchmark results on my own?
All of the performance benchmarks were generated using the
All of the performance benchmarks are generated using the
open-source tool within the Intel® Distribution of OpenVINO™ toolkit
called `benchmark_app`. This tool is available in both `C++ <https://github.com/openvinotoolkit/openvino/blob/master/samples/cpp/benchmark_app/README.md>`_ and `Python <https://github.com/openvinotoolkit/openvino/blob/master/tools/benchmark_tool/README.md>`_.
called `benchmark_app`. This tool is available
`for C++ apps <openvino_inference_engine_samples_benchmark_app_README>`_
This section provides reference documents that guide you through the OpenVINO toolkit workflow, from obtaining models, optimizing them, to deploying them in your own deep learning applications.
This section provides reference documents that guide you through the OpenVINO toolkit workflow, from preparing models, optimizing them, to deploying them in your own deep learning applications.
## Converting and Preparing Models
With [Model Downloader](@ref omz_tools_downloader) and [Model Optimizer](MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) guides, you will learn to download pre-trained models and convert them for use with OpenVINO™. You can use your own models or choose some from a broad selection provided in the [Open Model Zoo](./model_zoo.md).
## Optimization and Performance
In this section you will find resources on [how to test inference performance](MO_DG/prepare_model/Getting_performance_numbers.md) and [how to increase it](optimization_guide/dldt_optimization_guide.md). It can be achieved by [optimizing the model](optimization_guide/model_optimization_guide.md) or [optimizing inference at runtime](optimization_guide/dldt_deployment_optimization_guide.md).
## Deploying Inference
This section explains the process of creating your own inference application using [OpenVINO™ Runtime](./OV_Runtime_UG/openvino_intro.md) and documents the [OpenVINO Runtime API](./api_references.html) for both Python and C++.
It also provides a [guide on deploying applications with OpenVINO](./OV_Runtime_UG/deployment/deployment_intro.md) and directs you to other sources on this topic.
## OpenVINO Ecosystem
Apart from the core components, OpenVINO offers tools, plugins, and expansions revolving around it, even if not constituting necessary parts of its workflow. This section will give you an overview of [what makes up OpenVINO Toolkit](./Documentation/openvino_ecosystem.md).
## Media Processing and Computer Vision Libraries
The OpenVINO™ toolkit also works with the following media processing frameworks and libraries:
* [Intel® Deep Learning Streamer (Intel® DL Streamer)](@ref openvino_docs_dlstreamer) — A streaming media analytics framework based on GStreamer, for creating complex media analytics pipelines optimized for Intel hardware platforms. Go to the Intel® DL Streamer [documentation](https://dlstreamer.github.io/) website to learn more.
* [Intel® oneAPI Video Processing Library (oneVPL)](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/api-based-programming/intel-oneapi-video-processing-library-onevpl.html) — A programming interface for video decoding, encoding, and processing to build portable media pipelines on CPUs, GPUs, and other accelerators.
You can also add computer vision capabilities to your application using optimized versions of [OpenCV](https://opencv.org/).
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.