Docs benchmarks page update port22.2 (#13165)

* update page and benchmark config data

benchmarks articles
update data tables
delete image

* hide / remove ovms benchmarks page

Co-authored-by: Ilya Churaev <ilya.churaev@intel.com>
This commit is contained in:
Karol Blaszczak
2022-09-22 15:04:45 +02:00
committed by GitHub
parent ea92b38c44
commit 2e8acae6f2
9 changed files with 120 additions and 940 deletions

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -7,7 +7,7 @@
:hidden:
openvino_docs_performance_benchmarks_openvino
openvino_docs_performance_benchmarks_ovms
@endsphinxdirective
@@ -19,6 +19,6 @@ The benchmark results below demonstrate high performance gains on several public
The following benchmarks are available:
* [Intel® Distribution of OpenVINO™ toolkit Benchmark Results](performance_benchmarks_openvino.md).
* [OpenVINO™ Model Server Benchmark Results](performance_benchmarks_ovms.md).
Performance of a particular application can also be evaluated virtually using [Intel® DevCloud for the Edge](https://devcloud.intel.com/edge/). It is a remote development environment with access to Intel® hardware and the latest versions of the Intel® Distribution of the OpenVINO™ Toolkit. To learn more about it, visit [the website](https://www.intel.com/content/www/us/en/developer/tools/devcloud/edge/overview.html) or [create an account](https://www.intel.com/content/www/us/en/forms/idz/devcloud-registration.html?tgt=https://www.intel.com/content/www/us/en/secure/forms/devcloud-enrollment/account-provisioning.html).

View File

@@ -6,178 +6,188 @@
:hidden:
openvino_docs_performance_benchmarks_faq
Download Performance Data Spreadsheet in MS Excel Format <https://docs.openvino.ai/downloads/benchmark_files/OV-2022.1-Download-Excel.xlsx>
openvino_docs_performance_int8_vs_fp32
Performance Data Spreadsheet (download xlsx) <https://docs.openvino.ai/2022.2/_static/benchmarks_files/OV-2022.2-Performance-Data.xlsx>
@endsphinxdirective
Features and benefits of Intel® technologies depend on system configuration and may require enabled hardware, software or service activation. More information on this subject may be obtained from the original equipment manufacturer (OEM), official [Intel® web page](https://www.intel.com) or retailer.
## Platform Configurations
@sphinxdirective
:download:`A full list of HW platforms used for testing (along with their configuration)<../../../docs/benchmarks/files/Platform_list.pdf>`
@endsphinxdirective
For more specific information, refer to the [Configuration Details](https://docs.openvino.ai/resources/benchmark_files/system_configurations_2022.1.html) document.
## Benchmark Setup Information
This benchmark setup includes a single machine on which both the benchmark application and the OpenVINO™ installation reside. The presented performance benchmark numbers are based on realease 2022.1 of Intel® Distribution of OpenVINO™ toolkit.
This benchmark setup includes a single machine on which both the benchmark application and the OpenVINO™ installation reside. The presented performance benchmark numbers are based on release 2022.2 of the Intel® Distribution of OpenVINO™ toolkit.
The benchmark application loads the OpenVINO™ Runtime and executes inferences on the specified hardware (CPU, GPU or VPU). It measures the time spent on actual inferencing (excluding any pre or post processing) and then reports on the inferences per second (or Frames Per Second - FPS). For additional information on the benchmark application, refer to the entry 5 in the [FAQ section](performance_benchmarks_faq.md).
The benchmark application loads the OpenVINO™ Runtime and executes inference on the specified hardware (CPU, GPU or VPU). It measures the time spent on actual inferencing (excluding any pre or post processing) and then reports on the inferences per second (or Frames Per Second - FPS). For additional information on the benchmark application, refer to the entry 5 in the [FAQ section](performance_benchmarks_faq.md).
Measuring inference performance involves many variables and is extremely use case and application dependent. Below are four parameters used for measurements, which are key elements to consider for a successful deep learning inference application:
Measuring inference performance involves many variables and is extremely use-case and application dependent. Below are four parameters used for measurements, which are key elements to consider for a successful deep learning inference application:
- **Throughput** - Measures the number of inferences delivered within a latency threshold (for example, number of FPS). When deploying a system with deep learning inference, select the throughput that delivers the best trade-off between latency and power for the price and performance that meets your requirements.
- **Value** - While throughput is important, what is more critical in edge AI deployments is the performance efficiency or performance-per-cost. Application performance in throughput per dollar of system cost is the best measure of value.
- **Efficiency** - System power is a key consideration from the edge to the data center. When selecting deep learning solutions, power efficiency (throughput/watt) is a critical factor to consider. Intel designs provide excellent power efficiency for running deep learning workloads.
- **Latency** - This parameter measures the synchronous execution of inference requests and is reported in milliseconds. Each inference request (i.e., preprocess, infer, postprocess) is allowed to complete before the next one is started. This performance metric is relevant in usage scenarios where a single image input needs to be acted upon as soon as possible. An example of that kind of a scenario would be real-time or near real-time applications, i.e., the response of an industrial robot to its environment or obstacle avoidance for autonomous vehicles.
For a listing of all platforms and configurations used for testing, refer to the following:
@sphinxdirective
* :download:`HW platforms (pdf) <_static/benchmarks_files/platform_list_22.2.pdf>`
* :download:`Configuration Details (xlsx) <_static/benchmarks_files/OV-2022.2-system-info-detailed.xlsx>`
@endsphinxdirective
## Benchmark Performance Results
Benchmark performance results below are based on testing as of March 17, 2022. They may not reflect all publicly available updates at the time of testing.
Benchmark performance results below are based on testing as of September 20, 2022. They may not reflect all publicly available updates at the time of testing.
<!-- See configuration disclosure for details. No product can be absolutely secure. -->
Performance varies by use, configuration and other factors, which are elaborated further in [here](https://www.intel.com/PerformanceIndex). Used Intel optimizations (for Intel® compilers or other products) may not optimize to the same degree for non-Intel products.
### bert-base-cased [124]
### bert-base-cased_onnx [124]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/bert-base-cased124.csv"></div>
<div class="chart-block" data-loadcsv="csv/bert-base-cased_onnx.csv"></div>
@endsphinxdirective
### bert-large-uncased-whole-word-masking-squad-int8-0001 [384]
### bert-large-uncased-whole-word-masking-squad-0001_onnx [384]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/bert-large-uncased-whole-word-masking-squad-int8-0001-384.csv"></div>
<div class="chart-block" data-loadcsv="csv/bert-large-uncased-whole-word-masking-squad-0001_onnx.csv"></div>
@endsphinxdirective
### deeplabv3-TF [513x513]
### deeplabv3_tf [513x513]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/deeplabv3-TF-513x513.csv"></div>
<div class="chart-block" data-loadcsv="csv/deeplabv3_tf.csv"></div>
@endsphinxdirective
### densenet-121-TF [224x224]
### densenet-121_tf [224x224]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/densenet-121-TF-224x224.csv"></div>
<div class="chart-block" data-loadcsv="csv/densenet-121_tf.csv"></div>
@endsphinxdirective
### efficientdet-d0 [512x512]
### efficientdet-d0_tf [512x512]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/efficientdet-d0-512x512.csv"></div>
<div class="chart-block" data-loadcsv="csv/efficientdet-d0_tf.csv"></div>
@endsphinxdirective
### faster-rcnn-resnet50-coco-TF [600x1024]
### mask_rcnn_resnet50_atrous_coco_tf [600x1024]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/faster_rcnn_resnet50_coco-TF-600x1024.csv"></div>
<div class="chart-block" data-loadcsv="csv/mask_rcnn_resnet50_atrous_coco_tf.csv"></div>
@endsphinxdirective
### inception-v4-TF [299x299]
### ssd-resnet34-1200_onnx [1200x1200]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/inception-v4-TF-299x299.csv"></div>
<div class="chart-block" data-loadcsv="csv/ssd-resnet34-1200_onnx.csv"></div>
@endsphinxdirective
### mobilenet-ssd-CF [300x300]
### resnet-50_tf [224x224]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/mobilenet-ssd-CF-300x300.csv"></div>
<div class="chart-block" data-loadcsv="csv/resnet-50_tf.csv"></div>
@endsphinxdirective
### mobilenet-v2-pytorch [224x224]
### resnet-50-pytorch_onnx [224x224]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/mobilenet-v2-pytorch-224x224.csv"></div>
<div class="chart-block" data-loadcsv="csv/resnet-50-pytorch_onnx.csv"></div>
@endsphinxdirective
### resnet-18-pytorch [224x224]
### yolo_v3_tiny_tf [416x416]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/resnet-18-pytorch-224x224.csv"></div>
<div class="chart-block" data-loadcsv="csv/yolo_v3_tiny_tf.csv"></div>
@endsphinxdirective
### resnet_50_TF [224x224]
### yolo_v4_tf2 [608x608]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/resnet-50-TF-224x224.csv"></div>
<div class="chart-block" data-loadcsv="csv/yolo_v4_tf2.csv"></div>
@endsphinxdirective
### ssd-resnet34-1200-onnx [1200x1200]
### googlenet-v4_tf [224x224]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/ssd-resnet34-1200-onnx-1200x1200.csv"></div>
<div class="chart-block" data-loadcsv="csv/googlenet-v4_tf.csv"></div>
@endsphinxdirective
### unet-camvid-onnx-0001 [368x480]
### ssd_mobilenet_v1_coco_tf [300x300]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/unet-camvid-onnx-0001-368x480.csv"></div>
<div class="chart-block" data-loadcsv="csv/ssd_mobilenet_v1_coco_tf.csv"></div>
@endsphinxdirective
### yolo-v3-tiny-tf [416x416]
### ssd_mobilenet_v2_coco_tf [300x300]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/yolo-v3-tiny-tf-416x416.csv"></div>
<div class="chart-block" data-loadcsv="csv/ssd_mobilenet_v2_coco_tf.csv"></div>
@endsphinxdirective
### yolo_v4-tf [608x608]
### unet-camvid-onnx-0001_onnx [368x480]
@sphinxdirective
.. raw:: html
<div class="chart-block" data-loadcsv="csv/yolo_v4-tf-608x608.csv"></div>
<div class="chart-block" data-loadcsv="csv/unet-camvid-onnx-0001_onnx.csv"></div>
@endsphinxdirective
## Disclaimers
Intel® Distribution of OpenVINO™ toolkit performance benchmark numbers are based on release 2022.2.
Intel technologies features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. Performance results are based on testing as of September 20, 2022 and may not reflect all publicly available updates. See configuration disclosure for details. No product can be absolutely secure.
Performance varies by use, configuration and other factors. Learn more at [www.intel.com/PerformanceIndex](https://www.intel.com/PerformanceIndex).
Your costs and results may vary.
Intel optimizations, for Intel compilers or other products, may not optimize to the same degree for non-Intel products.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

View File

@@ -1,497 +0,0 @@
# OpenVINO™ Model Server Benchmark Results {#openvino_docs_performance_benchmarks_ovms}
OpenVINO™ Model Server is an open-source, production-grade inference platform that exposes a set of models via a convenient inference API over gRPC or HTTP/REST. It employs the OpenVINO™ Runtime libraries from the Intel® Distribution of OpenVINO™ toolkit to extend workloads across Intel® hardware including CPU, GPU and others.
![OpenVINO™ Model Server](../img/performance_benchmarks_ovms_01.png)
## Measurement Methodology
OpenVINO™ Model Server is measured in a multiple-client-single-server configuration using two hardware platforms connected by an ethernet network. The network bandwidth depends on the platforms as well as models under investigation, and it is set not to be a bottleneck for workload intensity. This connection is dedicated only to the performance measurements. The benchmark setup consists of four main parts:
![OVMS Benchmark Setup Diagram](../img/performance_benchmarks_ovms_02.png)
- **OpenVINO™ Model Server** -- It is launched as a docker container on the server platform and it listens, and answers to, requests from clients. It is run on the same system as the OpenVINO™ toolkit benchmark application in corresponding benchmarking. Models served by it are placed in a local file system mounted into the docker container. The OpenVINO™ Model Server instance communicates with other components via ports over a dedicated docker network.
- **Clients** - They are run in a separated physical system referred to as a client platform. Clients are implemented in the Python3 programming language based on the TensorFlow API and they work as parallel processes. Each client waits for a response from OpenVINO™ Model Server before it sends a new request. Clients also play a role in verification of responses.
- **Load Balancer** -- It works on the client platform in a docker container by using a HAProxy. It is mainly responsible for counting requests forwarded from clients to OpenVINO™ Model Server, estimating its latency, and sharing this information by Prometheus service. The reason for locating this part on the client site is to simulate a real life scenario that includes an impact of a physical network on reported metrics.
- **Execution Controller** -- It is launched on the client platform. It is responsible for synchronization of the whole measurement process, downloading metrics from Load Balancer and presenting the final report of the execution.
## resnet-50-TF (INT8)
![](../img/throughput_ovms_resnet50_int8.png)
## resnet-50-TF (FP32)
![](../img/throughput_ovms_resnet50_fp32_bs_1.png)
## googlenet-v4-TF (FP32)
![](../img/throughput_ovms_googlenet4_fp32.png)
## yolo-v3-tf (FP32)
![](../img/throughput_ovms_yolo3_fp32.png)
## yolo-v4-tf (FP32)
![](../img/throughput_ovms_yolo4_fp32.png)
## brain-tumor-segmentation-0002
![](../img/throughput_ovms_braintumorsegmentation.png)
## alexnet
![](../img/throughput_ovms_alexnet.png)
## mobilenet-v3-large-1.0-224-TF (FP32)
![](../img/throughput_ovms_mobilenet3large_fp32.png)
## deeplabv3 (FP32)
![](../img/throughput_ovms_deeplabv3_fp32.png)
## bert-small-uncased-whole-word-masking-squad-int8-0002 (INT8)
![](../img/throughput_ovms_bertsmall_int8.png)
## bert-small-uncased-whole-word-masking-squad-0002 (FP32)
![](../img/throughput_ovms_bertsmall_fp32.png)
## 3D U-Net (FP32)
![](../img/throughput_ovms_3dunet.png)
## Image Compression for Improved Throughput
OpenVINO™ Model Server supports compressed binary input data (images in JPEG and PNG formats) for vision processing models. This
feature improves overall performance on networks where the bandwidth constitutes a system bottleneck. Some examples of such a use case are: wireless 5G communication, a typical 1 Gbit/sec Ethernet network, and a scenario of multiple client machines issuing a high rate of inference requests to a single, central OpenVINO model server. Generally, performance improvement grows with increased compressibility of data/image. Decompression on the server side is performed by the OpenCV library.
### Supported Image Formats for OVMS Compression
- Always supported:
- Portable image format - `*.pbm`, `*.pgm`, `*.ppm`, `*.pxm`, `*.pnm`.
- Radiance HDR - `*.hdr`, `*.pic`.
- Sun rasters - `*.sr`, `*.ras`.
- Windows bitmaps - `*.bmp`, `*.dib`.
- Limited support (refer to OpenCV documentation):
- Raster and Vector geospatial data supported by GDAL.
- JPEG files - `*.jpeg`, `*.jpg`, `*.jpe`.
- Portable Network Graphics - `*.png`.
- TIFF files - `*.tiff`, `*.tif`.
- OpenEXR Image files - `*.exr`.
- JPEG 2000 files - `*.jp2`.
- WebP - `*.webp`.
### googlenet-v4-tf (FP32)
![](../img/throughput_ovms_1gbps_googlenet4_fp32.png)
### resnet-50-tf (INT8)
![](../img/throughput_ovms_1gbps_resnet50_int8.png)
### resnet-50-tf (FP32)
![](../img/throughput_ovms_1gbps_resnet50_fp32.png)
## Platform Configurations
OpenVINO™ Model Server performance benchmark numbers are based on release 2021.4. Performance results are based on testing as of June 17, 2021 and may not reflect all publicly available updates.
### Platform with Intel® Xeon® Platinum 8260M
@sphinxdirective
.. raw:: html
<table class="table">
<tr>
<th></th>
<th><strong>Server Platform</strong></th>
<th><strong>Client Platform</strong></th>
</tr>
<tr>
<td><strong>Motherboard</strong></td>
<td>Inspur YZMB-00882-104 NF5280M5</td>
<td>Intel® Server Board S2600WF H48104-872</td>
</tr>
<tr>
<td><strong>Memory</strong></td>
<td>Samsung 16 x 16GB @ 2666 MT/s DDR4</td>
<td>Hynix 16 x 16GB @ 2666 MT/s DDR4</td>
</tr>
<tr>
<td><strong>CPU</strong></td>
<td>Intel® Xeon® Platinum 8260M CPU @ 2.40GHz</td>
<td>Intel® Xeon® Gold 6252 CPU @ 2.10GHz</td>
</tr>
<tr>
<td><strong>Selected CPU Flags</strong></td>
<td>Hyper Threading, Turbo Boost, DL Boost</td>
<td>Hyper Threading, Turbo Boost, DL Boost</td>
</tr>
<tr>
<td><strong>CPU Thermal Design Power</strong></td>
<td>162 W</td>
<td>150 W</td>
</tr>
<tr>
<td><strong>Operating System</strong></td>
<td>Ubuntu 20.04.2 LTS</td>
<td>Ubuntu 20.04.2 LTS</td>
</tr>
<tr>
<td><strong>Kernel Version</strong></td>
<td>5.4.0-54-generic</td>
<td>5.4.0-65-generic</td>
</tr>
<tr>
<td><strong>BIOS Vendor</strong></td>
<td>American Megatrends Inc.</td>
<td>Intel® Corporation</td>
</tr>
<tr>
<td><strong>BIOS Version & Release</strong></td>
<td>4.1.16, date: 06/23/2020</td>
<td>SE5C620.86B.02.01, date: 03/26/2020</td>
</tr>
<tr>
<td><strong>Docker Version</strong></td>
<td>20.10.3</td>
<td>20.10.3</td>
</tr>
<tr>
<td><strong>Network Speed</strong></td>
<td colspan="2">40 Gb/s</td>
</tr>
</table>
@endsphinxdirective
### Platform with Intel® Xeon® Gold 6252
@sphinxdirective
.. raw:: html
<table class="table">
<tr>
<th></th>
<th><strong>Server Platform</strong></th>
<th><strong>Client Platform</strong></th>
</tr>
<tr>
<td><strong>Motherboard</strong></td>
<td>Intel® Server Board S2600WF H48104-872</td>
<td>Inspur YZMB-00882-104 NF5280M5</td>
</tr>
<tr>
<td><strong>Memory</strong></td>
<td>Hynix 16 x 16GB @ 2666 MT/s DDR4</td>
<td>Samsung 16 x 16GB @ 2666 MT/s DDR4</td>
</tr>
<tr>
<td><strong>CPU</strong></td>
<td>Intel® Xeon® Gold 6252 CPU @ 2.10GHz</td>
<td>Intel® Xeon® Platinum 8260M CPU @ 2.40GHz</td>
</tr>
<tr>
<td><strong>Selected CPU Flags</strong></td>
<td>Hyper Threading, Turbo Boost, DL Boost</td>
<td>Hyper Threading, Turbo Boost, DL Boost</td>
</tr>
<tr>
<td><strong>CPU Thermal Design Power</strong></td>
<td>150 W</td>
<td>162 W</td>
</tr>
<tr>
<td><strong>Operating System</strong></td>
<td>Ubuntu 20.04.2 LTS</td>
<td>Ubuntu 20.04.2 LTS</td>
</tr>
<tr>
<td><strong>Kernel Version</strong></td>
<td>5.4.0-65-generic</td>
<td>5.4.0-54-generic</td>
</tr>
<tr>
<td><strong>BIOS Vendor</strong></td>
<td>Intel® Corporation</td>
<td>American Megatrends Inc.</td>
</tr>
<tr>
<td><strong>BIOS Version and Release Date</strong></td>
<td>SE5C620.86B.02.01, date: 03/26/2020</td>
<td>4.1.16, date: 06/23/2020</td>
</tr>
<tr>
<td><strong>Docker Version</strong></td>
<td>20.10.3</td>
<td>20.10.3</td>
</tr>
<tr>
<td><strong>Network Speed</strong></td>
<td colspan="2" align="center">40 Gb/s</td>
</tr>
</table>
@endsphinxdirective
### Platform with Intel® Core™ i9-10920X
@sphinxdirective
.. raw:: html
<table class="table">
<tr>
<th></th>
<th><strong>Server Platform</strong></th>
<th><strong>Client Platform</strong></th>
</tr>
<tr>
<td><strong>Motherboard</strong></td>
<td>ASUSTeK COMPUTER INC. PRIME X299-A II</td>
<td>ASUSTeK COMPUTER INC. PRIME Z370-P</td>
</tr>
<tr>
<td><strong>Memory</strong></td>
<td>Corsair 4 x 16GB @ 2666 MT/s DDR4</td>
<td>Corsair 4 x 16GB @ 2133 MT/s DDR4</td>
</tr>
<tr>
<td><strong>CPU</strong></td>
<td>Intel® Core™ i9-10920X CPU @ 3.50GHz</td>
<td>Intel® Core™ i7-8700T CPU @ 2.40GHz</td>
</tr>
<tr>
<td><strong>Selected CPU Flags</strong></td>
<td>Hyper Threading, Turbo Boost, DL Boost</td>
<td>Hyper Threading, Turbo Boost</td>
</tr>
<tr>
<td><strong>CPU Thermal Design Power</strong></td>
<td>165 W</td>
<td>35 W</td>
</tr>
<tr>
<td><strong>Operating System</strong></td>
<td>Ubuntu 20.04.1 LTS</td>
<td>Ubuntu 20.04.1 LTS</td>
</tr>
<tr>
<td><strong>Kernel Version</strong></td>
<td>5.4.0-52-generic</td>
<td>5.4.0-56-generic</td>
</tr>
<tr>
<td><strong>BIOS Vendor</strong></td>
<td>American Megatrends Inc.</td>
<td>American Megatrends Inc.</td>
</tr>
<tr>
<td><strong>BIOS Version and Release Date</strong></td>
<td>0603, date: 03/05/2020</td>
<td>2401, date: 07/15/2019</td>
</tr>
<tr>
<td><strong>Docker Version</strong></td>
<td>19.03.13</td>
<td>19.03.14</td>
</tr>
</tr>
<tr>
<td><strong>Network Speed</strong></td>
<td colspan="2" align="center">10 Gb/s</td>
</tr>
</table>
@endsphinxdirective
### Platform with Intel® Core™ i7-8700T
@sphinxdirective
.. raw:: html
<table class="table">
<tr>
<th></th>
<th><strong>Server Platform</strong></th>
<th><strong>Client Platform</strong></th>
</tr>
<tr>
<td><strong>Motherboard</strong></td>
<td>ASUSTeK COMPUTER INC. PRIME Z370-P</td>
<td>ASUSTeK COMPUTER INC. PRIME X299-A II</td>
</tr>
<tr>
<td><strong>Memory</strong></td>
<td>Corsair 4 x 16GB @ 2133 MT/s DDR4</td>
<td>Corsair 4 x 16GB @ 2666 MT/s DDR4</td>
</tr>
<tr>
<td><strong>CPU</strong></td>
<td>Intel® Core™ i7-8700T CPU @ 2.40GHz</td>
<td>Intel® Core™ i9-10920X CPU @ 3.50GHz</td>
</tr>
<tr>
<td><strong>Selected CPU Flags</strong></td>
<td>Hyper Threading, Turbo Boost</td>
<td>Hyper Threading, Turbo Boost, DL Boost</td>
</tr>
<tr>
<td><strong>CPU Thermal Design Power</strong></td>
<td>35 W</td>
<td>165 W</td>
</tr>
<tr>
<td><strong>Operating System</strong></td>
<td>Ubuntu 20.04.1 LTS</td>
<td>Ubuntu 20.04.1 LTS</td>
</tr>
<tr>
<td><strong>Kernel Version</strong></td>
<td>5.4.0-56-generic</td>
<td>5.4.0-52-generic</td>
</tr>
<tr>
<td><strong>BIOS Vendor</strong></td>
<td>American Megatrends Inc.</td>
<td>American Megatrends Inc.</td>
</tr>
<tr>
<td><strong>BIOS Version and Release Date</strong></td>
<td>2401, date: 07/15/2019</td>
<td>0603, date: 03/05/2020</td>
</tr>
<tr>
<td><strong>Docker Version</strong></td>
<td>19.03.14</td>
<td>19.03.13</td>
</tr>
</tr>
<tr>
<td><strong>Network Speed</strong></td>
<td colspan="2" align="center">10 Gb/s</td>
</tr>
</table>
@endsphinxdirective
### Platform with Intel® Core™ i5-8500
@sphinxdirective
.. raw:: html
<table class="table">
<tr>
<th></th>
<th><strong>Server Platform</strong></th>
<th><strong>Client Platform</strong></th>
</tr>
<tr>
<td><strong>Motherboard</strong></td>
<td>ASUSTeK COMPUTER INC. PRIME Z370-A</td>
<td>Gigabyte Technology Co., Ltd. Z390 UD</td>
</tr>
<tr>
<td><strong>Memory</strong></td>
<td>Corsair 2 x 16GB @ 2133 MT/s DDR4</td>
<td>029E 4 x 8GB @ 2400 MT/s DDR4</td>
</tr>
<tr>
<td><strong>CPU</strong></td>
<td>Intel® Core™ i5-8500 CPU @ 3.00GHz</td>
<td>Intel® Core™ i3-8100 CPU @ 3.60GHz</td>
</tr>
<tr>
<td><strong>Selected CPU Flags</strong></td>
<td>Turbo Boost</td>
<td>-</td>
</tr>
<tr>
<td><strong>CPU Thermal Design Power</strong></td>
<td>65 W</td>
<td>65 W</td>
</tr>
<tr>
<td><strong>Operating System</strong></td>
<td>Ubuntu 20.04.1 LTS</td>
<td>Ubuntu 20.04.1 LTS</td>
</tr>
<tr>
<td><strong>Kernel Version</strong></td>
<td>5.4.0-52-generic</td>
<td>5.4.0-52-generic</td>
</tr>
<tr>
<td><strong>BIOS Vendor</strong></td>
<td>American Megatrends Inc.</td>
<td>American Megatrends Inc.</td>
</tr>
<tr>
<td><strong>BIOS Version and Release Date</strong></td>
<td>2401, date: 07/12/2019</td>
<td>F10j, date: 09/16/2020</td>
</tr>
<tr>
<td><strong>Docker Version</strong></td>
<td>19.03.13</td>
<td>20.10.0</td>
</tr>
</tr>
<tr>
<td><strong>Network Speed</strong></td>
<td colspan="2" align="center">40 Gb/s</td>
</tr>
</table>
@endsphinxdirective
### Platform with Intel® Core™ i3-8100
@sphinxdirective
.. raw:: html
<table class="table">
<tr>
<th></th>
<th><strong>Server Platform</strong></th>
<th><strong>Client Platform</strong></th>
</tr>
<tr>
<td><strong>Motherboard</strong></td>
<td>Gigabyte Technology Co., Ltd. Z390 UD</td>
<td>ASUSTeK COMPUTER INC. PRIME Z370-A</td>
</tr>
<tr>
<td><strong>Memory</strong></td>
<td>029E 4 x 8GB @ 2400 MT/s DDR4</td>
<td>Corsair 2 x 16GB @ 2133 MT/s DDR4</td>
</tr>
<tr>
<td><strong>CPU</strong></td>
<td>Intel® Core™ i3-8100 CPU @ 3.60GHz</td>
<td>Intel® Core™ i5-8500 CPU @ 3.00GHz</td>
</tr>
<tr>
<td><strong>Selected CPU Flags</strong></td>
<td>-</td>
<td>Turbo Boost</td>
</tr>
<tr>
<td><strong>CPU Thermal Design Power</strong></td>
<td>65 W</td>
<td>65 W</td>
</tr>
<tr>
<td><strong>Operating System</strong></td>
<td>Ubuntu 20.04.1 LTS</td>
<td>Ubuntu 20.04.1 LTS</td>
</tr>
<tr>
<td><strong>Kernel Version</strong></td>
<td>5.4.0-52-generic</td>
<td>5.4.0-52-generic</td>
</tr>
<tr>
<td><strong>BIOS Vendor</strong></td>
<td>American Megatrends Inc.</td>
<td>American Megatrends Inc.</td>
</tr>
<tr>
<td><strong>BIOS Version and Release Date</strong></td>
<td>F10j, date: 09/16/2020</td>
<td>2401, date: 07/12/2019</td>
</tr>
<tr>
<td><strong>Docker Version</strong></td>
<td>20.10.0</td>
<td>19.03.13</td>
</tr>
</tr>
<tr>
<td><strong>Network Speed</strong></td>
<td colspan="2" align="center">40 Gb/s</td>
</tr>
</table>
@endsphinxdirective

View File

@@ -10,450 +10,120 @@ The following table presents the absolute accuracy drop calculated as the accura
<th></th>
<th></th>
<th></th>
<th>Intel® Core™ <br>i9-10920X CPU<br>@ 3.50GHZ (VNNI)</th>
<th>Intel® Core™ <br>i9-9820X CPU<br>@ 3.30GHz (AVX512)</th>
<th>Intel® Core™ <br>i7-6700K CPU<br>@ 4.0GHz (AVX2)</th>
<th>Intel® Core™ <br>i7-1185G7 CPU<br>@ 4.0GHz (TGL VNNI)</th>
<th>Intel® Core™ i9-12900K @ 3.2 GHz (AVX2)</th>
<th>Intel® Core™ i9-12900K @ 3.2 GHz (AVX2)</th>
<th>iGPU Gen12LP (Intel® Core™ i9-12900K @ 3.2 GHz)</th>
</tr>
<tr align="left">
<th>OpenVINO Benchmark <br>Model Name</th>
<th>Dataset</th>
<th>Metric Name</th>
<th colspan="4" align="center">Absolute Accuracy Drop, %</th>
<th colspan="3" align="center">Absolute Accuracy Drop, %</th>
</tr>
<tr>
<td>bert-base-cased</td>
<td>SST-2</td>
<td>accuracy</td>
<td>0.57</td>
<td>0.11</td>
<td>0.11</td>
<td>0.57</td>
<td>0.34</td>
<td>0.46</td>
</tr>
<tr>
<td>bert-large-uncased-whole-word-masking-squad-0001</td>
<td>SQUAD</td>
<td>F1</td>
<td>0.76</td>
<td>0.59</td>
<td>0.68</td>
<td>0.76</td>
</tr>
<td>0.87</td>
<td>1.11</td>
<td>0.70</td>
</tr>
<tr>
<td>brain-tumor-<br>segmentation-<br>0001-MXNET</td>
<td>BraTS</td>
<td>Dice-index@ <br>Mean@ <br>Overall Tumor</td>
<td>0.10</td>
<td>0.10</td>
<td>0.10</td>
<td>0.10</td>
</tr>
<tr>
<td>brain-tumor-<br>segmentation-<br>0001-ONNX</td>
<td>BraTS</td>
<td>Dice-index@ <br>Mean@ <br>Overall Tumor</td>
<td>0.11</td>
<td>0.12</td>
<td>0.12</td>
<td>0.11</td>
</tr>
<tr>
<td>deeplabv3-TF</td>
<td>deeplabv3</td>
<td>VOC2012</td>
<td>mean_iou</td>
<td>0.03</td>
<td>0.42</td>
<td>0.42</td>
<td>0.03</td>
<td>0.04</td>
<td>0.04</td>
<td>0.11</td>
</tr>
<tr>
<td>densenet-121-TF</td>
<td>densenet-121</td>
<td>ImageNet</td>
<td>accuracy@top1</td>
<td>0.50</td>
<td>0.56</td>
<td>0.56</td>
<td>0.50</td>
<td>0.63</td>
</tr>
<tr>
<td>efficientdet-d0-tf</td>
<td>efficientdet-d0</td>
<td>COCO2017</td>
<td>coco_precision</td>
<td>0.55</td>
<td>0.81</td>
<td>0.81</td>
<td>0.55</td>
<td>0.63</td>
<td>0.62</td>
<td>0.45</td>
</tr>
<tr>
<td>facenet-<br>20180408-<br>102900-TF</td>
<td>LFW_MTCNN</td>
<td>pairwise_<br>accuracy<br>_subsets</td>
<td>0.05</td>
<td>0.12</td>
<td>0.12</td>
<td>0.05</td>
</tr>
<tr>
<td>faster_rcnn_<br>resnet50_coco-TF</td>
<td>faster_rcnn_<br>resnet50_coco</td>
<td>COCO2017</td>
<td>coco_<br>precision</td>
<td>0.16</td>
<td>0.52</td>
<td>0.55</td>
<td>0.31</td>
</tr>
<tr>
<td>resnet-18</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.16</td>
<td>0.16</td>
<td>0.16</td>
</tr>
<tr>
<td>googlenet-v3-tf</td>
<td>resnet-50</td>
<td>ImageNet</td>
<td>accuracy@top1</td>
<td>0.01</td>
<td>0.01</td>
<td>0.01</td>
<td>0.01</td>
</tr>
<tr>
<td>googlenet-v4-tf</td>
<td>ImageNet</td>
<td>accuracy@top1</td>
<td>acc@top-1</td>
<td>0.09</td>
<td>0.09</td>
<td>0.06</td>
<td>0.06</td>
<td>0.09</td>
</tr>
<tr>
<td>mask_rcnn_resnet50_<br>atrous_coco-tf</td>
<td>resnet-50-pytorch</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.13</td>
<td>0.13</td>
<td>0.11</td>
</tr>
<tr>
<td>ssd-resnet34-1200</td>
<td>COCO2017</td>
<td>coco_orig_precision</td>
<td>0.02</td>
<td>0.10</td>
<td>0.10</td>
<td>0.02</td>
</tr>
<tr>
<td>mobilenet-<br>ssd-caffe</td>
<td>VOC2012</td>
<td>mAP</td>
<td>0.51</td>
<td>0.54</td>
<td>0.54</td>
<td>0.51</td>
</tr>
<tr>
<td>mobilenet-v2-1.0-<br>224-TF</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.35</td>
<td>0.79</td>
<td>0.79</td>
<td>0.35</td>
</tr>
<tr>
<td>mobilenet-v2-<br>PYTORCH</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.34</td>
<td>0.58</td>
<td>0.58</td>
<td>0.34</td>
</tr>
<tr>
<td>resnet-18-<br>pytorch</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.29</td>
<td>0.25</td>
<td>0.25</td>
<td>0.29</td>
</tr>
<tr>
<td>resnet-50-<br>PYTORCH</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.24</td>
<td>0.20</td>
<td>0.20</td>
<td>0.24</td>
</tr>
<tr>
<td>resnet-50-<br>TF</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.10</td>
<td>COCO mAp</td>
<td>0.09</td>
<td>0.09</td>
<td>0.10</td>
<td>0.13</td>
</tr>
<tr>
<td>ssd_mobilenet_<br>v1_coco-tf</td>
<td>unet-camvid-onnx-0001</td>
<td>CamVid</td>
<td>mean_iou@mean</td>
<td>0.56</td>
<td>0.56</td>
<td>0.60</td>
</tr>
<tr>
<td>yolo-v3-tiny</td>
<td>COCO2017</td>
<td>coco_precision</td>
<td>0.23</td>
<td>3.06</td>
<td>3.06</td>
<td>COCO mAp</td>
<td>0.12</td>
<td>0.12</td>
<td>0.17</td>
</tr>
<tr>
<td>ssdlite_<br>mobilenet_<br>v2-TF</td>
<td>COCO2017</td>
<td>coco_precision</td>
<td>0.09</td>
<td>0.44</td>
<td>0.44</td>
<td>0.09</td>
</tr>
<tr>
<td>ssd-resnet34-<br>1200-onnx</td>
<td>yolo_v4</td>
<td>COCO2017</td>
<td>COCO mAp</td>
<td>0.09</td>
<td>0.08</td>
<td>0.09</td>
<td>0.09</td>
</tr>
<tr>
<td>unet-camvid-<br>onnx-0001</td>
<td>CamVid</td>
<td>mean_iou@mean</td>
<td>0.33</td>
<td>0.33</td>
<td>0.33</td>
<td>0.33</td>
</tr>
<tr>
<td>yolo-v3-tiny-tf</td>
<td>COCO2017</td>
<td>COCO mAp</td>
<td>0.05</td>
<td>0.08</td>
<td>0.08</td>
<td>0.05</td>
</tr>
<tr>
<td>yolo_v4-TF</td>
<td>COCO2017</td>
<td>COCO mAp</td>
<td>0.03</td>
<td>0.01</td>
<td>0.01</td>
<td>0.03</td>
<td>0.52</td>
<td>0.52</td>
<td>0.54</td>
</tr>
</table>
@endsphinxdirective
The table below illustrates the speed-up factor for the performance gain by switching from an FP32 representation of an OpenVINO™ supported model to its INT8 representation:
@sphinxdirective
.. raw:: html
<table class="table">
<tr align="left">
<th></th>
<th></th>
<th>Intel® Core™ <br>i7-8700T</th>
<th>Intel® Core™ <br>i7-1185G7</th>
<th>Intel® Xeon® <br>W-1290P</th>
<th>Intel® Xeon® <br>Platinum <br>8270</th>
</tr>
<tr align="left">
<th>OpenVINO <br>benchmark <br>model name</th>
<th>Dataset</th>
<th colspan="4" align="center">Throughput speed-up FP16-INT8 vs FP32</th>
</tr>
<tr>
<td>bert-base-cased</td>
<td>SST-2</td>
<td>1.5</td>
<td>3.0</td>
<td>1.4</td>
<td>2.4</td>
</tr>
<tr>
<td>bert-large-uncased-whole-word-masking-squad-0001</td>
<td>SQUAD</td>
<td>1.7</td>
<td>3.2</td>
<td>1.7</td>
<td>3.3</td>
</tr>
<tr>
<td>brain-tumor-<br>segmentation-<br>0001-MXNET</td>
<td>BraTS</td>
<td>1.6</td>
<td>2.0</td>
<td>1.9</td>
<td>2.1</td>
</tr>
<tr>
<td>brain-tumor-<br>segmentation-<br>0001-ONNX</td>
<td>BraTS</td>
<td>2.6</td>
<td>3.2</td>
<td>3.3</td>
<td>3.0</td>
</tr>
<tr>
<td>deeplabv3-TF</td>
<td>VOC2012</td>
<td>1.9</td>
<td>3.1</td>
<td>3.5</td>
<td>3.8</td>
</tr>
<tr>
<td>densenet-121-TF</td>
<td>ImageNet</td>
<td>1.7</td>
<td>3.3</td>
<td>1.9</td>
<td>3.7</td>
</tr>
<tr>
<td>efficientdet-d0-tf</td>
<td>COCO2017</td>
<td>1.6</td>
<td>1.9</td>
<td>2.5</td>
<td>2.3</td>
</tr>
<tr>
<td>facenet-<br>20180408-<br>102900-TF</td>
<td>LFW_MTCNN</td>
<td>2.1</td>
<td>3.5</td>
<td>2.4</td>
<td>3.4</td>
</tr>
<tr>
<td>faster_rcnn_<br>resnet50_coco-TF</td>
<td>COCO2017</td>
<td>1.9</td>
<td>3.7</td>
<td>1.9</td>
<td>3.3</td>
</tr>
<tr>
<td>googlenet-v3-tf</td>
<td>ImageNet</td>
<td>1.9</td>
<td>3.7</td>
<td>2.0</td>
<td>4.0</td>
</tr>
<tr>
<td>googlenet-v4-tf</td>
<td>ImageNet</td>
<td>1.9</td>
<td>3.7</td>
<td>2.0</td>
<td>4.2</td>
</tr>
<tr>
<td>mask_rcnn_resnet50_<br>atrous_coco-tf</td>
<td>COCO2017</td>
<td>1.6</td>
<td>3.6</td>
<td>1.6</td>
<td>2.3</td>
</tr>
<tr>
<td>mobilenet-<br>ssd-caffe</td>
<td>VOC2012</td>
<td>1.6</td>
<td>3.1</td>
<td>2.2</td>
<td>3.8</td>
</tr>
<tr>
<td>mobilenet-v2-1.0-<br>224-TF</td>
<td>ImageNet</td>
<td>1.5</td>
<td>2.4</td>
<td>2.1</td>
<td>3.3</td>
</tr>
<tr>
<td>mobilenet-v2-<br>PYTORCH</td>
<td>ImageNet</td>
<td>1.5</td>
<td>2.4</td>
<td>2.1</td>
<td>3.4</td>
</tr>
<tr>
<td>resnet-18-<br>pytorch</td>
<td>ImageNet</td>
<td>2.0</td>
<td>4.1</td>
<td>2.2</td>
<td>4.1</td>
</tr>
<tr>
<td>resnet-50-<br>PYTORCH</td>
<td>ImageNet</td>
<td>1.9</td>
<td>3.5</td>
<td>2.1</td>
<td>4.0</td>
</tr>
<tr>
<td>resnet-50-<br>TF</td>
<td>ImageNet</td>
<td>1.9</td>
<td>3.5</td>
<td>2.0</td>
<td>4.0</td>
</tr>
<tr>
<td>ssd_mobilenet_<br>v1_coco-tf</td>
<td>COCO2017</td>
<td>1.7</td>
<td>3.1</td>
<td>2.2</td>
<td>3.6</td>
</tr>
<tr>
<td>ssdlite_<br>mobilenet_<br>v2-TF</td>
<td>COCO2017</td>
<td>1.6</td>
<td>2.4</td>
<td>2.7</td>
<td>3.2</td>
</tr>
<tr>
<td>ssd-resnet34-<br>1200-onnx</td>
<td>COCO2017</td>
<td>1.7</td>
<td>4.0</td>
<td>1.7</td>
<td>3.2</td>
</tr>
<tr>
<td>unet-camvid-<br>onnx-0001</td>
<td>CamVid</td>
<td>1.6</td>
<td>4.6</td>
<td>1.6</td>
<td>6.2</td>
</tr>
<tr>
<td>yolo-v3-tiny-tf</td>
<td>COCO2017</td>
<td>1.8</td>
<td>3.4</td>
<td>2.0</td>
<td>3.5</td>
</tr>
<tr>
<td>yolo_v4-TF</td>
<td>COCO2017</td>
<td>2.3</td>
<td>3.4</td>
<td>2.4</td>
<td>3.1</td>
</tr>
</table>
@endsphinxdirective
![INT8 vs FP32 Comparison](../img/int8vsfp32.png)
@endsphinxdirective

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:b9f29fd468777e09c1e02bdf23996c5a05c7aa14ccee73cb6c48e9afae39af16
size 30476