Update benchmarks articles (#14438)

Update performance_benchmarks_faq.md
Update performance_int8_vs_fp32.md
This commit is contained in:
Karol Blaszczak 2022-12-06 17:37:40 +01:00 committed by GitHub
parent 4171f258b2
commit 88b116af66
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 184 additions and 102 deletions

View File

@ -1,71 +1,153 @@
# Performance Information Frequently Asked Questions {#openvino_docs_performance_benchmarks_faq}
# Performance Information F.A.Q. {#openvino_docs_performance_benchmarks_faq}
The following questions (Q#) and answers (A) are related to published [performance benchmarks](./performance_benchmarks.md).
#### Q1: How often do performance benchmarks get updated?
**A**: New performance benchmarks are typically published on every `major.minor` release of the Intel® Distribution of OpenVINO™ toolkit.
@sphinxdirective
#### Q2: Where can I find the models used in the performance benchmarks?
**A**: All models used are included in the GitHub repository of [Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo).
.. dropdown:: How often do performance benchmarks get updated?
#### Q3: Will there be any new models added to the list used for benchmarking?
**A**: The models used in the performance benchmarks were chosen based on general adoption and usage in deployment scenarios. New models that support a diverse set of workloads and usage are added periodically.
New performance benchmarks are typically published on every
`major.minor` release of the Intel® Distribution of OpenVINO™ toolkit.
#### Q4: What does "CF" or "TF" in the graphs stand for?
**A**: The "CF" means "Caffe", and "TF" means "TensorFlow".
.. dropdown:: Where can I find the models used in the performance benchmarks?
#### Q5: How can I run the benchmark results on my own?
**A**: All of the performance benchmarks were generated using the open-source tool within the Intel® Distribution of OpenVINO™ toolkit called `benchmark_app`. This tool is available in both [C++](../../samples/cpp/benchmark_app/README.md) and [Python](../../tools/benchmark_tool/README.md).
All models used are included in the GitHub repository of `Open Model Zoo <https://github.com/openvinotoolkit/open_model_zoo>`_.
#### Q6: What image sizes are used for the classification network models?
**A**: The image size used in inference depends on the benchmarked network. The table below presents the list of input sizes for each network model:
.. dropdown:: Will there be any new models added to the list used for benchmarking?
| **Model** | **Public Network** | **Task** | **Input Size** (Height x Width) |
|------------------------------------------------------------------------------------------------------------------------------------|------------------------------------|-----------------------------|-----------------------------------|
| [bert-base-cased](https://github.com/PaddlePaddle/PaddleNLP/tree/v2.1.1) | BERT | question / answer | 124 |
| [bert-large-uncased-whole-word-masking-squad-int8-0001](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/bert-large-uncased-whole-word-masking-squad-int8-0001) | BERT-large | question / answer | 384 |
| [bert-small-uncased-whole-masking-squad-0002](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/bert-small-uncased-whole-word-masking-squad-0002) | BERT-small | question / answer | 384 |
| [brain-tumor-segmentation-0001-MXNET](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/brain-tumor-segmentation-0001) | brain-tumor-segmentation-0001 | semantic segmentation | 128x128x128 |
| [brain-tumor-segmentation-0002-CF2](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/brain-tumor-segmentation-0002) | brain-tumor-segmentation-0002 | semantic segmentation | 128x128x128 |
| [deeplabv3-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/deeplabv3) | DeepLab v3 Tf | semantic segmentation | 513x513 |
| [densenet-121-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/densenet-121-tf) | Densenet-121 Tf | classification | 224x224 |
| [efficientdet-d0](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/efficientdet-d0-tf) | Efficientdet | classification | 512x512 |
| [facenet-20180408-102900-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/facenet-20180408-102900) | FaceNet TF | face recognition | 160x160 |
| [Facedetection0200](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/face-detection-0200) | FaceDetection0200 | detection | 256x256 |
| [faster_rcnn_resnet50_coco-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/faster_rcnn_resnet50_coco) | Faster RCNN Tf | object detection | 600x1024 |
| [forward-tacotron-duration-prediction](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/forward-tacotron) | ForwardTacotron | text to speech | 241 |
| [inception-v4-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/googlenet-v4-tf) | Inception v4 Tf (aka GoogleNet-V4) | classification | 299x299 |
| [inception-v3-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/googlenet-v3) | Inception v3 Tf | classification | 299x299 |
| [mask_rcnn_resnet50_atrous_coco](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mask_rcnn_resnet50_atrous_coco) | Mask R-CNN ResNet50 Atrous | instance segmentation | 800x1365 |
| [mobilenet-ssd-CF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-ssd) | SSD (MobileNet)_COCO-2017_Caffe | object detection | 300x300 |
| [mobilenet-v2-1.0-224-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v2-1.0-224) | MobileNet v2 Tf | classification | 224x224 |
| [mobilenet-v2-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v2-pytorch ) | Mobilenet V2 PyTorch | classification | 224x224 |
| [Mobilenet-V3-small](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v3-small-1.0-224-tf) | Mobilenet-V3-1.0-224 | classifier | 224x224 |
| [Mobilenet-V3-large](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v3-large-1.0-224-tf) | Mobilenet-V3-1.0-224 | classifier | 224x224 |
| [pp-ocr-rec](https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.1/) | PP-OCR | optical character recognition | 32x640 |
| [pp-yolo](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.1) | PP-YOLO | detection | 640x640 |
| [resnet-18-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-18-pytorch) | ResNet-18 PyTorch | classification | 224x224 |
| [resnet-50-pytorch](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-pytorch) | ResNet-50 v1 PyTorch | classification | 224x224 |
| [resnet-50-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) | ResNet-50_v1_ILSVRC-2012 | classification | 224x224 |
| [yolo_v4-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v4-tf) | Yolo-V4 TF | object detection | 608x608 |
| [ssd_mobilenet_v1_coco-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssd_mobilenet_v1_coco) | ssd_mobilenet_v1_coco | object detection | 300x300 |
| [ssdlite_mobilenet_v2-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssdlite_mobilenet_v2) | ssdlite_mobilenet_v2 | object detection | 300x300 |
| [unet-camvid-onnx-0001](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/unet-camvid-onnx-0001) | U-Net | semantic segmentation | 368x480 |
| [yolo-v3-tiny-tf](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tiny-tf) | YOLO v3 Tiny | object detection | 416x416 |
| [yolo-v3](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tf) | YOLO v3 | object detection | 416x416 |
| [ssd-resnet34-1200-onnx](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssd-resnet34-1200-onnx) | ssd-resnet34 onnx model | object detection | 1200x1200 |
The models used in the performance benchmarks were chosen based
on general adoption and usage in deployment scenarios. New models that
support a diverse set of workloads and usage are added periodically.
#### Q7: Where can I purchase the specific hardware used in the benchmarking?
**A**: Intel partners with vendors all over the world. For a list of Hardware Manufacturers, see the [Intel® AI: In Production Partners & Solutions Catalog](https://www.intel.com/content/www/us/en/internet-of-things/ai-in-production/partners-solutions-catalog.html) . For more details, see the [Supported Devices](../OV_Runtime_UG/supported_plugins/Supported_Devices.md) documentation. Before purchasing any hardware, you can test and run models remotely, using [Intel® DevCloud for the Edge](http://devcloud.intel.com/edge/).
.. dropdown:: How can I run the benchmark results on my own?
#### Q8: How can I optimize my models for better performance or accuracy?
**A**: Set of guidelines and recommendations to optimize models are available in the [optimization guide](../optimization_guide/dldt_optimization_guide.md). Join the conversation in the [Community Forum](https://software.intel.com/en-us/forums/intel-distribution-of-openvino-toolkit) for further support.
All of the performance benchmarks were generated using the
open-source tool within the Intel® Distribution of OpenVINO™ toolkit
called `benchmark_app`. This tool is available in both `C++ <https://github.com/openvinotoolkit/openvino/blob/master/samples/cpp/benchmark_app/README.md>`_ and `Python <https://github.com/openvinotoolkit/openvino/blob/master/tools/benchmark_tool/README.md>`_.
#### Q9: Why are INT8 optimized models used for benchmarking on CPUs with no VNNI support?
**A**: The benefit of low-precision optimization using the OpenVINO™ toolkit model optimizer extends beyond processors supporting VNNI through Intel® DL Boost. The reduced bit width of INT8 compared to FP32 allows Intel® CPU to process the data faster. Therefore, it offers better throughput on any converted model, regardless of the intrinsically supported low-precision optimizations within Intel® hardware. For comparison on boost factors for different network models and a selection of Intel® CPU architectures, including AVX-2 with Intel® Core™ i7-8700T, and AVX-512 (VNNI) with Intel® Xeon® 5218T and Intel® Xeon® 8270, refer to the [Model Accuracy for INT8 and FP32 Precision](performance_int8_vs_fp32.md) article.
.. dropdown:: What image sizes are used for the classification network models?
#### Q10: Where can I search for OpenVINO™ performance results based on HW-platforms?
**A**: The website format has changed in order to support more common approach of searching for the performance results of a given neural network model on different HW-platforms. As opposed to reviewing performance of a given HW-platform when working with different neural network models.
The image size used in inference depends on the benchmarked
network. The table below presents the list of input sizes for each
network model:
#### Q11: How is Latency measured?
**A**: Latency is measured by running the OpenVINO™ Runtime in synchronous mode. In this mode, each frame or image is processed through the entire set of stages (pre-processing, inference, post-processing) before the next frame or image is processed. This KPI is relevant for applications where the inference on a single image is required. For example, the analysis of an ultra sound image in a medical application or the analysis of a seismic image in the oil & gas industry. Other use cases include real or near real-time applications, e.g. the response of industrial robot to changes in its environment and obstacle avoidance for autonomous vehicles, where a quick response to the result of the inference is required.
.. list-table::
:header-rows: 1
* - Model
- Public Network
- Task
- Input Size
* - :ref:`bert-base-cased<https://github.com/PaddlePaddle/PaddleNLP/tree/v2.1.1>`
- BERT
- question / answer
- 124
* - :ref:`bert-large-uncased-whole-word-masking-squad-int8-0001<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/bert-large-uncased-whole-word-masking-squad-int8-0001>`
- BERT-large
- question / answer
- 384
* - :ref:`deeplabv3-TF<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/deeplabv3>`
- DeepLab v3 Tf
- semantic segmentation
- 513x513
* - :ref:`densenet-121-TF<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/densenet-121-tf>`
- Densenet-121 Tf
- classification
- 224x224
* - :ref:`efficientdet-d0<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/efficientdet-d0-tf>`
- Efficientdet
- classification
- 512x512
* - :ref:`faster_rcnn_resnet50_coco-TF<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/faster_rcnn_resnet50_coco>`
- Faster RCNN Tf
- object detection
- 600x1024
* - :ref:`inception-v4-TF<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/googlenet-v4-tf>`
- Inception v4 Tf (aka GoogleNet-V4)
- classification
- 299x299
* - :ref:`mobilenet-ssd-CF<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-ssd>`
- SSD (MobileNet)_COCO-2017_Caffe
- object detection
- 300x300
* - :ref:`mobilenet-v2-pytorch<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/mobilenet-v2-pytorch>`
- Mobilenet V2 PyTorch
- classification
- 224x224
* - :ref:`resnet-18-pytorch<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-18-pytorch>`
- ResNet-18 PyTorch
- classification
- 224x224
* - :ref:`resnet-50-TF<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf>`
- ResNet-50_v1_ILSVRC-2012
- classification
- 224x224
* - :ref:`ssd-resnet34-1200-onnx <https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssd-resnet34-1200-onnx>`
- ssd-resnet34 onnx model
- object detection
- 1200x1200
* - :ref:`unet-camvid-onnx-0001<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/unet-camvid-onnx-0001>`
- U-Net
- semantic segmentation
- 368x480
* - :ref:`yolo-v3-tiny-tf<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tiny-tf>`
- YOLO v3 Tiny
- object detection
- 416x416
* - :ref:`yolo_v4-TF<https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v4-tf>`
- Yolo-V4 TF
- object detection
- 608x608
.. dropdown:: Where can I purchase the specific hardware used in the benchmarking?
Intel partners with vendors all over the world. For a list of Hardware Manufacturers, see the
[Intel® AI: In Production Partners & Solutions Catalog](https://www.intel.com/content/www/us/en/internet-of-things/ai-in-production/partners-solutions-catalog.html).
For more details, see the [Supported Devices](../OV_Runtime_UG/supported_plugins/Supported_Devices.md)
documentation. Before purchasing any hardware, you can test and run
models remotely, using [Intel® DevCloud for the Edge](http://devcloud.intel.com/edge/).
.. dropdown:: How can I optimize my models for better performance or accuracy?
Set of guidelines and recommendations to optimize models are available in the
[optimization guide](../optimization_guide/dldt_optimization_guide.md).
Join the conversation in the [Community Forum](https://software.intel.com/en-us/forums/intel-distribution-of-openvino-toolkit)
for further support.
.. dropdown:: Why are INT8 optimized models used for benchmarking on CPUs with no VNNI support?
The benefit of low-precision optimization using the OpenVINO™
toolkit model optimizer extends beyond processors supporting VNNI
through Intel® DL Boost. The reduced bit width of INT8 compared to FP32
allows Intel® CPU to process the data faster. Therefore, it offers
better throughput on any converted model, regardless of the
intrinsically supported low-precision optimizations within Intel®
hardware. For comparison on boost factors for different network models
and a selection of Intel® CPU architectures, including AVX-2 with Intel®
Core™ i7-8700T, and AVX-512 (VNNI) with Intel® Xeon® 5218T and Intel®
Xeon® 8270, refer to the [Model Accuracy for INT8 and FP32 Precision](performance_int8_vs_fp32.md) article.
.. dropdown:: Where can I search for OpenVINO™ performance results based on HW-platforms?
The website format has changed in order to support more common
approach of searching for the performance results of a given neural
network model on different HW-platforms. As opposed to reviewing
performance of a given HW-platform when working with different neural
network models.
.. dropdown:: How is Latency measured?
Latency is measured by running the OpenVINO™ Runtime in
synchronous mode. In this mode, each frame or image is processed through
the entire set of stages (pre-processing, inference, post-processing)
before the next frame or image is processed. This KPI is relevant for
applications where the inference on a single image is required. For
example, the analysis of an ultra sound image in a medical application
or the analysis of a seismic image in the oil & gas industry. Other use
cases include real or near real-time applications, e.g. the response of
industrial robot to changes in its environment and obstacle avoidance
for autonomous vehicles, where a quick response to the result of the
inference is required.
@endsphinxdirective

View File

@ -5,16 +5,16 @@ The following table presents the absolute accuracy drop calculated as the accura
@sphinxdirective
.. raw:: html
<table class="table">
<table class="table" id="model-accuracy-and-perf-int8-fp32-table">
<tr align="left">
<th></th>
<th></th>
<th></th>
<th>Intel® Core™ i9-12900K @ 3.2 GHz (AVX2)</th>
<th>Intel® Xeon® 6338 @ 2.0 GHz (VNNI)</th>
<th>iGPU Gen12LP (Intel® Core™ i9-12900K @ 3.2 GHz)</th>
<th class="light-header">Intel® Core™ i9-12900K @ 3.2 GHz (AVX2)</th>
<th class="light-header">Intel® Xeon® 6338 @ 2.0 GHz (VNNI)</th>
<th class="light-header">iGPU Gen12LP (Intel® Core™ i9-12900K @ 3.2 GHz)</th>
</tr>
<tr align="left">
<tr align="left" class="header">
<th>OpenVINO Benchmark <br>Model Name</th>
<th>Dataset</th>
<th>Metric Name</th>
@ -24,105 +24,105 @@ The following table presents the absolute accuracy drop calculated as the accura
<td>bert-base-cased</td>
<td>SST-2</td>
<td>accuracy</td>
<td>0.11</td>
<td>0.34</td>
<td>0.46</td>
<td class="data">0.11</td>
<td class="data">0.34</td>
<td class="data">0.46</td>
</tr>
<tr>
<td>bert-large-uncased-whole-word-masking-squad-0001</td>
<td>SQUAD</td>
<td>F1</td>
<td>0.87</td>
<td>1.11</td>
<td>0.70</td>
<td class="data">0.87</td>
<td class="data">1.11</td>
<td class="data">0.70</td>
</tr>
<tr>
<td>deeplabv3</td>
<td>VOC2012</td>
<td>mean_iou</td>
<td>0.04</td>
<td>0.04</td>
<td>0.11</td>
<td class="data">0.04</td>
<td class="data">0.04</td>
<td class="data">0.11</td>
</tr>
<tr>
<td>densenet-121</td>
<td>ImageNet</td>
<td>accuracy@top1</td>
<td>0.56</td>
<td>0.56</td>
<td>0.63</td>
<td class="data">0.56</td>
<td class="data">0.56</td>
<td class="data">0.63</td>
</tr>
<tr>
<td>efficientdet-d0</td>
<td>COCO2017</td>
<td>coco_precision</td>
<td>0.63</td>
<td>0.62</td>
<td>0.45</td>
<td class="data">0.63</td>
<td class="data">0.62</td>
<td class="data">0.45</td>
</tr>
<tr>
<td>faster_rcnn_<br>resnet50_coco</td>
<td>COCO2017</td>
<td>coco_<br>precision</td>
<td>0.52</td>
<td>0.55</td>
<td>0.31</td>
<td class="data">0.52</td>
<td class="data">0.55</td>
<td class="data">0.31</td>
</tr>
<tr>
<td>resnet-18</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.16</td>
<td>0.16</td>
<td>0.16</td>
<td class="data">0.16</td>
<td class="data">0.16</td>
<td class="data">0.16</td>
</tr>
<tr>
<td>resnet-50</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.09</td>
<td>0.09</td>
<td>0.09</td>
<td class="data">0.09</td>
<td class="data">0.09</td>
<td class="data">0.09</td>
</tr>
<tr>
<td>resnet-50-pytorch</td>
<td>ImageNet</td>
<td>acc@top-1</td>
<td>0.13</td>
<td>0.13</td>
<td>0.11</td>
<td class="data">0.13</td>
<td class="data">0.13</td>
<td class="data">0.11</td>
</tr>
<tr>
<td>ssd-resnet34-1200</td>
<td>COCO2017</td>
<td>COCO mAp</td>
<td>0.09</td>
<td>0.09</td>
<td>0.13</td>
<td class="data">0.09</td>
<td class="data">0.09</td>
<td class="data">0.13</td>
</tr>
<tr>
<td>unet-camvid-onnx-0001</td>
<td>CamVid</td>
<td>mean_iou@mean</td>
<td>0.56</td>
<td>0.56</td>
<td>0.60</td>
<td class="data">0.56</td>
<td class="data">0.56</td>
<td class="data">0.60</td>
</tr>
<tr>
<td>yolo-v3-tiny</td>
<td>COCO2017</td>
<td>COCO mAp</td>
<td>0.12</td>
<td>0.12</td>
<td>0.17</td>
<td class="data">0.12</td>
<td class="data">0.12</td>
<td class="data">0.17</td>
</tr>
<tr>
<td>yolo_v4</td>
<td>COCO2017</td>
<td>COCO mAp</td>
<td>0.52</td>
<td>0.52</td>
<td>0.54</td>
<td class="data">0.52</td>
<td class="data">0.52</td>
<td class="data">0.54</td>
</tr>
</table>