Files
openvino/docs/benchmarks/performance_benchmarks_faq.md
2022-07-20 11:14:45 +02:00

12 KiB

Performance Information Frequently Asked Questions

The following questions (Q#) and answers (A) are related to published performance benchmarks.

Q1: How often do performance benchmarks get updated?

A: New performance benchmarks are typically published on every major.minor release of the Intel® Distribution of OpenVINO™ toolkit.

Q2: Where can I find the models used in the performance benchmarks?

A: All models used are included in the GitHub repository of Open Model Zoo.

Q3: Will there be any new models added to the list used for benchmarking?

A: The models used in the performance benchmarks were chosen based on general adoption and usage in deployment scenarios. New models that support a diverse set of workloads and usage are added periodically.

Q4: What does "CF" or "TF" in the graphs stand for?

A: The "CF" means "Caffe", and "TF" means "TensorFlow".

Q5: How can I run the benchmark results on my own?

A: All of the performance benchmarks were generated using the open-source tool within the Intel® Distribution of OpenVINO™ toolkit called benchmark_app. This tool is available in both C++ and Python.

Q6: What image sizes are used for the classification network models?

A: The image size used in inference depends on the benchmarked network. The table below presents the list of input sizes for each network model:

Model Public Network Task Input Size (Height x Width)
bert-base-cased BERT question / answer 124
bert-large-uncased-whole-word-masking-squad-int8-0001 BERT-large question / answer 384
bert-small-uncased-whole-masking-squad-0002 BERT-small question / answer 384
brain-tumor-segmentation-0001-MXNET brain-tumor-segmentation-0001 semantic segmentation 128x128x128
brain-tumor-segmentation-0002-CF2 brain-tumor-segmentation-0002 semantic segmentation 128x128x128
deeplabv3-TF DeepLab v3 Tf semantic segmentation 513x513
densenet-121-TF Densenet-121 Tf classification 224x224
efficientdet-d0 Efficientdet classification 512x512
facenet-20180408-102900-TF FaceNet TF face recognition 160x160
Facedetection0200 FaceDetection0200 detection 256x256
faster_rcnn_resnet50_coco-TF Faster RCNN Tf object detection 600x1024
forward-tacotron-duration-prediction ForwardTacotron text to speech 241
inception-v4-TF Inception v4 Tf (aka GoogleNet-V4) classification 299x299
inception-v3-TF Inception v3 Tf classification 299x299
mask_rcnn_resnet50_atrous_coco Mask R-CNN ResNet50 Atrous instance segmentation 800x1365
mobilenet-ssd-CF SSD (MobileNet)_COCO-2017_Caffe object detection 300x300
mobilenet-v2-1.0-224-TF MobileNet v2 Tf classification 224x224
mobilenet-v2-pytorch Mobilenet V2 PyTorch classification 224x224
Mobilenet-V3-small Mobilenet-V3-1.0-224 classifier 224x224
Mobilenet-V3-large Mobilenet-V3-1.0-224 classifier 224x224
pp-ocr-rec PP-OCR optical character recognition 32x640
pp-yolo PP-YOLO detection 640x640
resnet-18-pytorch ResNet-18 PyTorch classification 224x224
resnet-50-pytorch ResNet-50 v1 PyTorch classification 224x224
resnet-50-TF ResNet-50_v1_ILSVRC-2012 classification 224x224
yolo_v4-TF Yolo-V4 TF object detection 608x608
ssd_mobilenet_v1_coco-TF ssd_mobilenet_v1_coco object detection 300x300
ssdlite_mobilenet_v2-TF ssdlite_mobilenet_v2 object detection 300x300
unet-camvid-onnx-0001 U-Net semantic segmentation 368x480
yolo-v3-tiny-tf YOLO v3 Tiny object detection 416x416
yolo-v3 YOLO v3 object detection 416x416
ssd-resnet34-1200-onnx ssd-resnet34 onnx model object detection 1200x1200

Q7: Where can I purchase the specific hardware used in the benchmarking?

A: Intel partners with vendors all over the world. For a list of Hardware Manufacturers, see the Intel® AI: In Production Partners & Solutions Catalog . For more details, see the Supported Devices documentation. Before purchasing any hardware, you can test and run models remotely, using Intel® DevCloud for the Edge.

Q8: How can I optimize my models for better performance or accuracy?

A: Set of guidelines and recommendations to optimize models are available in the optimization guide. Join the conversation in the Community Forum for further support.

Q9: Why are INT8 optimized models used for benchmarking on CPUs with no VNNI support?

A: The benefit of low-precision optimization using the OpenVINO™ toolkit model optimizer extends beyond processors supporting VNNI through Intel® DL Boost. The reduced bit width of INT8 compared to FP32 allows Intel® CPU to process the data faster. Therefore, it offers better throughput on any converted model, regardless of the intrinsically supported low-precision optimizations within Intel® hardware. For comparison on boost factors for different network models and a selection of Intel® CPU architectures, including AVX-2 with Intel® Core™ i7-8700T, and AVX-512 (VNNI) with Intel® Xeon® 5218T and Intel® Xeon® 8270, refer to the Model Accuracy for INT8 and FP32 Precision article.

Q10: Where can I search for OpenVINO™ performance results based on HW-platforms?

A: The website format has changed in order to support more common approach of searching for the performance results of a given neural network model on different HW-platforms. As opposed to reviewing performance of a given HW-platform when working with different neural network models.

Q11: How is Latency measured?

A: Latency is measured by running the OpenVINO™ Runtime in synchronous mode. In this mode, each frame or image is processed through the entire set of stages (pre-processing, inference, post-processing) before the next frame or image is processed. This KPI is relevant for applications where the inference on a single image is required. For example, the analysis of an ultra sound image in a medical application or the analysis of a seismic image in the oil & gas industry. Other use cases include real or near real-time applications, e.g. the response of industrial robot to changes in its environment and obstacle avoidance for autonomous vehicles, where a quick response to the result of the inference is required.