Files
openvino/docs/benchmarks/performance_int8_vs_fp32.md
Andrey Zaytsev 4ae6258bed Feature/azaytsev/from 2021 4 (#9247)
* Added info on DockerHub CI Framework

* Feature/azaytsev/change layout (#3295)

* Changes according to feedback comments

* Replaced @ref's with html links

* Fixed links, added a title page for installing from repos and images, fixed formatting issues

* Added links

* minor fix

* Added DL Streamer to the list of components installed by default

* Link fixes

* Link fixes

* ovms doc fix (#2988)

* added OpenVINO Model Server

* ovms doc fixes

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>

* Updated openvino_docs.xml

* Updated the link to software license agreements

* Revert "Updated the link to software license agreements"

This reverts commit 706dac500e.

* Docs to Sphinx (#8151)

* docs to sphinx

* Update GPU.md

* Update CPU.md

* Update AUTO.md

* Update performance_int8_vs_fp32.md

* update

* update md

* updates

* disable doc ci

* disable ci

* fix index.rst

Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com>
# Conflicts:
#	.gitignore
#	docs/CMakeLists.txt
#	docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md
#	docs/IE_DG/Extensibility_DG/Custom_ONNX_Ops.md
#	docs/IE_DG/Extensibility_DG/VPU_Kernel.md
#	docs/IE_DG/InferenceEngine_QueryAPI.md
#	docs/IE_DG/Int8Inference.md
#	docs/IE_DG/Integrate_with_customer_application_new_API.md
#	docs/IE_DG/Model_caching_overview.md
#	docs/IE_DG/supported_plugins/GPU_RemoteBlob_API.md
#	docs/IE_DG/supported_plugins/HETERO.md
#	docs/IE_DG/supported_plugins/MULTI.md
#	docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md
#	docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md
#	docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md
#	docs/MO_DG/prepare_model/convert_model/Convert_Model_From_ONNX.md
#	docs/MO_DG/prepare_model/convert_model/Converting_Model.md
#	docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md
#	docs/MO_DG/prepare_model/convert_model/Cutting_Model.md
#	docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_RNNT.md
#	docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_EfficientDet_Models.md
#	docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_WideAndDeep_Family_Models.md
#	docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md
#	docs/doxygen/Doxyfile.config
#	docs/doxygen/ie_docs.xml
#	docs/doxygen/ie_plugin_api.config
#	docs/doxygen/ngraph_cpp_api.config
#	docs/doxygen/openvino_docs.xml
#	docs/get_started/get_started_macos.md
#	docs/get_started/get_started_raspbian.md
#	docs/get_started/get_started_windows.md
#	docs/img/cpu_int8_flow.png
#	docs/index.md
#	docs/install_guides/VisionAcceleratorFPGA_Configure.md
#	docs/install_guides/VisionAcceleratorFPGA_Configure_Windows.md
#	docs/install_guides/deployment-manager-tool.md
#	docs/install_guides/installing-openvino-linux.md
#	docs/install_guides/installing-openvino-macos.md
#	docs/install_guides/installing-openvino-windows.md
#	docs/optimization_guide/dldt_optimization_guide.md
#	inference-engine/ie_bridges/c/include/c_api/ie_c_api.h
#	inference-engine/ie_bridges/python/docs/api_overview.md
#	inference-engine/ie_bridges/python/sample/ngraph_function_creation_sample/README.md
#	inference-engine/ie_bridges/python/sample/speech_sample/README.md
#	inference-engine/ie_bridges/python/src/openvino/inference_engine/ie_api.pyx
#	inference-engine/include/ie_api.h
#	inference-engine/include/ie_core.hpp
#	inference-engine/include/ie_version.hpp
#	inference-engine/samples/benchmark_app/README.md
#	inference-engine/samples/speech_sample/README.md
#	inference-engine/src/plugin_api/exec_graph_info.hpp
#	inference-engine/src/plugin_api/file_utils.h
#	inference-engine/src/transformations/include/transformations_visibility.hpp
#	inference-engine/tools/benchmark_tool/README.md
#	ngraph/core/include/ngraph/ngraph.hpp
#	ngraph/frontend/onnx_common/include/onnx_common/parser.hpp
#	ngraph/python/src/ngraph/utils/node_factory.py
#	openvino/itt/include/openvino/itt.hpp
#	thirdparty/ade
#	tools/benchmark/README.md

* Cherry-picked remove font-family (#8211)

* Cherry-picked: Update get_started_scripts.md (#8338)

* doc updates (#8268)

* Various doc changes

* theme changes

* remove font-family (#8211)

* fix  css

* Update uninstalling-openvino.md

* fix css

* fix

* Fixes for Installation Guides

Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com>
Co-authored-by: kblaszczak-intel <karol.blaszczak@intel.com>
# Conflicts:
#	docs/IE_DG/Bfloat16Inference.md
#	docs/IE_DG/InferenceEngine_QueryAPI.md
#	docs/IE_DG/OnnxImporterTutorial.md
#	docs/IE_DG/supported_plugins/AUTO.md
#	docs/IE_DG/supported_plugins/HETERO.md
#	docs/IE_DG/supported_plugins/MULTI.md
#	docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md
#	docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md
#	docs/install_guides/installing-openvino-macos.md
#	docs/install_guides/installing-openvino-windows.md
#	docs/ops/opset.md
#	inference-engine/samples/benchmark_app/README.md
#	inference-engine/tools/benchmark_tool/README.md
#	thirdparty/ade

* Cherry-picked: doc script changes (#8568)

* fix openvino-sphinx-theme

* add linkcheck target

* fix

* change version

* add doxygen-xfail.txt

* fix

* AA

* fix

* fix

* fix

* fix

* fix
# Conflicts:
#	thirdparty/ade

* Cherry-pick: Feature/azaytsev/doc updates gna 2021 4 2 (#8567)

* Various doc changes

* Reformatted C++/Pythob sections. Updated with info from PR8490

* additional fix

* Gemini Lake replaced with Elkhart Lake

* Fixed links in IGs, Added 12th Gen
# Conflicts:
#	docs/IE_DG/supported_plugins/GNA.md
#	thirdparty/ade

* Cherry-pick: Feature/azaytsev/doc fixes (#8897)

* Various doc changes

* Removed the empty Learning path topic

* Restored the Gemini Lake CPIU list
# Conflicts:
#	docs/IE_DG/supported_plugins/GNA.md
#	thirdparty/ade

* Cherry-pick: sphinx copybutton doxyrest code blocks (#8992)

# Conflicts:
#	thirdparty/ade

* Cherry-pick: iframe video enable fullscreen (#9041)

# Conflicts:
#	thirdparty/ade

* Cherry-pick: fix untitled titles (#9213)

# Conflicts:
#	thirdparty/ade

* Cherry-pick: perf bench graph animation (#9045)

* animation

* fix
# Conflicts:
#	thirdparty/ade

* Cherry-pick: doc pytest (#8888)

* docs pytest

* fixes
# Conflicts:
#	docs/doxygen/doxygen-ignore.txt
#	docs/scripts/ie_docs.xml
#	thirdparty/ade

* Cherry-pick: restore deleted files (#9215)

* Added new operations to the doc structure (from removed ie_docs.xml)

* Additional fixes

* Update docs/IE_DG/InferenceEngine_QueryAPI.md

Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>

* Update Custom_Layers_Guide.md

* Changes according to review  comments

* doc scripts fixes

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>

* Update Int8Inference.md

* update xfail

* clang format

* updated xfail

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>
Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com>
Co-authored-by: kblaszczak-intel <karol.blaszczak@intel.com>
Co-authored-by: Yury Gorbachev <yury.gorbachev@intel.com>
Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>
2021-12-21 20:26:37 +03:00

10 KiB

INT8 vs FP32 Comparison on Select Networks and Platforms

The table below illustrates the speed-up factor for the performance gain by switching from an FP32 representation of an OpenVINO™ supported model to its INT8 representation.

@sphinxdirective .. raw:: html

<table class="table">
  <tr align="left">
    <th></th>
    <th></th>
    <th>Intel® Core™ <br>i7-8700T</th>
    <th>Intel® Core™ <br>i7-1185G7</th>
    <th>Intel® Xeon® <br>W-1290P</th>
    <th>Intel® Xeon® <br>Platinum <br>8270</th>
  </tr>
  <tr align="left">
    <th>OpenVINO <br>benchmark <br>model name</th>
    <th>Dataset</th>
    <th colspan="4" align="center">Throughput speed-up FP16-INT8 vs FP32</th>
  </tr>
  <tr>
    <td>bert-large-<br>uncased-whole-word-<br>masking-squad-0001</td>
    <td>SQuAD</td>
    <td>1.6</td>
    <td>3.1</td>
    <td>1.5</td>
    <td>2.5</td>
  </tr>
  <tr>
    <td>brain-tumor-<br>segmentation-<br>0001-MXNET</td>
    <td>BraTS</td>
    <td>1.6</td>
    <td>2.0</td>
    <td>1.8</td>
    <td>1.8</td>
  </tr>
  <tr>
    <td>deeplabv3-TF</td>
    <td>VOC 2012<br>Segmentation</td>
    <td>1.9</td>
    <td>3.0</td>
    <td>2.8</td>
    <td>3.1</td>
  </tr>
  <tr>
    <td>densenet-121-TF</td>
    <td>ImageNet</td>
    <td>1.8</td>
    <td>3.5</td>
    <td>1.9</td>
    <td>3.8</td>
  </tr>
  <tr>
    <td>facenet-<br>20180408-<br>102900-TF</td>
    <td>LFW</td>
    <td>2.1</td>
    <td>3.6</td>
    <td>2.2</td>
    <td>3.7</td>
  </tr>
  <tr>
    <td>faster_rcnn_<br>resnet50_coco-TF</td>
    <td>MS COCO</td>
    <td>1.9</td>
    <td>3.7</td>
    <td>2.0</td>
    <td>3.4</td>
  </tr>
  <tr>
    <td>inception-v3-TF</td>
    <td>ImageNet</td>
    <td>1.9</td>
    <td>3.8</td>
    <td>2.0</td>
    <td>4.1</td>
  </tr>
  <tr>
    <td>mobilenet-<br>ssd-CF</td>
    <td>VOC2012</td>
    <td>1.6</td>
    <td>3.1</td>
    <td>1.9</td>
    <td>3.6</td>
  </tr>
  <tr>
    <td>mobilenet-v2-1.0-<br>224-TF</td>
    <td>ImageNet</td>
    <td>1.5</td>
    <td>2.4</td>
    <td>1.8</td>
    <td>3.9</td>
  </tr>
  <tr>
    <td>mobilenet-v2-<br>pytorch</td>
    <td>ImageNet</td>
    <td>1.7</td>
    <td>2.4</td>
    <td>1.9</td>
    <td>4.0</td>
  </tr>
  <tr>
    <td>resnet-18-<br>pytorch</td>
    <td>ImageNet</td>
    <td>1.9</td>
    <td>3.7</td>
    <td>2.1</td>
    <td>4.2</td>
  </tr>
  <tr>
    <td>resnet-50-<br>pytorch</td>
    <td>ImageNet</td>
    <td>1.9</td>
    <td>3.6</td>
    <td>2.0</td>
    <td>3.9</td>
  </tr>
  <tr>
    <td>resnet-50-<br>TF</td>
    <td>ImageNet</td>
    <td>1.9</td>
    <td>3.6</td>
    <td>2.0</td>
    <td>3.9</td>
  </tr>
  <tr>
    <td>squeezenet1.1-<br>CF</td>
    <td>ImageNet</td>
    <td>1.7</td>
    <td>3.2</td>
    <td>1.8</td>
    <td>3.4</td>
  </tr>
  <tr>
    <td>ssd_mobilenet_<br>v1_coco-tf</td>
    <td>VOC2012</td>
    <td>1.8</td>
    <td>3.1</td>
    <td>2.0</td>
    <td>3.6</td>
  </tr>
  <tr>
    <td>ssd300-CF</td>
    <td>MS COCO</td>
    <td>1.8</td>
    <td>4.2</td>
    <td>1.9</td>
    <td>3.9</td>
  </tr>
  <tr>
    <td>ssdlite_<br>mobilenet_<br>v2-TF</td>
    <td>MS COCO</td>
    <td>1.7</td>
    <td>2.5</td>
    <td>2.4</td>
    <td>3.5</td>
  </tr>
  <tr>
    <td>yolo_v4-TF</td>
    <td>MS COCO</td>
    <td>1.9</td>
    <td>3.6</td>
    <td>2.0</td>
    <td>3.4</td>
  </tr>
  <tr>
    <td>unet-camvid-onnx-0001</td>
    <td>MS COCO</td>
    <td>1.7</td>
    <td>3.9</td>
    <td>1.7</td>
    <td>3.7</td>
  </tr>
  <tr>
    <td>ssd-resnet34-<br>1200-onnx</td>
    <td>MS COCO</td>
    <td>1.7</td>
    <td>4.0</td>
    <td>1.7</td>
    <td>3.4</td>
  </tr>
  <tr>
    <td>googlenet-v4-tf</td>
    <td>ImageNet</td>
    <td>1.9</td>
    <td>3.9</td>
    <td>2.0</td>
    <td>4.1</td>
  </tr>
  <tr>
    <td>vgg19-caffe</td>
    <td>ImageNet</td>
    <td>1.9</td>
    <td>4.7</td>
    <td>2.0</td>
    <td>4.5</td>
  </tr>
  <tr>
    <td>yolo-v3-tiny-tf</td>
    <td>MS COCO</td>
    <td>1.7</td>
    <td>3.4</td>
    <td>1.9</td>
    <td>3.5</td>
  </tr>
</table>

@endsphinxdirective

The following table shows the absolute accuracy drop that is calculated as the difference in accuracy between the FP32 representation of a model and its INT8 representation.

@sphinxdirective .. raw:: html

<table class="table">
  <tr align="left">
    <th></th>
    <th></th>
    <th></th>
    <th>Intel® Core™ <br>i9-10920X CPU<br>@ 3.50GHZ (VNNI)</th>
    <th>Intel® Core™ <br>i9-9820X CPU<br>@ 3.30GHz (AVX512)</th>
    <th>Intel® Core™ <br>i7-6700K CPU<br>@ 4.0GHz (AVX2)</th>
    <th>Intel® Core™ <br>i7-1185G7 CPU<br>@ 4.0GHz (TGL VNNI)</th>
  </tr>
  <tr align="left">
    <th>OpenVINO Benchmark <br>Model Name</th>
    <th>Dataset</th>
    <th>Metric Name</th>
    <th colspan="4" align="center">Absolute Accuracy Drop, %</th>
  </tr>
  <tr>
    <td>bert-large-uncased-whole-word-masking-squad-0001</td>
    <td>SQuAD</td>
    <td>F1</td>
    <td>0.62</td>
    <td>0.71</td>
    <td>0.62</td>
    <td>0.62</td>
  </tr>
  <tr>
    <td>brain-tumor-<br>segmentation-<br>0001-MXNET</td>
    <td>BraTS</td>
    <td>Dice-index@ <br>Mean@ <br>Overall Tumor</td>
    <td>0.08</td>
    <td>0.10</td>
    <td>0.10</td>
    <td>0.08</td>
  </tr>
  <tr>
    <td>deeplabv3-TF</td>
    <td>VOC 2012<br>Segmentation</td>
    <td>mean_iou</td>
    <td>0.09</td>
    <td>0.41</td>
    <td>0.41</td>
    <td>0.09</td>
  </tr>
  <tr>
    <td>densenet-121-TF</td>
    <td>ImageNet</td>
    <td>acc@top-1</td>
    <td>0.49</td>
    <td>0.56</td>
    <td>0.56</td>
    <td>0.49</td>
  </tr>
  <tr>
    <td>facenet-<br>20180408-<br>102900-TF</td>
    <td>LFW</td>
    <td>pairwise_<br>accuracy<br>_subsets</td>
    <td>0.05</td>
    <td>0.12</td>
    <td>0.12</td>
    <td>0.05</td>
  </tr>
  <tr>
    <td>faster_rcnn_<br>resnet50_coco-TF</td>
    <td>MS COCO</td>
    <td>coco_<br>precision</td>
    <td>0.09</td>
    <td>0.09</td>
    <td>0.09</td>
    <td>0.09</td>
  </tr>
  <tr>
    <td>inception-v3-TF</td>
    <td>ImageNet</td>
    <td>acc@top-1</td>
    <td>0.02</td>
    <td>0.01</td>
    <td>0.01</td>
    <td>0.02</td>
  </tr>
  <tr>
    <td>mobilenet-<br>ssd-CF</td>
    <td>VOC2012</td>
    <td>mAP</td>
    <td>0.06</td>
    <td>0.04</td>
    <td>0.04</td>
    <td>0.06</td>
  </tr>
  <tr>
    <td>mobilenet-v2-1.0-<br>224-TF</td>
    <td>ImageNet</td>
    <td>acc@top-1</td>
    <td>0.40</td>
    <td>0.76</td>
    <td>0.76</td>
    <td>0.40</td>
  </tr>
  <tr>
    <td>mobilenet-v2-<br>PYTORCH</td>
    <td>ImageNet</td>
    <td>acc@top-1</td>
    <td>0.36</td>
    <td>0.52</td>
    <td>0.52</td>
    <td>0.36</td>
  </tr>
  <tr>
    <td>resnet-18-<br>pytorch</td>
    <td>ImageNet</td>
    <td>acc@top-1</td>
    <td>0.25</td>
    <td>0.25</td>
    <td>0.25</td>
    <td>0.25</td>
  </tr>
  <tr>
    <td>resnet-50-<br>PYTORCH</td>
    <td>ImageNet</td>
    <td>acc@top-1</td>
    <td>0.19</td>
    <td>0.21</td>
    <td>0.21</td>
    <td>0.19</td>
  </tr>
  <tr>
    <td>resnet-50-<br>TF</td>
    <td>ImageNet</td>
    <td>acc@top-1</td>
    <td>0.11</td>
    <td>0.11</td>
    <td>0.11</td>
    <td>0.11</td>
  </tr>
  <tr>
    <td>squeezenet1.1-<br>CF</td>
    <td>ImageNet</td>
    <td>acc@top-1</td>
    <td>0.64</td>
    <td>0.66</td>
    <td>0.66</td>
    <td>0.64</td>
  </tr>
  <tr>
    <td>ssd_mobilenet_<br>v1_coco-tf</td>
    <td>VOC2012</td>
    <td>COCO mAp</td>
    <td>0.17</td>
    <td>2.96</td>
    <td>2.96</td>
    <td>0.17</td>
  </tr>
  <tr>
    <td>ssd300-CF</td>
    <td>MS COCO</td>
    <td>COCO mAp</td>
    <td>0.18</td>
    <td>3.06</td>
    <td>3.06</td>
    <td>0.18</td>
  </tr>
  <tr>
    <td>ssdlite_<br>mobilenet_<br>v2-TF</td>
    <td>MS COCO</td>
    <td>COCO mAp</td>
    <td>0.11</td>
    <td>0.43</td>
    <td>0.43</td>
    <td>0.11</td>
  </tr>
  <tr>
    <td>yolo_v4-TF</td>
    <td>MS COCO</td>
    <td>COCO mAp</td>
    <td>0.06</td>
    <td>0.03</td>
    <td>0.03</td>
    <td>0.06</td>
  </tr>
  <tr>
    <td>unet-camvid-<br>onnx-0001</td>
    <td>MS COCO</td>
    <td>COCO mAp</td>
    <td>0.29</td>
    <td>0.29</td>
    <td>0.31</td>
    <td>0.29</td>
  </tr>
  <tr>
    <td>ssd-resnet34-<br>1200-onnx</td>
    <td>MS COCO</td>
    <td>COCO mAp</td>
    <td>0.02</td>
    <td>0.03</td>
    <td>0.03</td>
    <td>0.02</td>
  </tr>
  <tr>
    <td>googlenet-v4-tf</td>
    <td>ImageNet</td>
    <td>COCO mAp</td>
    <td>0.08</td>
    <td>0.06</td>
    <td>0.06</td>
    <td>0.06</td>
  </tr>
  <tr>
    <td>vgg19-caffe</td>
    <td>ImageNet</td>
    <td>COCO mAp</td>
    <td>0.02</td>
    <td>0.04</td>
    <td>0.04</td>
    <td>0.02</td>
  </tr>
  <tr>
    <td>yolo-v3-tiny-tf</td>
    <td>MS COCO</td>
    <td>COCO mAp</td>
    <td>0.02</td>
    <td>0.6</td>
    <td>0.6</td>
    <td>0.02</td>
  </tr>
</table>

@endsphinxdirective

INT8 vs FP32 Comparison

For more complete information about performance and benchmark results, visit: www.intel.com/benchmarks and Optimization Notice. Legal Information.