Files
openvino/inference-engine/tools/benchmark_tool/README.md
Andrey Zaytsev 40eba6a2ef Feature/merge 2021 3 to master (#5307)
* Feature/azaytsev/cldnn doc fixes (#4600)

* Legal fixes, removed the Generating docs section

* Removed info regarding generating docs

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>

* Feature/azaytsev/gna model link fixes (#4599)

* Added info on DockerHub CI Framework

* Feature/azaytsev/change layout (#3295)

* Changes according to feedback comments

* Replaced @ref's with html links

* Fixed links, added a title page for installing from repos and images, fixed formatting issues

* Added links

* minor fix

* Added DL Streamer to the list of components installed by default

* Link fixes

* Link fixes

* ovms doc fix (#2988)

* added OpenVINO Model Server

* ovms doc fixes

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>

* Updated openvino_docs.xml

* Link Fixes

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>

* Fix for broken CC in CPU plugin (#4595)

* Azure CI: Add "ref: releases/2021/3"

* Fixed clone rt info (#4597)

* [.ci/azure] Enable CC build (#4619)

* Formula fix (#4624)

* Fixed transformation to pull constants into Loop body (cherry-pick of PR 4591) (#4607)

* Cherry-pick of PR 4591

* Fixed typo

* Moved a check into the parameter_unchanged_after_iteration function

* Fixed KW hits (#4638)

* [CPU] Supported ANY layout for inputs in inferRequest (#4621)

* [.ci/azure] Add windows_conditional_compilation.yml (#4648) (#4655)

* Fix for MKLDNN constant layers execution (#4642)

* Fix for MKLDNN constant layers execution

* Single mkldnn::engine for all MKLDNN graphs

* Add workaround for control edges to support TF 2.4 RNN (#4634)

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Corrected PyYAML dependency (#4598) (#4620)

5.4.2 is absent on PyPI

* [CPU] Statically analyzed issues. (#4637)

* Docs api (#4657)

* Updated API changes document

* Comment for CVS-49440

* Add documentation on how to convert QuartzNet model (#4664)

* Add documentation on how to convert QuartzNet model (#4422)

* Add documentation on how to convert QuartzNet model

* Apply review feedback

* Small fix

* Apply review feedback

* Apply suggestions from code review

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Add reference to file

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Fixed bug in assign elimination transformation. (#4644)

* [doc] Updated PyPI support OSes (#4643) (#4662)

* [doc] Updated PyPI support OSes (#4643)

* Updated PyPI support OSes

* Added python versions for win and mac

* Update pypi-openvino-dev.md

* Update pypi-openvino-dev.md

* Update pypi-openvino-rt.md

* Update pypi-openvino-dev.md

Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com>

* [IE][VPU]: Fix empty output of CTCGreedyDecoderSeqLen (#4653)

* Allow the second output of CTCGreedyDecoderSeqLen to be nullptr in cases when it is not used but calculated in the Myriad plugin. In this case, parse the second output as FakeData
* It is a cherry-pick of #4652
* Update the firmware to release version

* [VPU] WA for Segmentation fault on dlclose() issue (#4645)

* Document TensorFlow 2* Update: Layers Support and Remove Beta Status (#4474) (#4711)

* Document TensorFlow 2* Update: Layers Support and Remove Beta Status

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Update documentation based on latest test results and feedback

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Remove ConvLSTM2D from supported layers list

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Document Dot layer without limitation

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Address feedback upon DenseFeatures and RNN operations

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Do a grammar correction

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Do a grammar correction based on feedback

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Updated nGraph custom op documentation (#4604)

* Updated nGraph custom op documentation

* Fixed comments

* [IE CLDNN] Fix missing variable initializations and types (#4669)

* Fix NormalizeL2 creation in QueryNetwork (cherry pick from master PR 4310) (#4651)

* Updated documentation about the supported YOLOv3 model from ONNX (#4722) (#4726)

* Restored folded Operations for QueryNetwork (#4685)

* Restored folded Operations for QueryNetwork

* Fixed comment

* Add unfolded constant operations to supported layers map

* Add STN to list of supported models (#4728)

* Fix python API for Loop/TensorIterator/Assign/ReadValue operations

* Catch std::except in fuzz tests (#4695)

Fuzz tests must catch all expected exceptions from IE. IE is using C++ std
library which may raise standard exceptions which IE pass through.

* Docs update (#4626)

* Updated latency case desc to cover multi-socket machines

* updated opt guide a bit

* avoiding '#' which is interpreted as ref

* Update CPU.md

* Update docs/optimization_guide/dldt_optimization_guide.md

Co-authored-by: Alina Alborova <alina.alborova@intel.com>

* Update docs/optimization_guide/dldt_optimization_guide.md

Co-authored-by: Alina Alborova <alina.alborova@intel.com>

* Update docs/optimization_guide/dldt_optimization_guide.md

Co-authored-by: Alina Alborova <alina.alborova@intel.com>

* Update docs/optimization_guide/dldt_optimization_guide.md

Co-authored-by: Alina Alborova <alina.alborova@intel.com>

* Update docs/optimization_guide/dldt_optimization_guide.md

Co-authored-by: Alina Alborova <alina.alborova@intel.com>

Co-authored-by: Alina Alborova <alina.alborova@intel.com>

* Blocked dims hwc 2021/3 (#4729)

* Fix for BlockedDims

* Added test for HWC layout

* [GNA] Update documentation regarding splits and concatenations support (#4740)

* Added mo.py to wheel packages (#4731)

* Inserted a disclaimer (#4760)

* Fixed some klockwork issues in C API samples (#4767)

* Feature/vpu doc fixes 2021 3 (#4635)

* Documentation fixes and updates for VPU

* minor correction

* minor correction

* Fixed links

* updated supported layers list for vpu

* [DOCS] added iname/oname (#4735)

* [VPU] Limit dlclose() WA to be used for Ubuntu only (#4806)

* Fixed wrong link (#4817)

* MKLDNN weights cache key calculation algorithm changed (#4790)

* Updated PIP install instructions (#4821)

* Document YOLACT support (#4749)

* Document YOLACT support

* Add preprocessing section

* Apply suggestions from code review

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>

* Add documentation on how to convert F3Net model (#4863)

* Add instruction for F3Net model pytorch->onnx conversion

* Fix style

* Fixed dead lock in telemetry (#4873)

* Fixed dead lock in telemetry

* Refactored TelemetrySender.send function

* Refactored send function implementation to avoid deadlocks

* Unit tests for telemetry sender function

* Added legal header

* avladimi/cvs-31369: Documented packages content to YUM/APT IGs (#4839)

* Documented runtime/dev packages content

* Minor formatting fixes

* Implemented review comments

* Update installing-openvino-apt.md

Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com>

* [DOC] Low-Precision 8-bit Integer Inference (#4834)

* [DOC] Low-Precision 8-bit Integer Inference

* [DOC] Low-Precision 8-bit Integer Inference: comment fixes

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* [DOC] LPT comments fix

* [DOC] LPT comments fix: absolute links are updated to relative

* Update Int8Inference.md

* Update Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com>

* Avladimi/cherry pick from master (#4892)

* Fixed CVS-48061

* Reviewed and edited the Customization instructions

* Fixed broken links in the TOC

* Fixed links

* Fixed formatting in the IG for Raspberry

* Feature/benchmarks 2021 3 (#4910)

* added new topics, changed the intro text

* updated

* Updates

* Updates

* Updates

* Updates

* Updates

* Added yolo-v4-tf and unet-camvid-onnx graphs

* Date for pricing is updated to March 15th

* Feature/omz link changes (#4911)

* Changed labels for demos and model downloader

* Changed links to models and tools

* Changed links to models and tools

* Changed links to demos

* [cherry-pick] Extensibility docs review (#4915)

* Feature/ovsa docs 2021 3 (#4914)

* Updated to 2021-3, fixed formatting issues

* Fixed formatting issues

* Fixed formatting issues

* Fixed formatting issues

* Update ovsa_get_started.md

* Clarification of Low Latency Transformation and State API documentation (#4877)

* Assign/ReadValue, LowLatency and StateAPI clarifications

* Apply suggestions from code review: spelling mistakes

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* fixed wording

* cherry-pick missing commit to release branch: low latency documentation

* Resolve review remarks

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
Co-authored-by: Svetlana Dolinina <svetlana.a.dolinina@intel.com>

* DevCloud call outs (#4904)

* [README.md] change latest release to 2021.3

* [49342] Update recommended CMake version on install guide in documentation (#4763)

* Inserted a disclaimer

* Another disclaimer

* Update installing-openvino-windows.md

* Update installing-openvino-windows.md

* Update installing-openvino-windows.md

* Feature/doc fixes 2021 3 (#4971)

* Made changes for CVS-50424

* Changes for CVS-49349

* Minor change for CVS-49349

* Changes for CVS-49343

* Cherry-pick #PR4254

* Replaced /opt/intel/openvino/ with /opt/intel/openvino_2021/ as the default target directory

* (CVS-50786) Added a new section Reference IMplementations to keep Speech Library and Speech Recognition Demos

* Doc fixes

* Replaced links to inference_engine_intro.md with Deep_Learning_Inference_Engine_DevGuide.md, fixed links

* Fixed link

* Fixes

* Fixes

* Reemoved Intel® Xeon® processor E family

* fixes for graphs (#5057)

* compression.configs.hardware config to package_data (#5066)

* update OpenCV version to 4.5.2 (#5069)

* update OpenCV version to 4.5.2

* Enable mo.front.common.extractors module (#5038)

* Enable mo.front.common.extractors module (#5018)

* Enable mo.front.common.extractors module

* Update package_BOM.txt

* Test MO wheel content

* fix doc iframe issue - 2021.3 (#5090)

* wrap with htmlonly

* wrap with htmlonly

* Add specification for ExperimentalDetectron* oprations (#5128)

* Feature/benchmarks 2021 3 ehl (#5191)

* Added EHL config

* Updated graphs

* improve table formatting

* Wrap <iframe> tag with \htmlonly \endhtmlonly to avoid build errors

* Updated graphs

* Fixed links to TDP and Price for 8380

* Add PyTorch section to the documentation (#4972)

* Add PyTorch section to the documentation

* Apply review feedback

* Remove section about loop

* Apply review feedback

* Apply review feedback

* Apply review feedback

* doc: add Red Hat docker registry (#5184) (#5253)

* Incorporate changes in master

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>
Co-authored-by: Vladislav Volkov <vladislav.volkov@intel.com>
Co-authored-by: azhogov <alexander.zhogov@intel.com>
Co-authored-by: Ilya Churaev <ilya.churaev@intel.com>
Co-authored-by: Alina Kladieva <alina.kladieva@intel.com>
Co-authored-by: Evgeny Lazarev <evgeny.lazarev@intel.com>
Co-authored-by: Gorokhov Dmitriy <dmitry.gorokhov@intel.com>
Co-authored-by: Roman Kazantsev <roman.kazantsev@intel.com>
Co-authored-by: Mikhail Ryzhov <mikhail.ryzhov@intel.com>
Co-authored-by: Nikolay Shchegolev <nikolay.shchegolev@intel.com>
Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
Co-authored-by: Maxim Vafin <maxim.vafin@intel.com>
Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
Co-authored-by: Anastasia Popova <anastasia.popova@intel.com>
Co-authored-by: Maksim Doronin <maksim.doronin@intel.com>
Co-authored-by: Andrew Bakalin <andrew.bakalin@intel.com>
Co-authored-by: Mikhail Letavin <mikhail.letavin@intel.com>
Co-authored-by: Anton Chetverikov <Anton.Chetverikov@intel.com>
Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com>
Co-authored-by: Andrey Somsikov <andrey.somsikov@intel.com>
Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com>
Co-authored-by: Alina Alborova <alina.alborova@intel.com>
Co-authored-by: Elizaveta Lobanova <elizaveta.lobanova@intel.com>
Co-authored-by: Andrey Dmitriev <andrey.dmitriev@intel.com>
Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>
Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>
Co-authored-by: Edward Shogulin <edward.shogulin@intel.com>
Co-authored-by: Svetlana Dolinina <svetlana.a.dolinina@intel.com>
Co-authored-by: Alexey Suhov <alexey.suhov@intel.com>
Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com>
Co-authored-by: Dmitry Kurtaev <dmitry.kurtaev+github@gmail.com>
Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com>
Co-authored-by: Kate Generalova <kate.generalova@intel.com>
2021-04-19 20:19:17 +03:00

14 KiB

Benchmark Python* Tool

This topic demonstrates how to run the Benchmark Python* Tool, which performs inference using convolutional networks. Performance can be measured for two inference modes: synchronous (latency-oriented) and asynchronous (throughput-oriented).

NOTE: This topic describes usage of Python implementation of the Benchmark Tool. For the C++ implementation, refer to Benchmark C++ Tool.

Tip

: You also can work with the Benchmark Tool inside the OpenVINO™ [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench). [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare performance of deep learning models on various Intel® architecture configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components.
Proceed to an [easy installation from Docker](@ref workbench_docs_Workbench_DG_Install_from_Docker_Hub) to get started.

How It Works

Upon start-up, the application reads command-line parameters and loads a network and images/binary files to the Inference Engine plugin, which is chosen depending on a specified device. The number of infer requests and execution approach depend on the mode defined with the -api command-line parameter.

Note

: By default, Inference Engine samples, tools and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with --reverse_input_channels argument specified. For more information about the argument, refer to When to Reverse Input Channels section of Converting a Model Using General Conversion Parameters.

Synchronous API

For synchronous mode, the primary metric is latency. The application creates one infer request and executes the Infer method. A number of executions is defined by one of the two values:

  • Number of iterations defined with the -niter command-line argument
  • Time duration specified with the -t command-line argument
  • Both of them (execution will continue until both conditions are met)
  • Predefined duration if -niter and -t are not specified. Predefined duration value depends on device.

During the execution, the application collects two types of metrics:

  • Latency for each infer request executed with Infer method
  • Duration of all executions

Reported latency value is calculated as mean value of all collected latencies. Reported throughput value is a derivative from reported latency and additionally depends on batch size.

Asynchronous API

For asynchronous mode, the primary metric is throughput in frames per second (FPS). The application creates a certain number of infer requests and executes the StartAsync method. A number of executions is defined by one of the two values:

  • Number of iterations defined with the -niter command-line argument
  • Time duration specified with the -t command-line argument
  • Both of them (execution will continue until both conditions are met)
  • Predefined duration if -niter and -t are not specified. Predefined duration value depends on device.

The infer requests are executed asynchronously. Callback is used to wait for previous execution to complete. The application measures all infer requests executions and reports the throughput metric based on batch size and total execution duration.

Run the Tool

Before running the Benchmark tool, install the requirements:

pip install -r  requirements.txt

Notice that the benchmark_app usually produces optimal performance for any device out of the box.

So in most cases you don't need to play the app options explicitly and the plain device name is enough, for example, for CPU:

python3 benchmark_app.py -m <model> -i <input> -d CPU

But it is still may be non-optimal for some cases, especially for very small networks. More details can read in Introduction to Performance Topics.

Running the application with the -h or --help' option yields the following usage message:

usage: benchmark_app.py [-h] [-i PATH_TO_INPUT] -m PATH_TO_MODEL
                        [-d TARGET_DEVICE]
                        [-l PATH_TO_EXTENSION] [-c PATH_TO_CLDNN_CONFIG]
                        [-api {sync,async}] [-niter NUMBER_ITERATIONS]
                        [-b BATCH_SIZE]
                        [-stream_output [STREAM_OUTPUT]] [-t TIME]
                        [-progress [PROGRESS]] [-nstreams NUMBER_STREAMS]
                        [-nthreads NUMBER_THREADS] [-pin {YES,NO}]
                        [--exec_graph_path EXEC_GRAPH_PATH]
                        [-pc [PERF_COUNTS]]

Options:
  -h, --help            Show this help message and exit.
  -i PATH_TO_INPUT, --path_to_input PATH_TO_INPUT
                        Optional. Path to a folder with images and/or binaries
                        or to specific image or binary file.
  -m PATH_TO_MODEL, --path_to_model PATH_TO_MODEL
                        Required. Path to an .xml/.onnx/.prototxt file with a
                        trained model or to a .blob file with a trained
                        compiled model.
  -d TARGET_DEVICE, --target_device TARGET_DEVICE
                        Optional. Specify a target device to infer on: CPU,
                        GPU, FPGA, HDDL or MYRIAD.
                        Use "-d HETERO:<comma separated devices list>" format to specify HETERO plugin.
                        Use "-d MULTI:<comma separated devices list>" format to specify MULTI plugin.
                        The application looks for a suitable plugin for the specified device.
  -l PATH_TO_EXTENSION, --path_to_extension PATH_TO_EXTENSION
                        Optional. Required for CPU custom layers. Absolute
                        path to a shared library with the kernels
                        implementations.
  -c PATH_TO_CLDNN_CONFIG, --path_to_cldnn_config PATH_TO_CLDNN_CONFIG
                        Optional. Required for GPU custom kernels. Absolute
                        path to an .xml file with the kernels description.
  -api {sync,async}, --api_type {sync,async}
                        Optional. Enable using sync/async API. Default value
                        is async.
  -niter NUMBER_ITERATIONS, --number_iterations NUMBER_ITERATIONS
                        Optional. Number of iterations. If not specified, the
                        number of iterations is calculated depending on a
                        device.
  -b BATCH_SIZE, --batch_size BATCH_SIZE
                        Optional. Batch size value. If not specified, the
                        batch size value is determined from IR
  -stream_output [STREAM_OUTPUT]
                        Optional. Print progress as a plain text. When
                        specified, an interactive progress bar is replaced
                        with a multiline output.
  -t TIME, --time TIME  Optional. Time in seconds to execute topology.
  -progress [PROGRESS]  Optional. Show progress bar (can affect performance
                        measurement). Default values is "False".
  -shape SHAPE          Optional. Set shape for input. For example,
                        "input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]"
                        in case of one input size.
  -layout LAYOUT        Optional. Prompts how network layouts should be
                        treated by application. For example,
                        "input1[NCHW],input2[NC]" or "[NCHW]" in case of one
                        input size.
  -nstreams NUMBER_STREAMS, --number_streams NUMBER_STREAMS
                       Optional. Number of streams to use for inference on the CPU/GPU in throughput mode
                       (for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
                       Default value is determined automatically for a device.
                       Please note that although the automatic selection usually provides a reasonable performance,
                       it still may be non-optimal for some cases, especially for very small networks.
  -nthreads NUMBER_THREADS, --number_threads NUMBER_THREADS
                        Number of threads to use for inference on the CPU
                        (including HETERO  and MULTI cases).
  -pin {YES,NUMA,NO}, --infer_threads_pinning {YES,NUMA,NO}
                        Optional. Enable threads->cores ("YES", default), threads->(NUMA)nodes ("NUMA") or completely disable
                        ("NO") CPU threads pinning for CPU-involved inference.
  --exec_graph_path EXEC_GRAPH_PATH
                        Optional. Path to a file where to store executable
                        graph information serialized.
  -pc [PERF_COUNTS], --perf_counts [PERF_COUNTS]
                        Optional. Report performance counters.
  -ip "U8"/"FP16"/"FP32"    Optional. Specifies precision for all input layers of the network.
  -op "U8"/"FP16"/"FP32"    Optional. Specifies precision for all output layers of the network.
  -iop                      Optional. Specifies precision for input and output layers by name. Example: -iop "input:FP16, output:FP16". Notice that quotes are required. Overwrites precision from ip and op options for specified layers.

Running the application with the empty list of options yields the usage message given above and an error message.

Application supports topologies with one or more inputs. If a topology is not data sensitive, you can skip the input parameter. In this case, inputs are filled with random values. If a model has only image input(s), please a provide folder with images or a path to an image as input. If a model has some specific input(s) (not images), please prepare a binary file(s), which is filled with data of appropriate precision and provide a path to them as input. If a model has mixed input types, input folder should contain all required files. Image inputs are filled with image files one by one. Binary inputs are filled with binary inputs one by one.

To run the tool, you can use [public](@ref omz_models_group_public) or [Intel's](@ref omz_models_group_intel) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader).

Note

: Before running the tool with a trained model, make sure the model is converted to the Inference Engine format (*.xml + *.bin) using the Model Optimizer tool.

Examples of Running the Tool

This section provides step-by-step instructions on how to run the Benchmark Tool with the googlenet-v1 public model on CPU or FPGA devices. As an input, the car.png file from the <INSTALL_DIR>/deployment_tools/demo/ directory is used.

NOTE: The Internet access is required to execute the following steps successfully. If you have access to the Internet through the proxy server only, please make sure that it is configured in your OS environment.

  1. Download the model. Go to the the Model Downloader directory and run the downloader.py script with specifying the model name and directory to download the model to:

    cd <INSTALL_DIR>/deployment_tools/open_model_zoo/tools/downloader
    
    python3 downloader.py --name googlenet-v1 -o <models_dir>
    
  2. Convert the model to the Inference Engine IR format. Go to the Model Optimizer directory and run the mo.py script with specifying the path to the model, model format (which must be FP32 for CPU and FPG) and output directory to generate the IR files:

    cd <INSTALL_DIR>/deployment_tools/model_optimizer
    
    python3 mo.py --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
    
  3. Run the tool with specifying the <INSTALL_DIR>/deployment_tools/demo/car.png file as an input image, the IR of the googlenet-v1 model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and FPGA devices:

    • On CPU:
     python3 benchmark_app.py -m <ir_dir>/googlenet-v1.xml -d CPU -api async -i <INSTALL_DIR>/deployment_tools/demo/car.png --progress true -b 1
    
    • On FPGA:
    python3 benchmark_app.py -m <ir_dir>/googlenet-v1.xml -d HETERO:FPGA,CPU -api async -i <INSTALL_DIR>/deployment_tools/demo/car.png --progress true -b 1
    

The application outputs number of executed iterations, total duration of execution, latency and throughput. Additionally, if you set the -pc parameter, the application outputs performance counters. If you set -exec_graph_path, the application reports executable graph information serialized.

Below are fragments of sample output for CPU and FPGA devices:

  • For CPU:
    [Step 8/9] Measuring performance (Start inference asynchronously, 60000 ms duration, 4 inference requests in parallel using 4 streams)
    Progress: |................................| 100.00%
    
    [Step 9/9] Dumping statistics report
    Progress: |................................| 100.00%
    
    Count:      4408 iterations
    Duration:   60153.52 ms
    Latency:    51.8244 ms
    Throughput: 73.28 FPS
    
  • For FPGA:
    [Step 10/11] Measuring performance (Start inference asynchronously, 5 inference requests using 1 streams for CPU, limits: 120000 ms duration)
    Progress: |................................| 100%
    
    [Step 11/11] Dumping statistics report
    Count:      98075 iterations
    Duration:   120011.03 ms
    Latency:    5.65 ms
    Throughput: 817.22 FPS
    

See Also