Files
openvino/docs/IE_DG/supported_plugins/MULTI.md
Andrey Zaytsev 322c874113 Feature/azaytsev/cherry picks from 2021 4 (#7389)
* Added info on DockerHub CI Framework

* Feature/azaytsev/change layout (#3295)

* Changes according to feedback comments

* Replaced @ref's with html links

* Fixed links, added a title page for installing from repos and images, fixed formatting issues

* Added links

* minor fix

* Added DL Streamer to the list of components installed by default

* Link fixes

* Link fixes

* ovms doc fix (#2988)

* added OpenVINO Model Server

* ovms doc fixes

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>

* Updated openvino_docs.xml

* Updated the link to software license agreements

* Revert "Updated the link to software license agreements"

This reverts commit 706dac500e.

* Updated legal info (#6409)

# Conflicts:
#	thirdparty/ade

* Cherry-pick 4833c8db72

[DOCS]Changed DL WB related docs and tips (#6318)

* changed DL WB related docs and tips

* added two tips to benchmark and changed layout

* changed layout

* changed links

* page title added

* changed tips

* ie layout fixed

* updated diagram and hints

* changed tooltip and ref link

* changet tooltip link

* changed DL WB description

* typo fix
# Conflicts:
#	docs/doxygen/ie_docs.xml
#	thirdparty/ade

* Cherry-pick 6405

Feature/azaytsev/mo devguide changes (#6405)

* MO devguide edits

* MO devguide edits

* MO devguide edits

* MO devguide edits

* MO devguide edits

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Experimenting with videos

* Additional edits

* Additional edits

* Updated the workflow diagram

* Minor fix

* Experimenting with videos

* Updated the workflow diagram

* Removed  Prepare_Trained_Model, changed the title for Config_Model_Optimizer

* Rolled back

* Revert "Rolled back"

This reverts commit 6a4a3e1765.

* Revert "Removed  Prepare_Trained_Model, changed the title for Config_Model_Optimizer"

This reverts commit 0810bd534f.

* Fixed ie_docs.xml, Removed  Prepare_Trained_Model, changed the title for Config_Model_Optimizer

* Fixed ie_docs.xml

* Minor fix

* <details> tag issue

* <details> tag issue

* Fix <details> tag issue

* Fix <details> tag issue

* Fix <details> tag issue
# Conflicts:
#	thirdparty/ade

* Cherry-pick #6419

* [Runtime] INT8 inference documentation update

* [Runtime] INT8 inference documentation: typo was fixed

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Table of Contents was removed

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
# Conflicts:
#	docs/IE_DG/Int8Inference.md
#	thirdparty/ade

* Cherry pick (#6437)

* Q2 changes

* Changed Convert_RNNT.md

Co-authored-by: baychub <cbay@yahoo.com>
# Conflicts:
#	docs/IE_DG/Int8Inference.md
#	docs/install_guides/installing-openvino-conda.md
#	docs/install_guides/pypi-openvino-dev.md
#	thirdparty/ade

* Cherry-pick (#6447)

* Added benchmark page changes

* Make the picture smaller

* Added Intel® Iris® Xe MAX Graphics

* Changed the TIP about DL WB

* Added Note on the driver for Intel® Iris® Xe MAX Graphics

* Fixed formatting

* Added the link to Intel® software for general purpose GPU capabilities

* OVSA ovsa_get_started updates

* Fixed link
# Conflicts:
#	thirdparty/ade

* Cherry-pick #6450

* fix layout

* 4
# Conflicts:
#	thirdparty/ade

* Cherry-pick #6466

* Cherry-pick #6548

* install docs fixes

* changed video width

* CMake reference added

* fixed table

* added backtics and table formating

* new table changes

* GPU table changes

* added more backtics and changed table format

* gpu table changes

* Update get_started_dl_workbench.md

Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com>
# Conflicts:
#	thirdparty/ade

* [Runtime] INT8 inference documentation update (#6419)

* [Runtime] INT8 inference documentation update

* [Runtime] INT8 inference documentation: typo was fixed

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Update docs/IE_DG/Int8Inference.md

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>

* Table of Contents was removed

Co-authored-by: Anastasiya Ageeva <anastasiya.ageeva@intel.com>
# Conflicts:
#	docs/IE_DG/Int8Inference.md
#	thirdparty/ade

* Cherry-pick #6651

* Edits to MO

Per findings spreadsheet

* macOS changes

per issue spreadsheet

* Fixes from review spreadsheet

Mostly IE_DG fixes

* Consistency changes

* Make doc fixes from last round of review

* Add GSG build-all details

* Fix links to samples and demos pages

* Make MO_DG v2 changes

* Add image view step to classify demo

* Put MO dependency with others

* Edit docs per issues spreadsheet

* Add file to pytorch_specific

* More fixes per spreadsheet

* Prototype sample page

* Add build section

* Update README.md

* Batch download/convert by default

* Add detail to How It Works

* Minor change

* Temporary restored topics

* corrected layout

* Resized

* Added white background into the picture

* fixed link to omz_tools_downloader

* fixed title in the layout

Co-authored-by: baychub <cbay@yahoo.com>
Co-authored-by: baychub <31420038+baychub@users.noreply.github.com>
# Conflicts:
#	docs/doxygen/ie_docs.xml

* Cherry-pick  (#6789) [59449][DOCS] GPU table layout change

* changed argument display

* added br tag to more arguments

* changed argument display in GPU table

* changed more arguments

* changed Quantized_ models display
# Conflicts:
#	thirdparty/ade

* Sync doxygen-ignore

* Removed ref to FPGA.md

* Fixed link to ONNX format doc

Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com>
Co-authored-by: Tatiana Savina <tatiana.savina@intel.com>
Co-authored-by: Edward Shogulin <edward.shogulin@intel.com>
Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com>
2021-09-07 19:21:41 +03:00

7.6 KiB
Raw Blame History

Multi-Device Plugin

Introducing the Multi-Device Plugin

The Multi-Device plugin automatically assigns inference requests to available computational devices to execute the requests in parallel. Potential gains are as follows:

  • Improved throughput that multiple devices can deliver (compared to single-device execution)
  • More consistent performance, since the devices can now share the inference burden (so that if one device is becoming too busy, another device can take more of the load)

Notice that with multi-device the application logic is left unchanged, so you don't need to explicitly load the network to every device, create and balance the inference requests and so on. From the application point of view, this is just another device that handles the actual machinery. The only thing that is required to leverage performance is to provide the multi-device (and hence the underlying devices) with enough inference requests to crunch. For example, if you were processing 4 cameras on the CPU (with 4 inference requests), you may now want to process more cameras (with more requests in flight) to keep CPU+GPU busy via multi-device.

The "setup" of Multi-Device can be described in three major steps:

  • First is configuration of each device as usual (e.g. via conventional SetConfig method)
  • Second is loading of a network to the Multi-Device plugin created on top of (prioritized) list of the configured devices. This is the only change that you need in your application.
  • Finally, just like with any other ExecutableNetwork (resulted from LoadNetwork) you just create as many requests as needed to saturate the devices. These steps are covered below in details.

Defining and Configuring the Multi-Device plugin

Following the OpenVINO notions of "devices", the Multi-Device has a "MULTI" name. The only configuration option for the Multi-Device plugin is a prioritized list of devices to use:

Parameter name Parameter values Default Description
"MULTI_DEVICE_PRIORITIES" comma-separated device names with no spaces N/A Prioritized list of devices

You can use name of the configuration directly as a string, or use MultiDeviceConfigParams::KEY_MULTI_DEVICE_PRIORITIES from the multi/multi_device_config.hpp, which defines the same string.

Basically, there are three ways to specify the devices to be use by the "MULTI":

@snippet snippets/MULTI0.cpp part0

Notice that the priorities of the devices can be changed in real time for the executable network:

@snippet snippets/MULTI1.cpp part1

Finally, there is a way to specify number of requests that the multi-device will internally keep for each device. Suppose your original app was running 4 cameras with 4 inference requests. You would probably want to share these 4 requests between 2 devices used in the MULTI. The easiest way is to specify a number of requests for each device using parentheses: "MULTI:CPU(2),GPU(2)" and use the same 4 requests in your app. However, such an explicit configuration is not performance-portable and hence not recommended. Instead, the better way is to configure the individual devices and query the resulting number of requests to be used at the application level (see Configuring the Individual Devices and Creating the Multi-Device On Top).

Enumerating Available Devices

Inference Engine now features a dedicated API to enumerate devices and their capabilities. See Hello Query Device C++ Sample. This is example output from the sample (truncated to the devices' names only):

./hello_query_device
Available devices: 
    Device: CPU
...
    Device: GPU.0
...
    Device: GPU.1
...
    Device: HDDL

A simple programmatic way to enumerate the devices and use with the multi-device is as follows:

@snippet snippets/MULTI2.cpp part2

Beyond the trivial "CPU", "GPU", "HDDL" and so on, when multiple instances of a device are available the names are more qualified. For example, this is how two Intel® Movidius™ Myriad™ X sticks are listed with the hello_query_sample:

...
    Device: MYRIAD.1.2-ma2480
...
    Device: MYRIAD.1.4-ma2480

So the explicit configuration to use both would be "MULTI:MYRIAD.1.2-ma2480,MYRIAD.1.4-ma2480". Accordingly, the code that loops over all available devices of "MYRIAD" type only is below:

@snippet snippets/MULTI3.cpp part3

Configuring the Individual Devices and Creating the Multi-Device On Top

As discussed in the first section, you shall configure each individual device as usual and then just create the "MULTI" device on top:

@snippet snippets/MULTI4.cpp part4

Alternatively, you can combine all the individual device settings into single config and load that, allowing the Multi-Device plugin to parse and apply that to the right devices. See code example in the next section.

Notice that while the performance of accelerators combines really well with multi-device, the CPU+GPU execution poses some performance caveats, as these devices share the power, bandwidth and other resources. For example it is recommended to enable the GPU throttling hint (which save another CPU thread for the CPU inference). See section of the Using the multi-device with OpenVINO samples and benchmarking the performance below.

Querying the Optimal Number of Inference Requests

Notice that until R2 you had to calculate number of requests in your application for any device, e.g. you had to know that Intel® Vision Accelerator Design with Intel® Movidius™ VPUs required at least 32 inference requests to perform well. Now you can use the new GetMetric API to query the optimal number of requests. Similarly, when using the multi-device you don't need to sum over included devices yourself, you can query metric directly:

@snippet snippets/MULTI5.cpp part5

Using the Multi-Device with OpenVINO Samples and Benchmarking the Performance

Notice that every OpenVINO sample that supports "-d" (which stands for "device") command-line option transparently accepts the multi-device. The Benchmark Application is the best reference to the optimal usage of the multi-device. As discussed multiple times earlier, you don't need to setup number of requests, CPU streams or threads as the application provides optimal out of the box performance. Below is example command-line to evaluate HDDL+GPU performance with that:

./benchmark_app d MULTI:HDDL,GPU m <model> -i <input> -niter 1000

Notice that you can use the FP16 IR to work with multi-device (as CPU automatically upconverts it to the fp32) and rest of devices support it naturally. Also notice that no demos are (yet) fully optimized for the multi-device, by means of supporting the OPTIMAL_NUMBER_OF_INFER_REQUESTS metric, using the GPU streams/throttling, and so on.

Video: MULTI Plugin

See Also