Removed information about FPGA plugin (#7474)

This commit is contained in:
Ilya Lavrenov 2021-09-13 14:01:49 +03:00 committed by GitHub
parent b11b1d44cb
commit b373cb844b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
30 changed files with 37 additions and 163 deletions

View File

@ -19,7 +19,6 @@ Glossary {#openvino_docs_IE_DG_Glossary}
| ELU | Exponential Linear rectification Unit |
| FCN | Fully Convolutional Network |
| FP | Floating Point |
| FPGA | Field-Programmable Gate Array |
| GCC | GNU Compiler Collection |
| GPU | Graphics Processing Unit |
| HD | High Definition |

View File

@ -29,8 +29,6 @@ The function returns list of available devices, for example:
```
MYRIAD.1.2-ma2480
MYRIAD.1.4-ma2480
FPGA.0
FPGA.1
CPU
GPU.0
GPU.1

View File

@ -23,7 +23,7 @@ If transmitting data from one part of a network to another part in heterogeneous
In this case, you can define heaviest part manually and set the affinity to avoid sending data back and forth many times during one inference.
## Annotation of Layers per Device and Default Fallback Policy
Default fallback policy decides which layer goes to which device automatically according to the support in dedicated plugins (FPGA, GPU, CPU, MYRIAD).
Default fallback policy decides which layer goes to which device automatically according to the support in dedicated plugins (GPU, CPU, MYRIAD).
Another way to annotate a network is to set affinity manually using <code>ngraph::Node::get_rt_info</code> with key `"affinity"`:
@ -46,25 +46,16 @@ If you rely on the default affinity distribution, you can avoid calling <code>In
During loading of the network to heterogeneous plugin, network is divided to separate parts and loaded to dedicated plugins.
Intermediate blobs between these sub graphs are allocated automatically in the most efficient way.
## Execution Precision
Precision for inference in heterogeneous plugin is defined by
* Precision of IR.
* Ability of final plugins to execute in precision defined in IR
Examples:
* If you want to execute GPU with CPU fallback with FP16 on GPU, you need to use only FP16 IR.
* If you want to execute on FPGA with CPU fallback, you can use any precision for IR. The execution on FPGA is defined by bitstream, the execution on CPU happens in FP32.
Samples can be used with the following command:
```sh
./object_detection_sample_ssd -m <path_to_model>/ModelSSD.xml -i <path_to_pictures>/picture.jpg -d HETERO:FPGA,CPU
./object_detection_sample_ssd -m <path_to_model>/ModelSSD.xml -i <path_to_pictures>/picture.jpg -d HETERO:GPU,CPU
```
where:
- `HETERO` stands for heterogeneous plugin
- `FPGA,CPU` points to fallback policy with priority on FPGA and fallback to CPU
- `GPU,CPU` points to fallback policy with priority on GPU and fallback to CPU
You can point more than two devices: `-d HETERO:FPGA,GPU,CPU`
You can point more than two devices: `-d HETERO:GPU,GPU,CPU`
## Analyzing Heterogeneous Execution
After enabling of <code>KEY_HETERO_DUMP_GRAPH_DOT</code> config key, you can dump GraphViz* `.dot` files with annotations of devices per layer.

View File

@ -27,11 +27,9 @@ limitations under the License.
<tab type="usergroup" title="Installation Guides" url=""><!--automatically generated-->
<tab type="usergroup" title="Linux" url="@ref openvino_docs_install_guides_installing_openvino_linux">
<tab type="user" title="Install Intel® Distribution of OpenVINO™ toolkit for Linux* OS" url="@ref openvino_docs_install_guides_installing_openvino_linux"/>
<tab type="user" title="[DEPRECATED] Install Intel® Distribution of OpenVINO™ toolkit for Linux with FPGA Support" url="@ref openvino_docs_install_guides_installing_openvino_linux_fpga"/>
</tab>
<tab type="usergroup" title="Windows" url="@ref openvino_docs_install_guides_installing_openvino_windows">
<tab type="user" title="Install Intel® Distribution of OpenVINO™ toolkit for Windows* 10" url="@ref openvino_docs_install_guides_installing_openvino_windows"/>
<tab type="user" title="[DEPRECATED] Install Intel® Distribution of OpenVINO™ toolkit for Windows* with FPGA support" url="@ref openvino_docs_install_guides_installing_openvino_windows_fpga"/>
</tab>
<tab type="user" title="macOS" url="@ref openvino_docs_install_guides_installing_openvino_macos"/>
<tab type="user" title="Raspbian OS" url="@ref openvino_docs_install_guides_installing_openvino_raspbian"/>

View File

@ -353,14 +353,6 @@ docker run -itu root:root --rm --device=/dev/ion:/dev/ion -v /var/tmp:/var/tmp <
/bin/bash -c "apt update && apt install sudo && deployment_tools/demo/demo_security_barrier_camera.sh -d HDDL -sample-options -no_show"
```
## Use a Docker* Image for FPGA
Intel will be transitioning to the next-generation programmable deep-learning solution based on FPGAs in order to increase the level of customization possible in FPGA deep-learning. As part of this transition, future standard releases (i.e., non-LTS releases) of Intel® Distribution of OpenVINO™ toolkit will no longer include the Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA.
Intel® Distribution of OpenVINO™ toolkit 2020.3.X LTS release will continue to support Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. For questions about next-generation programmable deep-learning solutions based on FPGAs, please talk to your sales representative or contact us to get the latest FPGA updates.
For instructions for previous releases with FPGA Support, see documentation for the [2020.4 version](https://docs.openvinotoolkit.org/2020.4/openvino_docs_install_guides_installing_openvino_docker_linux.html#use_a_docker_image_for_fpga) or lower.
## Troubleshooting
If you got proxy issues, please setup proxy settings for Docker. See the Proxy section in the [Install the DL Workbench from Docker Hub* ](@ref workbench_docs_Workbench_DG_Run_Locally) topic.

View File

@ -1,21 +0,0 @@
# Install Intel® Distribution of OpenVINO™ toolkit for Linux* with FPGA Support {#openvino_docs_install_guides_installing_openvino_linux_fpga}
## Product Change Notice
Intel® Distribution of OpenVINO™ toolkit for Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA
<table>
<tr>
<td><strong>Change Notice Begins</strong></td>
<td>July 2020</td>
</tr>
<tr>
<td><strong>Change Date</strong></td>
<td>October 2020</td>
</tr>
</table>
Intel will be transitioning to the next-generation programmable deep-learning solution based on FPGAs in order to increase the level of customization possible in FPGA deep-learning. As part of this transition, future standard releases (i.e., non-LTS releases) of Intel® Distribution of OpenVINO™ toolkit will no longer include the Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA.
Intel® Distribution of OpenVINO™ toolkit 2020.3.X LTS release will continue to support Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. For questions about next-generation programmable deep-learning solutions based on FPGAs, please talk to your sales representative or contact us to get the latest FPGA updates.
For installation instructions for the last release of Intel® Distribution of OpenVINO™ toolkit for Linux* with FPGA Support, see documentation for the [2020.4 version](https://docs.openvinotoolkit.org/2020.4/openvino_docs_install_guides_installing_openvino_linux_fpga.html).

View File

@ -1,21 +0,0 @@
# Install Intel® Distribution of OpenVINO™ toolkit for Windows* with FPGA Support {#openvino_docs_install_guides_installing_openvino_windows_fpga}
## Product Change Notice
Intel® Distribution of OpenVINO™ toolkit for Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA
<table>
<tr>
<td><strong>Change Notice Begins</strong></td>
<td>July 2020</td>
</tr>
<tr>
<td><strong>Change Date</strong></td>
<td>October 2020</td>
</tr>
</table>
Intel will be transitioning to the next-generation programmable deep-learning solution based on FPGAs in order to increase the level of customization possible in FPGA deep-learning. As part of this transition, future standard releases (i.e., non-LTS releases) of Intel® Distribution of OpenVINO™ toolkit will no longer include the Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA.
Intel® Distribution of OpenVINO™ toolkit 2020.3.X LTS release will continue to support Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. For questions about next-generation programmable deep-learning solutions based on FPGAs, please talk to your sales representative or contact us to get the latest FPGA updates.
For installation instructions for the last release of Intel® Distribution of OpenVINO™ toolkit for Windows* with FPGA Support, see documentation for the [2020.4 version](https://docs.openvinotoolkit.org/2020.4/openvino_docs_install_guides_installing_openvino_windows_fpga.html).

View File

@ -196,16 +196,6 @@ Since Intel® Movidius™ Myriad™ X Visual Processing Unit (Intel® Movidius
Intel® Vision Accelerator Design with Intel® Movidius™ VPUs requires keeping at least 32 inference requests in flight to fully saturate the device.
### FPGA <a name="fpga"></a>
Below are listed the most important tips for the efficient usage of the FPGA:
- Just like for the Intel® Movidius™ Myriad™ VPU flavors, for the FPGA, it is important to hide the communication overheads by running multiple inference requests in parallel. For examples, refer to the [Benchmark App Sample](../../inference-engine/samples/benchmark_app/README.md).
- Since the first inference iteration with FPGA is always significantly slower than the subsequent ones, make sure you run multiple iterations (all samples, except GUI-based demos, have the `-ni` or 'niter' option to do that).
- FPGA performance heavily depends on the bitstream.
- Number of the infer request per executable network is limited to five, so “channel” parallelism (keeping individual infer request per camera/video input) would not work beyond five inputs. Instead, you need to mux the inputs into some queue that will internally use a pool of (5) requests.
- In most scenarios, the FPGA acceleration is leveraged through <a href="heterogeneity">heterogeneous execution</a> with further specific tips.
## Heterogeneity <a name="heterogeneity"></a>
Heterogeneous execution (constituted by the dedicated Inference Engine [“Hetero” plugin](../IE_DG/supported_plugins/HETERO.md)) enables to schedule a network inference to the multiple devices.
@ -249,23 +239,15 @@ Every Inference Engine sample supports the `-d` (device) option.
For example, here is a command to run an [Object Detection Sample SSD Sample](../../inference-engine/samples/object_detection_sample_ssd/README.md):
```sh
./object_detection_sample_ssd -m <path_to_model>/ModelSSD.xml -i <path_to_pictures>/picture.jpg -d HETERO:FPGA,CPU
./object_detection_sample_ssd -m <path_to_model>/ModelSSD.xml -i <path_to_pictures>/picture.jpg -d HETERO:GPU,CPU
```
where:
- `HETERO` stands for Heterogeneous plugin.
- `FPGA,CPU` points to fallback policy with first priority on FPGA and further fallback to CPU.
- `GPU,CPU` points to fallback policy with first priority on GPU and further fallback to CPU.
You can point more than two devices: `-d HETERO:FPGA,GPU,CPU`.
### Heterogeneous Scenarios with FPGA <a name="heterogeneous-scenarios-fpga"></a>
As FPGA is considered as an inference accelerator, most performance issues are related to the fact that due to the fallback, the CPU can be still used quite heavily.
- Yet in most cases, the CPU does only small/lightweight layers, for example, post-processing (`SoftMax` in most classification models or `DetectionOutput` in the SSD*-based topologies). In that case, limiting the number of CPU threads with [`KEY_CPU_THREADS_NUM`](../IE_DG/supported_plugins/CPU.md) config would further reduce the CPU utilization without significantly degrading the overall performance.
- Also, if you are still using OpenVINO™ toolkit version earlier than R1 2019, or if you have recompiled the Inference Engine with OpenMP (say for backward compatibility), setting the `KMP_BLOCKTIME` environment variable to something less than default 200ms (we suggest 1ms) is particularly helpful. Use `KMP_BLOCKTIME=0` if the CPU subgraph is small.
> **NOTE**: General threading tips (see <a href="#note-on-app-level-threading">Note on the App-Level Threading</a>) apply well, even when the entire topology fits the FPGA, because there is still a host-side code for data pre- and post-processing.
You can point more than two devices: `-d HETERO:GPU,MYRIAD,CPU`.
### General Tips on GPU/CPU Execution <a name="tips-on-gpu-cpu-execution"></a>

View File

@ -12,7 +12,7 @@ auto function = network.getFunction();
// This example demonstrates how to perform default affinity initialization and then
// correct affinity manually for some layers
const std::string device = "HETERO:FPGA,CPU";
const std::string device = "HETERO:GPU,CPU";
// QueryNetworkResult object contains map layer -> device
InferenceEngine::QueryNetworkResult res = core.QueryNetwork(network, device, { });

View File

@ -5,7 +5,7 @@ using namespace InferenceEngine;
//! [part2]
InferenceEngine::Core core;
auto network = core.ReadNetwork("sample.xml");
auto executable_network = core.LoadNetwork(network, "HETERO:FPGA,CPU");
auto executable_network = core.LoadNetwork(network, "HETERO:GPU,CPU");
//! [part2]
return 0;
}

View File

@ -1,17 +0,0 @@
#include <ie_core.hpp>
int main() {
using namespace InferenceEngine;
//! [part0]
using namespace InferenceEngine::PluginConfigParams;
using namespace InferenceEngine::HeteroConfigParams;
Core ie;
auto network = ie.ReadNetwork("sample.xml");
// ...
auto execNetwork = ie.LoadNetwork(network, "HETERO:FPGA,CPU", { {KEY_HETERO_DUMP_GRAPH_DOT, YES} });
//! [part0]
return 0;
}

View File

@ -501,7 +501,7 @@ INFERENCE_ENGINE_C_API(IE_NODISCARD IEStatusCode) ie_core_get_config(const ie_co
* @brief Gets available devices for neural network inference.
* @ingroup Core
* @param core A pointer to ie_core_t instance.
* @param avai_devices The devices are returned as { CPU, FPGA.0, FPGA.1, MYRIAD }
* @param avai_devices The devices are returned as { CPU, GPU.0, GPU.1, MYRIAD }
* If there more than one device of specific type, they are enumerated with .# suffix
* @return Status code of the operation: OK(0) for success.
*/

View File

@ -9,7 +9,7 @@ from enum import Enum
supported_precisions = ['FP32', 'FP64', 'FP16', 'I64', 'U64', 'I32', 'U32',
'I16', 'I4', 'I8', 'U16', 'U4', 'U8', 'BOOL', 'BIN', 'BF16']
known_plugins = ['CPU', 'GPU', 'FPGA', 'MYRIAD', 'HETERO', 'HDDL', 'MULTI']
known_plugins = ['CPU', 'GPU', 'MYRIAD', 'HETERO', 'HDDL', 'MULTI']
layout_int_to_str_map = {0: 'ANY', 1: 'NCHW', 2: 'NHWC', 3: 'NCDHW', 4: 'NDHWC', 64: 'OIHW', 95: 'SCALAR', 96: 'C',
128: 'CHW', 192: 'HW', 193: 'NC', 194: 'CN', 200: 'BLOCKED'}

View File

@ -541,7 +541,7 @@ cdef class IECore:
def get_config(self, device_name: str, config_name: str):
return self.impl.getConfig(device_name.encode(), config_name.encode())
## A list of devices. The devices are returned as \[CPU, FPGA.0, FPGA.1, MYRIAD\].
## A list of devices. The devices are returned as \[CPU, GPU.0, GPU.1, MYRIAD\].
# If there are more than one device of a specific type, they all are listed followed by a dot and a number.
@property
def available_devices(self):

View File

@ -5,7 +5,7 @@ OpenVINO™ toolkit quickly deploys applications and solutions that emulate huma
OpenVINO™ toolkit:
- Enables CNN-based deep learning inference on the edge
- Supports heterogeneous execution across an Intel® CPU, Intel® Integrated Graphics, Intel® FPGA, Intel® Neural Compute Stick 2, and Intel® Vision Accelerator Design with Intel® Movidius™ VPUs
- Supports heterogeneous execution across an Intel® CPU, Intel® Integrated Graphics, Intel® Neural Compute Stick 2, and Intel® Vision Accelerator Design with Intel® Movidius™ VPUs
- Speeds time-to-market via an easy-to-use library of computer vision functions and pre-optimized kernels
- Includes optimized calls for computer vision standards, including OpenCV\* and OpenCL™

View File

@ -140,7 +140,7 @@ To run the tool, you can use [public](@ref omz_models_group_public) or [Intel's]
## Examples of Running the Tool
This section provides step-by-step instructions on how to run the Benchmark Tool with the `googlenet-v1` public model on CPU or FPGA devices. As an input, the `car.png` file from the `<INSTALL_DIR>/deployment_tools/demo/` directory is used.
This section provides step-by-step instructions on how to run the Benchmark Tool with the `googlenet-v1` public model on CPU or GPU devices. As an input, the `car.png` file from the `<INSTALL_DIR>/deployment_tools/demo/` directory is used.
> **NOTE:** The Internet access is required to execute the following steps successfully. If you have access to the Internet through the proxy server only, please make sure that it is configured in your OS environment.
@ -158,21 +158,21 @@ This section provides step-by-step instructions on how to run the Benchmark Tool
```sh
python3 mo.py --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
```
3. Run the tool with specifying the `<INSTALL_DIR>/deployment_tools/demo/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and FPGA devices:
3. Run the tool with specifying the `<INSTALL_DIR>/deployment_tools/demo/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and GPU devices:
* On CPU:
```sh
./benchmark_app -m <ir_dir>/googlenet-v1.xml -i <INSTALL_DIR>/deployment_tools/demo/car.png -d CPU -api async --progress true
```
* On FPGA:
* On GPU:
```sh
./benchmark_app -m <ir_dir>/googlenet-v1.xml -i <INSTALL_DIR>/deployment_tools/demo/car.png -d HETERO:FPGA,CPU -api async --progress true
./benchmark_app -m <ir_dir>/googlenet-v1.xml -i <INSTALL_DIR>/deployment_tools/demo/car.png -d GPU -api async --progress true
```
The application outputs the number of executed iterations, total duration of execution, latency, and throughput.
Additionally, if you set the `-report_type` parameter, the application outputs statistics report. If you set the `-pc` parameter, the application outputs performance counters. If you set `-exec_graph_path`, the application reports executable graph information serialized. All measurements including per-layer PM counters are reported in milliseconds.
Below are fragments of sample output for CPU and FPGA devices:
Below are fragments of sample output for CPU and GPU devices:
* For CPU:
```
@ -189,7 +189,7 @@ Below are fragments of sample output for CPU and FPGA devices:
Throughput: 76.73 FPS
```
* For FPGA:
* For GPU:
```
[Step 10/11] Measuring performance (Start inference asynchronously, 5 inference requests using 4 streams for CPU, limits: 120000 ms duration)
Progress: [....................] 100% done

View File

@ -59,7 +59,6 @@ uint32_t deviceDefaultDeviceDurationInSeconds(const std::string& device) {
{"VPU", 60},
{"MYRIAD", 60},
{"HDDL", 60},
{"FPGA", 120},
{"UNKNOWN", 120}};
uint32_t duration = 0;
for (const auto& deviceDurationInSeconds : deviceDefaultDurationInSeconds) {

View File

@ -249,7 +249,7 @@ public:
/**
* @brief Returns devices available for neural networks inference
*
* @return A vector of devices. The devices are returned as { CPU, FPGA.0, FPGA.1, MYRIAD }
* @return A vector of devices. The devices are returned as { CPU, GPU.0, GPU.1, MYRIAD }
* If there more than one device of specific type, they are enumerated with .# suffix.
*/
std::vector<std::string> GetAvailableDevices() const;

View File

@ -230,7 +230,7 @@ public:
/**
* @brief Returns devices available for neural networks inference
*
* @return A vector of devices. The devices are returned as { CPU, FPGA.0, FPGA.1, MYRIAD }
* @return A vector of devices. The devices are returned as { CPU, GPU.0, GPU.1, MYRIAD }
* If there more than one device of specific type, they are enumerated with .# suffix.
*/
std::vector<std::string> get_available_devices() const;

View File

@ -598,7 +598,7 @@ public:
/**
* @brief Returns devices available for neural networks inference
*
* @return A vector of devices. The devices are returned as { CPU, FPGA.0, FPGA.1, MYRIAD }
* @return A vector of devices. The devices are returned as { CPU, GPU.0, GPU.1, MYRIAD }
* If there more than one device of specific type, they are enumerated with .# suffix.
*/
std::vector<std::string> GetAvailableDevices() const override {
@ -766,7 +766,7 @@ public:
* @brief Sets config values for a plugin or set of plugins
* @param deviceName A device name to set config to
* If empty, config is set for all the plugins / plugin's meta-data
* @note `deviceName` is not allowed in form of MULTI:CPU, HETERO:FPGA,CPU, AUTO:CPU
* @note `deviceName` is not allowed in form of MULTI:CPU, HETERO:GPU,CPU, AUTO:CPU
* just simple forms like CPU, GPU, MULTU, GPU.0, etc
*/
void SetConfigForPlugins(const std::map<std::string, std::string>& configMap, const std::string& deviceName) {
@ -1132,7 +1132,7 @@ void Core::SetConfig(const std::map<std::string, std::string>& config, const std
"You can configure the devices with SetConfig before creating the AUTO on top.";
}
// GPU.0, FPGA.1 cases
// GPU.0, GPU.1 cases
if (deviceName.find(".") != std::string::npos) {
IE_THROW()
<< "SetConfig is supported only for device family itself (without particular device .#). "
@ -1314,7 +1314,7 @@ void Core::set_config(const ConfigMap& config, const std::string& deviceName) {
"You can configure the devices with SetConfig before creating the AUTO on top.";
}
// GPU.0, FPGA.1 cases
// GPU.0, GPU.1 cases
if (deviceName.find(".") != std::string::npos) {
IE_THROW()
<< "SetConfig is supported only for device family itself (without particular device .#). "

View File

@ -121,7 +121,7 @@ public:
/**
* @brief Returns devices available for neural networks inference
*
* @return A vector of devices. The devices are returned as { CPU, FPGA.0, FPGA.1, MYRIAD }
* @return A vector of devices. The devices are returned as { CPU, GPU.0, GPU.1, MYRIAD }
* If there more than one device of specific type, they are enumerated with .# suffix.
*/
virtual std::vector<std::string> GetAvailableDevices() const = 0;

View File

@ -51,7 +51,7 @@ private:
/**
* @brief This is global point for getting task executor objects by string id.
* It's necessary in multiple asynchronous requests for having unique executors to avoid oversubscription.
* E.g. There 2 task executors for CPU device: one - in FPGA, another - in MKLDNN. Parallel execution both of them leads
* E.g. There 2 task executors for CPU device: one - in GPU, another - in MKLDNN. Parallel execution both of them leads
* to not optimal CPU usage. More efficient to run the corresponding tasks one by one via single executor.
* @ingroup ie_dev_api_threading
*/

View File

@ -107,7 +107,4 @@ const TestModel convReluNormPoolFcModelFP32 = getConvReluNormPoolFcModel(Inferen
const TestModel convReluNormPoolFcModelFP16 = getConvReluNormPoolFcModel(InferenceEngine::Precision::FP16);
const TestModel convReluNormPoolFcModelQ78 = getConvReluNormPoolFcModel(InferenceEngine::Precision::Q78);
class FPGAHangingTest : public BehaviorPluginTest {
};
#endif

View File

@ -17,7 +17,7 @@ def pytest_addoption(parser):
parser.addoption(
"--backend",
default="CPU",
choices=["CPU", "GPU", "FPGA", "HDDL", "MYRIAD", "HETERO", "TEMPLATE"],
choices=["CPU", "GPU", "HDDL", "MYRIAD", "HETERO", "TEMPLATE"],
help="Select target device",
)
parser.addoption(
@ -42,7 +42,6 @@ def pytest_configure(config):
# register additional markers
config.addinivalue_line("markers", "skip_on_cpu: Skip test on CPU")
config.addinivalue_line("markers", "skip_on_gpu: Skip test on GPU")
config.addinivalue_line("markers", "skip_on_fpga: Skip test on FPGA")
config.addinivalue_line("markers", "skip_on_hddl: Skip test on HDDL")
config.addinivalue_line("markers", "skip_on_myriad: Skip test on MYRIAD")
config.addinivalue_line("markers", "skip_on_hetero: Skip test on HETERO")
@ -58,7 +57,6 @@ def pytest_collection_modifyitems(config, items):
keywords = {
"CPU": "skip_on_cpu",
"GPU": "skip_on_gpu",
"FPGA": "skip_on_fpga",
"HDDL": "skip_on_hddl",
"MYRIAD": "skip_on_myriad",
"HETERO": "skip_on_hetero",
@ -68,7 +66,6 @@ def pytest_collection_modifyitems(config, items):
skip_markers = {
"CPU": pytest.mark.skip(reason="Skipping test on the CPU backend."),
"GPU": pytest.mark.skip(reason="Skipping test on the GPU backend."),
"FPGA": pytest.mark.skip(reason="Skipping test on the FPGA backend."),
"HDDL": pytest.mark.skip(reason="Skipping test on the HDDL backend."),
"MYRIAD": pytest.mark.skip(reason="Skipping test on the MYRIAD backend."),
"HETERO": pytest.mark.skip(reason="Skipping test on the HETERO backend."),

View File

@ -33,7 +33,7 @@ To run the demos, run demo_squeezenet_download_convert_run.sh or demo_security_b
./demo_squeezenet_download_convert_run.sh
The script allows to specify the target device to infer on using -d <CPU|GPU|MYRIAD|FPGA> option.
The script allows to specify the target device to infer on using -d <CPU|GPU|MYRIAD> option.
Classification Demo Using SqueezeNet
====================================

View File

@ -20,6 +20,6 @@ else:
if not os.path.exists(out_path):
os.makedirs(out_path)
# supported_devices : CPU, GPU, MYRIAD, FPGA
# supported_devices : CPU, GPU, MYRIAD
test_device = os.environ.get('TEST_DEVICE', 'CPU;GPU').split(';')
test_precision = os.environ.get('TEST_PRECISION', 'FP32;FP16').split(';')

View File

@ -79,7 +79,7 @@ Options:
compiled model.
-d TARGET_DEVICE, --target_device TARGET_DEVICE
Optional. Specify a target device to infer on: CPU,
GPU, FPGA, HDDL or MYRIAD.
GPU, HDDL or MYRIAD.
Use "-d HETERO:<comma separated devices list>" format to specify HETERO plugin.
Use "-d MULTI:<comma separated devices list>" format to specify MULTI plugin.
The application looks for a suitable plugin for the specified device.
@ -149,7 +149,7 @@ To run the tool, you can use [public](@ref omz_models_group_public) or [Intel's]
## Examples of Running the Tool
This section provides step-by-step instructions on how to run the Benchmark Tool with the `googlenet-v1` public model on CPU or FPGA devices. As an input, the `car.png` file from the `<INSTALL_DIR>/deployment_tools/demo/` directory is used.
This section provides step-by-step instructions on how to run the Benchmark Tool with the `googlenet-v1` public model on CPU or GPU devices. As an input, the `car.png` file from the `<INSTALL_DIR>/deployment_tools/demo/` directory is used.
> **NOTE:** The Internet access is required to execute the following steps successfully. If you have access to the Internet through the proxy server only, please make sure that it is configured in your OS environment.
@ -167,22 +167,22 @@ This section provides step-by-step instructions on how to run the Benchmark Tool
```sh
python3 mo.py --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
```
3. Run the tool with specifying the `<INSTALL_DIR>/deployment_tools/demo/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and FPGA devices:
3. Run the tool with specifying the `<INSTALL_DIR>/deployment_tools/demo/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and GPU devices:
* On CPU:
```sh
python3 benchmark_app.py -m <ir_dir>/googlenet-v1.xml -d CPU -api async -i <INSTALL_DIR>/deployment_tools/demo/car.png --progress true -b 1
```
* On FPGA:
* On GPU:
```sh
python3 benchmark_app.py -m <ir_dir>/googlenet-v1.xml -d HETERO:FPGA,CPU -api async -i <INSTALL_DIR>/deployment_tools/demo/car.png --progress true -b 1
python3 benchmark_app.py -m <ir_dir>/googlenet-v1.xml -d GPU -api async -i <INSTALL_DIR>/deployment_tools/demo/car.png --progress true -b 1
```
The application outputs number of executed iterations, total duration of execution, latency and throughput.
Additionally, if you set the `-pc` parameter, the application outputs performance counters.
If you set `-exec_graph_path`, the application reports executable graph information serialized.
Below are fragments of sample output for CPU and FPGA devices:
Below are fragments of sample output for CPU and GPU devices:
* For CPU:
```
[Step 8/9] Measuring performance (Start inference asynchronously, 60000 ms duration, 4 inference requests in parallel using 4 streams)
@ -196,7 +196,7 @@ Below are fragments of sample output for CPU and FPGA devices:
Latency: 51.8244 ms
Throughput: 73.28 FPS
```
* For FPGA:
* For GPU:
```
[Step 10/11] Measuring performance (Start inference asynchronously, 5 inference requests using 1 streams for CPU, limits: 120000 ms duration)
Progress: |................................| 100%

View File

@ -4,7 +4,6 @@
VPU_DEVICE_NAME = 'VPU'
MYRIAD_DEVICE_NAME = 'MYRIAD'
HDDL_DEVICE_NAME = 'HDDL'
FPGA_DEVICE_NAME = 'FPGA'
CPU_DEVICE_NAME = 'CPU'
GPU_DEVICE_NAME = 'GPU'
HETERO_DEVICE_NAME = 'HETERO'
@ -25,7 +24,6 @@ DEVICE_DURATION_IN_SECS = {
VPU_DEVICE_NAME: 60,
MYRIAD_DEVICE_NAME: 60,
HDDL_DEVICE_NAME: 60,
FPGA_DEVICE_NAME: 120,
GNA_DEVICE_NAME: 60,
UNKNOWN_DEVICE_TYPE: 120
}

View File

@ -7,9 +7,6 @@ The tool compiles networks for the following target devices using corresponding
* Intel® Neural Compute Stick 2 (MYRIAD plugin)
> **NOTE**: Intel® Distribution of OpenVINO™ toolkit no longer supports the Intel® Vision Accelerator Design with an Intel® Arria® 10 FPGA and the Intel® Programmable Acceleration Card with Intel® Arria® 10 GX FPGA. To compile a network for those devices, use the Compile Tool from the Intel® Distribution of OpenVINO™ toolkit [2020.3 LTS release](https://docs.openvinotoolkit.org/2020.3/_inference_engine_tools_compile_tool_README.html).
The tool is delivered as an executable file that can be run on both Linux* and Windows*.
The tool is located in the `<INSTALLROOT>/deployment_tools/tools/compile_tool` directory.

View File

@ -83,10 +83,6 @@ static constexpr char tiling_cmx_limit_message[] =
" Value should be equal or greater than -1.\n"
" Overwrites value from config.";
// FPGA-specific
static constexpr char dla_arch_name[] =
"Optional. Specify architecture name used to compile executable network for FPGA device.";
DEFINE_bool(h, false, help_message);
DEFINE_string(m, "", model_message);
DEFINE_string(d, "", targetDeviceMessage);
@ -102,7 +98,6 @@ DEFINE_string(iol, "", iol_message);
DEFINE_string(VPU_NUMBER_OF_SHAVES, "", number_of_shaves_message);
DEFINE_string(VPU_NUMBER_OF_CMX_SLICES, "", number_of_cmx_slices_message);
DEFINE_string(VPU_TILING_CMX_LIMIT_KB, "", tiling_cmx_limit_message);
DEFINE_string(DLA_ARCH_NAME, "", dla_arch_name);
static void showUsage() {
std::cout << "compile_tool [OPTIONS]" << std::endl;
@ -124,9 +119,6 @@ static void showUsage() {
std::cout << " -VPU_NUMBER_OF_SHAVES <value> " << number_of_shaves_message << std::endl;
std::cout << " -VPU_NUMBER_OF_CMX_SLICES <value> " << number_of_cmx_slices_message << std::endl;
std::cout << " -VPU_TILING_CMX_LIMIT_KB <value> " << tiling_cmx_limit_message << std::endl;
std::cout << std::endl;
std::cout << " FPGA-specific options: " << std::endl;
std::cout << " -DLA_ARCH_NAME <value> " << dla_arch_name << std::endl;
std::cout << std::endl;
}
@ -179,7 +171,6 @@ static std::map<std::string, std::string> parseConfigFile(char comment = '#') {
static std::map<std::string, std::string> configure() {
const bool isMYRIAD = FLAGS_d.find("MYRIAD") != std::string::npos;
const bool isFPGA = FLAGS_d.find("FPGA") != std::string::npos;
auto config = parseConfigFile();
@ -197,12 +188,6 @@ static std::map<std::string, std::string> configure() {
}
}
if (isFPGA) {
if (!FLAGS_DLA_ARCH_NAME.empty()) {
config["DLIA_ARCH_NAME"] = FLAGS_DLA_ARCH_NAME;
}
}
return config;
}