Moved inference_engine samples to cpp folder (#8615)
* Moved inference_engine samples to cpp folder * Fixed documentations links * Fixed installation * Fixed scripts * Fixed cmake script * Try to fix install * Fixed samples * Some fix
This commit is contained in:
parent
03c8542357
commit
f639e4e902
@ -87,10 +87,8 @@ add_subdirectory(openvino)
|
||||
add_subdirectory(ngraph)
|
||||
add_subdirectory(inference-engine)
|
||||
add_subdirectory(runtime)
|
||||
add_subdirectory(samples)
|
||||
include(cmake/extra_modules.cmake)
|
||||
if(ENABLE_SAMPLES)
|
||||
add_subdirectory(samples)
|
||||
endif()
|
||||
add_subdirectory(model-optimizer)
|
||||
add_subdirectory(docs)
|
||||
add_subdirectory(tools)
|
||||
|
@ -60,7 +60,7 @@ Low-Precision 8-bit integer models cannot be converted to BF16, even if bfloat16
|
||||
|
||||
Bfloat16 simulation mode is available on CPU and Intel® AVX-512 platforms that do not support the native `avx512_bf16` instruction. The simulator does not guarantee an adequate performance.
|
||||
To enable Bfloat16 simulator:
|
||||
* In [Benchmark App](../../inference-engine/samples/benchmark_app/README.md), add the `-enforcebf16=true` option
|
||||
* In [Benchmark App](../../samples/cpp/benchmark_app/README.md), add the `-enforcebf16=true` option
|
||||
* In C++ API, set `KEY_ENFORCE_BF16` to `YES`
|
||||
* In C API:
|
||||
```
|
||||
|
@ -45,4 +45,4 @@ The following pages describe how to integrate custom _kernels_ into the Inferenc
|
||||
|
||||
* [Build an extension library using CMake*](Building.md)
|
||||
* [Using Inference Engine Samples](../Samples_Overview.md)
|
||||
* [Hello Shape Infer SSD sample](../../../inference-engine/samples/hello_reshape_ssd/README.md)
|
||||
* [Hello Shape Infer SSD sample](../../../samples/cpp/hello_reshape_ssd/README.md)
|
||||
|
@ -2,7 +2,7 @@ Introduction to Inference Engine Device Query API {#openvino_docs_IE_DG_Inferenc
|
||||
===============================
|
||||
|
||||
This section provides a high-level description of the process of querying of different device properties and configuration values.
|
||||
Refer to the [Hello Query Device Sample](../../inference-engine/samples/hello_query_device/README.md) sources and [Multi-Device Plugin guide](supported_plugins/MULTI.md) for example of using the Inference Engine Query API in user applications.
|
||||
Refer to the [Hello Query Device Sample](../../samples/cpp/hello_query_device/README.md) sources and [Multi-Device Plugin guide](supported_plugins/MULTI.md) for example of using the Inference Engine Query API in user applications.
|
||||
|
||||
## Using the Inference Engine Query API in Your Code
|
||||
|
||||
|
@ -41,7 +41,7 @@ After that, you should quantize the model by the [Model Quantizer](@ref omz_tool
|
||||
|
||||
## Inference
|
||||
|
||||
The simplest way to infer the model and collect performance counters is the [C++ Benchmark Application](../../inference-engine/samples/benchmark_app/README.md).
|
||||
The simplest way to infer the model and collect performance counters is the [C++ Benchmark Application](../../samples/cpp/benchmark_app/README.md).
|
||||
```sh
|
||||
./benchmark_app -m resnet-50-tf.xml -d CPU -niter 1 -api sync -report_type average_counters -report_folder pc_report_dir
|
||||
```
|
||||
|
@ -2,7 +2,7 @@ Integrate the Inference Engine with Your Application {#openvino_docs_IE_DG_Integ
|
||||
===============================
|
||||
|
||||
This section provides a high-level description of the process of integrating the Inference Engine into your application.
|
||||
Refer to the [Hello Classification Sample](../../inference-engine/samples/hello_classification/README.md) sources
|
||||
Refer to the [Hello Classification Sample](../../samples/cpp/hello_classification/README.md) sources
|
||||
for example of using the Inference Engine in applications.
|
||||
|
||||
## Use the Inference Engine API in Your Code
|
||||
@ -73,7 +73,7 @@ methods:
|
||||
> Inference Engine expects two separate image planes (Y and UV). You must use a specific
|
||||
> `InferenceEngine::NV12Blob` object instead of default blob object and set this blob to
|
||||
> the Inference Engine Infer Request using `InferenceEngine::InferRequest::SetBlob()`.
|
||||
> Refer to [Hello NV12 Input Classification C++ Sample](../../inference-engine/samples/hello_nv12_input_classification/README.md)
|
||||
> Refer to [Hello NV12 Input Classification C++ Sample](../../samples/cpp/hello_nv12_input_classification/README.md)
|
||||
> for more details.
|
||||
|
||||
If you skip this step, the default values are set:
|
||||
@ -209,6 +209,6 @@ It's allowed to specify additional build options (e.g. to build CMake project on
|
||||
|
||||
### Run Your Application
|
||||
|
||||
Before running, make sure you completed **Set the Environment Variables** section in [OpenVINO Installation](../../inference-engine/samples/hello_nv12_input_classification/README.md) document so that the application can find the libraries.
|
||||
Before running, make sure you completed **Set the Environment Variables** section in [OpenVINO Installation](../../samples/cpp/hello_nv12_input_classification/README.md) document so that the application can find the libraries.
|
||||
|
||||
[integration_process]: img/integration_process.png
|
||||
|
@ -29,7 +29,7 @@ Refer to the ENABLE_FP16_FOR_QUANTIZED_MODELS key in the [GPU Plugin documentati
|
||||
One way to increase computational efficiency is batching, which combines many (potentially tens) of
|
||||
input images to achieve optimal throughput. However, high batch size also comes with a
|
||||
latency penalty. So, for more real-time oriented usages, lower batch sizes (as low as a single input) are used.
|
||||
Refer to the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample, which allows latency vs. throughput measuring.
|
||||
Refer to the [Benchmark App](../../samples/cpp/benchmark_app/README.md) sample, which allows latency vs. throughput measuring.
|
||||
|
||||
## Using Caching API for first inference latency optimization
|
||||
Since with the 2021.4 release, Inference Engine provides an ability to enable internal caching of loaded networks.
|
||||
@ -42,7 +42,7 @@ To gain better performance on accelerators, such as VPU, the Inference Engine us
|
||||
[Integrating Inference Engine in Your Application (current API)](Integrate_with_customer_application_new_API.md)).
|
||||
The point is amortizing the costs of data transfers, by pipe-lining, see [Async API explained](@ref omz_demos_object_detection_demo_cpp).
|
||||
Since the pipe-lining relies on the availability of the parallel slack, running multiple inference requests in parallel is essential.
|
||||
Refer to the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample, which enables running a number of inference requests in parallel. Specifying different number of request produces different throughput measurements.
|
||||
Refer to the [Benchmark App](../../samples/cpp/benchmark_app/README.md) sample, which enables running a number of inference requests in parallel. Specifying different number of request produces different throughput measurements.
|
||||
|
||||
## Best Latency on the Multi-Socket CPUs
|
||||
Note that when latency is of concern, there are additional tips for multi-socket systems.
|
||||
@ -70,7 +70,7 @@ OpenVINO™ toolkit provides a "throughput" mode that allows running multiple in
|
||||
Internally, the execution resources are split/pinned into execution "streams".
|
||||
Using this feature gains much better performance for the networks that originally are not scaled well with a number of threads (for example, lightweight topologies). This is especially pronounced for the many-core server machines.
|
||||
|
||||
Run the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) and play with number of infer requests running in parallel, next section.
|
||||
Run the [Benchmark App](../../samples/cpp/benchmark_app/README.md) and play with number of infer requests running in parallel, next section.
|
||||
Try different values of the `-nstreams` argument from `1` to a number of CPU cores and find one that provides the best performance.
|
||||
|
||||
The throughput mode relaxes the requirement to saturate the CPU by using a large batch: running multiple independent inference requests in parallel often gives much better performance, than using a batch only.
|
||||
@ -78,7 +78,7 @@ This allows you to simplify the app-logic, as you don't need to combine multiple
|
||||
Instead, it is possible to keep a separate infer request per camera or another source of input and process the requests in parallel using Async API.
|
||||
|
||||
## Benchmark App
|
||||
[Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample is the best performance reference.
|
||||
[Benchmark App](../../samples/cpp/benchmark_app/README.md) sample is the best performance reference.
|
||||
It has a lot of device-specific knobs, but the primary usage is as simple as:
|
||||
```bash
|
||||
$ ./benchmark_app –d GPU –m <model> -i <input>
|
||||
|
@ -10,35 +10,35 @@ After installation of Intel® Distribution of OpenVINO™ toolkit, С, C++ and P
|
||||
Inference Engine sample applications include the following:
|
||||
|
||||
- **Speech Sample** - Acoustic model inference based on Kaldi neural networks and speech feature vectors.
|
||||
- [Automatic Speech Recognition C++ Sample](../../inference-engine/samples/speech_sample/README.md)
|
||||
- [Automatic Speech Recognition C++ Sample](../../samples/cpp/speech_sample/README.md)
|
||||
- [Automatic Speech Recognition Python Sample](../../samples/python/speech_sample/README.md)
|
||||
- **Benchmark Application** – Estimates deep learning inference performance on supported devices for synchronous and asynchronous modes.
|
||||
- [Benchmark C++ Tool](../../inference-engine/samples/benchmark_app/README.md)
|
||||
- [Benchmark C++ Tool](../../samples/cpp/benchmark_app/README.md)
|
||||
- [Benchmark Python Tool](../../tools/benchmark_tool/README.md)
|
||||
- **Hello Classification Sample** – Inference of image classification networks like AlexNet and GoogLeNet using Synchronous Inference Request API. Input of any size and layout can be set to an infer request which will be pre-processed automatically during inference (the sample supports only images as inputs and supports Unicode paths).
|
||||
- [Hello Classification C++ Sample](../../inference-engine/samples/hello_classification/README.md)
|
||||
- [Hello Classification C++ Sample](../../samples/cpp/hello_classification/README.md)
|
||||
- [Hello Classification C Sample](../../samples/c/hello_classification/README.md)
|
||||
- [Hello Classification Python Sample](../../samples/python/hello_classification/README.md)
|
||||
- **Hello NV12 Input Classification Sample** – Input of any size and layout can be provided to an infer request. The sample transforms the input to the NV12 color format and pre-process it automatically during inference. The sample supports only images as inputs.
|
||||
- [Hello NV12 Input Classification C++ Sample](../../inference-engine/samples/hello_nv12_input_classification/README.md)
|
||||
- [Hello NV12 Input Classification C++ Sample](../../samples/cpp/hello_nv12_input_classification/README.md)
|
||||
- [Hello NV12 Input Classification C Sample](../../samples/c/hello_nv12_input_classification/README.md)
|
||||
- **Hello Query Device Sample** – Query of available Inference Engine devices and their metrics, configuration values.
|
||||
- [Hello Query Device C++ Sample](../../inference-engine/samples/hello_query_device/README.md)
|
||||
- [Hello Query Device C++ Sample](../../samples/cpp/hello_query_device/README.md)
|
||||
- [Hello Query Device Python* Sample](../../samples/python/hello_query_device/README.md)
|
||||
- **Hello Reshape SSD Sample** – Inference of SSD networks resized by ShapeInfer API according to an input size.
|
||||
- [Hello Reshape SSD C++ Sample**](../../inference-engine/samples/hello_reshape_ssd/README.md)
|
||||
- [Hello Reshape SSD C++ Sample**](../../samples/cpp/hello_reshape_ssd/README.md)
|
||||
- [Hello Reshape SSD Python Sample**](../../samples/python/hello_reshape_ssd/README.md)
|
||||
- **Image Classification Sample Async** – Inference of image classification networks like AlexNet and GoogLeNet using Asynchronous Inference Request API (the sample supports only images as inputs).
|
||||
- [Image Classification Async C++ Sample](../../inference-engine/samples/classification_sample_async/README.md)
|
||||
- [Image Classification Async C++ Sample](../../samples/cpp/classification_sample_async/README.md)
|
||||
- [Image Classification Async Python* Sample](../../samples/python/classification_sample_async/README.md)
|
||||
- **Style Transfer Sample** – Style Transfer sample (the sample supports only images as inputs).
|
||||
- [Style Transfer C++ Sample](../../inference-engine/samples/style_transfer_sample/README.md)
|
||||
- [Style Transfer C++ Sample](../../samples/cpp/style_transfer_sample/README.md)
|
||||
- [Style Transfer Python* Sample](../../samples/python/style_transfer_sample/README.md)
|
||||
- **nGraph Function Creation Sample** – Construction of the LeNet network using the nGraph function creation sample.
|
||||
- [nGraph Function Creation C++ Sample](../../inference-engine/samples/ngraph_function_creation_sample/README.md)
|
||||
- [nGraph Function Creation C++ Sample](../../samples/cpp/ngraph_function_creation_sample/README.md)
|
||||
- [nGraph Function Creation Python Sample](../../samples/python/ngraph_function_creation_sample/README.md)
|
||||
- **Object Detection for SSD Sample** – Inference of object detection networks based on the SSD, this sample is simplified version that supports only images as inputs.
|
||||
- [Object Detection SSD C++ Sample](../../inference-engine/samples/object_detection_sample_ssd/README.md)
|
||||
- [Object Detection SSD C++ Sample](../../samples/cpp/object_detection_sample_ssd/README.md)
|
||||
- [Object Detection SSD C Sample](../../samples/c/object_detection_sample_ssd/README.md)
|
||||
- [Object Detection SSD Python* Sample](../../samples/python/object_detection_sample_ssd/README.md)
|
||||
|
||||
|
@ -48,7 +48,7 @@ Auto-device supports query device optimization capabilities in metric;
|
||||
### Enumerating Available Devices
|
||||
|
||||
Inference Engine now features a dedicated API to enumerate devices and their capabilities.
|
||||
See [Hello Query Device C++ Sample](../../../inference-engine/samples/hello_query_device/README.md).
|
||||
See [Hello Query Device C++ Sample](../../../samples/cpp/hello_query_device/README.md).
|
||||
This is the example output from the sample (truncated to the devices' names only):
|
||||
|
||||
```sh
|
||||
|
@ -99,7 +99,7 @@ CPU plugin removes a Power layer from a topology if it has the following paramet
|
||||
The plugin supports the configuration parameters listed below.
|
||||
All parameters must be set with the <code>InferenceEngine::Core::LoadNetwork()</code> method.
|
||||
When specifying key values as raw strings (that is, when using Python API), omit the `KEY_` prefix.
|
||||
Refer to the OpenVINO samples for usage examples: [Benchmark App](../../../inference-engine/samples/benchmark_app/README.md).
|
||||
Refer to the OpenVINO samples for usage examples: [Benchmark App](../../../samples/cpp/benchmark_app/README.md).
|
||||
|
||||
These are general options, also supported by other plugins:
|
||||
|
||||
|
@ -12,7 +12,7 @@ For an in-depth description of clDNN, see [Inference Engine source files](https:
|
||||
* "GPU" is an alias for "GPU.0"
|
||||
* If the system doesn't have an integrated GPU, then devices are enumerated starting from 0.
|
||||
|
||||
For demonstration purposes, see the [Hello Query Device C++ Sample](../../../inference-engine/samples/hello_query_device/README.md) that can print out the list of available devices with associated indices. Below is an example output (truncated to the device names only):
|
||||
For demonstration purposes, see the [Hello Query Device C++ Sample](../../../samples/cpp/hello_query_device/README.md) that can print out the list of available devices with associated indices. Below is an example output (truncated to the device names only):
|
||||
|
||||
```sh
|
||||
./hello_query_device
|
||||
|
@ -38,7 +38,7 @@ Notice that the priorities of the devices can be changed in real time for the ex
|
||||
Finally, there is a way to specify number of requests that the multi-device will internally keep for each device. Suppose your original app was running 4 cameras with 4 inference requests. You would probably want to share these 4 requests between 2 devices used in the MULTI. The easiest way is to specify a number of requests for each device using parentheses: "MULTI:CPU(2),GPU(2)" and use the same 4 requests in your app. However, such an explicit configuration is not performance-portable and hence not recommended. Instead, the better way is to configure the individual devices and query the resulting number of requests to be used at the application level (see [Configuring the Individual Devices and Creating the Multi-Device On Top](#configuring-the-individual-devices-and-creating-the-multi-device-on-top)).
|
||||
|
||||
## Enumerating Available Devices
|
||||
Inference Engine now features a dedicated API to enumerate devices and their capabilities. See [Hello Query Device C++ Sample](../../../inference-engine/samples/hello_query_device/README.md). This is example output from the sample (truncated to the devices' names only):
|
||||
Inference Engine now features a dedicated API to enumerate devices and their capabilities. See [Hello Query Device C++ Sample](../../../samples/cpp/hello_query_device/README.md). This is example output from the sample (truncated to the devices' names only):
|
||||
|
||||
```sh
|
||||
./hello_query_device
|
||||
@ -86,7 +86,7 @@ Notice that until R2 you had to calculate number of requests in your application
|
||||
|
||||
## Using the Multi-Device with OpenVINO Samples and Benchmarking the Performance
|
||||
Notice that every OpenVINO sample that supports "-d" (which stands for "device") command-line option transparently accepts the multi-device.
|
||||
The [Benchmark Application](../../../inference-engine/samples/benchmark_app/README.md) is the best reference to the optimal usage of the multi-device. As discussed multiple times earlier, you don't need to setup number of requests, CPU streams or threads as the application provides optimal out of the box performance.
|
||||
The [Benchmark Application](../../../samples/cpp/benchmark_app/README.md) is the best reference to the optimal usage of the multi-device. As discussed multiple times earlier, you don't need to setup number of requests, CPU streams or threads as the application provides optimal out of the box performance.
|
||||
Below is example command-line to evaluate HDDL+GPU performance with that:
|
||||
|
||||
```sh
|
||||
|
@ -14,7 +14,7 @@ The IR will have two inputs: `input` for data and `ivector` for ivectors.
|
||||
|
||||
## Example: Run ASpIRE Chain TDNN Model with the Speech Recognition Sample
|
||||
|
||||
These instructions show how to run the converted model with the [Speech Recognition sample](../../../../../inference-engine/samples/speech_sample/README.md).
|
||||
These instructions show how to run the converted model with the [Speech Recognition sample](../../../../../samples/cpp/speech_sample/README.md).
|
||||
In this example, the input data contains one utterance from one speaker.
|
||||
|
||||
To follow the steps described below, you must first do the following:
|
||||
@ -109,4 +109,4 @@ speech_sample -i feats.ark,ivector_online_ie.ark -m final.xml -d CPU -o predicti
|
||||
```
|
||||
|
||||
Results can be decoded as described in "Use of Sample in Kaldi* Speech Recognition Pipeline" chapter
|
||||
in [the Speech Recognition Sample description](../../../../../inference-engine/samples/speech_sample/README.md).
|
||||
in [the Speech Recognition Sample description](../../../../../samples/cpp/speech_sample/README.md).
|
||||
|
@ -59,7 +59,7 @@ For example, if you downloaded the [pre-trained SSD InceptionV2 topology](http:/
|
||||
|
||||
Inference Engine comes with a number of samples to infer Object Detection API models including:
|
||||
|
||||
* [Object Detection for SSD Sample](../../../../../inference-engine/samples/object_detection_sample_ssd/README.md) --- for RFCN, SSD and Faster R-CNNs
|
||||
* [Object Detection for SSD Sample](../../../../../samples/cpp/object_detection_sample_ssd/README.md) --- for RFCN, SSD and Faster R-CNNs
|
||||
* [Mask R-CNN Sample for TensorFlow* Object Detection API Models](@ref omz_demos_mask_rcnn_demo_cpp) --- for Mask R-CNNs
|
||||
|
||||
There are several important notes about feeding input images to the samples:
|
||||
|
@ -15,7 +15,7 @@ The models used in the performance benchmarks were chosen based on general adopt
|
||||
CF means Caffe*, while TF means TensorFlow*.
|
||||
|
||||
#### 5. How can I run the benchmark results on my own?
|
||||
All of the performance benchmarks were generated using the open-sourced tool within the Intel® Distribution of OpenVINO™ toolkit called `benchmark_app`, which is available in both [C++](../../inference-engine/samples/benchmark_app/README.md) and [Python](../../tools/benchmark_tool/README.md).
|
||||
All of the performance benchmarks were generated using the open-sourced tool within the Intel® Distribution of OpenVINO™ toolkit called `benchmark_app`, which is available in both [C++](../../samples/cpp/benchmark_app/README.md) and [Python](../../tools/benchmark_tool/README.md).
|
||||
|
||||
#### 6. What image sizes are used for the classification network models?
|
||||
The image size used in the inference depends on the network being benchmarked. The following table shows the list of input sizes for each network model.
|
||||
|
@ -1,4 +1,4 @@
|
||||
openvino/inference-engine/samples/hello_reshape_ssd/README.md
|
||||
openvino/samples/cpp/hello_reshape_ssd/README.md
|
||||
openvino/docs/index.md
|
||||
inference-engine/include/ie_icnn_network.hpp
|
||||
openvino/docs/get_started/get_started_dl_workbench.md
|
||||
|
@ -113,7 +113,7 @@ When the script completes, you see the label and confidence for the top-10 categ
|
||||
|
||||
Top 10 results:
|
||||
|
||||
Image /home/user/dldt/inference-engine/samples/sample_data/car.png
|
||||
Image /home/user/openvino/samples/cpp/sample_data/car.png
|
||||
|
||||
classid probability label
|
||||
------- ----------- -----
|
||||
@ -366,7 +366,7 @@ When the Sample Application completes, you see the label and confidence for the
|
||||
```sh
|
||||
Top 10 results:
|
||||
|
||||
Image /home/user/dldt/inference-engine/samples/sample_data/car.png
|
||||
Image /home/user/openvino/samples/cpp/sample_data/car.png
|
||||
|
||||
classid probability label
|
||||
------- ----------- -----
|
||||
|
@ -53,9 +53,9 @@ When evaluating performance of your model with the Inference Engine, you must me
|
||||
In the asynchronous case (see <a href="#new-request-based-api">Request-Based API and “GetBlob” Idiom</a>), the performance of an individual infer request is usually of less concern. Instead, you typically execute multiple requests asynchronously and measure the throughput in images per second by dividing the number of images that were processed by the processing time.
|
||||
In contrast, for latency-oriented tasks, the time to a single frame is more important.
|
||||
|
||||
Refer to the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample, which allows latency vs. throughput measuring.
|
||||
Refer to the [Benchmark App](../../samples/cpp/benchmark_app/README.md) sample, which allows latency vs. throughput measuring.
|
||||
|
||||
> **NOTE**: The [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample also supports batching, that is, automatically packing multiple input images into a single request. However, high batch size results in a latency penalty. So for more real-time oriented usages, batch sizes that are as low as a single input are usually used. Still, devices like CPU, Intel®Movidius™ Myriad™ 2 VPU, Intel® Movidius™ Myriad™ X VPU, or Intel® Vision Accelerator Design with Intel® Movidius™ VPU require a number of parallel requests instead of batching to leverage the performance. Running multiple requests should be coupled with a device configured to the corresponding number of streams. See <a href="#cpu-streams">details on CPU streams</a> for an example.
|
||||
> **NOTE**: The [Benchmark App](../../samples/cpp/benchmark_app/README.md) sample also supports batching, that is, automatically packing multiple input images into a single request. However, high batch size results in a latency penalty. So for more real-time oriented usages, batch sizes that are as low as a single input are usually used. Still, devices like CPU, Intel®Movidius™ Myriad™ 2 VPU, Intel® Movidius™ Myriad™ X VPU, or Intel® Vision Accelerator Design with Intel® Movidius™ VPU require a number of parallel requests instead of batching to leverage the performance. Running multiple requests should be coupled with a device configured to the corresponding number of streams. See <a href="#cpu-streams">details on CPU streams</a> for an example.
|
||||
|
||||
[OpenVINO™ Deep Learning Workbench tool](https://docs.openvinotoolkit.org/latest/workbench_docs_Workbench_DG_Introduction.html) provides throughput versus latency charts for different numbers of streams, requests, and batch sizes to find the performance sweet spot.
|
||||
|
||||
@ -63,7 +63,7 @@ Refer to the [Benchmark App](../../inference-engine/samples/benchmark_app/README
|
||||
|
||||
When comparing the Inference Engine performance with the framework or another reference code, make sure that both versions are as similar as possible:
|
||||
|
||||
- Wrap exactly the inference execution (refer to the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample for an example).
|
||||
- Wrap exactly the inference execution (refer to the [Benchmark App](../../samples/cpp/benchmark_app/README.md) sample for an example).
|
||||
- Track model loading time separately.
|
||||
- Ensure the inputs are identical for the Inference Engine and the framework. For example, Caffe\* allows you to auto-populate the input with random values. Notice that it might give different performance than on real images.
|
||||
- Similarly, for correct performance comparison, make sure the access pattern, for example, input layouts, is optimal for Inference Engine (currently, it is NCHW).
|
||||
@ -79,7 +79,7 @@ You need to build your performance conclusions on reproducible data. Do the perf
|
||||
- If the warm-up run does not help or execution time still varies, you can try running a large number of iterations and then average or find a mean of the results.
|
||||
- For time values that range too much, use geomean.
|
||||
|
||||
Refer to the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) for code examples of performance measurements. Almost every sample, except interactive demos, has the `-ni` option to specify the number of iterations.
|
||||
Refer to the [Benchmark App](../../samples/cpp/benchmark_app/README.md) for code examples of performance measurements. Almost every sample, except interactive demos, has the `-ni` option to specify the number of iterations.
|
||||
|
||||
## Model Optimizer Knobs Related to Performance <a name="mo-knobs-related-to-performance"></a>
|
||||
|
||||
@ -121,9 +121,9 @@ for the multi-device execution:
|
||||
(e.g., the number of request in the flight is not enough to saturate all devices).
|
||||
- It is highly recommended to query the optimal number of inference requests directly from the instance of the ExecutionNetwork
|
||||
(resulted from the LoadNetwork call with the specific multi-device configuration as a parameter).
|
||||
Please refer to the code of the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample for details.
|
||||
Please refer to the code of the [Benchmark App](../../samples/cpp/benchmark_app/README.md) sample for details.
|
||||
- Notice that for example CPU+GPU execution performs better with certain knobs
|
||||
which you can find in the code of the same [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample.
|
||||
which you can find in the code of the same [Benchmark App](../../samples/cpp/benchmark_app/README.md) sample.
|
||||
One specific example is disabling GPU driver polling, which in turn requires multiple GPU streams (which is already a default for the GPU) to amortize slower
|
||||
inference completion from the device to the host.
|
||||
- Multi-device logic always attempts to save on the (e.g., inputs) data copies between device-agnostic, user-facing inference requests
|
||||
@ -169,7 +169,7 @@ This feature usually provides much better performance for the networks than batc
|
||||
Compared with the batching, the parallelism is somewhat transposed (i.e. performed over inputs, and much less within CNN ops):
|
||||

|
||||
|
||||
Try the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample and play with the number of streams running in parallel. The rule of thumb is tying up to a number of CPU cores on your machine.
|
||||
Try the [Benchmark App](../../samples/cpp/benchmark_app/README.md) sample and play with the number of streams running in parallel. The rule of thumb is tying up to a number of CPU cores on your machine.
|
||||
For example, on an 8-core CPU, compare the `-nstreams 1` (which is a legacy, latency-oriented scenario) to the 2, 4, and 8 streams.
|
||||
Notice that on a multi-socket machine, the bare minimum of streams for a latency scenario equals the number of sockets.
|
||||
|
||||
@ -190,13 +190,13 @@ Inference Engine relies on the [Compute Library for Deep Neural Networks (clDNN)
|
||||
- If your application is simultaneously using the inference on the CPU or otherwise loads the host heavily, make sure that the OpenCL driver threads do not starve. You can use [CPU configuration options](../IE_DG/supported_plugins/CPU.md) to limit number of inference threads for the CPU plugin.
|
||||
- In the GPU-only scenario, a GPU driver might occupy a CPU core with spin-looped polling for completion. If the _CPU_ utilization is a concern, consider the `KEY_CLDND_PLUGIN_THROTTLE` configuration option.
|
||||
|
||||
> **NOTE**: See the [Benchmark App Sample](../../inference-engine/samples/benchmark_app/README.md) code for a usage example.
|
||||
> **NOTE**: See the [Benchmark App Sample](../../samples/cpp/benchmark_app/README.md) code for a usage example.
|
||||
Notice that while disabling the polling, this option might reduce the GPU performance, so usually this option is used with multiple [GPU streams](../IE_DG/supported_plugins/GPU.md).
|
||||
|
||||
|
||||
### Intel® Movidius™ Myriad™ X Visual Processing Unit and Intel® Vision Accelerator Design with Intel® Movidius™ VPUs <a name="myriad"></a>
|
||||
|
||||
Since Intel® Movidius™ Myriad™ X Visual Processing Unit (Intel® Movidius™ Myriad™ 2 VPU) communicates with the host over USB, minimum four infer requests in flight are recommended to hide the data transfer costs. See <a href="#new-request-based-api">Request-Based API and “GetBlob” Idiom</a> and [Benchmark App Sample](../../inference-engine/samples/benchmark_app/README.md) for more information.
|
||||
Since Intel® Movidius™ Myriad™ X Visual Processing Unit (Intel® Movidius™ Myriad™ 2 VPU) communicates with the host over USB, minimum four infer requests in flight are recommended to hide the data transfer costs. See <a href="#new-request-based-api">Request-Based API and “GetBlob” Idiom</a> and [Benchmark App Sample](../../samples/cpp/benchmark_app/README.md) for more information.
|
||||
|
||||
Intel® Vision Accelerator Design with Intel® Movidius™ VPUs requires keeping at least 32 inference requests in flight to fully saturate the device.
|
||||
|
||||
@ -240,7 +240,7 @@ For general details on the heterogeneous plugin, refer to the [corresponding sec
|
||||
|
||||
Every Inference Engine sample supports the `-d` (device) option.
|
||||
|
||||
For example, here is a command to run an [Object Detection Sample SSD Sample](../../inference-engine/samples/object_detection_sample_ssd/README.md):
|
||||
For example, here is a command to run an [Object Detection Sample SSD Sample](../../samples/cpp/object_detection_sample_ssd/README.md):
|
||||
|
||||
```sh
|
||||
./object_detection_sample_ssd -m <path_to_model>/ModelSSD.xml -i <path_to_pictures>/picture.jpg -d HETERO:GPU,CPU
|
||||
@ -284,7 +284,7 @@ You can use the GraphViz\* utility or `.dot` converters (for example, to `.png`
|
||||
|
||||

|
||||
|
||||
You can also use performance data (in the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md), it is an option `-pc`) to get performance data on each subgraph. Again, refer to the [HETERO plugin documentation](https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_supported_plugins_HETERO.html#analyzing_heterogeneous_execution) and to <a href="#performance-counters">Internal Inference Performance Counters</a> for information on general counters.
|
||||
You can also use performance data (in the [Benchmark App](../../samples/cpp/benchmark_app/README.md), it is an option `-pc`) to get performance data on each subgraph. Again, refer to the [HETERO plugin documentation](https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_supported_plugins_HETERO.html#analyzing_heterogeneous_execution) and to <a href="#performance-counters">Internal Inference Performance Counters</a> for information on general counters.
|
||||
|
||||
## Optimizing Custom Kernels <a name="optimizing-custom-kernels"></a>
|
||||
|
||||
@ -430,7 +430,7 @@ There are important performance caveats though: for example, the tasks that run
|
||||
|
||||
Also, if the inference is performed on the graphics processing unit (GPU), there is little gain in doing the encoding of the resulting video on the same GPU in parallel, for instance, because the device is already busy.
|
||||
|
||||
Refer to the [Object Detection SSD Demo](@ref omz_demos_object_detection_demo_cpp) (latency-oriented Async API showcase) and [Benchmark App Sample](../../inference-engine/samples/benchmark_app/README.md) (which has both latency and throughput-oriented modes) for complete examples of the Async API in action.
|
||||
Refer to the [Object Detection SSD Demo](@ref omz_demos_object_detection_demo_cpp) (latency-oriented Async API showcase) and [Benchmark App Sample](../../samples/cpp/benchmark_app/README.md) (which has both latency and throughput-oriented modes) for complete examples of the Async API in action.
|
||||
|
||||
## Using Tools <a name="using-tools"></a>
|
||||
|
||||
|
@ -12,54 +12,7 @@ if(ENABLE_PYTHON)
|
||||
add_subdirectory(ie_bridges/python)
|
||||
endif()
|
||||
|
||||
add_subdirectory(samples)
|
||||
|
||||
# TODO: remove this
|
||||
foreach(sample benchmark_app classification_sample_async hello_classification
|
||||
hello_nv12_input_classification hello_query_device hello_reshape_ssd
|
||||
ngraph_function_creation_sample object_detection_sample_ssd
|
||||
speech_sample style_transfer_sample)
|
||||
if(TARGET ${sample})
|
||||
install(TARGETS ${sample}
|
||||
RUNTIME DESTINATION tests COMPONENT tests EXCLUDE_FROM_ALL)
|
||||
endif()
|
||||
endforeach()
|
||||
|
||||
if(TARGET format_reader)
|
||||
install(TARGETS format_reader
|
||||
RUNTIME DESTINATION ${IE_CPACK_RUNTIME_PATH} COMPONENT tests EXCLUDE_FROM_ALL
|
||||
LIBRARY DESTINATION ${IE_CPACK_LIBRARY_PATH} COMPONENT tests EXCLUDE_FROM_ALL)
|
||||
endif()
|
||||
|
||||
openvino_developer_export_targets(COMPONENT openvino_common TARGETS format_reader ie_samples_utils)
|
||||
|
||||
if(ENABLE_TESTS)
|
||||
add_subdirectory(tests_deprecated)
|
||||
add_subdirectory(tests)
|
||||
endif()
|
||||
|
||||
#
|
||||
# Install
|
||||
#
|
||||
|
||||
# install C++ samples
|
||||
|
||||
ie_cpack_add_component(cpp_samples DEPENDS cpp_samples_deps core)
|
||||
|
||||
if(UNIX)
|
||||
install(DIRECTORY samples/
|
||||
DESTINATION samples/cpp
|
||||
COMPONENT cpp_samples
|
||||
USE_SOURCE_PERMISSIONS
|
||||
PATTERN *.bat EXCLUDE
|
||||
PATTERN speech_libs_and_demos EXCLUDE
|
||||
PATTERN .clang-format EXCLUDE)
|
||||
elseif(WIN32)
|
||||
install(DIRECTORY samples/
|
||||
DESTINATION samples/cpp
|
||||
COMPONENT cpp_samples
|
||||
USE_SOURCE_PERMISSIONS
|
||||
PATTERN *.sh EXCLUDE
|
||||
PATTERN speech_libs_and_demos EXCLUDE
|
||||
PATTERN .clang-format EXCLUDE)
|
||||
endif()
|
||||
|
@ -2,6 +2,27 @@
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
add_subdirectory(cpp)
|
||||
|
||||
# TODO: remove this
|
||||
foreach(sample benchmark_app classification_sample_async hello_classification
|
||||
hello_nv12_input_classification hello_query_device hello_reshape_ssd
|
||||
ngraph_function_creation_sample object_detection_sample_ssd
|
||||
speech_sample style_transfer_sample)
|
||||
if(TARGET ${sample})
|
||||
install(TARGETS ${sample}
|
||||
RUNTIME DESTINATION tests COMPONENT tests EXCLUDE_FROM_ALL)
|
||||
endif()
|
||||
endforeach()
|
||||
|
||||
if(TARGET format_reader)
|
||||
install(TARGETS format_reader
|
||||
RUNTIME DESTINATION ${IE_CPACK_RUNTIME_PATH} COMPONENT tests EXCLUDE_FROM_ALL
|
||||
LIBRARY DESTINATION ${IE_CPACK_LIBRARY_PATH} COMPONENT tests EXCLUDE_FROM_ALL)
|
||||
endif()
|
||||
|
||||
openvino_developer_export_targets(COMPONENT openvino_common TARGETS format_reader ie_samples_utils)
|
||||
|
||||
add_subdirectory(c)
|
||||
|
||||
# TODO: remove this
|
||||
@ -18,17 +39,40 @@ if(TARGET opencv_c_wrapper)
|
||||
RUNTIME DESTINATION ${IE_CPACK_RUNTIME_PATH} COMPONENT tests EXCLUDE_FROM_ALL
|
||||
LIBRARY DESTINATION ${IE_CPACK_LIBRARY_PATH} COMPONENT tests EXCLUDE_FROM_ALL)
|
||||
endif()
|
||||
#
|
||||
# Install
|
||||
#
|
||||
|
||||
# install C++ samples
|
||||
|
||||
ie_cpack_add_component(cpp_samples DEPENDS cpp_samples_deps core)
|
||||
|
||||
if(UNIX)
|
||||
install(DIRECTORY cpp/
|
||||
DESTINATION samples/cpp
|
||||
COMPONENT cpp_samples
|
||||
USE_SOURCE_PERMISSIONS
|
||||
PATTERN *.bat EXCLUDE
|
||||
PATTERN .clang-format EXCLUDE)
|
||||
elseif(WIN32)
|
||||
install(DIRECTORY cpp/
|
||||
DESTINATION samples/cpp
|
||||
COMPONENT cpp_samples
|
||||
USE_SOURCE_PERMISSIONS
|
||||
PATTERN *.sh EXCLUDE
|
||||
PATTERN .clang-format EXCLUDE)
|
||||
endif()
|
||||
|
||||
# install C samples
|
||||
|
||||
ie_cpack_add_component(c_samples DEPENDS core_c)
|
||||
|
||||
if(UNIX)
|
||||
install(PROGRAMS ${IE_MAIN_SOURCE_DIR}/samples/build_samples.sh
|
||||
install(PROGRAMS cpp/build_samples.sh
|
||||
DESTINATION samples/c
|
||||
COMPONENT c_samples)
|
||||
elseif(WIN32)
|
||||
install(PROGRAMS ${IE_MAIN_SOURCE_DIR}/samples/build_samples_msvc.bat
|
||||
install(PROGRAMS cpp/build_samples_msvc.bat
|
||||
DESTINATION samples/c
|
||||
COMPONENT c_samples)
|
||||
endif()
|
||||
@ -39,7 +83,7 @@ install(DIRECTORY c
|
||||
PATTERN c/CMakeLists.txt EXCLUDE
|
||||
PATTERN c/.clang-format EXCLUDE)
|
||||
|
||||
install(FILES ${IE_MAIN_SOURCE_DIR}/samples/CMakeLists.txt
|
||||
install(FILES cpp/CMakeLists.txt
|
||||
DESTINATION samples/c
|
||||
COMPONENT c_samples)
|
||||
|
||||
|
@ -2,4 +2,4 @@
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
include("${InferenceEngine_SOURCE_DIR}/samples/CMakeLists.txt")
|
||||
include("${OpenVINO_SOURCE_DIR}/samples/cpp/CMakeLists.txt")
|
||||
|
@ -18,7 +18,7 @@ Hello Classification C sample application demonstrates how to use the following
|
||||
| Model Format | Inference Engine Intermediate Representation (\*.xml + \*.bin), ONNX (\*.onnx)
|
||||
| Validated images | The sample uses OpenCV\* to [read input image](https://docs.opencv.org/master/d4/da8/group__imgcodecs.html#ga288b8b3da0892bd651fce07b3bbd3a56) (\*.bmp, \*.png)
|
||||
| Supported devices | [All](../../../docs/IE_DG/supported_plugins/Supported_Devices.md) |
|
||||
| Other language realization | [C++](../../../inference-engine/samples/hello_classification/README.md), [Python](../../python/hello_classification/README.md) |
|
||||
| Other language realization | [C++](../../../samples/cpp/hello_classification/README.md), [Python](../../python/hello_classification/README.md) |
|
||||
|
||||
## How It Works
|
||||
|
||||
|
@ -16,7 +16,7 @@ Basic Inference Engine API is covered by [Hello Classification C sample](../hell
|
||||
| Model Format | Inference Engine Intermediate Representation (\*.xml + \*.bin), ONNX (\*.onnx)
|
||||
| Validated images | An uncompressed image in the NV12 color format - \*.yuv
|
||||
| Supported devices | [All](../../../docs/IE_DG/supported_plugins/Supported_Devices.md) |
|
||||
| Other language realization | [C++](../../../inference-engine/samples/hello_nv12_input_classification/README.md) |
|
||||
| Other language realization | [C++](../../../samples/cpp/hello_nv12_input_classification/README.md) |
|
||||
|
||||
## How It Works
|
||||
|
||||
|
@ -24,7 +24,7 @@ Basic Inference Engine API is covered by [Hello Classification C sample](../hell
|
||||
| Model Format | Inference Engine Intermediate Representation (.xml + .bin), ONNX (.onnx)
|
||||
| Validated images | The sample uses OpenCV* to [read input image](https://docs.opencv.org/master/d4/da8/group__imgcodecs.html#ga288b8b3da0892bd651fce07b3bbd3a56) (.bmp, .png, .jpg)
|
||||
| Supported devices | [All](../../../docs/IE_DG/supported_plugins/Supported_Devices.md) |
|
||||
| Other language realization | [C++](../../../inference-engine/samples/object_detection_sample_ssd/README.md), [Python](../../python/object_detection_sample_ssd/README.md) |
|
||||
| Other language realization | [C++](../../../samples/cpp/object_detection_sample_ssd/README.md), [Python](../../python/object_detection_sample_ssd/README.md) |
|
||||
|
||||
## How It Works
|
||||
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue
Block a user