Adds support of user layouts to benchmark_app (#4002)
* Adds support of user layouts to benchmark_app * Keep snake_case for python Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
This commit is contained in:
parent
2a2ef7d989
commit
4e3d7d23fc
@ -5,7 +5,7 @@ This topic demonstrates how to use the Benchmark C++ Tool to estimate deep learn
|
|||||||
> **NOTE:** This topic describes usage of C++ implementation of the Benchmark Tool. For the Python* implementation, refer to [Benchmark Python* Tool](../../tools/benchmark_tool/README.md).
|
> **NOTE:** This topic describes usage of C++ implementation of the Benchmark Tool. For the Python* implementation, refer to [Benchmark Python* Tool](../../tools/benchmark_tool/README.md).
|
||||||
|
|
||||||
> **TIP**: You also can work with the Benchmark Tool inside the OpenVINO™ [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench).
|
> **TIP**: You also can work with the Benchmark Tool inside the OpenVINO™ [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench).
|
||||||
> [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare
|
> [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare
|
||||||
> performance of deep learning models on various Intel® architecture
|
> performance of deep learning models on various Intel® architecture
|
||||||
> configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components.
|
> configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components.
|
||||||
> <br>
|
> <br>
|
||||||
@ -75,11 +75,11 @@ benchmark_app [OPTION]
|
|||||||
Options:
|
Options:
|
||||||
|
|
||||||
-h, --help Print a usage message
|
-h, --help Print a usage message
|
||||||
-m "<path>" Required. Path to an .xml/.onnx/.prototxt file with a trained model or to a .blob files with a trained compiled model.
|
-m "<path>" Required. Path to an .xml/.onnx/.prototxt file with a trained model or to a .blob files with a trained compiled model.
|
||||||
-i "<path>" Optional. Path to a folder with images and/or binaries or to specific image or binary file.
|
-i "<path>" Optional. Path to a folder with images and/or binaries or to specific image or binary file.
|
||||||
-d "<device>" Optional. Specify a target device to infer on (the list of available devices is shown below). Default value is CPU.
|
-d "<device>" Optional. Specify a target device to infer on (the list of available devices is shown below). Default value is CPU.
|
||||||
Use "-d HETERO:<comma-separated_devices_list>" format to specify HETERO plugin.
|
Use "-d HETERO:<comma-separated_devices_list>" format to specify HETERO plugin.
|
||||||
Use "-d MULTI:<comma-separated_devices_list>" format to specify MULTI plugin.
|
Use "-d MULTI:<comma-separated_devices_list>" format to specify MULTI plugin.
|
||||||
The application looks for a suitable plugin for the specified device.
|
The application looks for a suitable plugin for the specified device.
|
||||||
-l "<absolute_path>" Required for CPU custom layers. Absolute path to a shared library with the kernels implementations.
|
-l "<absolute_path>" Required for CPU custom layers. Absolute path to a shared library with the kernels implementations.
|
||||||
Or
|
Or
|
||||||
@ -92,14 +92,15 @@ Options:
|
|||||||
-t Optional. Time, in seconds, to execute topology.
|
-t Optional. Time, in seconds, to execute topology.
|
||||||
-progress Optional. Show progress bar (can affect performance measurement). Default values is "false".
|
-progress Optional. Show progress bar (can affect performance measurement). Default values is "false".
|
||||||
-shape Optional. Set shape for input. For example, "input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]" in case of one input size.
|
-shape Optional. Set shape for input. For example, "input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]" in case of one input size.
|
||||||
|
-layout Optional. Prompts how network layouts should be treated by application. For example, "input1[NCHW],input2[NC]" or "[NCHW]" in case of one input size.
|
||||||
|
|
||||||
CPU-specific performance options:
|
CPU-specific performance options:
|
||||||
-nstreams "<integer>" Optional. Number of streams to use for inference on the CPU, GPU or MYRIAD devices
|
-nstreams "<integer>" Optional. Number of streams to use for inference on the CPU, GPU or MYRIAD devices
|
||||||
(for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
|
(for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
|
||||||
Default value is determined automatically for a device.
|
Default value is determined automatically for a device.
|
||||||
Please note that although the automatic selection usually provides a reasonable performance,
|
Please note that although the automatic selection usually provides a reasonable performance,
|
||||||
it still may be non-optimal for some cases, especially for very small networks.
|
it still may be non-optimal for some cases, especially for very small networks.
|
||||||
Also, using nstreams>1 is inherently throughput-oriented option, while for the best-latency
|
Also, using nstreams>1 is inherently throughput-oriented option, while for the best-latency
|
||||||
estimations the number of streams should be set to 1.
|
estimations the number of streams should be set to 1.
|
||||||
-nthreads "<integer>" Optional. Number of threads to use for inference on the CPU (including HETERO and MULTI cases).
|
-nthreads "<integer>" Optional. Number of threads to use for inference on the CPU (including HETERO and MULTI cases).
|
||||||
-enforcebf16 Optional. Enforcing of floating point operations execution in bfloat16 precision on platforms with native bfloat16 support. By default, this key sets "true" on platforms with native bfloat16 support and "false" for other platforms. Use "-enforcebf16=false" to disable this feature.
|
-enforcebf16 Optional. Enforcing of floating point operations execution in bfloat16 precision on platforms with native bfloat16 support. By default, this key sets "true" on platforms with native bfloat16 support and "false" for other platforms. Use "-enforcebf16=false" to disable this feature.
|
||||||
@ -125,12 +126,12 @@ If a model has mixed input types, input folder should contain all required files
|
|||||||
To run the tool, you can use [public](@ref omz_models_public_index) or [Intel's](@ref omz_models_intel_index) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader_README).
|
To run the tool, you can use [public](@ref omz_models_public_index) or [Intel's](@ref omz_models_intel_index) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader_README).
|
||||||
|
|
||||||
> **NOTE**: Before running the tool with a trained model, make sure the model is converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](../../../docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
|
> **NOTE**: Before running the tool with a trained model, make sure the model is converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](../../../docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
|
||||||
>
|
>
|
||||||
> The sample accepts models in ONNX format (.onnx) that do not require preprocessing.
|
> The sample accepts models in ONNX format (.onnx) that do not require preprocessing.
|
||||||
|
|
||||||
## Examples of Running the Tool
|
## Examples of Running the Tool
|
||||||
|
|
||||||
This section provides step-by-step instructions on how to run the Benchmark Tool with the `googlenet-v1` public model on CPU or FPGA devices. As an input, the `car.png` file from the `<INSTALL_DIR>/deployment_tools/demo/` directory is used.
|
This section provides step-by-step instructions on how to run the Benchmark Tool with the `googlenet-v1` public model on CPU or FPGA devices. As an input, the `car.png` file from the `<INSTALL_DIR>/deployment_tools/demo/` directory is used.
|
||||||
|
|
||||||
> **NOTE:** The Internet access is required to execute the following steps successfully. If you have access to the Internet through the proxy server only, please make sure that it is configured in your OS environment.
|
> **NOTE:** The Internet access is required to execute the following steps successfully. If you have access to the Internet through the proxy server only, please make sure that it is configured in your OS environment.
|
||||||
|
|
||||||
@ -147,9 +148,9 @@ This section provides step-by-step instructions on how to run the Benchmark Tool
|
|||||||
```
|
```
|
||||||
```sh
|
```sh
|
||||||
python3 mo.py --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
|
python3 mo.py --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
|
||||||
```
|
```
|
||||||
3. Run the tool with specifying the `<INSTALL_DIR>/deployment_tools/demo/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and FPGA devices:
|
3. Run the tool with specifying the `<INSTALL_DIR>/deployment_tools/demo/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and FPGA devices:
|
||||||
|
|
||||||
* On CPU:
|
* On CPU:
|
||||||
```sh
|
```sh
|
||||||
./benchmark_app -m <ir_dir>/googlenet-v1.xml -i <INSTALL_DIR>/deployment_tools/demo/car.png -d CPU -api async --progress true
|
./benchmark_app -m <ir_dir>/googlenet-v1.xml -i <INSTALL_DIR>/deployment_tools/demo/car.png -d CPU -api async --progress true
|
||||||
@ -162,7 +163,7 @@ This section provides step-by-step instructions on how to run the Benchmark Tool
|
|||||||
The application outputs the number of executed iterations, total duration of execution, latency, and throughput.
|
The application outputs the number of executed iterations, total duration of execution, latency, and throughput.
|
||||||
Additionally, if you set the `-report_type` parameter, the application outputs statistics report. If you set the `-pc` parameter, the application outputs performance counters. If you set `-exec_graph_path`, the application reports executable graph information serialized. All measurements including per-layer PM counters are reported in milliseconds.
|
Additionally, if you set the `-report_type` parameter, the application outputs statistics report. If you set the `-pc` parameter, the application outputs performance counters. If you set `-exec_graph_path`, the application reports executable graph information serialized. All measurements including per-layer PM counters are reported in milliseconds.
|
||||||
|
|
||||||
Below are fragments of sample output for CPU and FPGA devices:
|
Below are fragments of sample output for CPU and FPGA devices:
|
||||||
|
|
||||||
* For CPU:
|
* For CPU:
|
||||||
```
|
```
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
// Copyright (C) 2018-2020 Intel Corporation
|
// Copyright (C) 2018-2021 Intel Corporation
|
||||||
// SPDX-License-Identifier: Apache-2.0
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
//
|
//
|
||||||
|
|
||||||
@ -102,6 +102,9 @@ static const char dump_config_message[] = "Optional. Path to XML/YAML/JSON file
|
|||||||
static const char shape_message[] = "Optional. Set shape for input. For example, \"input1[1,3,224,224],input2[1,4]\" or \"[1,3,224,224]\""
|
static const char shape_message[] = "Optional. Set shape for input. For example, \"input1[1,3,224,224],input2[1,4]\" or \"[1,3,224,224]\""
|
||||||
" in case of one input size.";
|
" in case of one input size.";
|
||||||
|
|
||||||
|
static const char layout_message[] = "Optional. Prompts how network layouts should be treated by application. "
|
||||||
|
"For example, \"input1[NCHW],input2[NC]\" or \"[NCHW]\" in case of one input size.";
|
||||||
|
|
||||||
// @brief message for quantization bits
|
// @brief message for quantization bits
|
||||||
static const char gna_qb_message[] = "Optional. Weight bits for quantization: 8 or 16 (default)";
|
static const char gna_qb_message[] = "Optional. Weight bits for quantization: 8 or 16 (default)";
|
||||||
|
|
||||||
@ -189,6 +192,9 @@ DEFINE_string(dump_config, "", dump_config_message);
|
|||||||
/// @brief Define flag for input shape <br>
|
/// @brief Define flag for input shape <br>
|
||||||
DEFINE_string(shape, "", shape_message);
|
DEFINE_string(shape, "", shape_message);
|
||||||
|
|
||||||
|
/// @brief Define flag for layout shape <br>
|
||||||
|
DEFINE_string(layout, "", layout_message);
|
||||||
|
|
||||||
/// @brief Define flag for quantization bits (default 16)
|
/// @brief Define flag for quantization bits (default 16)
|
||||||
DEFINE_int32(qb, 16, gna_qb_message);
|
DEFINE_int32(qb, 16, gna_qb_message);
|
||||||
|
|
||||||
@ -215,6 +221,7 @@ static void showUsage() {
|
|||||||
std::cout << " -t " << execution_time_message << std::endl;
|
std::cout << " -t " << execution_time_message << std::endl;
|
||||||
std::cout << " -progress " << progress_message << std::endl;
|
std::cout << " -progress " << progress_message << std::endl;
|
||||||
std::cout << " -shape " << shape_message << std::endl;
|
std::cout << " -shape " << shape_message << std::endl;
|
||||||
|
std::cout << " -layout " << layout_message << std::endl;
|
||||||
std::cout << std::endl << " device-specific performance options:" << std::endl;
|
std::cout << std::endl << " device-specific performance options:" << std::endl;
|
||||||
std::cout << " -nstreams \"<integer>\" " << infer_num_streams_message << std::endl;
|
std::cout << " -nstreams \"<integer>\" " << infer_num_streams_message << std::endl;
|
||||||
std::cout << " -nthreads \"<integer>\" " << infer_num_threads_message << std::endl;
|
std::cout << " -nthreads \"<integer>\" " << infer_num_threads_message << std::endl;
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
// Copyright (C) 2018-2020 Intel Corporation
|
// Copyright (C) 2018-2021 Intel Corporation
|
||||||
// SPDX-License-Identifier: Apache-2.0
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
//
|
//
|
||||||
|
|
||||||
@ -48,7 +48,7 @@ std::vector<std::string> filterFilesByExtensions(const std::vector<std::string>&
|
|||||||
void fillBlobImage(Blob::Ptr& inputBlob,
|
void fillBlobImage(Blob::Ptr& inputBlob,
|
||||||
const std::vector<std::string>& filePaths,
|
const std::vector<std::string>& filePaths,
|
||||||
const size_t& batchSize,
|
const size_t& batchSize,
|
||||||
const InputInfo& info,
|
const benchmark_app::InputInfo& app_info,
|
||||||
const size_t& requestId,
|
const size_t& requestId,
|
||||||
const size_t& inputId,
|
const size_t& inputId,
|
||||||
const size_t& inputSize) {
|
const size_t& inputSize) {
|
||||||
@ -60,7 +60,6 @@ void fillBlobImage(Blob::Ptr& inputBlob,
|
|||||||
// locked memory holder should be alive all time while access to its buffer happens
|
// locked memory holder should be alive all time while access to its buffer happens
|
||||||
auto minputHolder = minput->wmap();
|
auto minputHolder = minput->wmap();
|
||||||
auto inputBlobData = minputHolder.as<uint8_t *>();
|
auto inputBlobData = minputHolder.as<uint8_t *>();
|
||||||
const TensorDesc& inputBlobDesc = inputBlob->getTensorDesc();
|
|
||||||
|
|
||||||
/** Collect images data ptrs **/
|
/** Collect images data ptrs **/
|
||||||
std::vector<std::shared_ptr<uint8_t>> vreader;
|
std::vector<std::shared_ptr<uint8_t>> vreader;
|
||||||
@ -77,24 +76,30 @@ void fillBlobImage(Blob::Ptr& inputBlob,
|
|||||||
}
|
}
|
||||||
|
|
||||||
/** Getting image data **/
|
/** Getting image data **/
|
||||||
TensorDesc desc = info.getTensorDesc();
|
std::shared_ptr<uint8_t> imageData(reader->getData(app_info.width(), app_info.height()));
|
||||||
std::shared_ptr<uint8_t> imageData(reader->getData(getTensorWidth(desc), getTensorHeight(desc)));
|
|
||||||
if (imageData) {
|
if (imageData) {
|
||||||
vreader.push_back(imageData);
|
vreader.push_back(imageData);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Fill input tensor with images. First b channel, then g and r channels **/
|
/** Fill input tensor with images. First b channel, then g and r channels **/
|
||||||
const size_t numChannels = getTensorChannels(inputBlobDesc);
|
const size_t numChannels = app_info.channels();
|
||||||
const size_t imageSize = getTensorWidth(inputBlobDesc) * getTensorHeight(inputBlobDesc);
|
const size_t width = app_info.width();
|
||||||
|
const size_t height = app_info.height();
|
||||||
/** Iterate over all input images **/
|
/** Iterate over all input images **/
|
||||||
for (size_t imageId = 0; imageId < vreader.size(); ++imageId) {
|
for (size_t imageId = 0; imageId < vreader.size(); ++imageId) {
|
||||||
/** Iterate over all pixel in image (b,g,r) **/
|
/** Iterate over all width **/
|
||||||
for (size_t pid = 0; pid < imageSize; pid++) {
|
for (size_t w = 0; w < app_info.width(); ++w) {
|
||||||
/** Iterate over all channels **/
|
/** Iterate over all height **/
|
||||||
for (size_t ch = 0; ch < numChannels; ++ch) {
|
for (size_t h = 0; h < app_info.height(); ++h) {
|
||||||
/** [images stride + channels stride + pixel id ] all in bytes **/
|
/** Iterate over all channels **/
|
||||||
inputBlobData[imageId * imageSize * numChannels + ch * imageSize + pid] = vreader.at(imageId).get()[pid*numChannels + ch];
|
for (size_t ch = 0; ch < numChannels; ++ch) {
|
||||||
|
/** [images stride + channels stride + pixel id ] all in bytes **/
|
||||||
|
size_t offset = imageId * numChannels * width * height +
|
||||||
|
(((app_info.layout == "NCHW") || (app_info.layout == "CHW")) ?
|
||||||
|
(ch * width * height + h * width + w) : (h * width * numChannels + w * numChannels + ch));
|
||||||
|
inputBlobData[offset] = vreader.at(imageId).get()[h * width * numChannels + w * numChannels + ch];
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -185,24 +190,23 @@ void fillBlobImInfo(Blob::Ptr& inputBlob,
|
|||||||
|
|
||||||
void fillBlobs(const std::vector<std::string>& inputFiles,
|
void fillBlobs(const std::vector<std::string>& inputFiles,
|
||||||
const size_t& batchSize,
|
const size_t& batchSize,
|
||||||
const InferenceEngine::ConstInputsDataMap& info,
|
benchmark_app::InputsInfo& app_inputs_info,
|
||||||
std::vector<InferReqWrap::Ptr> requests) {
|
std::vector<InferReqWrap::Ptr> requests) {
|
||||||
std::vector<std::pair<size_t, size_t>> input_image_sizes;
|
std::vector<std::pair<size_t, size_t>> input_image_sizes;
|
||||||
for (const ConstInputsDataMap::value_type& item : info) {
|
for (auto& item : app_inputs_info) {
|
||||||
if (isImage(item.second)) {
|
if (item.second.isImage()) {
|
||||||
input_image_sizes.push_back(std::make_pair(getTensorWidth(item.second->getTensorDesc()),
|
input_image_sizes.push_back(std::make_pair(item.second.width(), item.second.height()));
|
||||||
getTensorHeight(item.second->getTensorDesc())));
|
|
||||||
}
|
}
|
||||||
slog::info << "Network input '" << item.first << "' precision " << item.second->getTensorDesc().getPrecision()
|
slog::info << "Network input '" << item.first << "' precision " << item.second.precision
|
||||||
<< ", dimensions (" << item.second->getTensorDesc().getLayout() << "): ";
|
<< ", dimensions (" << item.second.layout << "): ";
|
||||||
for (const auto& i : item.second->getTensorDesc().getDims()) {
|
for (const auto& i : item.second.shape) {
|
||||||
slog::info << i << " ";
|
slog::info << i << " ";
|
||||||
}
|
}
|
||||||
slog::info << slog::endl;
|
slog::info << slog::endl;
|
||||||
}
|
}
|
||||||
|
|
||||||
size_t imageInputCount = input_image_sizes.size();
|
size_t imageInputCount = input_image_sizes.size();
|
||||||
size_t binaryInputCount = info.size() - imageInputCount;
|
size_t binaryInputCount = app_inputs_info.size() - imageInputCount;
|
||||||
|
|
||||||
std::vector<std::string> binaryFiles;
|
std::vector<std::string> binaryFiles;
|
||||||
std::vector<std::string> imageFiles;
|
std::vector<std::string> imageFiles;
|
||||||
@ -258,26 +262,28 @@ void fillBlobs(const std::vector<std::string>& inputFiles,
|
|||||||
|
|
||||||
size_t imageInputId = 0;
|
size_t imageInputId = 0;
|
||||||
size_t binaryInputId = 0;
|
size_t binaryInputId = 0;
|
||||||
for (const ConstInputsDataMap::value_type& item : info) {
|
for (auto& item : app_inputs_info) {
|
||||||
Blob::Ptr inputBlob = requests.at(requestId)->getBlob(item.first);
|
Blob::Ptr inputBlob = requests.at(requestId)->getBlob(item.first);
|
||||||
if (isImage(inputBlob)) {
|
auto app_info = app_inputs_info.at(item.first);
|
||||||
|
auto precision = app_info.precision;
|
||||||
|
if (app_info.isImage()) {
|
||||||
if (!imageFiles.empty()) {
|
if (!imageFiles.empty()) {
|
||||||
// Fill with Images
|
// Fill with Images
|
||||||
fillBlobImage(inputBlob, imageFiles, batchSize, *item.second, requestId, imageInputId++, imageInputCount);
|
fillBlobImage(inputBlob, imageFiles, batchSize, app_info, requestId, imageInputId++, imageInputCount);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
if (!binaryFiles.empty()) {
|
if (!binaryFiles.empty()) {
|
||||||
// Fill with binary files
|
// Fill with binary files
|
||||||
if (item.second->getPrecision() == InferenceEngine::Precision::FP32) {
|
if (precision == InferenceEngine::Precision::FP32) {
|
||||||
fillBlobBinary<float>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
fillBlobBinary<float>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::FP16) {
|
} else if (precision == InferenceEngine::Precision::FP16) {
|
||||||
fillBlobBinary<short>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
fillBlobBinary<short>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::I32) {
|
} else if (precision == InferenceEngine::Precision::I32) {
|
||||||
fillBlobBinary<int32_t>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
fillBlobBinary<int32_t>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::I64) {
|
} else if (precision == InferenceEngine::Precision::I64) {
|
||||||
fillBlobBinary<int64_t>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
fillBlobBinary<int64_t>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::U8) {
|
} else if (precision == InferenceEngine::Precision::U8) {
|
||||||
fillBlobBinary<uint8_t>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
fillBlobBinary<uint8_t>(inputBlob, binaryFiles, batchSize, requestId, binaryInputId++, binaryInputCount);
|
||||||
} else {
|
} else {
|
||||||
THROW_IE_EXCEPTION << "Input precision is not supported for " << item.first;
|
THROW_IE_EXCEPTION << "Input precision is not supported for " << item.first;
|
||||||
@ -285,18 +291,18 @@ void fillBlobs(const std::vector<std::string>& inputFiles,
|
|||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (isImageInfo(inputBlob) && (input_image_sizes.size() == 1)) {
|
if (app_info.isImageInfo() && (input_image_sizes.size() == 1)) {
|
||||||
// Most likely it is image info: fill with image information
|
// Most likely it is image info: fill with image information
|
||||||
auto image_size = input_image_sizes.at(0);
|
auto image_size = input_image_sizes.at(0);
|
||||||
slog::info << "Fill input '" << item.first << "' with image size " << image_size.first << "x"
|
slog::info << "Fill input '" << item.first << "' with image size " << image_size.first << "x"
|
||||||
<< image_size.second << slog::endl;
|
<< image_size.second << slog::endl;
|
||||||
if (item.second->getPrecision() == InferenceEngine::Precision::FP32) {
|
if (precision == InferenceEngine::Precision::FP32) {
|
||||||
fillBlobImInfo<float>(inputBlob, batchSize, image_size);
|
fillBlobImInfo<float>(inputBlob, batchSize, image_size);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::FP16) {
|
} else if (precision == InferenceEngine::Precision::FP16) {
|
||||||
fillBlobImInfo<short>(inputBlob, batchSize, image_size);
|
fillBlobImInfo<short>(inputBlob, batchSize, image_size);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::I32) {
|
} else if (precision == InferenceEngine::Precision::I32) {
|
||||||
fillBlobImInfo<int32_t>(inputBlob, batchSize, image_size);
|
fillBlobImInfo<int32_t>(inputBlob, batchSize, image_size);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::I64) {
|
} else if (precision == InferenceEngine::Precision::I64) {
|
||||||
fillBlobImInfo<int64_t>(inputBlob, batchSize, image_size);
|
fillBlobImInfo<int64_t>(inputBlob, batchSize, image_size);
|
||||||
} else {
|
} else {
|
||||||
THROW_IE_EXCEPTION << "Input precision is not supported for image info!";
|
THROW_IE_EXCEPTION << "Input precision is not supported for image info!";
|
||||||
@ -306,23 +312,23 @@ void fillBlobs(const std::vector<std::string>& inputFiles,
|
|||||||
}
|
}
|
||||||
// Fill random
|
// Fill random
|
||||||
slog::info << "Fill input '" << item.first << "' with random values ("
|
slog::info << "Fill input '" << item.first << "' with random values ("
|
||||||
<< std::string((isImage(inputBlob) ? "image" : "some binary data"))
|
<< std::string((app_info.isImage() ? "image" : "some binary data"))
|
||||||
<< " is expected)" << slog::endl;
|
<< " is expected)" << slog::endl;
|
||||||
if (item.second->getPrecision() == InferenceEngine::Precision::FP32) {
|
if (precision == InferenceEngine::Precision::FP32) {
|
||||||
fillBlobRandom<float>(inputBlob);
|
fillBlobRandom<float>(inputBlob);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::FP16) {
|
} else if (precision == InferenceEngine::Precision::FP16) {
|
||||||
fillBlobRandom<short>(inputBlob);
|
fillBlobRandom<short>(inputBlob);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::I32) {
|
} else if (precision == InferenceEngine::Precision::I32) {
|
||||||
fillBlobRandom<int32_t>(inputBlob);
|
fillBlobRandom<int32_t>(inputBlob);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::I64) {
|
} else if (precision == InferenceEngine::Precision::I64) {
|
||||||
fillBlobRandom<int64_t>(inputBlob);
|
fillBlobRandom<int64_t>(inputBlob);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::U8) {
|
} else if (precision == InferenceEngine::Precision::U8) {
|
||||||
fillBlobRandom<uint8_t>(inputBlob);
|
fillBlobRandom<uint8_t>(inputBlob);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::I8) {
|
} else if (precision == InferenceEngine::Precision::I8) {
|
||||||
fillBlobRandom<int8_t>(inputBlob);
|
fillBlobRandom<int8_t>(inputBlob);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::U16) {
|
} else if (precision == InferenceEngine::Precision::U16) {
|
||||||
fillBlobRandom<uint16_t>(inputBlob);
|
fillBlobRandom<uint16_t>(inputBlob);
|
||||||
} else if (item.second->getPrecision() == InferenceEngine::Precision::I16) {
|
} else if (precision == InferenceEngine::Precision::I16) {
|
||||||
fillBlobRandom<int16_t>(inputBlob);
|
fillBlobRandom<int16_t>(inputBlob);
|
||||||
} else {
|
} else {
|
||||||
THROW_IE_EXCEPTION << "Input precision is not supported for " << item.first;
|
THROW_IE_EXCEPTION << "Input precision is not supported for " << item.first;
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
// Copyright (C) 2018-2020 Intel Corporation
|
// Copyright (C) 2018-2021 Intel Corporation
|
||||||
// SPDX-License-Identifier: Apache-2.0
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
//
|
//
|
||||||
|
|
||||||
@ -9,29 +9,10 @@
|
|||||||
|
|
||||||
#include <inference_engine.hpp>
|
#include <inference_engine.hpp>
|
||||||
|
|
||||||
|
#include "utils.hpp"
|
||||||
#include "infer_request_wrap.hpp"
|
#include "infer_request_wrap.hpp"
|
||||||
|
|
||||||
template<typename T>
|
|
||||||
static bool isImage(const T &blob) {
|
|
||||||
auto descriptor = blob->getTensorDesc();
|
|
||||||
if (descriptor.getLayout() != InferenceEngine::NCHW) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
auto channels = descriptor.getDims()[1];
|
|
||||||
return channels == 3;
|
|
||||||
}
|
|
||||||
|
|
||||||
template<typename T>
|
|
||||||
static bool isImageInfo(const T &blob) {
|
|
||||||
auto descriptor = blob->getTensorDesc();
|
|
||||||
if (descriptor.getLayout() != InferenceEngine::NC) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
auto channels = descriptor.getDims()[1];
|
|
||||||
return (channels >= 2);
|
|
||||||
}
|
|
||||||
|
|
||||||
void fillBlobs(const std::vector<std::string>& inputFiles,
|
void fillBlobs(const std::vector<std::string>& inputFiles,
|
||||||
const size_t& batchSize,
|
const size_t& batchSize,
|
||||||
const InferenceEngine::ConstInputsDataMap& info,
|
benchmark_app::InputsInfo& app_inputs_info,
|
||||||
std::vector<InferReqWrap::Ptr> requests);
|
std::vector<InferReqWrap::Ptr> requests);
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
// Copyright (C) 2018-2020 Intel Corporation
|
// Copyright (C) 2018-2021 Intel Corporation
|
||||||
// SPDX-License-Identifier: Apache-2.0
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
//
|
//
|
||||||
|
|
||||||
@ -320,6 +320,8 @@ int main(int argc, char *argv[]) {
|
|||||||
size_t batchSize = FLAGS_b;
|
size_t batchSize = FLAGS_b;
|
||||||
Precision precision = Precision::UNSPECIFIED;
|
Precision precision = Precision::UNSPECIFIED;
|
||||||
std::string topology_name = "";
|
std::string topology_name = "";
|
||||||
|
benchmark_app::InputsInfo app_inputs_info;
|
||||||
|
std::string output_name;
|
||||||
if (!isNetworkCompiled) {
|
if (!isNetworkCompiled) {
|
||||||
// ----------------- 4. Reading the Intermediate Representation network ----------------------------------------
|
// ----------------- 4. Reading the Intermediate Representation network ----------------------------------------
|
||||||
next_step();
|
next_step();
|
||||||
@ -345,15 +347,12 @@ int main(int argc, char *argv[]) {
|
|||||||
next_step();
|
next_step();
|
||||||
batchSize = cnnNetwork.getBatchSize();
|
batchSize = cnnNetwork.getBatchSize();
|
||||||
// Parse input shapes if specified
|
// Parse input shapes if specified
|
||||||
InferenceEngine::ICNNNetwork::InputShapes shapes = cnnNetwork.getInputShapes();
|
|
||||||
bool reshape = false;
|
bool reshape = false;
|
||||||
if (!FLAGS_shape.empty()) {
|
app_inputs_info = getInputsInfo<InputInfo::Ptr>(FLAGS_shape, FLAGS_layout, FLAGS_b, inputInfo, reshape);
|
||||||
reshape |= updateShapes(shapes, FLAGS_shape, inputInfo);
|
|
||||||
}
|
|
||||||
if ((FLAGS_b != 0) && (batchSize != FLAGS_b)) {
|
|
||||||
reshape |= adjustShapesBatch(shapes, FLAGS_b, inputInfo);
|
|
||||||
}
|
|
||||||
if (reshape) {
|
if (reshape) {
|
||||||
|
InferenceEngine::ICNNNetwork::InputShapes shapes = {};
|
||||||
|
for (auto& item : app_inputs_info)
|
||||||
|
shapes[item.first] = item.second.shape;
|
||||||
slog::info << "Reshaping network: " << getShapesString(shapes) << slog::endl;
|
slog::info << "Reshaping network: " << getShapesString(shapes) << slog::endl;
|
||||||
startTime = Time::now();
|
startTime = Time::now();
|
||||||
cnnNetwork.reshape(shapes);
|
cnnNetwork.reshape(shapes);
|
||||||
@ -365,7 +364,9 @@ int main(int argc, char *argv[]) {
|
|||||||
{"reshape network time (ms)", duration_ms}
|
{"reshape network time (ms)", duration_ms}
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
batchSize = cnnNetwork.getBatchSize();
|
// use batch size according to provided layout and shapes
|
||||||
|
batchSize = (!FLAGS_layout.empty()) ? getBatchSize(app_inputs_info) : cnnNetwork.getBatchSize();
|
||||||
|
|
||||||
topology_name = cnnNetwork.getName();
|
topology_name = cnnNetwork.getName();
|
||||||
slog::info << (FLAGS_b != 0 ? "Network batch size was changed to: " : "Network batch size: ") << batchSize << slog::endl;
|
slog::info << (FLAGS_b != 0 ? "Network batch size was changed to: " : "Network batch size: ") << batchSize << slog::endl;
|
||||||
|
|
||||||
@ -373,9 +374,10 @@ int main(int argc, char *argv[]) {
|
|||||||
next_step();
|
next_step();
|
||||||
|
|
||||||
for (auto& item : inputInfo) {
|
for (auto& item : inputInfo) {
|
||||||
if (isImage(item.second)) {
|
if (app_inputs_info.at(item.first).isImage()) {
|
||||||
/** Set the precision of input data provided by the user, should be called before load of the network to the device **/
|
/** Set the precision of input data provided by the user, should be called before load of the network to the device **/
|
||||||
item.second->setPrecision(Precision::U8);
|
app_inputs_info.at(item.first).precision = Precision::U8;
|
||||||
|
item.second->setPrecision(app_inputs_info.at(item.first).precision);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
// ----------------- 7. Loading the model to the device --------------------------------------------------------
|
// ----------------- 7. Loading the model to the device --------------------------------------------------------
|
||||||
@ -407,6 +409,7 @@ int main(int argc, char *argv[]) {
|
|||||||
{
|
{
|
||||||
{"import network time (ms)", duration_ms}
|
{"import network time (ms)", duration_ms}
|
||||||
});
|
});
|
||||||
|
app_inputs_info = getInputsInfo<InputInfo::CPtr>(FLAGS_shape, FLAGS_layout, FLAGS_b, exeNetwork.GetInputsInfo());
|
||||||
if (batchSize == 0) {
|
if (batchSize == 0) {
|
||||||
batchSize = 1;
|
batchSize = 1;
|
||||||
}
|
}
|
||||||
@ -485,8 +488,7 @@ int main(int argc, char *argv[]) {
|
|||||||
next_step();
|
next_step();
|
||||||
|
|
||||||
InferRequestsQueue inferRequestsQueue(exeNetwork, nireq);
|
InferRequestsQueue inferRequestsQueue(exeNetwork, nireq);
|
||||||
const InferenceEngine::ConstInputsDataMap info(exeNetwork.GetInputsInfo());
|
fillBlobs(inputFiles, batchSize, app_inputs_info, inferRequestsQueue.requests);
|
||||||
fillBlobs(inputFiles, batchSize, info, inferRequestsQueue.requests);
|
|
||||||
|
|
||||||
// ----------------- 10. Measuring performance ------------------------------------------------------------------
|
// ----------------- 10. Measuring performance ------------------------------------------------------------------
|
||||||
size_t progressCnt = 0;
|
size_t progressCnt = 0;
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
// Copyright (C) 2018-2020 Intel Corporation
|
// Copyright (C) 2018-2021 Intel Corporation
|
||||||
// SPDX-License-Identifier: Apache-2.0
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
//
|
//
|
||||||
|
|
||||||
@ -8,6 +8,7 @@
|
|||||||
#include <vector>
|
#include <vector>
|
||||||
#include <map>
|
#include <map>
|
||||||
#include <regex>
|
#include <regex>
|
||||||
|
#include <iostream>
|
||||||
|
|
||||||
#include <samples/common.hpp>
|
#include <samples/common.hpp>
|
||||||
#include <samples/slog.hpp>
|
#include <samples/slog.hpp>
|
||||||
@ -18,6 +19,41 @@
|
|||||||
#include <opencv2/core.hpp>
|
#include <opencv2/core.hpp>
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
namespace benchmark_app {
|
||||||
|
bool InputInfo::isImage() const {
|
||||||
|
if ((layout != "NCHW") && (layout != "NHWC") &&
|
||||||
|
(layout != "CHW") && (layout != "HWC"))
|
||||||
|
return false;
|
||||||
|
return (channels() == 3);
|
||||||
|
}
|
||||||
|
bool InputInfo::isImageInfo() const {
|
||||||
|
if (layout != "NC")
|
||||||
|
return false;
|
||||||
|
return (channels() >= 2);
|
||||||
|
}
|
||||||
|
size_t InputInfo::getDimentionByLayout(char character) const {
|
||||||
|
size_t pos = layout.find(character);
|
||||||
|
if (pos == std::string::npos)
|
||||||
|
throw std::runtime_error("Error: Can't get " + std::string(character, 1) + " from layout " + layout);
|
||||||
|
return shape.at(pos);
|
||||||
|
}
|
||||||
|
size_t InputInfo::width() const {
|
||||||
|
return getDimentionByLayout('W');
|
||||||
|
}
|
||||||
|
size_t InputInfo::height() const {
|
||||||
|
return getDimentionByLayout('H');
|
||||||
|
}
|
||||||
|
size_t InputInfo::channels() const {
|
||||||
|
return getDimentionByLayout('C');
|
||||||
|
}
|
||||||
|
size_t InputInfo::batch() const {
|
||||||
|
return getDimentionByLayout('N');
|
||||||
|
}
|
||||||
|
size_t InputInfo::depth() const {
|
||||||
|
return getDimentionByLayout('D');
|
||||||
|
}
|
||||||
|
} // namespace benchmark_app
|
||||||
|
|
||||||
uint32_t deviceDefaultDeviceDurationInSeconds(const std::string& device) {
|
uint32_t deviceDefaultDeviceDurationInSeconds(const std::string& device) {
|
||||||
static const std::map<std::string, uint32_t> deviceDefaultDurationInSeconds {
|
static const std::map<std::string, uint32_t> deviceDefaultDurationInSeconds {
|
||||||
{ "CPU", 60 },
|
{ "CPU", 60 },
|
||||||
@ -102,61 +138,20 @@ std::map<std::string, std::string> parseNStreamsValuePerDevice(const std::vector
|
|||||||
return result;
|
return result;
|
||||||
}
|
}
|
||||||
|
|
||||||
bool adjustShapesBatch(InferenceEngine::ICNNNetwork::InputShapes& shapes,
|
size_t getBatchSize(const benchmark_app::InputsInfo& inputs_info) {
|
||||||
const size_t batch_size, const InferenceEngine::InputsDataMap& input_info) {
|
size_t batch_size = 0;
|
||||||
bool updated = false;
|
for (auto& info : inputs_info) {
|
||||||
for (auto& item : input_info) {
|
std::size_t batch_index = info.second.layout.find("N");
|
||||||
auto layout = item.second->getTensorDesc().getLayout();
|
if (batch_index != std::string::npos) {
|
||||||
|
if (batch_size == 0)
|
||||||
int batch_index = -1;
|
batch_size = info.second.shape[batch_index];
|
||||||
if ((layout == InferenceEngine::Layout::NCHW) || (layout == InferenceEngine::Layout::NCDHW) ||
|
else if (batch_size != info.second.shape[batch_index])
|
||||||
(layout == InferenceEngine::Layout::NHWC) || (layout == InferenceEngine::Layout::NDHWC) ||
|
throw std::logic_error("Can't deterimine batch size: batch is different for different inputs!");
|
||||||
(layout == InferenceEngine::Layout::NC)) {
|
|
||||||
batch_index = 0;
|
|
||||||
} else if (layout == InferenceEngine::Layout::CN) {
|
|
||||||
batch_index = 1;
|
|
||||||
}
|
|
||||||
if ((batch_index != -1) && (shapes.at(item.first).at(batch_index) != batch_size)) {
|
|
||||||
shapes[item.first][batch_index] = batch_size;
|
|
||||||
updated = true;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
return updated;
|
if (batch_size == 0)
|
||||||
}
|
batch_size = 1;
|
||||||
|
return batch_size;
|
||||||
bool updateShapes(InferenceEngine::ICNNNetwork::InputShapes& shapes,
|
|
||||||
const std::string shapes_string, const InferenceEngine::InputsDataMap& input_info) {
|
|
||||||
bool updated = false;
|
|
||||||
std::string search_string = shapes_string;
|
|
||||||
auto start_pos = search_string.find_first_of('[');
|
|
||||||
while (start_pos != std::string::npos) {
|
|
||||||
auto end_pos = search_string.find_first_of(']');
|
|
||||||
if (end_pos == std::string::npos)
|
|
||||||
break;
|
|
||||||
auto input_name = search_string.substr(0, start_pos);
|
|
||||||
auto input_shape = search_string.substr(start_pos + 1, end_pos - start_pos - 1);
|
|
||||||
std::vector<size_t> parsed_shape;
|
|
||||||
for (auto& dim : split(input_shape, ',')) {
|
|
||||||
parsed_shape.push_back(std::stoi(dim));
|
|
||||||
}
|
|
||||||
if (!input_name.empty()) {
|
|
||||||
shapes[input_name] = parsed_shape;
|
|
||||||
updated = true;
|
|
||||||
} else {
|
|
||||||
for (auto& item : input_info) {
|
|
||||||
shapes[item.first] = parsed_shape;
|
|
||||||
}
|
|
||||||
updated = true;
|
|
||||||
}
|
|
||||||
search_string = search_string.substr(end_pos + 1);
|
|
||||||
if (search_string.empty() || search_string.front() != ',')
|
|
||||||
break;
|
|
||||||
search_string = search_string.substr(1);
|
|
||||||
start_pos = search_string.find_first_of('[');
|
|
||||||
}
|
|
||||||
if (!search_string.empty())
|
|
||||||
throw std::logic_error("Can't parse `shape` parameter: " + shapes_string);
|
|
||||||
return updated;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
std::string getShapesString(const InferenceEngine::ICNNNetwork::InputShapes& shapes) {
|
std::string getShapesString(const InferenceEngine::ICNNNetwork::InputShapes& shapes) {
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
// Copyright (C) 2018-2020 Intel Corporation
|
// Copyright (C) 2018-2021 Intel Corporation
|
||||||
// SPDX-License-Identifier: Apache-2.0
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
//
|
//
|
||||||
|
|
||||||
@ -8,15 +8,119 @@
|
|||||||
#include <vector>
|
#include <vector>
|
||||||
#include <map>
|
#include <map>
|
||||||
|
|
||||||
|
namespace benchmark_app {
|
||||||
|
struct InputInfo {
|
||||||
|
InferenceEngine::Precision precision;
|
||||||
|
InferenceEngine::SizeVector shape;
|
||||||
|
std::string layout;
|
||||||
|
bool isImage() const;
|
||||||
|
bool isImageInfo() const;
|
||||||
|
size_t getDimentionByLayout(char character) const;
|
||||||
|
size_t width() const;
|
||||||
|
size_t height() const;
|
||||||
|
size_t channels() const;
|
||||||
|
size_t batch() const;
|
||||||
|
size_t depth() const;
|
||||||
|
};
|
||||||
|
using InputsInfo = std::map<std::string, InputInfo>;
|
||||||
|
}
|
||||||
|
|
||||||
std::vector<std::string> parseDevices(const std::string& device_string);
|
std::vector<std::string> parseDevices(const std::string& device_string);
|
||||||
uint32_t deviceDefaultDeviceDurationInSeconds(const std::string& device);
|
uint32_t deviceDefaultDeviceDurationInSeconds(const std::string& device);
|
||||||
std::map<std::string, std::string> parseNStreamsValuePerDevice(const std::vector<std::string>& devices,
|
std::map<std::string, std::string> parseNStreamsValuePerDevice(const std::vector<std::string>& devices,
|
||||||
const std::string& values_string);
|
const std::string& values_string);
|
||||||
bool updateShapes(InferenceEngine::ICNNNetwork::InputShapes& shapes,
|
|
||||||
const std::string shapes_string, const InferenceEngine::InputsDataMap& input_info);
|
|
||||||
bool adjustShapesBatch(InferenceEngine::ICNNNetwork::InputShapes& shapes,
|
|
||||||
const size_t batch_size, const InferenceEngine::InputsDataMap& input_info);
|
|
||||||
std::string getShapesString(const InferenceEngine::ICNNNetwork::InputShapes& shapes);
|
std::string getShapesString(const InferenceEngine::ICNNNetwork::InputShapes& shapes);
|
||||||
|
size_t getBatchSize(const benchmark_app::InputsInfo& inputs_info);
|
||||||
|
std::vector<std::string> split(const std::string &s, char delim);
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
std::map<std::string, std::string> parseInputParameters(const std::string parameter_string,
|
||||||
|
const std::map<std::string, T>& input_info) {
|
||||||
|
// Parse parameter string like "input0[value0],input1[value1]" or "[value]" (applied to all inputs)
|
||||||
|
std::map<std::string, std::string> return_value;
|
||||||
|
std::string search_string = parameter_string;
|
||||||
|
auto start_pos = search_string.find_first_of('[');
|
||||||
|
while (start_pos != std::string::npos) {
|
||||||
|
auto end_pos = search_string.find_first_of(']');
|
||||||
|
if (end_pos == std::string::npos)
|
||||||
|
break;
|
||||||
|
auto input_name = search_string.substr(0, start_pos);
|
||||||
|
auto input_value = search_string.substr(start_pos + 1, end_pos - start_pos - 1);
|
||||||
|
if (!input_name.empty()) {
|
||||||
|
return_value[input_name] = input_value;
|
||||||
|
} else {
|
||||||
|
for (auto& item : input_info) {
|
||||||
|
return_value[item.first] = input_value;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
search_string = search_string.substr(end_pos + 1);
|
||||||
|
if (search_string.empty() || search_string.front() != ',')
|
||||||
|
break;
|
||||||
|
search_string = search_string.substr(1);
|
||||||
|
start_pos = search_string.find_first_of('[');
|
||||||
|
}
|
||||||
|
if (!search_string.empty())
|
||||||
|
throw std::logic_error("Can't parse input parameter string: " + parameter_string);
|
||||||
|
return return_value;
|
||||||
|
}
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
benchmark_app::InputsInfo getInputsInfo(const std::string& shape_string,
|
||||||
|
const std::string& layout_string,
|
||||||
|
const size_t batch_size,
|
||||||
|
const std::map<std::string, T>& input_info,
|
||||||
|
bool& reshape_required) {
|
||||||
|
std::map<std::string, std::string> shape_map = parseInputParameters(shape_string, input_info);
|
||||||
|
std::map<std::string, std::string> layout_map = parseInputParameters(layout_string, input_info);
|
||||||
|
reshape_required = false;
|
||||||
|
benchmark_app::InputsInfo info_map;
|
||||||
|
for (auto& item : input_info) {
|
||||||
|
benchmark_app::InputInfo info;
|
||||||
|
auto name = item.first;
|
||||||
|
auto descriptor = item.second->getTensorDesc();
|
||||||
|
// Precision
|
||||||
|
info.precision = descriptor.getPrecision();
|
||||||
|
// Shape
|
||||||
|
if (shape_map.count(name)) {
|
||||||
|
std::vector<size_t> parsed_shape;
|
||||||
|
for (auto& dim : split(shape_map.at(name), ',')) {
|
||||||
|
parsed_shape.push_back(std::stoi(dim));
|
||||||
|
}
|
||||||
|
info.shape = parsed_shape;
|
||||||
|
reshape_required = true;
|
||||||
|
} else {
|
||||||
|
info.shape = descriptor.getDims();
|
||||||
|
}
|
||||||
|
// Layout
|
||||||
|
if (layout_map.count(name)) {
|
||||||
|
info.layout = layout_map.at(name);
|
||||||
|
std::transform(info.layout.begin(), info.layout.end(), info.layout.begin(), ::toupper);
|
||||||
|
} else {
|
||||||
|
std::stringstream ss;
|
||||||
|
ss << descriptor.getLayout();
|
||||||
|
info.layout = ss.str();
|
||||||
|
}
|
||||||
|
// Update shape with batch if needed
|
||||||
|
if (batch_size != 0) {
|
||||||
|
std::size_t batch_index = info.layout.find("N");
|
||||||
|
if ((batch_index != std::string::npos) && (info.shape.at(batch_index) != batch_size)) {
|
||||||
|
info.shape[batch_index] = batch_size;
|
||||||
|
reshape_required = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
info_map[name] = info;
|
||||||
|
}
|
||||||
|
return info_map;
|
||||||
|
}
|
||||||
|
|
||||||
|
template <typename T>
|
||||||
|
benchmark_app::InputsInfo getInputsInfo(const std::string& shape_string,
|
||||||
|
const std::string& layout_string,
|
||||||
|
const size_t batch_size,
|
||||||
|
const std::map<std::string, T>& input_info) {
|
||||||
|
bool reshape_required = false;
|
||||||
|
return getInputsInfo<T>(shape_string, layout_string, batch_size, input_info, reshape_required);
|
||||||
|
}
|
||||||
|
|
||||||
#ifdef USE_OPENCV
|
#ifdef USE_OPENCV
|
||||||
void dump_config(const std::string& filename,
|
void dump_config(const std::string& filename,
|
||||||
|
@ -5,7 +5,7 @@ This topic demonstrates how to run the Benchmark Python* Tool, which performs in
|
|||||||
> **NOTE:** This topic describes usage of Python implementation of the Benchmark Tool. For the C++ implementation, refer to [Benchmark C++ Tool](../../samples/benchmark_app/README.md).
|
> **NOTE:** This topic describes usage of Python implementation of the Benchmark Tool. For the C++ implementation, refer to [Benchmark C++ Tool](../../samples/benchmark_app/README.md).
|
||||||
|
|
||||||
> **TIP**: You also can work with the Benchmark Tool inside the OpenVINO™ [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench).
|
> **TIP**: You also can work with the Benchmark Tool inside the OpenVINO™ [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench).
|
||||||
> [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare
|
> [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare
|
||||||
> performance of deep learning models on various Intel® architecture
|
> performance of deep learning models on various Intel® architecture
|
||||||
> configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components.
|
> configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components.
|
||||||
> <br>
|
> <br>
|
||||||
@ -109,11 +109,18 @@ Options:
|
|||||||
-t TIME, --time TIME Optional. Time in seconds to execute topology.
|
-t TIME, --time TIME Optional. Time in seconds to execute topology.
|
||||||
-progress [PROGRESS] Optional. Show progress bar (can affect performance
|
-progress [PROGRESS] Optional. Show progress bar (can affect performance
|
||||||
measurement). Default values is "False".
|
measurement). Default values is "False".
|
||||||
|
-shape SHAPE Optional. Set shape for input. For example,
|
||||||
|
"input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]"
|
||||||
|
in case of one input size.
|
||||||
|
-layout LAYOUT Optional. Prompts how network layouts should be
|
||||||
|
treated by application. For example,
|
||||||
|
"input1[NCHW],input2[NC]" or "[NCHW]" in case of one
|
||||||
|
input size.
|
||||||
-nstreams NUMBER_STREAMS, --number_streams NUMBER_STREAMS
|
-nstreams NUMBER_STREAMS, --number_streams NUMBER_STREAMS
|
||||||
Optional. Number of streams to use for inference on the CPU/GPU in throughput mode
|
Optional. Number of streams to use for inference on the CPU/GPU in throughput mode
|
||||||
(for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
|
(for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
|
||||||
Default value is determined automatically for a device.
|
Default value is determined automatically for a device.
|
||||||
Please note that although the automatic selection usually provides a reasonable performance,
|
Please note that although the automatic selection usually provides a reasonable performance,
|
||||||
it still may be non-optimal for some cases, especially for very small networks.
|
it still may be non-optimal for some cases, especially for very small networks.
|
||||||
-nthreads NUMBER_THREADS, --number_threads NUMBER_THREADS
|
-nthreads NUMBER_THREADS, --number_threads NUMBER_THREADS
|
||||||
Number of threads to use for inference on the CPU
|
Number of threads to use for inference on the CPU
|
||||||
@ -142,7 +149,7 @@ To run the tool, you can use [public](@ref omz_models_public_index) or [Intel's]
|
|||||||
|
|
||||||
## Examples of Running the Tool
|
## Examples of Running the Tool
|
||||||
|
|
||||||
This section provides step-by-step instructions on how to run the Benchmark Tool with the `googlenet-v1` public model on CPU or FPGA devices. As an input, the `car.png` file from the `<INSTALL_DIR>/deployment_tools/demo/` directory is used.
|
This section provides step-by-step instructions on how to run the Benchmark Tool with the `googlenet-v1` public model on CPU or FPGA devices. As an input, the `car.png` file from the `<INSTALL_DIR>/deployment_tools/demo/` directory is used.
|
||||||
|
|
||||||
> **NOTE:** The Internet access is required to execute the following steps successfully. If you have access to the Internet through the proxy server only, please make sure that it is configured in your OS environment.
|
> **NOTE:** The Internet access is required to execute the following steps successfully. If you have access to the Internet through the proxy server only, please make sure that it is configured in your OS environment.
|
||||||
|
|
||||||
@ -159,9 +166,9 @@ This section provides step-by-step instructions on how to run the Benchmark Tool
|
|||||||
```
|
```
|
||||||
```sh
|
```sh
|
||||||
python3 mo.py --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
|
python3 mo.py --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
|
||||||
```
|
```
|
||||||
3. Run the tool with specifying the `<INSTALL_DIR>/deployment_tools/demo/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and FPGA devices:
|
3. Run the tool with specifying the `<INSTALL_DIR>/deployment_tools/demo/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and FPGA devices:
|
||||||
|
|
||||||
* On CPU:
|
* On CPU:
|
||||||
```sh
|
```sh
|
||||||
python3 benchmark_app.py -m <ir_dir>/googlenet-v1.xml -d CPU -api async -i <INSTALL_DIR>/deployment_tools/demo/car.png --progress true -b 1
|
python3 benchmark_app.py -m <ir_dir>/googlenet-v1.xml -d CPU -api async -i <INSTALL_DIR>/deployment_tools/demo/car.png --progress true -b 1
|
||||||
@ -175,7 +182,7 @@ The application outputs number of executed iterations, total duration of executi
|
|||||||
Additionally, if you set the `-pc` parameter, the application outputs performance counters.
|
Additionally, if you set the `-pc` parameter, the application outputs performance counters.
|
||||||
If you set `-exec_graph_path`, the application reports executable graph information serialized.
|
If you set `-exec_graph_path`, the application reports executable graph information serialized.
|
||||||
|
|
||||||
Below are fragments of sample output for CPU and FPGA devices:
|
Below are fragments of sample output for CPU and FPGA devices:
|
||||||
* For CPU:
|
* For CPU:
|
||||||
```
|
```
|
||||||
[Step 8/9] Measuring performance (Start inference asynchronously, 60000 ms duration, 4 inference requests in parallel using 4 streams)
|
[Step 8/9] Measuring performance (Start inference asynchronously, 60000 ms duration, 4 inference requests in parallel using 4 streams)
|
||||||
|
@ -103,15 +103,19 @@ Options:
|
|||||||
-progress [PROGRESS] Optional. Show progress bar (can affect performance
|
-progress [PROGRESS] Optional. Show progress bar (can affect performance
|
||||||
measurement). Default values is "False".
|
measurement). Default values is "False".
|
||||||
-shape SHAPE Optional. Set shape for input. For example,
|
-shape SHAPE Optional. Set shape for input. For example,
|
||||||
"input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]" in
|
"input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]"
|
||||||
case of one input size.
|
in case of one input size.
|
||||||
|
-layout LAYOUT Optional. Prompts how network layouts should be
|
||||||
|
treated by application. For example,
|
||||||
|
"input1[NCHW],input2[NC]" or "[NCHW]" in case of one
|
||||||
|
input size.
|
||||||
-nstreams NUMBER_STREAMS, --number_streams NUMBER_STREAMS
|
-nstreams NUMBER_STREAMS, --number_streams NUMBER_STREAMS
|
||||||
Optional. Number of streams to use for inference on the CPU/GPU/MYRIAD
|
Optional. Number of streams to use for inference on the CPU/GPU/MYRIAD
|
||||||
(for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
|
(for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> or just <nstreams>).
|
||||||
Default value is determined automatically for a device.
|
Default value is determined automatically for a device.
|
||||||
Please note that although the automatic selection usually provides a reasonable performance,
|
Please note that although the automatic selection usually provides a reasonable performance,
|
||||||
it still may be non-optimal for some cases, especially for very small networks.
|
it still may be non-optimal for some cases, especially for very small networks.
|
||||||
Also, using nstreams>1 is inherently throughput-oriented option, while for the best-latency
|
Also, using nstreams>1 is inherently throughput-oriented option, while for the best-latency
|
||||||
estimations the number of streams should be set to 1.
|
estimations the number of streams should be set to 1.
|
||||||
-enforcebf16 [ENFORCE_BFLOAT16], --enforce_bfloat16 [ENFORCE_BFLOAT16]
|
-enforcebf16 [ENFORCE_BFLOAT16], --enforce_bfloat16 [ENFORCE_BFLOAT16]
|
||||||
Optional. Enforcing of floating point operations
|
Optional. Enforcing of floating point operations
|
||||||
|
@ -11,8 +11,8 @@ from openvino.tools.benchmark.utils.logging import logger
|
|||||||
from openvino.tools.benchmark.utils.progress_bar import ProgressBar
|
from openvino.tools.benchmark.utils.progress_bar import ProgressBar
|
||||||
from openvino.tools.benchmark.utils.utils import next_step, config_network_inputs, get_number_iterations, \
|
from openvino.tools.benchmark.utils.utils import next_step, config_network_inputs, get_number_iterations, \
|
||||||
process_help_inference_string, print_perf_counters, dump_exec_graph, get_duration_in_milliseconds, \
|
process_help_inference_string, print_perf_counters, dump_exec_graph, get_duration_in_milliseconds, \
|
||||||
get_command_line_arguments, parse_nstreams_value_per_device, parse_devices, update_shapes, \
|
get_command_line_arguments, parse_nstreams_value_per_device, parse_devices, get_inputs_info, \
|
||||||
adjust_shapes_batch, load_config, dump_config
|
get_batch_size, load_config, dump_config
|
||||||
from openvino.tools.benchmark.utils.statistics_report import StatisticsReport, averageCntReport, detailedCntReport
|
from openvino.tools.benchmark.utils.statistics_report import StatisticsReport, averageCntReport, detailedCntReport
|
||||||
|
|
||||||
|
|
||||||
@ -193,15 +193,10 @@ def run(args):
|
|||||||
# --------------------- 5. Resizing network to match image sizes and given batch ---------------------------
|
# --------------------- 5. Resizing network to match image sizes and given batch ---------------------------
|
||||||
next_step()
|
next_step()
|
||||||
|
|
||||||
shapes = {k: v.input_data.shape.copy() for k, v in ie_network.input_info.items()}
|
app_inputs_info, reshape = get_inputs_info(args.shape, args.layout, args.batch_size, ie_network.input_info)
|
||||||
reshape = False
|
|
||||||
if args.shape:
|
|
||||||
reshape |= update_shapes(shapes, args.shape, ie_network.input_info)
|
|
||||||
if args.batch_size and args.batch_size != ie_network.batch_size:
|
|
||||||
reshape |= adjust_shapes_batch(shapes, args.batch_size, ie_network.input_info)
|
|
||||||
|
|
||||||
if reshape:
|
if reshape:
|
||||||
start_time = datetime.utcnow()
|
start_time = datetime.utcnow()
|
||||||
|
shapes = { k : v.shape for k,v in app_inputs_info.items() }
|
||||||
logger.info(
|
logger.info(
|
||||||
'Reshaping network: {}'.format(', '.join("'{}': {}".format(k, v) for k, v in shapes.items())))
|
'Reshaping network: {}'.format(', '.join("'{}': {}".format(k, v) for k, v in shapes.items())))
|
||||||
ie_network.reshape(shapes)
|
ie_network.reshape(shapes)
|
||||||
@ -213,13 +208,15 @@ def run(args):
|
|||||||
('reshape network time (ms)', duration_ms)
|
('reshape network time (ms)', duration_ms)
|
||||||
])
|
])
|
||||||
|
|
||||||
batch_size = ie_network.batch_size
|
# use batch size according to provided layout and shapes
|
||||||
logger.info('Network batch size: {}'.format(ie_network.batch_size))
|
batch_size = get_batch_size(app_inputs_info) if args.layout else ie_network.batch_size
|
||||||
|
|
||||||
|
logger.info('Network batch size: {}'.format(batch_size))
|
||||||
|
|
||||||
# --------------------- 6. Configuring input of the model --------------------------------------------------
|
# --------------------- 6. Configuring input of the model --------------------------------------------------
|
||||||
next_step()
|
next_step()
|
||||||
|
|
||||||
config_network_inputs(ie_network)
|
config_network_inputs(ie_network, app_inputs_info)
|
||||||
|
|
||||||
# --------------------- 7. Loading the model to the device -------------------------------------------------
|
# --------------------- 7. Loading the model to the device -------------------------------------------------
|
||||||
next_step()
|
next_step()
|
||||||
@ -253,6 +250,7 @@ def run(args):
|
|||||||
[
|
[
|
||||||
('import network time (ms)', duration_ms)
|
('import network time (ms)', duration_ms)
|
||||||
])
|
])
|
||||||
|
app_inputs_info, _ = get_inputs_info(args.shape, args.layout, args.batch_size, exe_network.input_info)
|
||||||
if batch_size == 0:
|
if batch_size == 0:
|
||||||
batch_size = 1
|
batch_size = 1
|
||||||
|
|
||||||
@ -277,7 +275,7 @@ def run(args):
|
|||||||
if args.paths_to_input:
|
if args.paths_to_input:
|
||||||
for path in args.paths_to_input:
|
for path in args.paths_to_input:
|
||||||
paths_to_input.append(os.path.abspath(*path) if args.paths_to_input else None)
|
paths_to_input.append(os.path.abspath(*path) if args.paths_to_input else None)
|
||||||
set_inputs(paths_to_input, batch_size, exe_network.input_info, infer_requests)
|
set_inputs(paths_to_input, batch_size, app_inputs_info, infer_requests)
|
||||||
|
|
||||||
if statistics:
|
if statistics:
|
||||||
statistics.add_parameters(StatisticsReport.Category.RUNTIME_CONFIG,
|
statistics.add_parameters(StatisticsReport.Category.RUNTIME_CONFIG,
|
||||||
|
@ -68,6 +68,10 @@ def parse_args():
|
|||||||
args.add_argument('-shape', type=str, required=False, default='',
|
args.add_argument('-shape', type=str, required=False, default='',
|
||||||
help='Optional. '
|
help='Optional. '
|
||||||
'Set shape for input. For example, "input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]" in case of one input size.')
|
'Set shape for input. For example, "input1[1,3,224,224],input2[1,4]" or "[1,3,224,224]" in case of one input size.')
|
||||||
|
args.add_argument('-layout', type=str, required=False, default='',
|
||||||
|
help='Optional. '
|
||||||
|
'Prompts how network layouts should be treated by application. '
|
||||||
|
'For example, "input1[NCHW],input2[NC]" or "[NCHW]" in case of one input size.')
|
||||||
args.add_argument('-nstreams', '--number_streams', type=str, required=False, default=None,
|
args.add_argument('-nstreams', '--number_streams', type=str, required=False, default=None,
|
||||||
help='Optional. Number of streams to use for inference on the CPU/GPU/MYRIAD '
|
help='Optional. Number of streams to use for inference on the CPU/GPU/MYRIAD '
|
||||||
'(for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> '
|
'(for HETERO and MULTI device cases use format <device1>:<nstreams1>,<device2>:<nstreams2> '
|
||||||
@ -83,7 +87,7 @@ def parse_args():
|
|||||||
help='Number of threads to use for inference on the CPU, GNA '
|
help='Number of threads to use for inference on the CPU, GNA '
|
||||||
'(including HETERO and MULTI cases).')
|
'(including HETERO and MULTI cases).')
|
||||||
args.add_argument('-pin', '--infer_threads_pinning', type=str, required=False, default='YES', choices=['YES', 'NO', 'NUMA'],
|
args.add_argument('-pin', '--infer_threads_pinning', type=str, required=False, default='YES', choices=['YES', 'NO', 'NUMA'],
|
||||||
help='Optional. Enable threads->cores (\'YES\' is default value), threads->(NUMA)nodes (\'NUMA\') or completely disable (\'NO\')'
|
help='Optional. Enable threads->cores (\'YES\' is default value), threads->(NUMA)nodes (\'NUMA\') or completely disable (\'NO\')'
|
||||||
'CPU threads pinning for CPU-involved inference.')
|
'CPU threads pinning for CPU-involved inference.')
|
||||||
args.add_argument('-exec_graph_path', '--exec_graph_path', type=str, required=False,
|
args.add_argument('-exec_graph_path', '--exec_graph_path', type=str, required=False,
|
||||||
help='Optional. Path to a file where to store executable graph information serialized.')
|
help='Optional. Path to a file where to store executable graph information serialized.')
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
"""
|
"""
|
||||||
Copyright (C) 2018-2020 Intel Corporation
|
Copyright (C) 2018-2021 Intel Corporation
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License");
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
you may not use this file except in compliance with the License.
|
you may not use this file except in compliance with the License.
|
||||||
@ -22,21 +22,8 @@ from glob import glob
|
|||||||
from .constants import IMAGE_EXTENSIONS, BINARY_EXTENSIONS
|
from .constants import IMAGE_EXTENSIONS, BINARY_EXTENSIONS
|
||||||
from .logging import logger
|
from .logging import logger
|
||||||
|
|
||||||
def is_image(blob):
|
def set_inputs(paths_to_input, batch_size, app_input_info, requests):
|
||||||
if blob.layout != "NCHW":
|
requests_input_data = get_inputs(paths_to_input, batch_size, app_input_info, requests)
|
||||||
return False
|
|
||||||
channels = blob.shape[1]
|
|
||||||
return channels == 3
|
|
||||||
|
|
||||||
|
|
||||||
def is_image_info(blob):
|
|
||||||
if blob.layout != "NC":
|
|
||||||
return False
|
|
||||||
channels = blob.shape[1]
|
|
||||||
return channels >= 2
|
|
||||||
|
|
||||||
def set_inputs(paths_to_input, batch_size, input_info, requests):
|
|
||||||
requests_input_data = get_inputs(paths_to_input, batch_size, input_info, requests)
|
|
||||||
for i in range(len(requests)):
|
for i in range(len(requests)):
|
||||||
inputs = requests[i].input_blobs
|
inputs = requests[i].input_blobs
|
||||||
for k, v in requests_input_data[i].items():
|
for k, v in requests_input_data[i].items():
|
||||||
@ -44,19 +31,20 @@ def set_inputs(paths_to_input, batch_size, input_info, requests):
|
|||||||
raise Exception("No input with name {} found!".format(k))
|
raise Exception("No input with name {} found!".format(k))
|
||||||
inputs[k].buffer[:] = v
|
inputs[k].buffer[:] = v
|
||||||
|
|
||||||
def get_inputs(paths_to_input, batch_size, input_info, requests):
|
def get_inputs(paths_to_input, batch_size, app_input_info, requests):
|
||||||
input_image_sizes = {}
|
input_image_sizes = {}
|
||||||
for key in sorted(input_info.keys()):
|
for key in sorted(app_input_info.keys()):
|
||||||
if is_image(input_info[key].input_data):
|
info = app_input_info[key]
|
||||||
input_image_sizes[key] = (input_info[key].input_data.shape[2], input_info[key].input_data.shape[3])
|
if info.is_image:
|
||||||
|
input_image_sizes[key] = (info.width, info.height)
|
||||||
logger.info("Network input '{}' precision {}, dimensions ({}): {}".format(key,
|
logger.info("Network input '{}' precision {}, dimensions ({}): {}".format(key,
|
||||||
input_info[key].input_data.precision,
|
info.precision,
|
||||||
input_info[key].input_data.layout,
|
info.layout,
|
||||||
" ".join(str(x) for x in
|
" ".join(str(x) for x in
|
||||||
input_info[key].input_data.shape)))
|
info.shape)))
|
||||||
|
|
||||||
images_count = len(input_image_sizes.keys())
|
images_count = len(input_image_sizes.keys())
|
||||||
binaries_count = len(input_info) - images_count
|
binaries_count = len(app_input_info) - images_count
|
||||||
|
|
||||||
image_files = list()
|
image_files = list()
|
||||||
binary_files = list()
|
binary_files = list()
|
||||||
@ -100,33 +88,34 @@ def get_inputs(paths_to_input, batch_size, input_info, requests):
|
|||||||
for request_id in range(0, len(requests)):
|
for request_id in range(0, len(requests)):
|
||||||
logger.info("Infer Request {} filling".format(request_id))
|
logger.info("Infer Request {} filling".format(request_id))
|
||||||
input_data = {}
|
input_data = {}
|
||||||
keys = list(sorted(input_info.keys()))
|
keys = list(sorted(app_input_info.keys()))
|
||||||
for key in keys:
|
for key in keys:
|
||||||
if is_image(input_info[key].input_data):
|
info = app_input_info[key]
|
||||||
|
if info.is_image:
|
||||||
# input is image
|
# input is image
|
||||||
if len(image_files) > 0:
|
if len(image_files) > 0:
|
||||||
input_data[key] = fill_blob_with_image(image_files, request_id, batch_size, keys.index(key),
|
input_data[key] = fill_blob_with_image(image_files, request_id, batch_size, keys.index(key),
|
||||||
len(keys), input_info[key].input_data)
|
len(keys), info)
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# input is binary
|
# input is binary
|
||||||
if len(binary_files):
|
if len(binary_files):
|
||||||
input_data[key] = fill_blob_with_binary(binary_files, request_id, batch_size, keys.index(key),
|
input_data[key] = fill_blob_with_binary(binary_files, request_id, batch_size, keys.index(key),
|
||||||
len(keys), input_info[key].input_data)
|
len(keys), info)
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# most likely input is image info
|
# most likely input is image info
|
||||||
if is_image_info(input_info[key].input_data) and len(input_image_sizes) == 1:
|
if info.is_image_info and len(input_image_sizes) == 1:
|
||||||
image_size = input_image_sizes[list(input_image_sizes.keys()).pop()]
|
image_size = input_image_sizes[list(input_image_sizes.keys()).pop()]
|
||||||
logger.info("Fill input '" + key + "' with image size " + str(image_size[0]) + "x" +
|
logger.info("Fill input '" + key + "' with image size " + str(image_size[0]) + "x" +
|
||||||
str(image_size[1]))
|
str(image_size[1]))
|
||||||
input_data[key] = fill_blob_with_image_info(image_size, input_info[key].input_data)
|
input_data[key] = fill_blob_with_image_info(image_size, info)
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# fill with random data
|
# fill with random data
|
||||||
logger.info("Fill input '{}' with random values ({} is expected)".format(key, "image" if is_image(
|
logger.info("Fill input '{}' with random values ({} is expected)".format(key, "image"
|
||||||
input_info[key].input_data) else "some binary data"))
|
if info.is_image else "some binary data"))
|
||||||
input_data[key] = fill_blob_with_random(input_info[key].input_data)
|
input_data[key] = fill_blob_with_random(info)
|
||||||
|
|
||||||
requests_input_data.append(input_data)
|
requests_input_data.append(input_data)
|
||||||
|
|
||||||
@ -150,8 +139,8 @@ def get_files_by_extensions(paths_to_input, extensions):
|
|||||||
|
|
||||||
return input_files
|
return input_files
|
||||||
|
|
||||||
def fill_blob_with_image(image_paths, request_id, batch_size, input_id, input_size, layer):
|
def fill_blob_with_image(image_paths, request_id, batch_size, input_id, input_size, info):
|
||||||
shape = layer.shape
|
shape = info.shape
|
||||||
images = np.ndarray(shape)
|
images = np.ndarray(shape)
|
||||||
image_index = request_id * batch_size * input_size + input_id
|
image_index = request_id * batch_size * input_size + input_id
|
||||||
for b in range(batch_size):
|
for b in range(batch_size):
|
||||||
@ -159,15 +148,11 @@ def fill_blob_with_image(image_paths, request_id, batch_size, input_id, input_si
|
|||||||
image_filename = image_paths[image_index]
|
image_filename = image_paths[image_index]
|
||||||
logger.info('Prepare image {}'.format(image_filename))
|
logger.info('Prepare image {}'.format(image_filename))
|
||||||
image = cv2.imread(image_filename)
|
image = cv2.imread(image_filename)
|
||||||
|
new_im_size = tuple((info.width, info.height))
|
||||||
new_im_size = tuple(shape[2:])
|
|
||||||
if image.shape[:-1] != new_im_size:
|
if image.shape[:-1] != new_im_size:
|
||||||
logger.warning("Image is resized from ({}) to ({})".format(image.shape[:-1], new_im_size))
|
logger.warning("Image is resized from ({}) to ({})".format(image.shape[:-1], new_im_size))
|
||||||
image = cv2.resize(image, new_im_size)
|
image = cv2.resize(image, new_im_size)
|
||||||
|
if info.layout in ['NCHW', 'CHW']:
|
||||||
if image.shape[0] != shape[2]:
|
|
||||||
image = image.transpose((2, 1, 0))
|
|
||||||
else:
|
|
||||||
image = image.transpose((2, 0, 1))
|
image = image.transpose((2, 0, 1))
|
||||||
images[b] = image
|
images[b] = image
|
||||||
|
|
||||||
@ -189,11 +174,13 @@ def get_dtype(precision):
|
|||||||
return format_map[precision]
|
return format_map[precision]
|
||||||
raise Exception("Can't find data type for precision: " + precision)
|
raise Exception("Can't find data type for precision: " + precision)
|
||||||
|
|
||||||
def fill_blob_with_binary(binary_paths, request_id, batch_size, input_id, input_size, layer):
|
def fill_blob_with_binary(binary_paths, request_id, batch_size, input_id, input_size, info):
|
||||||
binaries = np.ndarray(layer.shape)
|
binaries = np.ndarray(info.shape)
|
||||||
shape = get_blob_shape(layer, 1) # get blob shape for batch 1
|
shape = info.shape.copy()
|
||||||
|
if 'N' in info.layout:
|
||||||
|
shape[info.layout.index('N')] = 1
|
||||||
binary_index = request_id * batch_size * input_size + input_id
|
binary_index = request_id * batch_size * input_size + input_id
|
||||||
dtype = get_dtype(layer.precision)
|
dtype = get_dtype(info.precision)
|
||||||
for b in range(batch_size):
|
for b in range(batch_size):
|
||||||
binary_index %= len(binary_paths)
|
binary_index %= len(binary_paths)
|
||||||
binary_filename = binary_paths[binary_index]
|
binary_filename = binary_paths[binary_index]
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
"""
|
"""
|
||||||
Copyright (C) 2018-2020 Intel Corporation
|
Copyright (C) 2018-2021 Intel Corporation
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License");
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
you may not use this file except in compliance with the License.
|
you may not use this file except in compliance with the License.
|
||||||
@ -17,7 +17,6 @@ from openvino.inference_engine import IENetwork,IECore
|
|||||||
|
|
||||||
from .constants import DEVICE_DURATION_IN_SECS, UNKNOWN_DEVICE_TYPE, \
|
from .constants import DEVICE_DURATION_IN_SECS, UNKNOWN_DEVICE_TYPE, \
|
||||||
CPU_DEVICE_NAME, GPU_DEVICE_NAME
|
CPU_DEVICE_NAME, GPU_DEVICE_NAME
|
||||||
from .inputs_filling import is_image
|
|
||||||
from .logging import logger
|
from .logging import logger
|
||||||
|
|
||||||
import json
|
import json
|
||||||
@ -61,13 +60,13 @@ def next_step(additional_info='', step_id=0):
|
|||||||
print(step_info_template)
|
print(step_info_template)
|
||||||
|
|
||||||
|
|
||||||
def config_network_inputs(ie_network: IENetwork):
|
def config_network_inputs(ie_network: IENetwork, app_inputs_info):
|
||||||
input_info = ie_network.input_info
|
input_info = ie_network.input_info
|
||||||
|
|
||||||
for key in input_info.keys():
|
for key in input_info.keys():
|
||||||
if is_image(input_info[key].input_data):
|
if app_inputs_info[key].is_image:
|
||||||
# Set the precision of input data provided by the user
|
# Set the precision of input data provided by the user
|
||||||
# Should be called before load of the network to the plugin
|
# Should be called before load of the network to the plugin
|
||||||
|
app_inputs_info[key].precision = 'U8'
|
||||||
input_info[key].precision = 'U8'
|
input_info[key].precision = 'U8'
|
||||||
|
|
||||||
|
|
||||||
@ -227,33 +226,105 @@ def get_command_line_arguments(argv):
|
|||||||
parameters.append((arg_name, arg_value))
|
parameters.append((arg_name, arg_value))
|
||||||
return parameters
|
return parameters
|
||||||
|
|
||||||
def update_shapes(shapes, shapes_string: str, inputs_info):
|
def parse_input_parameters(parameter_string, input_info):
|
||||||
updated = False
|
# Parse parameter string like "input0[value0],input1[value1]" or "[value]" (applied to all inputs)
|
||||||
matches = re.findall(r'(.*?)\[(.*?)\],?', shapes_string)
|
return_value = {}
|
||||||
if matches:
|
if parameter_string:
|
||||||
for match in matches:
|
matches = re.findall(r'(.*?)\[(.*?)\],?', parameter_string)
|
||||||
input_name = match[0]
|
if matches:
|
||||||
parsed_shape = [int(dim) for dim in match[1].split(',')]
|
for match in matches:
|
||||||
if input_name != '':
|
input_name, value = match
|
||||||
shapes[input_name] = parsed_shape
|
if input_name != '':
|
||||||
updated = True
|
return_value[input_name] = value
|
||||||
else:
|
else:
|
||||||
shapes.update({ k:parsed_shape for k in shapes.keys() })
|
return_value = { k:value for k in input_info.keys() }
|
||||||
updated = True
|
break
|
||||||
break
|
else:
|
||||||
else:
|
raise Exception("Can't parse input parameter: {}".format(parameter_string))
|
||||||
raise Exception("Can't parse `shape` parameter: {}".format(shapes_string))
|
return return_value
|
||||||
return updated
|
|
||||||
|
|
||||||
def adjust_shapes_batch(shapes, batch_size: int, inputs_info):
|
class InputInfo:
|
||||||
updated = False
|
def __init__(self):
|
||||||
for name, data in inputs_info.items():
|
self.precision = None
|
||||||
layout = data.input_data.layout
|
self.layout = ""
|
||||||
batch_index = layout.index('N') if 'N' in layout else -1
|
self.shape = []
|
||||||
if batch_index != -1 and shapes[name][batch_index] != batch_size:
|
|
||||||
shapes[name][batch_index] = batch_size
|
@property
|
||||||
updated = True
|
def is_image(self):
|
||||||
return updated
|
if self.layout not in [ "NCHW", "NHWC", "CHW", "HWC" ]:
|
||||||
|
return False
|
||||||
|
return self.channels == 3
|
||||||
|
|
||||||
|
@property
|
||||||
|
def is_image_info(self):
|
||||||
|
if self.layout != "NC":
|
||||||
|
return False
|
||||||
|
return self.channels >= 2
|
||||||
|
|
||||||
|
def getDimentionByLayout(self, character):
|
||||||
|
if character not in self.layout:
|
||||||
|
raise Exception("Error: Can't get {} from layout {}".format(character, self.layout))
|
||||||
|
return self.shape[self.layout.index(character)]
|
||||||
|
|
||||||
|
@property
|
||||||
|
def width(self):
|
||||||
|
return self.getDimentionByLayout("W")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def height(self):
|
||||||
|
return self.getDimentionByLayout("H")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def channels(self):
|
||||||
|
return self.getDimentionByLayout("C")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def batch(self):
|
||||||
|
return self.getDimentionByLayout("N")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def depth(self):
|
||||||
|
return self.getDimentionByLayout("D")
|
||||||
|
|
||||||
|
def get_inputs_info(shape_string, layout_string, batch_size, input_info):
|
||||||
|
shape_map = parse_input_parameters(shape_string, input_info)
|
||||||
|
layout_map = parse_input_parameters(layout_string, input_info)
|
||||||
|
reshape = False
|
||||||
|
info_map = {}
|
||||||
|
for name, descriptor in input_info.items():
|
||||||
|
info = InputInfo()
|
||||||
|
# Precision
|
||||||
|
info.precision = descriptor.precision
|
||||||
|
# Shape
|
||||||
|
if name in shape_map.keys():
|
||||||
|
parsed_shape = [int(dim) for dim in shape_map[name].split(',')]
|
||||||
|
info.shape = parsed_shape
|
||||||
|
reshape = True
|
||||||
|
else:
|
||||||
|
info.shape = descriptor.input_data.shape
|
||||||
|
# Layout
|
||||||
|
info.layout = layout_map[name].upper() if name in layout_map.keys() else descriptor.tensor_desc.layout
|
||||||
|
# Update shape with batch if needed
|
||||||
|
if batch_size != 0:
|
||||||
|
batch_index = info.layout.index('N') if 'N' in info.layout else -1
|
||||||
|
if batch_index != -1 and info.shape[batch_index] != batch_size:
|
||||||
|
info.shape[batch_index] = batch_size
|
||||||
|
reshape = True
|
||||||
|
info_map[name] = info
|
||||||
|
return info_map, reshape
|
||||||
|
|
||||||
|
def get_batch_size(inputs_info):
|
||||||
|
batch_size = 0
|
||||||
|
for _, info in inputs_info.items():
|
||||||
|
batch_index = info.layout.index('N') if 'N' in info.layout else -1
|
||||||
|
if batch_index != -1:
|
||||||
|
if batch_size == 0:
|
||||||
|
batch_size = info.shape[batch_index]
|
||||||
|
elif batch_size != info.shape[batch_index]:
|
||||||
|
raise Exception("Can't deterimine batch size: batch is different for different inputs!")
|
||||||
|
if batch_size == 0:
|
||||||
|
batch_size = 1
|
||||||
|
return batch_size
|
||||||
|
|
||||||
def show_available_devices():
|
def show_available_devices():
|
||||||
ie = IECore()
|
ie = IECore()
|
||||||
|
Loading…
Reference in New Issue
Block a user