Files
openvino/samples/cpp/benchmark_app/main.cpp

1230 lines
58 KiB
C++
Raw Normal View History

// Copyright (C) 2018-2022 Intel Corporation
2018-11-23 16:19:43 +03:00
// SPDX-License-Identifier: Apache-2.0
//
#include <algorithm>
#include <chrono>
#include <map>
#include <memory>
#include <string>
#include <utility>
#include <vector>
// clang-format off
#include "openvino/openvino.hpp"
#include "openvino/pass/serialize.hpp"
#include "gna/gna_config.hpp"
#include "gpu/gpu_config.hpp"
#include "samples/args_helper.hpp"
#include "samples/common.hpp"
#include "samples/slog.hpp"
2018-11-23 16:19:43 +03:00
2019-04-12 18:25:53 +03:00
#include "benchmark_app.hpp"
#include "infer_request_wrap.hpp"
#include "inputs_filling.hpp"
2019-04-12 18:25:53 +03:00
#include "progress_bar.hpp"
#include "remote_tensors_filling.hpp"
2019-04-12 18:25:53 +03:00
#include "statistics_report.hpp"
2019-08-09 19:02:42 +03:00
#include "utils.hpp"
// clang-format on
2018-11-23 16:19:43 +03:00
2019-08-09 19:02:42 +03:00
static const size_t progressBarDefaultTotalCount = 1000;
2018-11-23 16:19:43 +03:00
bool parse_and_check_command_line(int argc, char* argv[]) {
// ---------------------------Parsing and validating input
// arguments--------------------------------------
2019-04-12 18:25:53 +03:00
slog::info << "Parsing input parameters" << slog::endl;
gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
2019-08-09 19:02:42 +03:00
if (FLAGS_help || FLAGS_h) {
show_usage();
2019-08-09 19:02:42 +03:00
showAvailableDevices();
2019-04-12 18:25:53 +03:00
return false;
}
2018-11-23 16:19:43 +03:00
2019-04-12 18:25:53 +03:00
if (FLAGS_m.empty()) {
show_usage();
2019-08-09 19:02:42 +03:00
throw std::logic_error("Model is required but not set. Please set -m option.");
2019-04-12 18:25:53 +03:00
}
2018-11-23 16:19:43 +03:00
if (FLAGS_latency_percentile > 100 || FLAGS_latency_percentile < 1) {
show_usage();
throw std::logic_error("The percentile value is incorrect. The applicable values range is [1, 100].");
}
2019-04-12 18:25:53 +03:00
if (FLAGS_api != "async" && FLAGS_api != "sync") {
2019-08-09 19:02:42 +03:00
throw std::logic_error("Incorrect API. Please set -api option to `sync` or `async` value.");
2019-04-12 18:25:53 +03:00
}
if (!FLAGS_hint.empty() && FLAGS_hint != "throughput" && FLAGS_hint != "tput" && FLAGS_hint != "latency" &&
FLAGS_hint != "cumulative_throughput" && FLAGS_hint != "ctput" && FLAGS_hint != "none") {
OV Performance Hints (CPU and GPU logic for selecting the actual configs), while AUTO/MULTI are passing them thru) (#6993) * rebasing the perf-modes-2021.3 to the 2021.4 Caveats: the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet) (cherry picked from commit 1ae1edc0ed70fdea40f528fdaf8d00a9904d2a5c) * overriding streams (to force the TPUT mode to the DLBenchnark) (cherry picked from commit 7f506cda31abf35ac293d0dce32f602a0188c619) * disabling reducing #streams to fully mimic baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments) (cherry picked from commit 85073dd1dd2c7d43a89c37c8f646313f6ddfc650) * clang/identation (cherry picked from commit 050a4155a923cee294c8689d685b39247b7a172a) * splitting the Transformation to general and CPU specific. Now hopefully,this fully mimics the baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled (cherry picked from commit e98b2c1a67f2542a686543594b75b575ef515196) * disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8 (cherry picked from commit 32b8d80dee18685ebf3d069bb4cd2172af7363b7) * refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations (cherry picked from commit f2b972171b29cf599aae2407ceec2e6adb67e4e9) * isa-based threshold logic (cherry picked from commit b218457e1a93fcb3374eb9da948fdad2175ec33a) * mode->hint (cherry picked from commit ec20aa8ecaf3222f2a6fdfe9153cf6c9dfdd6a54) * optional PERFORMANCE_HINT_NUM_REQUESTS (cherry picked from commit 5a3883e3f36e7928c6391094ae10711c8e4c3b5c) * moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master) (cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776) * AUTO support for PerfHints * MULTI support for PerfHints * Enabling Perf hints for the GPU plugin * brushing settings output a bit * disabling "throughput" perf hint being default (until OV 2.0) * uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default * removing dead and experimental code, and debug printfs * clang/code-style * code-review remarks * Moved the output of the actual params that the hint produced to the right place * aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum * clang * benchmark_app brushing * Update inference-engine/samples/benchmark_app/README.md * propagating the perf hints thru one more scenario in the merged AUTO-MULTI * fixed mispint * Python benchmark_app update for perf hints * addresssing reviewers comments on the python benchmark_app * simplifying/brushing logic a bit * refactor the heuristic to the separate file (to be shared with iGPU soon) * refactor conversion of modes to the specific GPU config per feedback from Vladimir
2021-09-13 15:40:36 +03:00
throw std::logic_error("Incorrect performance hint. Please set -hint option to"
"`throughput`(tput), `latency', 'cumulative_throughput'(ctput) value or 'none'.");
OV Performance Hints (CPU and GPU logic for selecting the actual configs), while AUTO/MULTI are passing them thru) (#6993) * rebasing the perf-modes-2021.3 to the 2021.4 Caveats: the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet) (cherry picked from commit 1ae1edc0ed70fdea40f528fdaf8d00a9904d2a5c) * overriding streams (to force the TPUT mode to the DLBenchnark) (cherry picked from commit 7f506cda31abf35ac293d0dce32f602a0188c619) * disabling reducing #streams to fully mimic baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments) (cherry picked from commit 85073dd1dd2c7d43a89c37c8f646313f6ddfc650) * clang/identation (cherry picked from commit 050a4155a923cee294c8689d685b39247b7a172a) * splitting the Transformation to general and CPU specific. Now hopefully,this fully mimics the baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled (cherry picked from commit e98b2c1a67f2542a686543594b75b575ef515196) * disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8 (cherry picked from commit 32b8d80dee18685ebf3d069bb4cd2172af7363b7) * refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations (cherry picked from commit f2b972171b29cf599aae2407ceec2e6adb67e4e9) * isa-based threshold logic (cherry picked from commit b218457e1a93fcb3374eb9da948fdad2175ec33a) * mode->hint (cherry picked from commit ec20aa8ecaf3222f2a6fdfe9153cf6c9dfdd6a54) * optional PERFORMANCE_HINT_NUM_REQUESTS (cherry picked from commit 5a3883e3f36e7928c6391094ae10711c8e4c3b5c) * moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master) (cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776) * AUTO support for PerfHints * MULTI support for PerfHints * Enabling Perf hints for the GPU plugin * brushing settings output a bit * disabling "throughput" perf hint being default (until OV 2.0) * uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default * removing dead and experimental code, and debug printfs * clang/code-style * code-review remarks * Moved the output of the actual params that the hint produced to the right place * aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum * clang * benchmark_app brushing * Update inference-engine/samples/benchmark_app/README.md * propagating the perf hints thru one more scenario in the merged AUTO-MULTI * fixed mispint * Python benchmark_app update for perf hints * addresssing reviewers comments on the python benchmark_app * simplifying/brushing logic a bit * refactor the heuristic to the separate file (to be shared with iGPU soon) * refactor conversion of modes to the specific GPU config per feedback from Vladimir
2021-09-13 15:40:36 +03:00
}
if (FLAGS_hint != "none" && (FLAGS_nstreams != "" || FLAGS_nthreads != 0 || FLAGS_pin != "")) {
throw std::logic_error("-nstreams, -nthreads and -pin options are fine tune options. To use them you "
"should explicitely set -hint option to none. This is not OpenVINO limitation "
"(those options can be used in OpenVINO together), but a benchmark_app UI rule.");
}
if (!FLAGS_report_type.empty() && FLAGS_report_type != noCntReport && FLAGS_report_type != averageCntReport &&
FLAGS_report_type != detailedCntReport) {
std::string err = "only " + std::string(noCntReport) + "/" + std::string(averageCntReport) + "/" +
std::string(detailedCntReport) +
" report types are supported (invalid -report_type option value)";
2019-04-12 18:25:53 +03:00
throw std::logic_error(err);
}
2019-10-04 19:26:43 +03:00
if ((FLAGS_report_type == averageCntReport) && ((FLAGS_d.find("MULTI") != std::string::npos))) {
throw std::logic_error("only " + std::string(detailedCntReport) + " report type is supported for MULTI device");
}
bool isNetworkCompiled = fileExt(FLAGS_m) == "blob";
bool isPrecisionSet = !(FLAGS_ip.empty() && FLAGS_op.empty() && FLAGS_iop.empty());
if (isNetworkCompiled && isPrecisionSet) {
std::string err = std::string("Cannot set precision for a compiled network. ") +
std::string("Please re-compile your network with required precision "
"using compile_tool");
throw std::logic_error(err);
}
2019-04-12 18:25:53 +03:00
return true;
}
2018-11-23 16:19:43 +03:00
2019-08-09 19:02:42 +03:00
static void next_step(const std::string additional_info = "") {
static size_t step_id = 0;
static const std::map<size_t, std::string> step_names = {
{1, "Parsing and validating input arguments"},
DOCS: ported changes from 2022.1 release branch (#11206) * Extensibility guide with FE extensions and remove OV_FRAMEWORK_MAP from docs * Rework of Extensibility Intro, adopted examples to missing OPENVINO_FRAMEWORK_MAP * Removed OPENVINO_FRAMEWORK_MAP reference * Frontend extension detailed documentation * Fixed distributed snippets * Fixed snippet inclusion in FE extension document and chapter headers * Fixed wrong name in a snippet reference * Fixed test for template extension due to changed number of loaded extensions * Update docs/Extensibility_UG/frontend_extensions.md Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> * Minor fixes in extension snippets * Small grammar fix Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> * DOCS: transition banner (#10973) * transition banner * minor fix * update transition banner * updates * update custom.js * updates * updates * Documentation fixes (#11044) * Benchmark app usage * Fixed link to the devices * More fixes * Update docs/OV_Runtime_UG/multi_device.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Removed several hardcoded links Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Updated documentation for compile_tool (#11049) * Added deployment guide (#11060) * Added deployment guide * Added local distribution * Updates * Fixed more indentations * Removed obsolete code snippets (#11061) * Removed obsolete code snippets * NCC style * Fixed NCC for BA * Add a troubleshooting issue for PRC installation (#11074) * updates * adding gna to linux * add missing reference * update * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * update * minor updates * add gna item to yum and apt * add gna to get started page * update reference formatting * merge commit * add a troubleshooting issue * update * update * fix CVS-71846 Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * DOCS: fixed hardcoded links (#11100) * Fixes * Use links * applying reviewers comments to the Opt Guide (#11093) * applying reviewrs comments * fixed refs, more structuring (bold, bullets, etc) * refactoring tput/latency sections * next iteration (mostly latency), also brushed the auto-batching and other sections * updates sync/async images * common opts brushed * WIP tput redesigned * minor brushing of common and auto-batching * Tput fully refactored * fixed doc name in the link * moved int8 perf counters to the right section * fixed links * fixed broken quotes * fixed more links * add ref to the internals to the TOC * Added a note on the batch size Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com> * [80085] New images for docs (#11114) * change doc structure * fix manager tools * fix manager tools 3 step * fix manager tools 3 step * new img * new img for OV Runtime * fix steps * steps * fix intendents * change list * fix space * fix space * code snippets fix * change display * Benchmarks 2022 1 (#11130) * Minor fixes * Updates for 2022.1 * Edits according to the review * Edits according to review comments * Edits according to review comments * Edits according to review comments * Fixed table * Edits according to review comments * Removed config for Intel® Core™ i7-11850HE * Removed forward-tacotron-duration-prediction-241 graph * Added resnet-18-pytorch * Add info about Docker images in Deployment guide (#11136) * Renamed user guides (#11137) * fix screenshot (#11140) * More conservative recommendations on dynamic shapes usage in docs (#11161) * More conservative recommendations about using dynamic shapes * Duplicated statement from C++ part to Python part of reshape doc (no semantical changes) * Update ShapeInference.md (#11168) * Benchmarks 2022 1 updates (#11180) * Updated graphs * Quick fix for TODO in Dynamic Shapes article * Anchor link fixes * Fixed DM config (#11199) * DOCS: doxy sphinxtabs (#11027) * initial implementation of doxy sphinxtabs * fixes * fixes * fixes * fixes * fixes * WA for ignored visibility attribute * Fixes Co-authored-by: Sergey Lyalin <sergey.lyalin@intel.com> Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com> Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> Co-authored-by: Yuan Xu <yuan1.xu@intel.com> Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com> Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com> Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> Co-authored-by: Ilya Naumov <ilya.naumov@intel.com> Co-authored-by: Evgenya Stepyreva <evgenya.stepyreva@intel.com>
2022-03-24 22:27:29 +03:00
{2, "Loading OpenVINO Runtime"},
{3, "Setting device configuration"},
{4, "Reading network files"},
{5, "Resizing network to match image sizes and given batch"},
{6, "Configuring input of the model"},
{7, "Loading the model to the device"},
{8, "Setting optimal runtime parameters"},
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
{9, "Creating infer requests and preparing input blobs with data"},
{10, "Measuring performance"},
{11, "Dumping statistics report"}};
2019-08-09 19:02:42 +03:00
step_id++;
OPENVINO_ASSERT(step_names.count(step_id) != 0,
"Step ID ",
step_id,
" is out of total steps number ",
step_names.size());
2019-08-09 19:02:42 +03:00
std::cout << "[Step " << step_id << "/" << step_names.size() << "] " << step_names.at(step_id)
<< (additional_info.empty() ? "" : " (" + additional_info + ")") << std::endl;
}
ov::hint::PerformanceMode get_performance_hint(const std::string& device, const ov::Core& core) {
ov::hint::PerformanceMode ov_perf_hint = ov::hint::PerformanceMode::UNDEFINED;
auto supported_properties = core.get_property(device, ov::supported_properties);
if (std::find(supported_properties.begin(), supported_properties.end(), ov::hint::performance_mode) !=
supported_properties.end()) {
if (FLAGS_hint != "") {
if (FLAGS_hint == "throughput" || FLAGS_hint == "tput") {
slog::warn << "Device(" << device << ") performance hint is set to THROUGHPUT" << slog::endl;
ov_perf_hint = ov::hint::PerformanceMode::THROUGHPUT;
} else if (FLAGS_hint == "latency") {
slog::warn << "Device(" << device << ") performance hint is set to LATENCY" << slog::endl;
ov_perf_hint = ov::hint::PerformanceMode::LATENCY;
} else if (FLAGS_hint == "cumulative_throughput" || FLAGS_hint == "ctput") {
slog::warn << "Device(" << device << ") performance hint is set to CUMULATIVE_THROUGHPUT" << slog::endl;
ov_perf_hint = ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT;
} else if (FLAGS_hint == "none") {
slog::warn << "No device(" << device << ") performance hint is set" << slog::endl;
ov_perf_hint = ov::hint::PerformanceMode::UNDEFINED;
}
} else {
ov_perf_hint =
FLAGS_api == "sync" ? ov::hint::PerformanceMode::LATENCY : ov::hint::PerformanceMode::THROUGHPUT;
slog::warn << "Performance hint was not explicitly specified in command line. "
"Device("
<< device << ") performance hint will be set to " << ov_perf_hint << "." << slog::endl;
}
} else {
if (FLAGS_hint != "") {
slog::warn << "Device(" << device << ") does not support performance hint property(-hint)." << slog::endl;
}
}
return ov_perf_hint;
}
2019-04-12 18:25:53 +03:00
/**
* @brief The entry point of the benchmark application
*/
int main(int argc, char* argv[]) {
2019-10-04 19:26:43 +03:00
std::shared_ptr<StatisticsReport> statistics;
2019-04-12 18:25:53 +03:00
try {
ov::CompiledModel compiledModel;
2020-02-11 22:48:49 +03:00
// ----------------- 1. Parsing and validating input arguments
// -------------------------------------------------
2019-08-09 19:02:42 +03:00
next_step();
2019-04-12 18:25:53 +03:00
if (!parse_and_check_command_line(argc, argv)) {
2019-04-12 18:25:53 +03:00
return 0;
2018-11-23 16:19:43 +03:00
}
2020-02-11 22:48:49 +03:00
bool isNetworkCompiled = fileExt(FLAGS_m) == "blob";
if (isNetworkCompiled) {
slog::info << "Network is compiled" << slog::endl;
}
std::vector<gflags::CommandLineFlagInfo> flags;
StatisticsReport::Parameters command_line_arguments;
gflags::GetAllFlags(&flags);
for (auto& flag : flags) {
if (!flag.is_default) {
command_line_arguments.emplace_back(flag.name, flag.name, flag.current_value);
2019-10-04 19:26:43 +03:00
}
}
if (!FLAGS_report_type.empty()) {
statistics = FLAGS_json_stats ? std::make_shared<StatisticsReportJSON>(
StatisticsReport::Config{FLAGS_report_type, FLAGS_report_folder})
: std::make_shared<StatisticsReport>(
StatisticsReport::Config{FLAGS_report_type, FLAGS_report_folder});
statistics->add_parameters(StatisticsReport::Category::COMMAND_LINE_PARAMETERS, command_line_arguments);
2019-10-04 19:26:43 +03:00
}
auto isFlagSetInCommandLine = [&command_line_arguments](const std::string& name) {
return (std::find_if(command_line_arguments.begin(),
command_line_arguments.end(),
[name](const StatisticsVariant& p) {
return p.json_name == name;
}) != command_line_arguments.end());
};
std::string device_name = FLAGS_d;
// Parse devices
auto devices = parse_devices(device_name);
2019-10-04 19:26:43 +03:00
// Parse nstreams per device
std::map<std::string, std::string> device_nstreams = parse_value_per_device(devices, FLAGS_nstreams);
std::map<std::string, std::string> device_infer_precision =
parse_value_per_device(devices, FLAGS_infer_precision);
// Load device config file if specified
std::map<std::string, ov::AnyMap> config;
if (!FLAGS_load_config.empty()) {
load_config(FLAGS_load_config, config);
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
/** This vector stores paths to the processed images with input names**/
auto inputFiles = parse_input_arguments(gflags::GetArgvs());
2018-11-23 16:19:43 +03:00
DOCS: ported changes from 2022.1 release branch (#11206) * Extensibility guide with FE extensions and remove OV_FRAMEWORK_MAP from docs * Rework of Extensibility Intro, adopted examples to missing OPENVINO_FRAMEWORK_MAP * Removed OPENVINO_FRAMEWORK_MAP reference * Frontend extension detailed documentation * Fixed distributed snippets * Fixed snippet inclusion in FE extension document and chapter headers * Fixed wrong name in a snippet reference * Fixed test for template extension due to changed number of loaded extensions * Update docs/Extensibility_UG/frontend_extensions.md Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> * Minor fixes in extension snippets * Small grammar fix Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> * DOCS: transition banner (#10973) * transition banner * minor fix * update transition banner * updates * update custom.js * updates * updates * Documentation fixes (#11044) * Benchmark app usage * Fixed link to the devices * More fixes * Update docs/OV_Runtime_UG/multi_device.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Removed several hardcoded links Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Updated documentation for compile_tool (#11049) * Added deployment guide (#11060) * Added deployment guide * Added local distribution * Updates * Fixed more indentations * Removed obsolete code snippets (#11061) * Removed obsolete code snippets * NCC style * Fixed NCC for BA * Add a troubleshooting issue for PRC installation (#11074) * updates * adding gna to linux * add missing reference * update * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * update * minor updates * add gna item to yum and apt * add gna to get started page * update reference formatting * merge commit * add a troubleshooting issue * update * update * fix CVS-71846 Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * DOCS: fixed hardcoded links (#11100) * Fixes * Use links * applying reviewers comments to the Opt Guide (#11093) * applying reviewrs comments * fixed refs, more structuring (bold, bullets, etc) * refactoring tput/latency sections * next iteration (mostly latency), also brushed the auto-batching and other sections * updates sync/async images * common opts brushed * WIP tput redesigned * minor brushing of common and auto-batching * Tput fully refactored * fixed doc name in the link * moved int8 perf counters to the right section * fixed links * fixed broken quotes * fixed more links * add ref to the internals to the TOC * Added a note on the batch size Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com> * [80085] New images for docs (#11114) * change doc structure * fix manager tools * fix manager tools 3 step * fix manager tools 3 step * new img * new img for OV Runtime * fix steps * steps * fix intendents * change list * fix space * fix space * code snippets fix * change display * Benchmarks 2022 1 (#11130) * Minor fixes * Updates for 2022.1 * Edits according to the review * Edits according to review comments * Edits according to review comments * Edits according to review comments * Fixed table * Edits according to review comments * Removed config for Intel® Core™ i7-11850HE * Removed forward-tacotron-duration-prediction-241 graph * Added resnet-18-pytorch * Add info about Docker images in Deployment guide (#11136) * Renamed user guides (#11137) * fix screenshot (#11140) * More conservative recommendations on dynamic shapes usage in docs (#11161) * More conservative recommendations about using dynamic shapes * Duplicated statement from C++ part to Python part of reshape doc (no semantical changes) * Update ShapeInference.md (#11168) * Benchmarks 2022 1 updates (#11180) * Updated graphs * Quick fix for TODO in Dynamic Shapes article * Anchor link fixes * Fixed DM config (#11199) * DOCS: doxy sphinxtabs (#11027) * initial implementation of doxy sphinxtabs * fixes * fixes * fixes * fixes * fixes * WA for ignored visibility attribute * Fixes Co-authored-by: Sergey Lyalin <sergey.lyalin@intel.com> Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com> Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> Co-authored-by: Yuan Xu <yuan1.xu@intel.com> Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com> Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com> Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> Co-authored-by: Ilya Naumov <ilya.naumov@intel.com> Co-authored-by: Evgenya Stepyreva <evgenya.stepyreva@intel.com>
2022-03-24 22:27:29 +03:00
// ----------------- 2. Loading the OpenVINO Runtime
// -----------------------------------------------------------
2019-08-09 19:02:42 +03:00
next_step();
2018-11-23 16:19:43 +03:00
ov::Core core;
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (!FLAGS_extensions.empty()) {
// Extensions are loaded as a shared library
core.add_extension(FLAGS_extensions);
slog::info << "Extensions are loaded: " << FLAGS_extensions << slog::endl;
}
// Load clDNN Extensions
if ((FLAGS_d.find("GPU") != std::string::npos) && !FLAGS_c.empty()) {
// Override config if command line parameter is specified
if (!config.count("GPU"))
config["GPU"] = {};
config["GPU"][CONFIG_KEY(CONFIG_FILE)] = FLAGS_c;
}
if (config.count("GPU") && config.at("GPU").count(CONFIG_KEY(CONFIG_FILE))) {
auto ext = config.at("GPU").at(CONFIG_KEY(CONFIG_FILE)).as<std::string>();
core.set_property("GPU", {{CONFIG_KEY(CONFIG_FILE), ext}});
slog::info << "GPU extensions are loaded: " << ext << slog::endl;
2018-11-23 16:19:43 +03:00
}
slog::info << "OpenVINO: " << ov::get_openvino_version() << slog::endl;
2019-08-09 19:02:42 +03:00
slog::info << "Device info: " << slog::endl;
slog::info << core.get_versions(device_name) << slog::endl;
2018-11-23 16:19:43 +03:00
// ----------------- 3. Setting device configuration
// -----------------------------------------------------------
2019-08-09 19:02:42 +03:00
next_step();
auto getDeviceTypeFromName = [](std::string device) -> std::string {
return device.substr(0, device.find_first_of(".("));
};
// Set default values from dumped config
std::set<std::string> default_devices;
for (auto& device : devices) {
auto default_config = config.find(getDeviceTypeFromName(device));
if (default_config != config.end()) {
if (!config.count(device)) {
config[device] = default_config->second;
default_devices.emplace(default_config->first);
}
}
}
for (auto& device : default_devices) {
config.erase(device);
}
bool perf_counts = false;
// check if using the virtual device
auto if_auto = std::find(devices.begin(), devices.end(), "AUTO") != devices.end();
auto if_multi = std::find(devices.begin(), devices.end(), "MULTI") != devices.end();
// Remove the hardware devices if AUTO/MULTI appears in the devices list.
if (if_auto || if_multi) {
devices.clear();
std::string virtual_device;
if (if_auto) {
virtual_device = "AUTO";
devices.push_back("AUTO");
}
if (if_multi) {
virtual_device = "MULTI";
devices.push_back("MULTI");
}
parse_value_for_virtual_device(virtual_device, device_nstreams);
parse_value_for_virtual_device(virtual_device, device_infer_precision);
}
// Update config per device according to command line parameters
for (auto& device : devices) {
auto& device_config = config[device];
OV Performance Hints (CPU and GPU logic for selecting the actual configs), while AUTO/MULTI are passing them thru) (#6993) * rebasing the perf-modes-2021.3 to the 2021.4 Caveats: the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet) (cherry picked from commit 1ae1edc0ed70fdea40f528fdaf8d00a9904d2a5c) * overriding streams (to force the TPUT mode to the DLBenchnark) (cherry picked from commit 7f506cda31abf35ac293d0dce32f602a0188c619) * disabling reducing #streams to fully mimic baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments) (cherry picked from commit 85073dd1dd2c7d43a89c37c8f646313f6ddfc650) * clang/identation (cherry picked from commit 050a4155a923cee294c8689d685b39247b7a172a) * splitting the Transformation to general and CPU specific. Now hopefully,this fully mimics the baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled (cherry picked from commit e98b2c1a67f2542a686543594b75b575ef515196) * disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8 (cherry picked from commit 32b8d80dee18685ebf3d069bb4cd2172af7363b7) * refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations (cherry picked from commit f2b972171b29cf599aae2407ceec2e6adb67e4e9) * isa-based threshold logic (cherry picked from commit b218457e1a93fcb3374eb9da948fdad2175ec33a) * mode->hint (cherry picked from commit ec20aa8ecaf3222f2a6fdfe9153cf6c9dfdd6a54) * optional PERFORMANCE_HINT_NUM_REQUESTS (cherry picked from commit 5a3883e3f36e7928c6391094ae10711c8e4c3b5c) * moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master) (cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776) * AUTO support for PerfHints * MULTI support for PerfHints * Enabling Perf hints for the GPU plugin * brushing settings output a bit * disabling "throughput" perf hint being default (until OV 2.0) * uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default * removing dead and experimental code, and debug printfs * clang/code-style * code-review remarks * Moved the output of the actual params that the hint produced to the right place * aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum * clang * benchmark_app brushing * Update inference-engine/samples/benchmark_app/README.md * propagating the perf hints thru one more scenario in the merged AUTO-MULTI * fixed mispint * Python benchmark_app update for perf hints * addresssing reviewers comments on the python benchmark_app * simplifying/brushing logic a bit * refactor the heuristic to the separate file (to be shared with iGPU soon) * refactor conversion of modes to the specific GPU config per feedback from Vladimir
2021-09-13 15:40:36 +03:00
// high-level performance modes
auto ov_perf_hint = get_performance_hint(device, core);
if (ov_perf_hint != ov::hint::PerformanceMode::UNDEFINED) {
device_config.emplace(ov::hint::performance_mode(ov_perf_hint));
OV Performance Hints (CPU and GPU logic for selecting the actual configs), while AUTO/MULTI are passing them thru) (#6993) * rebasing the perf-modes-2021.3 to the 2021.4 Caveats: the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet) (cherry picked from commit 1ae1edc0ed70fdea40f528fdaf8d00a9904d2a5c) * overriding streams (to force the TPUT mode to the DLBenchnark) (cherry picked from commit 7f506cda31abf35ac293d0dce32f602a0188c619) * disabling reducing #streams to fully mimic baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments) (cherry picked from commit 85073dd1dd2c7d43a89c37c8f646313f6ddfc650) * clang/identation (cherry picked from commit 050a4155a923cee294c8689d685b39247b7a172a) * splitting the Transformation to general and CPU specific. Now hopefully,this fully mimics the baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled (cherry picked from commit e98b2c1a67f2542a686543594b75b575ef515196) * disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8 (cherry picked from commit 32b8d80dee18685ebf3d069bb4cd2172af7363b7) * refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations (cherry picked from commit f2b972171b29cf599aae2407ceec2e6adb67e4e9) * isa-based threshold logic (cherry picked from commit b218457e1a93fcb3374eb9da948fdad2175ec33a) * mode->hint (cherry picked from commit ec20aa8ecaf3222f2a6fdfe9153cf6c9dfdd6a54) * optional PERFORMANCE_HINT_NUM_REQUESTS (cherry picked from commit 5a3883e3f36e7928c6391094ae10711c8e4c3b5c) * moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master) (cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776) * AUTO support for PerfHints * MULTI support for PerfHints * Enabling Perf hints for the GPU plugin * brushing settings output a bit * disabling "throughput" perf hint being default (until OV 2.0) * uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default * removing dead and experimental code, and debug printfs * clang/code-style * code-review remarks * Moved the output of the actual params that the hint produced to the right place * aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum * clang * benchmark_app brushing * Update inference-engine/samples/benchmark_app/README.md * propagating the perf hints thru one more scenario in the merged AUTO-MULTI * fixed mispint * Python benchmark_app update for perf hints * addresssing reviewers comments on the python benchmark_app * simplifying/brushing logic a bit * refactor the heuristic to the separate file (to be shared with iGPU soon) * refactor conversion of modes to the specific GPU config per feedback from Vladimir
2021-09-13 15:40:36 +03:00
if (FLAGS_nireq != 0)
device_config.emplace(ov::hint::num_requests(FLAGS_nireq));
OV Performance Hints (CPU and GPU logic for selecting the actual configs), while AUTO/MULTI are passing them thru) (#6993) * rebasing the perf-modes-2021.3 to the 2021.4 Caveats: the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet) (cherry picked from commit 1ae1edc0ed70fdea40f528fdaf8d00a9904d2a5c) * overriding streams (to force the TPUT mode to the DLBenchnark) (cherry picked from commit 7f506cda31abf35ac293d0dce32f602a0188c619) * disabling reducing #streams to fully mimic baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments) (cherry picked from commit 85073dd1dd2c7d43a89c37c8f646313f6ddfc650) * clang/identation (cherry picked from commit 050a4155a923cee294c8689d685b39247b7a172a) * splitting the Transformation to general and CPU specific. Now hopefully,this fully mimics the baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled (cherry picked from commit e98b2c1a67f2542a686543594b75b575ef515196) * disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8 (cherry picked from commit 32b8d80dee18685ebf3d069bb4cd2172af7363b7) * refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations (cherry picked from commit f2b972171b29cf599aae2407ceec2e6adb67e4e9) * isa-based threshold logic (cherry picked from commit b218457e1a93fcb3374eb9da948fdad2175ec33a) * mode->hint (cherry picked from commit ec20aa8ecaf3222f2a6fdfe9153cf6c9dfdd6a54) * optional PERFORMANCE_HINT_NUM_REQUESTS (cherry picked from commit 5a3883e3f36e7928c6391094ae10711c8e4c3b5c) * moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master) (cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776) * AUTO support for PerfHints * MULTI support for PerfHints * Enabling Perf hints for the GPU plugin * brushing settings output a bit * disabling "throughput" perf hint being default (until OV 2.0) * uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default * removing dead and experimental code, and debug printfs * clang/code-style * code-review remarks * Moved the output of the actual params that the hint produced to the right place * aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum * clang * benchmark_app brushing * Update inference-engine/samples/benchmark_app/README.md * propagating the perf hints thru one more scenario in the merged AUTO-MULTI * fixed mispint * Python benchmark_app update for perf hints * addresssing reviewers comments on the python benchmark_app * simplifying/brushing logic a bit * refactor the heuristic to the separate file (to be shared with iGPU soon) * refactor conversion of modes to the specific GPU config per feedback from Vladimir
2021-09-13 15:40:36 +03:00
}
// Set performance counter
if (isFlagSetInCommandLine("pc")) {
// set to user defined value
device_config.emplace(ov::enable_profiling(FLAGS_pc));
} else if (device_config.count(ov::enable_profiling.name()) &&
(device_config.at(ov::enable_profiling.name()).as<bool>())) {
slog::warn << "Performance counters for " << device
<< " device is turned on. To print results use -pc option." << slog::endl;
} else if (FLAGS_report_type == detailedCntReport || FLAGS_report_type == averageCntReport) {
slog::warn << "Turn on performance counters for " << device << " device since report type is "
<< FLAGS_report_type << "." << slog::endl;
device_config.emplace(ov::enable_profiling(true));
} else if (!FLAGS_exec_graph_path.empty()) {
slog::warn << "Turn on performance counters for " << device << " device due to execution graph dumping."
<< slog::endl;
device_config.emplace(ov::enable_profiling(true));
} else {
// set to default value
device_config.emplace(ov::enable_profiling(FLAGS_pc));
2020-04-13 21:17:23 +03:00
}
perf_counts = (device_config.at(ov::enable_profiling.name()).as<bool>()) ? true : perf_counts;
auto supported_properties = core.get_property(device, ov::supported_properties);
auto supported = [&](const std::string& key) {
return std::find(std::begin(supported_properties), std::end(supported_properties), key) !=
std::end(supported_properties);
};
OV Performance Hints (CPU and GPU logic for selecting the actual configs), while AUTO/MULTI are passing them thru) (#6993) * rebasing the perf-modes-2021.3 to the 2021.4 Caveats: the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet) (cherry picked from commit 1ae1edc0ed70fdea40f528fdaf8d00a9904d2a5c) * overriding streams (to force the TPUT mode to the DLBenchnark) (cherry picked from commit 7f506cda31abf35ac293d0dce32f602a0188c619) * disabling reducing #streams to fully mimic baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments) (cherry picked from commit 85073dd1dd2c7d43a89c37c8f646313f6ddfc650) * clang/identation (cherry picked from commit 050a4155a923cee294c8689d685b39247b7a172a) * splitting the Transformation to general and CPU specific. Now hopefully,this fully mimics the baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled (cherry picked from commit e98b2c1a67f2542a686543594b75b575ef515196) * disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8 (cherry picked from commit 32b8d80dee18685ebf3d069bb4cd2172af7363b7) * refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations (cherry picked from commit f2b972171b29cf599aae2407ceec2e6adb67e4e9) * isa-based threshold logic (cherry picked from commit b218457e1a93fcb3374eb9da948fdad2175ec33a) * mode->hint (cherry picked from commit ec20aa8ecaf3222f2a6fdfe9153cf6c9dfdd6a54) * optional PERFORMANCE_HINT_NUM_REQUESTS (cherry picked from commit 5a3883e3f36e7928c6391094ae10711c8e4c3b5c) * moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master) (cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776) * AUTO support for PerfHints * MULTI support for PerfHints * Enabling Perf hints for the GPU plugin * brushing settings output a bit * disabling "throughput" perf hint being default (until OV 2.0) * uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default * removing dead and experimental code, and debug printfs * clang/code-style * code-review remarks * Moved the output of the actual params that the hint produced to the right place * aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum * clang * benchmark_app brushing * Update inference-engine/samples/benchmark_app/README.md * propagating the perf hints thru one more scenario in the merged AUTO-MULTI * fixed mispint * Python benchmark_app update for perf hints * addresssing reviewers comments on the python benchmark_app * simplifying/brushing logic a bit * refactor the heuristic to the separate file (to be shared with iGPU soon) * refactor conversion of modes to the specific GPU config per feedback from Vladimir
2021-09-13 15:40:36 +03:00
// the rest are individual per-device settings (overriding the values set with perf modes)
auto setThroughputStreams = [&]() {
std::string key = getDeviceTypeFromName(device) + "_THROUGHPUT_STREAMS";
auto it_device_nstreams = device_nstreams.find(device);
if (it_device_nstreams != device_nstreams.end()) {
// set to user defined value
if (supported(key)) {
device_config[key] = it_device_nstreams->second;
} else if (supported(ov::num_streams.name())) {
// Use API 2.0 key for streams
key = ov::num_streams.name();
device_config[key] = it_device_nstreams->second;
} else if (device == "MULTI" || device == "AUTO") {
// check if the element contains the hardware device property
auto value_vec = split(it_device_nstreams->second, ' ');
if (value_vec.size() == 1) {
key = ov::num_streams.name();
device_config[key] = it_device_nstreams->second;
} else {
// set device nstreams properties in the AUTO/MULTI plugin
std::stringstream strm(it_device_nstreams->second);
std::map<std::string, std::string> devices_property;
ov::util::Read<std::map<std::string, std::string>>{}(strm, devices_property);
for (auto it : devices_property) {
device_config.insert(
ov::device::properties(it.first, ov::num_streams(std::stoi(it.second))));
}
}
} else {
throw std::logic_error("Device " + device + " doesn't support config key '" + key + "' " +
"and '" + ov::num_streams.name() + "'!" +
"Please specify -nstreams for correct devices in format "
"<dev1>:<nstreams1>,<dev2>:<nstreams2>" +
" or via configuration file.");
}
} else if (ov_perf_hint == ov::hint::PerformanceMode::UNDEFINED && !device_config.count(key) &&
(FLAGS_api == "async")) {
slog::warn << "-nstreams default value is determined automatically for " << device
<< " device. "
"Although the automatic selection usually provides a "
"reasonable performance, "
"but it still may be non-optimal for some cases, for more "
"information look at README."
<< slog::endl;
if (std::string::npos == device.find("MYRIAD")) { // MYRIAD sets the default number of
// streams implicitly (without _AUTO)
if (supported(key)) {
device_config[key] = std::string(getDeviceTypeFromName(device) + "_THROUGHPUT_AUTO");
} else if (supported(ov::num_streams.name())) {
// Use API 2.0 key for streams
key = ov::num_streams.name();
device_config[key] = ov::streams::AUTO;
}
}
}
auto it_streams = device_config.find(ov::num_streams.name());
if (it_streams != device_config.end())
device_nstreams[device] = it_streams->second.as<std::string>();
};
2020-04-13 21:17:23 +03:00
auto set_infer_precision = [&] {
auto it_device_infer_precision = device_infer_precision.find(device);
if (it_device_infer_precision != device_infer_precision.end()) {
// set to user defined value
if (!supported(ov::hint::inference_precision.name())) {
throw std::logic_error("Device " + device + " doesn't support config key '" +
ov::hint::inference_precision.name() + "'! " +
"Please specify -infer_precision for correct devices in format "
"<dev1>:<infer_precision1>,<dev2>:<infer_precision2>" +
" or via configuration file.");
}
device_config.emplace(ov::hint::inference_precision(it_device_infer_precision->second));
}
};
auto fix_pin_option = [](const std::string& str) -> std::string {
if (str == "NO")
return "NONE";
else if (str == "YES")
return "CORE";
else
return str;
};
if (supported(ov::inference_num_threads.name()) && isFlagSetInCommandLine("nthreads")) {
device_config.emplace(ov::inference_num_threads(FLAGS_nthreads));
}
if (supported(ov::affinity.name()) && isFlagSetInCommandLine("pin")) {
device_config.emplace(ov::affinity(fix_pin_option(FLAGS_pin)));
}
if (device.find("CPU") != std::string::npos || device.find("GPU") != std::string::npos) {
// CPU supports few special performance-oriented keys
// for CPU and GPU execution, more throughput-oriented execution via streams
setThroughputStreams();
set_infer_precision();
} else if (device.find("MYRIAD") != std::string::npos) {
device_config.emplace(ov::log::level(ov::log::Level::WARNING));
setThroughputStreams();
} else if (device.find("GNA") != std::string::npos) {
set_infer_precision();
} else if (device.find("AUTO") != std::string::npos) {
setThroughputStreams();
set_infer_precision();
device_nstreams.erase(device);
} else if (device.find("MULTI") != std::string::npos) {
setThroughputStreams();
set_infer_precision();
if ((device_name.find("GPU") != std::string::npos) && (device_name.find("CPU") != std::string::npos)) {
slog::warn << "GPU throttling is turned on. Multi-device execution with "
2021-06-14 22:47:59 +09:00
"the CPU + GPU performs best with GPU throttling hint, "
<< "which releases another CPU thread (that is otherwise "
"used by the GPU driver for active polling)."
<< slog::endl;
device_config.insert(ov::device::properties("GPU", {{GPU_CONFIG_KEY(PLUGIN_THROTTLE), 1}}));
// limit threading for CPU portion of inference
if (!isFlagSetInCommandLine("pin")) {
auto it_affinity = device_config.find(ov::affinity.name());
if (it_affinity != device_config.end()) {
slog::warn << "Turn off threads pinning for " << device
<< " device since multi-scenario with GPU device is used." << slog::endl;
it_affinity->second = ov::Affinity::NONE;
}
}
2019-10-04 19:26:43 +03:00
}
device_nstreams.erase(device);
2019-08-09 19:02:42 +03:00
}
2018-11-23 16:19:43 +03:00
}
for (auto&& item : config) {
core.set_property(item.first, item.second);
}
2020-02-11 22:48:49 +03:00
size_t batchSize = FLAGS_b;
ov::element::Type type = ov::element::undefined;
2020-02-11 22:48:49 +03:00
std::string topology_name = "";
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
std::vector<benchmark_app::InputsInfo> app_inputs_info;
std::string output_name;
[Caching] Add caching options to benchmark app (#4909) * Python API for LoadNetwork by model file name * BenchmarkApp: Add caching and LoadNetworkFromFile support 2 new options are introduced - cache_dir <dir> - enables models caching - load_from_file - use new perform "LoadNetwork" by model file name Using both parameters will achieve maximum performance of read/load network on startup Tests: 1) Run "benchmark_app -h". Help will display 2 new options. After available devices there will be list of devices with cache support 2) ./benchmark_app -d CPU -i <model.xml> -load_from_file Verify that some test steps are skipped (related to ReadNetwork, re-shaping etc) 3) Pre-requisite: support of caching shall be enabled for Template plugin ./benchmark_app -d TEMPLATE -i <model.onnx> -load_from_file -cache_dir someDir Verify that "someDir" is created and generated blob is available Run again, verify that loading works as well (should be faster as it will not load onnx model) 4) Run same test as (3), but without -load_from_file option. Verify that cache is properly created For some devices loadNetwork time shall be improved when cache is available * Removed additional timing prints * Correction from old code * Revert "Removed additional timing prints" Additional change - when .blob is chosen instead of .xml, it takes priority over caching flags * Removed new time printings As discussed, these time measurements like 'total first inference time' will be available in 'timeTests' scripts * Fix clang-format issues
2021-05-17 13:41:15 +03:00
// Takes priority over config from file
if (!FLAGS_cache_dir.empty()) {
core.set_property(ov::cache_dir(FLAGS_cache_dir));
[Caching] Add caching options to benchmark app (#4909) * Python API for LoadNetwork by model file name * BenchmarkApp: Add caching and LoadNetworkFromFile support 2 new options are introduced - cache_dir <dir> - enables models caching - load_from_file - use new perform "LoadNetwork" by model file name Using both parameters will achieve maximum performance of read/load network on startup Tests: 1) Run "benchmark_app -h". Help will display 2 new options. After available devices there will be list of devices with cache support 2) ./benchmark_app -d CPU -i <model.xml> -load_from_file Verify that some test steps are skipped (related to ReadNetwork, re-shaping etc) 3) Pre-requisite: support of caching shall be enabled for Template plugin ./benchmark_app -d TEMPLATE -i <model.onnx> -load_from_file -cache_dir someDir Verify that "someDir" is created and generated blob is available Run again, verify that loading works as well (should be faster as it will not load onnx model) 4) Run same test as (3), but without -load_from_file option. Verify that cache is properly created For some devices loadNetwork time shall be improved when cache is available * Removed additional timing prints * Correction from old code * Revert "Removed additional timing prints" Additional change - when .blob is chosen instead of .xml, it takes priority over caching flags * Removed new time printings As discussed, these time measurements like 'total first inference time' will be available in 'timeTests' scripts * Fix clang-format issues
2021-05-17 13:41:15 +03:00
}
// If set batch size, disable the auto batching
if (FLAGS_b > 0) {
core.set_property(ov::hint::allow_auto_batching(false));
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
bool isDynamicNetwork = false;
[Caching] Add caching options to benchmark app (#4909) * Python API for LoadNetwork by model file name * BenchmarkApp: Add caching and LoadNetworkFromFile support 2 new options are introduced - cache_dir <dir> - enables models caching - load_from_file - use new perform "LoadNetwork" by model file name Using both parameters will achieve maximum performance of read/load network on startup Tests: 1) Run "benchmark_app -h". Help will display 2 new options. After available devices there will be list of devices with cache support 2) ./benchmark_app -d CPU -i <model.xml> -load_from_file Verify that some test steps are skipped (related to ReadNetwork, re-shaping etc) 3) Pre-requisite: support of caching shall be enabled for Template plugin ./benchmark_app -d TEMPLATE -i <model.onnx> -load_from_file -cache_dir someDir Verify that "someDir" is created and generated blob is available Run again, verify that loading works as well (should be faster as it will not load onnx model) 4) Run same test as (3), but without -load_from_file option. Verify that cache is properly created For some devices loadNetwork time shall be improved when cache is available * Removed additional timing prints * Correction from old code * Revert "Removed additional timing prints" Additional change - when .blob is chosen instead of .xml, it takes priority over caching flags * Removed new time printings As discussed, these time measurements like 'total first inference time' will be available in 'timeTests' scripts * Fix clang-format issues
2021-05-17 13:41:15 +03:00
if (FLAGS_load_from_file && !isNetworkCompiled) {
next_step();
slog::info << "Skipping the step for loading network from file" << slog::endl;
next_step();
slog::info << "Skipping the step for loading network from file" << slog::endl;
next_step();
slog::info << "Skipping the step for loading network from file" << slog::endl;
auto startTime = Time::now();
compiledModel = core.compile_model(FLAGS_m, device_name);
auto duration_ms = get_duration_ms_till_now(startTime);
slog::info << "Load network took " << double_to_string(duration_ms) << " ms" << slog::endl;
slog::info << "Original network I/O parameters:" << slog::endl;
printInputAndOutputsInfoShort(compiledModel);
[Caching] Add caching options to benchmark app (#4909) * Python API for LoadNetwork by model file name * BenchmarkApp: Add caching and LoadNetworkFromFile support 2 new options are introduced - cache_dir <dir> - enables models caching - load_from_file - use new perform "LoadNetwork" by model file name Using both parameters will achieve maximum performance of read/load network on startup Tests: 1) Run "benchmark_app -h". Help will display 2 new options. After available devices there will be list of devices with cache support 2) ./benchmark_app -d CPU -i <model.xml> -load_from_file Verify that some test steps are skipped (related to ReadNetwork, re-shaping etc) 3) Pre-requisite: support of caching shall be enabled for Template plugin ./benchmark_app -d TEMPLATE -i <model.onnx> -load_from_file -cache_dir someDir Verify that "someDir" is created and generated blob is available Run again, verify that loading works as well (should be faster as it will not load onnx model) 4) Run same test as (3), but without -load_from_file option. Verify that cache is properly created For some devices loadNetwork time shall be improved when cache is available * Removed additional timing prints * Correction from old code * Revert "Removed additional timing prints" Additional change - when .blob is chosen instead of .xml, it takes priority over caching flags * Removed new time printings As discussed, these time measurements like 'total first inference time' will be available in 'timeTests' scripts * Fix clang-format issues
2021-05-17 13:41:15 +03:00
if (statistics)
statistics->add_parameters(
StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("load network time (ms)", "load_network_time", duration_ms)});
convert_io_names_in_map(inputFiles, compiledModel.inputs());
app_inputs_info = get_inputs_info(FLAGS_shape,
FLAGS_layout,
batchSize,
FLAGS_data_shape,
inputFiles,
FLAGS_iscale,
FLAGS_imean,
compiledModel.inputs());
[Caching] Add caching options to benchmark app (#4909) * Python API for LoadNetwork by model file name * BenchmarkApp: Add caching and LoadNetworkFromFile support 2 new options are introduced - cache_dir <dir> - enables models caching - load_from_file - use new perform "LoadNetwork" by model file name Using both parameters will achieve maximum performance of read/load network on startup Tests: 1) Run "benchmark_app -h". Help will display 2 new options. After available devices there will be list of devices with cache support 2) ./benchmark_app -d CPU -i <model.xml> -load_from_file Verify that some test steps are skipped (related to ReadNetwork, re-shaping etc) 3) Pre-requisite: support of caching shall be enabled for Template plugin ./benchmark_app -d TEMPLATE -i <model.onnx> -load_from_file -cache_dir someDir Verify that "someDir" is created and generated blob is available Run again, verify that loading works as well (should be faster as it will not load onnx model) 4) Run same test as (3), but without -load_from_file option. Verify that cache is properly created For some devices loadNetwork time shall be improved when cache is available * Removed additional timing prints * Correction from old code * Revert "Removed additional timing prints" Additional change - when .blob is chosen instead of .xml, it takes priority over caching flags * Removed new time printings As discussed, these time measurements like 'total first inference time' will be available in 'timeTests' scripts * Fix clang-format issues
2021-05-17 13:41:15 +03:00
if (batchSize == 0) {
batchSize = 1;
}
[Caching] Add caching options to benchmark app (#4909) * Python API for LoadNetwork by model file name * BenchmarkApp: Add caching and LoadNetworkFromFile support 2 new options are introduced - cache_dir <dir> - enables models caching - load_from_file - use new perform "LoadNetwork" by model file name Using both parameters will achieve maximum performance of read/load network on startup Tests: 1) Run "benchmark_app -h". Help will display 2 new options. After available devices there will be list of devices with cache support 2) ./benchmark_app -d CPU -i <model.xml> -load_from_file Verify that some test steps are skipped (related to ReadNetwork, re-shaping etc) 3) Pre-requisite: support of caching shall be enabled for Template plugin ./benchmark_app -d TEMPLATE -i <model.onnx> -load_from_file -cache_dir someDir Verify that "someDir" is created and generated blob is available Run again, verify that loading works as well (should be faster as it will not load onnx model) 4) Run same test as (3), but without -load_from_file option. Verify that cache is properly created For some devices loadNetwork time shall be improved when cache is available * Removed additional timing prints * Correction from old code * Revert "Removed additional timing prints" Additional change - when .blob is chosen instead of .xml, it takes priority over caching flags * Removed new time printings As discussed, these time measurements like 'total first inference time' will be available in 'timeTests' scripts * Fix clang-format issues
2021-05-17 13:41:15 +03:00
} else if (!isNetworkCompiled) {
// ----------------- 4. Reading the Intermediate Representation network
// ----------------------------------------
2020-02-11 22:48:49 +03:00
next_step();
slog::info << "Loading network files" << slog::endl;
auto startTime = Time::now();
auto model = core.read_model(FLAGS_m);
auto duration_ms = get_duration_ms_till_now(startTime);
slog::info << "Read network took " << double_to_string(duration_ms) << " ms" << slog::endl;
slog::info << "Original network I/O parameters:" << slog::endl;
printInputAndOutputsInfoShort(*model);
2020-02-11 22:48:49 +03:00
if (statistics)
statistics->add_parameters(
StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("read network time (ms)", "read_network_time", duration_ms)});
2020-02-11 22:48:49 +03:00
const auto& inputInfo = std::const_pointer_cast<const ov::Model>(model)->inputs();
2020-02-11 22:48:49 +03:00
if (inputInfo.empty()) {
throw std::logic_error("no inputs info is provided");
}
// ----------------- 5. Resizing network to match image sizes and given
// batch ----------------------------------
[CPU] OneDNN 2.6 migration (#11627) * Migrate on OneDNN 2.7 * [CPU] Enabled brconv implementation * Post ops optimizations * [CPU] Enabled I8 precision on activations for Convolution node * [CPU][WA] Disabled Deconvolution + post ops fusing optimization * Fixed FQ post op optimization * [CPU] Optimize post ops processing * [WA] Add node name if tensor names are empty * [WA] remove layout compatibility chheck that leads to the fase-positive exceptions * [CPU] Optimize processing for FQ + Sum + FQ post ops pattern * [CPU][WA] Enabled ReduceSum -> AvgPool transformation due to perf issues * fix compiler error * rebase onednn master * cherry pick from 2.7 to 2.6 * [WA] make cpu case to run completed * fix xmm zero check * reopen 'FuseDeconvolutionAndSimpleOperation' Transform to fix CPU 'ConvolutionBackpropDataLayerTest' fail issue * [WR] Removed failed the ReduceMean tests caused by 21f3555. * group deconv may crash on memory out of bound * [WA] Remove the moc fail case by #af4731a1 * testcase conv maxpool will check brgconv instead of jit * test subgraph added nhwc format check * fix gemm bf16 win crash * fix avx2 groupconv accuracy problem * [WA] remove invalid FQ tests * WR to disable the LPT multiplyToGroupConv test because the transformation was disabled in d5e16f * add gemm int8 binary postops to fix GroupConvolutionQDqTransformation fail * add gemm int8 binary postops to fix GroupConvolutionQDqTransformation fail * fix gemm bf16 fail * Fix ConcatConvSumInPlaceTest * Add cpuDebugFuncTests target * [WA] bf16 crash due to MemoryInput/Output * OVClassBasicTest case typo * testcase subgraph sets default ENFORCE_BF16 to NO * fix clang check * Fix primType check issue * Fix cpplint error * MemoryInput/Output support bf16; Enforce bf16 'NO' should enable snipptes * disable BF16 fusing fakequant testcase * testcase init support amx check * testcase for conv brgconv avx512/amx * testcase for conv brgconv avx512/amx * WR enforce reorder bug and add NSPC into deconv supported list. * Compiling issue fix. * [WA] skip fakequantize fusing in bf16 * mix legacy/new binary postops * make nightly case run. tested on amx/avx512/avx2. * [CPU] Add BF16 AMX test for Matmul * Add CPU dump check tool * Add verbose log * Generate exec graph in cpu dump check tool * fix binary prelu post Ops * fix cpplint * Update ONEDNN version to fix AVX2 bug. * cpu dump check supports compare dump files * Add a new CPU_DEBUG_CAPS: OV_CPU_SUMMARY_PERF * change VERBOSE_LOG to DEBUG_LOG * fix oneDNN register_jit_code log * fix cpplint * Add OV_CPU_DEBUG_LOG controls debug logs to show * Revert reorder WR. * Enhanced CPU debug logs and breakpoint support * Enhanced cpu_dump_check with --ports * Fix DEBUG_LOG compile issue * GroupDeconvolutionLayerCPUTest extend to add amx test cases * Add Node into DBUEG_LOG * cpu_dump_check: Dump results even no port is specified * FIx MergeTransposeAndReorder for blocked input * Fix cpu_dump_check result names * Enhance DEBUG_LOG on edges * Cpu dump check support shape mismatch * Fix bi-directionl inplace * Cpu dump check support inference_precion_hing f32. * fix windows dump fail. * fix depthwise nwc conv * add rtol arg * win debugbreak * fix pooling accuracy * GroupDeconvolutionLayerCPUTest remove invalid test param for nspc * recover ov onednn fork * revert af4731a1f1e085f959d2612b656b50f75c0fbc98 '[WA] remove layout compatibility chheck' * [WA] disable avx2 conv3d fusing case * [WA] disable avx2 conv3d fusing case * [WA] Disabled weights md transpose in FC to prevent perf degradations Co-authored-by: dmitrygo <dmitry.gorokhov@intel.com> Co-authored-by: Vladislav Golubev <vladislav.golubev@intel.com> Co-authored-by: Zhang Yi3 <yi3.zhang@intel.com> Co-authored-by: liubo-intel <bo4.liu@intel.com> Co-authored-by: Luwei Zhou <luwei.zhou@intel.com> Co-authored-by: Li, Tingqian <tingqian.li@intel.com> Co-authored-by: xuchen-intel <chen.xu@intel.com> Co-authored-by: ceciliapeng2011 <cecilia.peng@intel.com>
2022-06-06 18:30:32 +08:00
for (auto& item : model->inputs()) {
if (item.get_tensor().get_names().empty()) {
item.get_tensor_ptr()->set_names(
std::unordered_set<std::string>{item.get_node_shared_ptr()->get_name()});
}
}
2020-02-11 22:48:49 +03:00
next_step();
convert_io_names_in_map(inputFiles, std::const_pointer_cast<const ov::Model>(model)->inputs());
// Parse input shapes if specified
bool reshape = false;
app_inputs_info = get_inputs_info(FLAGS_shape,
FLAGS_layout,
FLAGS_b,
FLAGS_data_shape,
inputFiles,
FLAGS_iscale,
FLAGS_imean,
inputInfo,
reshape);
if (reshape) {
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
benchmark_app::PartialShapes shapes = {};
for (auto& item : app_inputs_info[0])
shapes[item.first] = item.second.partialShape;
slog::info << "Reshaping network: " << get_shapes_string(shapes) << slog::endl;
startTime = Time::now();
model->reshape(shapes);
duration_ms = get_duration_ms_till_now(startTime);
slog::info << "Reshape network took " << double_to_string(duration_ms) << " ms" << slog::endl;
if (statistics)
statistics->add_parameters(
StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("reshape network time (ms)", "reshape_network_time", duration_ms)});
2020-02-11 22:48:49 +03:00
}
// ----------------- 6. Configuring inputs and outputs
// ----------------------------------------------------------------------
next_step();
auto preproc = ov::preprocess::PrePostProcessor(model);
std::map<std::string, std::string> user_precisions_map;
if (!FLAGS_iop.empty()) {
user_precisions_map = parseArgMap(FLAGS_iop);
convert_io_names_in_map(user_precisions_map,
std::const_pointer_cast<const ov::Model>(model)->inputs(),
std::const_pointer_cast<const ov::Model>(model)->outputs());
}
const auto input_precision = FLAGS_ip.empty() ? ov::element::undefined : getPrecision2(FLAGS_ip);
const auto output_precision = FLAGS_op.empty() ? ov::element::undefined : getPrecision2(FLAGS_op);
const auto& inputs = model->inputs();
for (int i = 0; i < inputs.size(); i++) {
const auto& item = inputs[i];
auto iop_precision = ov::element::undefined;
auto type_to_set = ov::element::undefined;
std::string name;
try {
// Some tensors might have no names, get_any_name will throw exception in that case.
// -iop option will not work for those tensors.
name = item.get_any_name();
iop_precision = getPrecision2(user_precisions_map.at(item.get_any_name()));
} catch (...) {
}
if (iop_precision != ov::element::undefined) {
type_to_set = iop_precision;
} else if (input_precision != ov::element::undefined) {
type_to_set = input_precision;
} else if (!name.empty() && app_inputs_info[0].at(name).is_image()) {
// image input, set U8
type_to_set = ov::element::u8;
}
auto& in = preproc.input(item.get_any_name());
if (type_to_set != ov::element::undefined) {
in.tensor().set_element_type(type_to_set);
if (!name.empty()) {
for (auto& info : app_inputs_info) {
info.at(name).type = type_to_set;
}
}
}
// Explicitly set inputs layout.
if (!name.empty() && !app_inputs_info[0].at(name).layout.empty()) {
in.model().set_layout(app_inputs_info[0].at(name).layout);
}
}
const auto& outs = model->outputs();
for (int i = 0; i < outs.size(); i++) {
const auto& item = outs[i];
auto iop_precision = ov::element::undefined;
try {
// Some tensors might have no names, get_any_name will throw exception in that case.
// -iop option will not work for those tensors.
iop_precision = getPrecision2(user_precisions_map.at(item.get_any_name()));
} catch (...) {
}
if (iop_precision != ov::element::undefined) {
preproc.output(i).tensor().set_element_type(iop_precision);
} else if (output_precision != ov::element::undefined) {
preproc.output(i).tensor().set_element_type(output_precision);
}
}
model = preproc.build();
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
// Check if network has dynamic shapes
auto input_info = app_inputs_info[0];
isDynamicNetwork = std::any_of(input_info.begin(),
input_info.end(),
[](const std::pair<std::string, benchmark_app::InputInfo>& i) {
return i.second.partialShape.is_dynamic();
});
topology_name = model->get_friendly_name();
// Calculate batch size according to provided layout and shapes (static case)
if (!isDynamicNetwork && app_inputs_info.size()) {
batchSize = get_batch_size(app_inputs_info.front());
2020-02-11 22:48:49 +03:00
slog::info << "Network batch size: " << batchSize << slog::endl;
} else if (batchSize == 0) {
batchSize = 1;
2020-02-11 22:48:49 +03:00
}
printInputAndOutputsInfoShort(*model);
// ----------------- 7. Loading the model to the device
// --------------------------------------------------------
2020-02-11 22:48:49 +03:00
next_step();
startTime = Time::now();
compiledModel = core.compile_model(model, device_name);
duration_ms = get_duration_ms_till_now(startTime);
slog::info << "Load network took " << double_to_string(duration_ms) << " ms" << slog::endl;
2020-02-11 22:48:49 +03:00
if (statistics)
statistics->add_parameters(
StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("load network time (ms)", "load_network_time", duration_ms)});
2020-02-11 22:48:49 +03:00
} else {
next_step();
slog::info << "Skipping the step for compiled network" << slog::endl;
next_step();
slog::info << "Skipping the step for compiled network" << slog::endl;
next_step();
slog::info << "Skipping the step for compiled network" << slog::endl;
// ----------------- 7. Loading the model to the device
// --------------------------------------------------------
2020-02-11 22:48:49 +03:00
next_step();
auto startTime = Time::now();
std::ifstream modelStream(FLAGS_m, std::ios_base::binary | std::ios_base::in);
if (!modelStream.is_open()) {
throw std::runtime_error("Cannot open model file " + FLAGS_m);
}
compiledModel = core.import_model(modelStream, device_name, {});
modelStream.close();
auto duration_ms = get_duration_ms_till_now(startTime);
slog::info << "Import network took " << double_to_string(duration_ms) << " ms" << slog::endl;
slog::info << "Original network I/O paramteters:" << slog::endl;
printInputAndOutputsInfoShort(compiledModel);
2020-02-11 22:48:49 +03:00
if (statistics)
statistics->add_parameters(
StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("import network time (ms)", "import_network_time", duration_ms)});
convert_io_names_in_map(inputFiles, compiledModel.inputs());
app_inputs_info = get_inputs_info(FLAGS_shape,
FLAGS_layout,
FLAGS_b,
FLAGS_data_shape,
inputFiles,
FLAGS_iscale,
FLAGS_imean,
compiledModel.inputs());
2020-02-11 22:48:49 +03:00
if (batchSize == 0) {
batchSize = 1;
}
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (isDynamicNetwork && FLAGS_api == "sync") {
throw std::logic_error("Benchmarking of the model with dynamic shapes is available for async API only."
"Please use -api async -nstreams 1 -nireq 1 to emulate sync behavior");
}
// Defining of benchmark mode
// for static models inference only mode is used as default one
bool inferenceOnly = FLAGS_inference_only;
if (isDynamicNetwork) {
if (isFlagSetInCommandLine("inference_only") && inferenceOnly && app_inputs_info.size() != 1) {
throw std::logic_error(
"Dynamic models with different input data shapes must be benchmarked only in full mode.");
}
inferenceOnly = isFlagSetInCommandLine("inference_only") && inferenceOnly && app_inputs_info.size() == 1;
}
OV Performance Hints (CPU and GPU logic for selecting the actual configs), while AUTO/MULTI are passing them thru) (#6993) * rebasing the perf-modes-2021.3 to the 2021.4 Caveats: the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet) (cherry picked from commit 1ae1edc0ed70fdea40f528fdaf8d00a9904d2a5c) * overriding streams (to force the TPUT mode to the DLBenchnark) (cherry picked from commit 7f506cda31abf35ac293d0dce32f602a0188c619) * disabling reducing #streams to fully mimic baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments) (cherry picked from commit 85073dd1dd2c7d43a89c37c8f646313f6ddfc650) * clang/identation (cherry picked from commit 050a4155a923cee294c8689d685b39247b7a172a) * splitting the Transformation to general and CPU specific. Now hopefully,this fully mimics the baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled (cherry picked from commit e98b2c1a67f2542a686543594b75b575ef515196) * disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8 (cherry picked from commit 32b8d80dee18685ebf3d069bb4cd2172af7363b7) * refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations (cherry picked from commit f2b972171b29cf599aae2407ceec2e6adb67e4e9) * isa-based threshold logic (cherry picked from commit b218457e1a93fcb3374eb9da948fdad2175ec33a) * mode->hint (cherry picked from commit ec20aa8ecaf3222f2a6fdfe9153cf6c9dfdd6a54) * optional PERFORMANCE_HINT_NUM_REQUESTS (cherry picked from commit 5a3883e3f36e7928c6391094ae10711c8e4c3b5c) * moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master) (cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776) * AUTO support for PerfHints * MULTI support for PerfHints * Enabling Perf hints for the GPU plugin * brushing settings output a bit * disabling "throughput" perf hint being default (until OV 2.0) * uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default * removing dead and experimental code, and debug printfs * clang/code-style * code-review remarks * Moved the output of the actual params that the hint produced to the right place * aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum * clang * benchmark_app brushing * Update inference-engine/samples/benchmark_app/README.md * propagating the perf hints thru one more scenario in the merged AUTO-MULTI * fixed mispint * Python benchmark_app update for perf hints * addresssing reviewers comments on the python benchmark_app * simplifying/brushing logic a bit * refactor the heuristic to the separate file (to be shared with iGPU soon) * refactor conversion of modes to the specific GPU config per feedback from Vladimir
2021-09-13 15:40:36 +03:00
// ----------------- 8. Querying optimal runtime parameters
// -----------------------------------------------------
2019-08-09 19:02:42 +03:00
next_step();
// output of the actual settings that the device selected
for (const auto& device : devices) {
auto supported_properties = compiledModel.get_property(ov::supported_properties);
slog::info << "Device: " << device << slog::endl;
for (const auto& cfg : supported_properties) {
try {
if (cfg == ov::supported_properties)
continue;
auto prop = compiledModel.get_property(cfg);
slog::info << " { " << cfg << " , " << prop.as<std::string>() << " }" << slog::endl;
} catch (const ov::Exception&) {
}
OV Performance Hints (CPU and GPU logic for selecting the actual configs), while AUTO/MULTI are passing them thru) (#6993) * rebasing the perf-modes-2021.3 to the 2021.4 Caveats: the (explicit) setting #streams is not disabled (as it was before for experiments with DLBenchmark), and the logic slighlty differ (streamsSet) (cherry picked from commit 1ae1edc0ed70fdea40f528fdaf8d00a9904d2a5c) * overriding streams (to force the TPUT mode to the DLBenchnark) (cherry picked from commit 7f506cda31abf35ac293d0dce32f602a0188c619) * disabling reducing #streams to fully mimic baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments) (cherry picked from commit 85073dd1dd2c7d43a89c37c8f646313f6ddfc650) * clang/identation (cherry picked from commit 050a4155a923cee294c8689d685b39247b7a172a) * splitting the Transformation to general and CPU specific. Now hopefully,this fully mimics the baseline c4df94d42d90a2bc3cd91d3d6844ba42f29bca7f of the 2021.3 (before experiments), as the streams reduce num (as well as early exit on GRU/LSTM/TensorIterator) is deisabled (cherry picked from commit e98b2c1a67f2542a686543594b75b575ef515196) * disabling GRU/LSTM/TI + reducing of streams + 5D considered compute-limited only for int8 (cherry picked from commit 32b8d80dee18685ebf3d069bb4cd2172af7363b7) * refactored to avoid compute_limited_ratio, reverted the reducing #streams, removed LSTM from limitations (cherry picked from commit f2b972171b29cf599aae2407ceec2e6adb67e4e9) * isa-based threshold logic (cherry picked from commit b218457e1a93fcb3374eb9da948fdad2175ec33a) * mode->hint (cherry picked from commit ec20aa8ecaf3222f2a6fdfe9153cf6c9dfdd6a54) * optional PERFORMANCE_HINT_NUM_REQUESTS (cherry picked from commit 5a3883e3f36e7928c6391094ae10711c8e4c3b5c) * moving the perfHints to the common OV config class + initial tests (CPU only, as the actual AUTO/MULTI should be accommodated on the master) (cherry picked from commit (then fixed)45bafe7d527f466507dea0693aeed51be4ebf776) * AUTO support for PerfHints * MULTI support for PerfHints * Enabling Perf hints for the GPU plugin * brushing settings output a bit * disabling "throughput" perf hint being default (until OV 2.0) * uncommenting the logic which was disabled to force the DLBenchmark to use the throughput mode by default * removing dead and experimental code, and debug printfs * clang/code-style * code-review remarks * Moved the output of the actual params that the hint produced to the right place * aligning MULTI's GetConfig beh to HETERO's as captured in the preso (CVS-59960) ratified with the ArchForum * clang * benchmark_app brushing * Update inference-engine/samples/benchmark_app/README.md * propagating the perf hints thru one more scenario in the merged AUTO-MULTI * fixed mispint * Python benchmark_app update for perf hints * addresssing reviewers comments on the python benchmark_app * simplifying/brushing logic a bit * refactor the heuristic to the separate file (to be shared with iGPU soon) * refactor conversion of modes to the specific GPU config per feedback from Vladimir
2021-09-13 15:40:36 +03:00
}
}
2019-08-09 19:02:42 +03:00
// Update number of streams
for (auto&& ds : device_nstreams) {
try {
const std::string key = getDeviceTypeFromName(ds.first) + "_THROUGHPUT_STREAMS";
device_nstreams[ds.first] = core.get_property(ds.first, key).as<std::string>();
} catch (const ov::Exception&) {
device_nstreams[ds.first] = core.get_property(ds.first, ov::num_streams.name()).as<std::string>();
}
}
2019-08-09 19:02:42 +03:00
// Number of requests
uint32_t nireq = FLAGS_nireq;
if (nireq == 0) {
2020-04-13 21:17:23 +03:00
if (FLAGS_api == "sync") {
nireq = 1;
} else {
try {
nireq = compiledModel.get_property(ov::optimal_number_of_infer_requests);
} catch (const std::exception& ex) {
throw ov::Exception("Every device used with the benchmark_app should support " +
std::string(ov::optimal_number_of_infer_requests.name()) +
" Failed to query the metric for the " + device_name +
" with error:" + ex.what());
2020-04-13 21:17:23 +03:00
}
2018-11-23 16:19:43 +03:00
}
}
2019-08-09 19:02:42 +03:00
// Iteration limit
uint32_t niter = FLAGS_niter;
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
size_t shape_groups_num = app_inputs_info.size();
2019-08-09 19:02:42 +03:00
if ((niter > 0) && (FLAGS_api == "async")) {
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (shape_groups_num > nireq) {
niter = ((niter + shape_groups_num - 1) / shape_groups_num) * shape_groups_num;
if (FLAGS_niter != niter) {
slog::warn << "Number of iterations was aligned by data shape groups number from " << FLAGS_niter
<< " to " << niter << " using number of possible input shapes " << shape_groups_num
<< slog::endl;
}
} else {
niter = ((niter + nireq - 1) / nireq) * nireq;
if (FLAGS_niter != niter) {
slog::warn << "Number of iterations was aligned by request number from " << FLAGS_niter << " to "
<< niter << " using number of requests " << nireq << slog::endl;
}
2019-08-09 19:02:42 +03:00
}
}
2019-04-12 18:25:53 +03:00
2019-08-09 19:02:42 +03:00
// Time limit
uint32_t duration_seconds = 0;
if (FLAGS_t != 0) {
// time limit
duration_seconds = FLAGS_t;
} else if (FLAGS_niter == 0) {
// default time limit
duration_seconds = device_default_device_duration_in_seconds(device_name);
2019-04-12 18:25:53 +03:00
}
uint64_t duration_nanoseconds = get_duration_in_nanoseconds(duration_seconds);
2019-04-12 18:25:53 +03:00
2019-10-04 19:26:43 +03:00
if (statistics) {
statistics->add_parameters(
StatisticsReport::Category::RUNTIME_CONFIG,
StatisticsReport::Parameters(
{StatisticsVariant("benchmark mode", "benchmark_mode", inferenceOnly ? "inference only" : "full"),
StatisticsVariant("topology", "topology", topology_name),
StatisticsVariant("target device", "target_device", device_name),
StatisticsVariant("API", "api", FLAGS_api),
StatisticsVariant("precision", "precision", type.get_type_name()),
StatisticsVariant("batch size", "batch_size", batchSize),
StatisticsVariant("number of iterations", "iterations_num", niter),
StatisticsVariant("number of parallel infer requests", "nireq", nireq),
StatisticsVariant("duration (ms)", "duration", get_duration_in_milliseconds(duration_seconds))}));
2019-10-04 19:26:43 +03:00
for (auto& nstreams : device_nstreams) {
std::stringstream ss;
ss << "number of " << nstreams.first << " streams";
std::string dev_name = nstreams.first;
std::transform(dev_name.begin(), dev_name.end(), dev_name.begin(), [](unsigned char c) {
return c == ' ' ? '_' : std::tolower(c);
});
statistics->add_parameters(StatisticsReport::Category::RUNTIME_CONFIG,
{StatisticsVariant(ss.str(), dev_name + "_streams_num", nstreams.second)});
2019-10-04 19:26:43 +03:00
}
}
// ----------------- 9. Creating infer requests and filling input blobs
// ----------------------------------------
2019-08-09 19:02:42 +03:00
next_step();
2018-11-23 16:19:43 +03:00
InferRequestsQueue inferRequestsQueue(compiledModel, nireq, app_inputs_info.size(), FLAGS_pcseq);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
bool inputHasName = false;
if (inputFiles.size() > 0) {
inputHasName = inputFiles.begin()->first != "";
}
bool newInputType = isDynamicNetwork || inputHasName;
// create vector to store remote input blobs buffer
std::vector<::gpu::BufferType> clInputsBuffer;
bool useGpuMem = false;
std::map<std::string, ov::TensorVector> inputsData;
if (isFlagSetInCommandLine("use_device_mem")) {
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (device_name.find("GPU") == 0) {
inputsData = ::gpu::get_remote_input_tensors(inputFiles,
app_inputs_info,
compiledModel,
clInputsBuffer,
inferRequestsQueue.requests.size());
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
useGpuMem = true;
} else if (device_name.find("CPU") == 0) {
if (newInputType) {
inputsData = get_tensors(inputFiles, app_inputs_info);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
} else {
inputsData = get_tensors_static_case(
inputFiles.empty() ? std::vector<std::string>{} : inputFiles.begin()->second,
batchSize,
app_inputs_info[0],
nireq);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
} else {
throw ov::Exception("Requested device doesn't support `use_device_mem` option.");
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
} else {
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (newInputType) {
inputsData = get_tensors(inputFiles, app_inputs_info);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
} else {
inputsData = get_tensors_static_case(
inputFiles.empty() ? std::vector<std::string>{} : inputFiles.begin()->second,
batchSize,
app_inputs_info[0],
nireq);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
}
// ----------------- 10. Measuring performance
// ------------------------------------------------------------------
2019-08-09 19:02:42 +03:00
size_t progressCnt = 0;
size_t progressBarTotalCount = progressBarDefaultTotalCount;
size_t iteration = 0;
2019-04-12 18:25:53 +03:00
2019-08-09 19:02:42 +03:00
std::stringstream ss;
ss << "Start inference " << FLAGS_api << "hronously";
2019-08-09 19:02:42 +03:00
if (FLAGS_api == "async") {
if (!ss.str().empty()) {
ss << ", ";
}
ss << nireq << " inference requests";
std::stringstream device_ss;
for (auto& nstreams : device_nstreams) {
if (!device_ss.str().empty()) {
device_ss << ", ";
}
device_ss << nstreams.second << " streams for " << nstreams.first;
}
if (!device_ss.str().empty()) {
ss << " using " << device_ss.str();
2019-04-12 18:25:53 +03:00
}
}
2019-08-09 19:02:42 +03:00
ss << ", limits: ";
if (duration_seconds > 0) {
ss << get_duration_in_milliseconds(duration_seconds) << " ms duration";
2018-11-23 16:19:43 +03:00
}
2019-08-09 19:02:42 +03:00
if (niter != 0) {
if (duration_seconds == 0) {
progressBarTotalCount = niter;
}
if (duration_seconds > 0) {
ss << ", ";
}
ss << niter << " iterations";
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
2019-08-09 19:02:42 +03:00
next_step(ss.str());
2018-11-23 16:19:43 +03:00
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (inferenceOnly) {
slog::info << "BENCHMARK IS IN INFERENCE ONLY MODE." << slog::endl;
slog::info << "Input blobs will be filled once before performance measurements." << slog::endl;
} else {
slog::info << "BENCHMARK IS IN FULL MODE." << slog::endl;
slog::info << "Inputs setup stage will be included in performance measurements." << slog::endl;
}
// copy prepared data straight into inferRequest->getTensor()
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
// for inference only mode
if (inferenceOnly) {
if (nireq < inputsData.begin()->second.size())
slog::warn << "Only " << nireq << " test configs will be used." << slog::endl;
size_t i = 0;
for (auto& inferRequest : inferRequestsQueue.requests) {
auto inputs = app_inputs_info[i % app_inputs_info.size()];
for (auto& item : inputs) {
auto inputName = item.first;
const auto& inputTensor = inputsData.at(inputName)[i % inputsData.at(inputName).size()];
// for remote blobs setTensor is used, they are already allocated on the device
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (useGpuMem) {
inferRequest->set_tensor(inputName, inputTensor);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
} else {
auto requestTensor = inferRequest->get_tensor(inputName);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (isDynamicNetwork) {
requestTensor.set_shape(inputTensor.get_shape());
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
copy_tensor_data(requestTensor, inputTensor);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
}
if (useGpuMem) {
auto outputTensors =
::gpu::get_remote_output_tensors(compiledModel, inferRequest->get_output_cl_buffer());
for (auto& output : compiledModel.outputs()) {
inferRequest->set_tensor(output.get_any_name(), outputTensors[output.get_any_name()]);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
}
++i;
}
}
2019-08-09 19:02:42 +03:00
// warming up - out of scope
auto inferRequest = inferRequestsQueue.get_idle_request();
2019-08-09 19:02:42 +03:00
if (!inferRequest) {
throw ov::Exception("No idle Infer Requests!");
2019-08-09 19:02:42 +03:00
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (!inferenceOnly) {
auto inputs = app_inputs_info[0];
for (auto& item : inputs) {
auto inputName = item.first;
const auto& data = inputsData.at(inputName)[0];
inferRequest->set_tensor(inputName, data);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
if (useGpuMem) {
auto outputTensors =
::gpu::get_remote_output_tensors(compiledModel, inferRequest->get_output_cl_buffer());
for (auto& output : compiledModel.outputs()) {
inferRequest->set_tensor(output.get_any_name(), outputTensors[output.get_any_name()]);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
}
}
2019-04-12 18:25:53 +03:00
if (FLAGS_api == "sync") {
inferRequest->infer();
2019-08-09 19:02:42 +03:00
} else {
inferRequest->start_async();
2019-08-09 19:02:42 +03:00
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
inferRequestsQueue.wait_all();
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
auto duration_ms = inferRequestsQueue.get_latencies()[0];
slog::info << "First inference took " << double_to_string(duration_ms) << " ms" << slog::endl;
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (statistics) {
statistics->add_parameters(
StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("first inference time (ms)", "first_inference_time", duration_ms)});
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
inferRequestsQueue.reset_times();
2019-08-09 19:02:42 +03:00
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
size_t processedFramesN = 0;
2020-02-11 22:48:49 +03:00
auto startTime = Time::now();
2019-08-09 19:02:42 +03:00
auto execTime = std::chrono::duration_cast<ns>(Time::now() - startTime).count();
/** Start inference & calculate performance **/
/** to align number if iterations to guarantee that last infer requests are
* executed in the same conditions **/
2019-08-09 19:02:42 +03:00
ProgressBar progressBar(progressBarTotalCount, FLAGS_stream_output, FLAGS_progress);
while ((niter != 0LL && iteration < niter) ||
(duration_nanoseconds != 0LL && (uint64_t)execTime < duration_nanoseconds) ||
2019-08-09 19:02:42 +03:00
(FLAGS_api == "async" && iteration % nireq != 0)) {
inferRequest = inferRequestsQueue.get_idle_request();
2019-08-09 19:02:42 +03:00
if (!inferRequest) {
throw ov::Exception("No idle Infer Requests!");
2019-08-09 19:02:42 +03:00
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (!inferenceOnly) {
auto inputs = app_inputs_info[iteration % app_inputs_info.size()];
if (FLAGS_pcseq) {
inferRequest->set_latency_group_id(iteration % app_inputs_info.size());
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
if (isDynamicNetwork) {
batchSize = get_batch_size(inputs);
if (!std::any_of(inputs.begin(),
inputs.end(),
[](const std::pair<const std::string, benchmark_app::InputInfo>& info) {
return ov::layout::has_batch(info.second.layout);
})) {
slog::warn
<< "No batch dimension was found, asssuming batch to be 1. Beware: this might affect "
"FPS calculation."
<< slog::endl;
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
for (auto& item : inputs) {
auto inputName = item.first;
const auto& data = inputsData.at(inputName)[iteration % inputsData.at(inputName).size()];
inferRequest->set_tensor(inputName, data);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
if (useGpuMem) {
auto outputTensors =
::gpu::get_remote_output_tensors(compiledModel, inferRequest->get_output_cl_buffer());
for (auto& output : compiledModel.outputs()) {
inferRequest->set_tensor(output.get_any_name(), outputTensors[output.get_any_name()]);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
}
}
2019-08-09 19:02:42 +03:00
if (FLAGS_api == "sync") {
2019-04-12 18:25:53 +03:00
inferRequest->infer();
2018-11-23 16:19:43 +03:00
} else {
// As the inference request is currently idle, the wait() adds no
// additional overhead (and should return immediately). The primary
// reason for calling the method is exception checking/re-throwing.
// Callback, that governs the actual execution can handle errors as
// well, but as it uses just error codes it has no details like what()
// method of `std::exception` So, rechecking for any exceptions here.
2020-02-11 22:48:49 +03:00
inferRequest->wait();
inferRequest->start_async();
2018-11-23 16:19:43 +03:00
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
++iteration;
2018-11-23 16:19:43 +03:00
2019-08-09 19:02:42 +03:00
execTime = std::chrono::duration_cast<ns>(Time::now() - startTime).count();
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
processedFramesN += batchSize;
2019-04-12 18:25:53 +03:00
2019-08-09 19:02:42 +03:00
if (niter > 0) {
progressBar.add_progress(1);
2019-08-09 19:02:42 +03:00
} else {
// calculate how many progress intervals are covered by current
// iteration. depends on the current iteration time and time of each
// progress interval. Previously covered progress intervals must be
// skipped.
2019-08-09 19:02:42 +03:00
auto progressIntervalTime = duration_nanoseconds / progressBarTotalCount;
size_t newProgress = execTime / progressIntervalTime - progressCnt;
progressBar.add_progress(newProgress);
2019-08-09 19:02:42 +03:00
progressCnt += newProgress;
2018-11-23 16:19:43 +03:00
}
2019-08-09 19:02:42 +03:00
}
2018-11-23 16:19:43 +03:00
2019-08-09 19:02:42 +03:00
// wait the latest inference executions
inferRequestsQueue.wait_all();
2019-08-09 19:02:42 +03:00
LatencyMetrics generalLatency(inferRequestsQueue.get_latencies(), "", FLAGS_latency_percentile);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
std::vector<LatencyMetrics> groupLatencies = {};
if (FLAGS_pcseq && app_inputs_info.size() > 1) {
const auto& lat_groups = inferRequestsQueue.get_latency_groups();
for (int i = 0; i < lat_groups.size(); i++) {
const auto& lats = lat_groups[i];
std::string data_shapes_string = "";
for (auto& item : app_inputs_info[i]) {
data_shapes_string += item.first + get_shape_string(item.second.dataShape) + ",";
}
data_shapes_string =
data_shapes_string == "" ? "" : data_shapes_string.substr(0, data_shapes_string.size() - 1);
groupLatencies.emplace_back(lats, data_shapes_string, FLAGS_latency_percentile);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
}
double totalDuration = inferRequestsQueue.get_duration_in_milliseconds();
double fps = (FLAGS_api == "sync") ? batchSize * 1000.0 / generalLatency.median_or_percentile
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
: 1000.0 * processedFramesN / totalDuration;
2019-10-04 19:26:43 +03:00
if (statistics) {
statistics->add_parameters(StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("total execution time (ms)", "execution_time", totalDuration),
StatisticsVariant("total number of iterations", "iterations_num", iteration)});
2019-10-04 19:26:43 +03:00
if (device_name.find("MULTI") == std::string::npos) {
std::string latency_label;
if (FLAGS_latency_percentile == 50) {
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
latency_label = "Median latency (ms)";
} else {
latency_label = "latency (" + std::to_string(FLAGS_latency_percentile) + " percentile) (ms)";
}
statistics->add_parameters(
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant(latency_label, "latency_median", generalLatency.median_or_percentile),
StatisticsVariant("Percentile boundary", "percentile_boundary", FLAGS_latency_percentile),
StatisticsVariant("Average latency (ms)", "latency_avg", generalLatency.avg),
StatisticsVariant("Min latency (ms)", "latency_min", generalLatency.min),
StatisticsVariant("Max latency (ms)", "latency_max", generalLatency.max)});
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (FLAGS_pcseq && app_inputs_info.size() > 1) {
for (size_t i = 0; i < groupLatencies.size(); ++i) {
statistics->add_parameters(
StatisticsReport::Category::EXECUTION_RESULTS_GROUPPED,
{StatisticsVariant("Group Latencies", "group_latencies", groupLatencies[i])});
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
}
2018-11-23 16:19:43 +03:00
}
statistics->add_parameters(StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("throughput", "throughput", fps)});
2019-04-12 18:25:53 +03:00
}
2019-08-09 19:02:42 +03:00
progressBar.finish();
// ----------------- 11. Dumping statistics report
// -------------------------------------------------------------
2019-08-09 19:02:42 +03:00
next_step();
2018-11-23 16:19:43 +03:00
if (!FLAGS_dump_config.empty()) {
dump_config(FLAGS_dump_config, config);
DOCS: ported changes from 2022.1 release branch (#11206) * Extensibility guide with FE extensions and remove OV_FRAMEWORK_MAP from docs * Rework of Extensibility Intro, adopted examples to missing OPENVINO_FRAMEWORK_MAP * Removed OPENVINO_FRAMEWORK_MAP reference * Frontend extension detailed documentation * Fixed distributed snippets * Fixed snippet inclusion in FE extension document and chapter headers * Fixed wrong name in a snippet reference * Fixed test for template extension due to changed number of loaded extensions * Update docs/Extensibility_UG/frontend_extensions.md Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> * Minor fixes in extension snippets * Small grammar fix Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> * DOCS: transition banner (#10973) * transition banner * minor fix * update transition banner * updates * update custom.js * updates * updates * Documentation fixes (#11044) * Benchmark app usage * Fixed link to the devices * More fixes * Update docs/OV_Runtime_UG/multi_device.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Removed several hardcoded links Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Updated documentation for compile_tool (#11049) * Added deployment guide (#11060) * Added deployment guide * Added local distribution * Updates * Fixed more indentations * Removed obsolete code snippets (#11061) * Removed obsolete code snippets * NCC style * Fixed NCC for BA * Add a troubleshooting issue for PRC installation (#11074) * updates * adding gna to linux * add missing reference * update * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * Update docs/install_guides/installing-model-dev-tools.md Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * update * minor updates * add gna item to yum and apt * add gna to get started page * update reference formatting * merge commit * add a troubleshooting issue * update * update * fix CVS-71846 Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> * DOCS: fixed hardcoded links (#11100) * Fixes * Use links * applying reviewers comments to the Opt Guide (#11093) * applying reviewrs comments * fixed refs, more structuring (bold, bullets, etc) * refactoring tput/latency sections * next iteration (mostly latency), also brushed the auto-batching and other sections * updates sync/async images * common opts brushed * WIP tput redesigned * minor brushing of common and auto-batching * Tput fully refactored * fixed doc name in the link * moved int8 perf counters to the right section * fixed links * fixed broken quotes * fixed more links * add ref to the internals to the TOC * Added a note on the batch size Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com> * [80085] New images for docs (#11114) * change doc structure * fix manager tools * fix manager tools 3 step * fix manager tools 3 step * new img * new img for OV Runtime * fix steps * steps * fix intendents * change list * fix space * fix space * code snippets fix * change display * Benchmarks 2022 1 (#11130) * Minor fixes * Updates for 2022.1 * Edits according to the review * Edits according to review comments * Edits according to review comments * Edits according to review comments * Fixed table * Edits according to review comments * Removed config for Intel® Core™ i7-11850HE * Removed forward-tacotron-duration-prediction-241 graph * Added resnet-18-pytorch * Add info about Docker images in Deployment guide (#11136) * Renamed user guides (#11137) * fix screenshot (#11140) * More conservative recommendations on dynamic shapes usage in docs (#11161) * More conservative recommendations about using dynamic shapes * Duplicated statement from C++ part to Python part of reshape doc (no semantical changes) * Update ShapeInference.md (#11168) * Benchmarks 2022 1 updates (#11180) * Updated graphs * Quick fix for TODO in Dynamic Shapes article * Anchor link fixes * Fixed DM config (#11199) * DOCS: doxy sphinxtabs (#11027) * initial implementation of doxy sphinxtabs * fixes * fixes * fixes * fixes * fixes * WA for ignored visibility attribute * Fixes Co-authored-by: Sergey Lyalin <sergey.lyalin@intel.com> Co-authored-by: Ivan Tikhonov <ivan.tikhonov@intel.com> Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com> Co-authored-by: Sergey Lyubimtsev <sergey.lyubimtsev@intel.com> Co-authored-by: Yuan Xu <yuan1.xu@intel.com> Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com> Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com> Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> Co-authored-by: Ilya Naumov <ilya.naumov@intel.com> Co-authored-by: Evgenya Stepyreva <evgenya.stepyreva@intel.com>
2022-03-24 22:27:29 +03:00
slog::info << "OpenVINO Runtime configuration settings were dumped to " << FLAGS_dump_config << slog::endl;
}
2019-04-12 18:25:53 +03:00
if (!FLAGS_exec_graph_path.empty()) {
2019-08-09 19:02:42 +03:00
try {
ov::serialize(compiledModel.get_runtime_model(), FLAGS_exec_graph_path);
2019-08-09 19:02:42 +03:00
slog::info << "executable graph is stored to " << FLAGS_exec_graph_path << slog::endl;
} catch (const std::exception& ex) {
2019-08-09 19:02:42 +03:00
slog::err << "Can't get executable graph: " << ex.what() << slog::endl;
}
}
2019-10-04 19:26:43 +03:00
if (perf_counts) {
std::vector<std::vector<ov::ProfilingInfo>> perfCounts;
2019-08-09 19:02:42 +03:00
for (size_t ireq = 0; ireq < nireq; ireq++) {
auto reqPerfCounts = inferRequestsQueue.requests[ireq]->get_performance_counts();
2019-10-04 19:26:43 +03:00
if (FLAGS_pc) {
2021-02-16 13:08:54 +09:00
slog::info << "Performance counts for " << ireq << "-th infer request:" << slog::endl;
printPerformanceCounts(reqPerfCounts, std::cout, getFullDeviceName(core, FLAGS_d), false);
2019-10-04 19:26:43 +03:00
}
perfCounts.push_back(reqPerfCounts);
}
if (statistics) {
statistics->dump_performance_counters(perfCounts);
2019-08-09 19:02:42 +03:00
}
2018-11-23 16:19:43 +03:00
}
2019-04-12 18:25:53 +03:00
2019-10-04 19:26:43 +03:00
if (statistics)
statistics->dump();
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
// Performance metrics report
slog::info << "Count: " << iteration << " iterations" << slog::endl;
slog::info << "Duration: " << double_to_string(totalDuration) << " ms" << slog::endl;
if (device_name.find("MULTI") == std::string::npos) {
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
slog::info << "Latency: " << slog::endl;
generalLatency.write_to_slog();
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
if (FLAGS_pcseq && app_inputs_info.size() > 1) {
slog::info << "Latency for each data shape group:" << slog::endl;
for (size_t i = 0; i < app_inputs_info.size(); ++i) {
slog::info << (i + 1) << ".";
for (auto& item : app_inputs_info[i]) {
std::stringstream input_shape;
auto shape = item.second.dataShape;
std::copy(shape.begin(), shape.end() - 1, std::ostream_iterator<size_t>(input_shape, ","));
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
input_shape << shape.back();
slog::info << " " << item.first << " : " << get_shape_string(item.second.dataShape);
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
slog::info << slog::endl;
groupLatencies[i].write_to_slog();
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
}
}
}
Dynamic reshapes (#7788) * Merged and compiling * Fix for dynamic shape type * review fixes * renamed blob shape to tensor shape, small improvements * fix code style * added parsing of multiple shapes * store latency per group, add isIdleRequestAvailable() to Infer Queue * added cached random inputs * redesign pipeline, added new metrics(avg, max, min), added metrics per groups * fixed code style * small improvements * modified tensor parameters parsing * modified -i parameter parsing: added possibility to specify input names * implemented image cashing * added cashed blobs creating * added -pcseq flag, modified batch filling, changes fps formula * improvements * code formatting * code formatting2 * apply suggestions from review * replaced Buffer class with InferenceEngine Blobs * use batch size in blobs filling * added shared blob allocator to handle blob's data * fixed warnings & code style * allocate blobs * fix for networks with image info input * added comments & fixed codestyle * clear data in free() in SharedBlobAllocator * remove unnecessary check * Delimeter is changed to :: * stylefix * added layout from string function, small improvements * modified parsing to enable : in input parameters * small fixes * small fixes * added missed blob allocation, fixes * [TEST]added support for remote blobs * fix remote blobs * new inputs/files output format * removed vectors resize which caused bugs * made cl::Buffer type under ifdef, fix inputs filling * changed batch() function to not throwing exceptions * removed unused var * fix code style * replace empty name in input files with name from net input * restored old behaviour for static models * fix code style * fix warning - made const iterator * fix warning - remove reference in loop variable * added random and image_info input types to -i, fix problem with layout * replaced batch() with getBatchSize() in main * fix layout, shape, tensor shape parameters parsing * upd help messages for input, tensor shape and pcseq command * added buffer for cl output blobs, small fixes Signed-off-by: ivikhrev <ivan.vikhrev@intel.com> * added legacy mode * restore setBlob * code style formatting * move collecting latency for groups under flag * removed not applicable layouts * added hint to error message when wrong input name in -tensor_shape was specified * added new metrics to statistics report * Apply suggestions from code review * fix binary blobs filling when layout is CN * apply suggestions * moved file in the right place after rebase * improved -pcseq output * updated args and readme * removed TEMPLATE plugin registration * fix -shape arg decsription * enable providing several -i args as input * renamed legacy_mode to inference_only and made it default for static models, renamed tensor_shape to data_shape * upd readme * use getBlob() in inference only mode * fix old input type for static case * fix typo * upd readme * move log about benchmark mode to the measuring perfomance step * added class for latency metrics * upd readme, fix typos, renamed funcs * fix warning and upd parsing to avoid error with : in file paths * fix error on centos : error: use of deleted function ‘std::basic_stringstream<char>::basic_stringstream(const std::basic_stringstream<char>&) * added check for key in inputs * renamed input to inputs * adjust batch size for binary blobs * replaced warning with exception in bench mode defining * align measurement cycle with master Co-authored-by: ivikhrev <ivan.vikhrev@intel.com>
2021-12-17 12:20:43 +03:00
slog::info << "Throughput: " << double_to_string(fps) << " FPS" << slog::endl;
2018-11-23 16:19:43 +03:00
} catch (const std::exception& ex) {
slog::err << ex.what() << slog::endl;
2019-10-04 19:26:43 +03:00
if (statistics) {
statistics->add_parameters(StatisticsReport::Category::EXECUTION_RESULTS,
{StatisticsVariant("error", "error", ex.what())});
2019-10-04 19:26:43 +03:00
statistics->dump();
}
2018-11-23 16:19:43 +03:00
return 3;
}
return 0;
}