[GPU] Update config api 2.0 (#9649)

This commit is contained in:
Sergey Shlyapnikov 2022-02-03 13:04:36 +03:00 committed by GitHub
parent b34cb55081
commit ccf4f4e420
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
20 changed files with 1135 additions and 505 deletions

View File

@ -5,10 +5,10 @@
.. toctree::
:maxdepth: 1
:hidden:
openvino_docs_IE_DG_supported_plugins_GPU_RemoteBlob_API
@endsphinxdirective
The GPU plugin uses the Intel® Compute Library for Deep Neural Networks (clDNN) to infer deep neural networks.
@ -114,9 +114,9 @@ When specifying key values as raw strings (that is, when using Python API), omit
| `KEY_CACHE_DIR` | `"<cache_dir>"` | `""` | Specifies a directory where compiled OCL binaries can be cached. First model loading generates the cache, and all subsequent LoadNetwork calls use precompiled kernels which significantly improves load time. If empty - caching is disabled |
| `KEY_PERF_COUNT` | `YES` / `NO` | `NO` | Collect performance counters during inference |
| `KEY_CONFIG_FILE` | `"<file1> [<file2> ...]"` | `""` | Load custom layer configuration files |
| `KEY_GPU_MODEL_`<br>`PRIORITY` | `GPU_MODEL_PRIORITY_<HIGH\|LOW>` <br/> `GPU_QUEUE_PRIORITY_<LOW\|HIGH\|MED\|DEFAULT>` <br/> `GPU_HOST_TASK_PRIORITY_<HIGH\|LOW\|ANY>` | `GPU_QUEUE_PRIORITY_DEFAULT` <br/> `\|GPU_HOST_TASK_PRIORITY_ANY` | Specifies two types of priority: host task priority and OpenCL queue priority.<br/><br/>Host task priority is specified by `GPU_HOST_TASK_PRIORITY_[level]` and there are three types of task levels: `HIGH`, `LOW`, and `ANY`. Note that `HIGH` and `LOW` are effective only when tbb is used for multithreading the LoadNetwork workload and the host processor is hybrid type. For hybrid processors, if the task priority type is set as `HIGH` the task will have higher priority for core type selection, and vice versa. If the host processor is not hybrid core or the multi threading is not using tbb, it is set as `ANY`, which is the default type.<br/><br/>OpenCL queue priority is specified by `GPU_QUEUE_PRIORITY_[level]` and there are four types of levels: `HIGH`, `MED`, `LOW`, and `DEFAULT`, where the default value is `DEFAULT`. Before usage, make sure your OpenCL driver supports appropriate extension.<br/><br/>Basically `GPU_MODEL_PRIORITY` can be set as combination of the two priority types, such as<br/>-`GPU_QUEUE_PRIORITY_HIGH\|GPU_HOST_TASK_PRIORITY_HIGH` or<br/>-`GPU_QUEUE_PRIORITY_LOW\|GPU_HOST_TASK_PRIORITY_HIGH`.<br/><br/>Also it can be set as a more abstract level of priority PLUGIN_PRIORIY_[level], which represents combination of the two priorities as follows:<br/>-`GPU_MODEL_PRIORITY_HIGH` : `GPU_QUEUE_PRIORITY_HIGH\|GPU_HOST_TASK_PRIORITY_HIGH`<br/>-`GPU_MODEL_PRIORITY_LOW` : `GPU_QUEUE_PRIORITY_LOW\|GPU_HOST_TASK_PRIORITY_LOW`<br/><br/>The default of `KEY_GPU_MODEL_PRIORITY` is `GPU_QUEUE_PRIORITY_DEFAULT\|GPU_HOST_TASK_PRIORITY_ANY`.<br> |
| `KEY_GPU_HOST_`<br>`TASK_PRIORITY` | `GPU_HOST_TASK_PRIORITY_<HIGH\|MEDIUM\|LOW>` | `GPU_HOST_TASK_PRIORITY_MEDIUM` | This key instructs the GPU plugin which cpu core type of TBB affinity used in load network. <br> This option has 3 types of levels: HIGH, LOW, and ANY. It is only affected on Hybrid CPUs. <br>- LOW - instructs the GPU Plugin to use LITTLE cores if they are available <br>- MEDIUM (DEFAULT) - instructs the GPU Plugin to use any available cores (BIG or LITTLE cores) <br>- HIGH - instructs the GPU Plugin to use BIG cores if they are available |
| `KEY_GPU_PLUGIN_`<br>`PRIORITY` | `<0-3>` | `0` | OpenCL queue priority (before usage, make sure your OpenCL driver supports appropriate extension)<br> Higher value means higher priority for OpenCL queue. 0 disables the setting. **Deprecated**. Please use KEY_GPU_MODEL_PRIORITY |
| `KEY_GPU_PLUGIN_`<br>`THROTTLE` | `<0-3>` | `0` | OpenCL queue throttling (before usage, make sure your OpenCL driver supports appropriate extension)<br> Lower value means lower driver thread priority and longer sleep time for it. 0 disables the setting. |
| `KEY_GPU_PLUGIN_`<br>`THROTTLE` | `<0-3>` | `2` | OpenCL queue throttling (before usage, make sure your OpenCL driver supports appropriate extension)<br> Lower value means lower driver thread priority and longer sleep time for it. Has no effect if the driver does not support reqired hint. |
| `KEY_CLDNN_ENABLE_`<br>`FP16_FOR_QUANTIZED_`<br>`MODELS` | `YES` / `NO` | `YES` | Allows using FP16+INT8 mixed precision mode, so non-quantized parts of a model will be executed in FP16 precision for FP16 IR. Does not affect quantized FP32 IRs |
| `KEY_GPU_NV12_`<br>`TWO_INPUTS` | `YES` / `NO` | `NO` | Controls preprocessing logic for nv12 input. If it's set to YES, then device graph will expect that user will set biplanar nv12 blob as input wich will be directly passed to device execution graph. Otherwise, preprocessing via GAPI is used to convert NV12->BGR, thus GPU graph have to expect single input |
| `KEY_GPU_THROUGHPUT_`<br>`STREAMS` | `KEY_GPU_THROUGHPUT_AUTO`, or positive integer| 1 | Specifies a number of GPU "execution" streams for the throughput mode (upper bound for a number of inference requests that can be executed simultaneously).<br>This option is can be used to decrease GPU stall time by providing more effective load from several streams. Increasing the number of streams usually is more effective for smaller topologies or smaller input sizes. Note that your application should provide enough parallel slack (e.g. running many inference requests) to leverage full GPU bandwidth. Additional streams consume several times more GPU memory, so make sure the system has enough memory available to suit parallel stream execution. Multiple streams might also put additional load on CPU. If CPU load increases, it can be regulated by setting an appropriate `KEY_GPU_PLUGIN_THROTTLE` option value (see above). If your target system has relatively weak CPU, keep throttling low. <br>The default value is 1, which implies latency-oriented behavior.<br>`KEY_GPU_THROUGHPUT_AUTO` creates bare minimum of streams to improve the performance; this is the most portable option if you are not sure how many resources your target machine has (and what would be the optimal number of streams). <br> A positive integer value creates the requested number of streams. |

View File

@ -1,12 +1,12 @@
#include <ie_core.hpp>
#include <openvino/runtime/core.hpp>
#include <openvino/runtime/intel_gpu/properties.hpp>
int main() {
using namespace InferenceEngine;
//! [part0]
InferenceEngine::Core core;
auto network = core.ReadNetwork("sample.xml");
auto exeNetwork = core.LoadNetwork(network, "GPU");
std::map<std::string, uint64_t> statistics_map = core.GetMetric("GPU", GPU_METRIC_KEY(MEMORY_STATISTICS));
ov::Core core;
auto model = core.read_model("sample.xml");
auto compiledModel = core.compile_model(model, "GPU");
std::map<std::string, uint64_t> statistics_map = core.get_property("GPU", ov::intel_gpu::memory_statistics);
//! [part0]
return 0;
}

View File

@ -1,25 +1,25 @@
#include <ie_core.hpp>
#include <openvino/runtime/core.hpp>
#include <openvino/runtime/intel_gpu/properties.hpp>
int main() {
using namespace InferenceEngine;
//! [part1]
InferenceEngine::Core core;
CNNNetwork cnnNetwork = core.ReadNetwork("network.xml");
ov::Core core;
std::shared_ptr<ov::Model> model = core.read_model("network.xml");
uint32_t n_streams = 2;
int64_t available_device_mem_size = 3221225472;
ov::AnyMap options = {
ov::hint::model(model), // Required. Set the address of the target network. If this is not set, the MAX_BATCH_SIZE returns 1.
ov::streams::num(n_streams), // Optional. Set only when you want to estimate max batch size for a specific throughtput streams. Default is 1 or throughtput streams set by SetConfig.
ov::intel_gpu::hint::available_device_mem(available_device_mem_size) // Optional. Set only when you want to limit the available device mem size.
};
std::map<std::string, Parameter> options = {{"MODEL_PTR", cnnNetwork.getFunction()}}; // Required. Set the address of the target network. If this is not set, the MAX_BATCH_SIZE returns 1.
options.insert(std::make_pair("GPU_THROUGHPUT_STREAMS", n_streams)); // Optional. Set only when you want to estimate max batch size for a specific throughtput streams. Default is 1 or throughtput streams set by SetConfig.
options.insert(std::make_pair("AVAILABLE_DEVICE_MEM_SIZE", available_device_mem_size)); // Optional. Set only when you want to limit the available device mem size.
auto max_batch_size = core.GetMetric("GPU", GPU_METRIC_KEY(MAX_BATCH_SIZE), options).as<uint32_t>();
uint32_t max_batch_size = core.get_property("GPU", ov::max_batch_size, options);
//! [part1]
//! [part2]
std::map<std::string, Parameter> opt = {{"MODEL_PTR", cnnNetwork.getFunction()}}; // Required. Same usage as for the MAX_BATCH_SIZE above. If not set, the OPTIONAL_BATCH_SIZE returns 1.
// This is not entirely GPU-specific metric (so METRIC_KEY is used rather than GPU_METRIC_KEY below),
// but the GPU is the only device that supports that at the moment.
// For the GPU, the metric already accommodates limitation for the on-device memory that the MAX_BATCH_SIZE poses.
// so OPTIMAL_BATCH_SIZE is always less than MAX_BATCH_SIZE. Unlike the latter it is also aligned to the power of 2.
auto optimal_batch_size = core.GetMetric("GPU", METRIC_KEY(OPTIMAL_BATCH_SIZE), options).as<unsigned int>();
uint32_t optimal_batch_size = core.get_property("GPU", ov::optimal_batch_size, options);
//! [part2]
}

View File

@ -51,11 +51,6 @@ DECLARE_GPU_METRIC_KEY(EXECUTION_UNITS_COUNT, int);
*/
DECLARE_GPU_METRIC_KEY(MEMORY_STATISTICS, std::map<std::string, uint64_t>);
/**
* @brief Metric to get maximum batch size which does not cause performance degradation due to memory swap impact.
*/
DECLARE_GPU_METRIC_KEY(MAX_BATCH_SIZE, uint32_t);
/**
* @brief Possible return value for OPTIMIZATION_CAPABILITIES metric
* - "HW_MATMUL" - Defines if device has hardware block for matrix multiplication
@ -76,48 +71,6 @@ namespace GPUConfigParams {
#define DECLARE_GPU_CONFIG_KEY(name) DECLARE_CONFIG_KEY(GPU_##name)
#define DECLARE_GPU_CONFIG_VALUE(name) DECLARE_CONFIG_VALUE(GPU_##name)
/**
* @brief This key instructs the GPU plugin to use two priorities of GPU configuration as follows:
* OpenCL queue priority hint as defined in https://www.khronos.org/registry/OpenCL/specs/opencl-2.1-extensions.pdf,
* it has 4 types of levels: High, Med, Low, and Default. the default is Default
* Host task priority which is set cpu core type of TBB affinity used in load network.
* this has 3 types of levels: High, LOW, and ANY. the default is ANY.
* it is only affected on Hybrid CPUs. if the device doesn't support Hybrid CPUs, it is set to the default.
*
* There are two types of setting you can choose from: Model level setting and Queue/Host Task level setting.
* Plugin level setting is the predefined combination of OpenCL queue priority and host task priority.
* It provides only two types of levels: High and Low.
* Queue/Host Task level setting is the combination of OpenCL Queue priority and host task priority
* such as GPU_QUEUE_PRIORITY_HIGH|GPU_HOST_TASK_PRIORITY_HIGH.
* You can set each levels of OpenCL Queue priority and host task priority directly using this setting.
*
* The default value of GPU_MODEL_PRIORITY is "GPI_QUEUE_PRIORITY_DEFAULT|GPU_HOST_TASK_PRIORITY_ANY".
* The detailed option values are as follows:
* Model priority
* GPUConfigParams::GPU_MODEL_PRIORITY_HIGH - GPU_QUEUE_PRIORITY_HIGH|GPU_HOST_TASK_PRIORITY_HIGH
* GPUConfigParams::GPU_MODEL_PRIORITY_LOW - GPU_QUEUE_PRIORITY_LOW|GPU_HOST_TASK_PRIORITY_LOW
* OpenCL queue priority
* GPUConfigParams::GPU_QUEUE_PRIORITY_HIGH - mapped to CL_QUEUE_PRIORITY_HIGH_KHR
* GPUConfigParams::GPU_QUEUE_PRIORITY_MED - mapped to CL_QUEUE_PRIORITY_MED_KHR
* GPUConfigParams::GPU_QUEUE_PRIORITY_LOW - mapped to CL_QUEUE_PRIORITY_LOW_KHR
* GPUConfigParams::GPI_QUEUE_PRIORITY_DEFAULT - Not set queue priority property in cl_queue_properties
* Host task priority
* GPUConfigParams::GPU_HOST_TASK_PRIORITY_HIGH - mapped to IStreamsExecutor::Config::BIG
* GPUConfigParams::GPU_HOST_TASK_PRIORITY_LOW - mapped to IStreamsExecutor::Config::LITTLE
* GPUConfigParams::GPU_HOST_TASK_PRIORITY_ANY - mapped to IStreamsExecutor::Config::ANY
*/
DECLARE_GPU_CONFIG_KEY(MODEL_PRIORITY);
DECLARE_GPU_CONFIG_VALUE(MODEL_PRIORITY_HIGH);
DECLARE_GPU_CONFIG_VALUE(MODEL_PRIORITY_LOW);
DECLARE_GPU_CONFIG_VALUE(QUEUE_PRIORITY_HIGH);
DECLARE_GPU_CONFIG_VALUE(QUEUE_PRIORITY_MED);
DECLARE_GPU_CONFIG_VALUE(QUEUE_PRIORITY_LOW);
DECLARE_GPU_CONFIG_VALUE(QUEUE_PRIORITY_DEFAULT);
DECLARE_GPU_CONFIG_VALUE(HOST_TASK_PRIORITY_HIGH);
DECLARE_GPU_CONFIG_VALUE(HOST_TASK_PRIORITY_LOW);
DECLARE_GPU_CONFIG_VALUE(HOST_TASK_PRIORITY_ANY);
/**
* @brief This key instructs the GPU plugin to use the OpenCL queue priority hint
* as defined in https://www.khronos.org/registry/OpenCL/specs/opencl-2.1-extensions.pdf
@ -134,6 +87,18 @@ DECLARE_GPU_CONFIG_KEY(PLUGIN_PRIORITY);
*/
DECLARE_GPU_CONFIG_KEY(PLUGIN_THROTTLE);
/**
* @brief This key instructs the GPU plugin which cpu core type of TBB affinity used in load network.
* This option has 3 types of levels: HIGH, LOW, and ANY. It is only affected on Hybrid CPUs.
* - LOW - instructs the GPU Plugin to use LITTLE cores if they are available
* - MEDIUM (DEFAULT) - instructs the GPU Plugin to use any available cores (BIG or LITTLE cores)
* - HIGH - instructs the GPU Plugin to use BIG cores if they are available
*/
DECLARE_GPU_CONFIG_KEY(HOST_TASK_PRIORITY);
DECLARE_GPU_CONFIG_VALUE(HOST_TASK_PRIORITY_HIGH);
DECLARE_GPU_CONFIG_VALUE(HOST_TASK_PRIORITY_MEDIUM);
DECLARE_GPU_CONFIG_VALUE(HOST_TASK_PRIORITY_LOW);
/**
* @brief This key should be set to correctly handle NV12 input without pre-processing.
* Turned off by default.

View File

@ -131,6 +131,16 @@ DECLARE_METRIC_KEY(RANGE_FOR_STREAMS, std::tuple<unsigned int, unsigned int>);
*/
DECLARE_METRIC_KEY(OPTIMAL_BATCH_SIZE, unsigned int);
/**
* @brief Metric to get maximum batch size which does not cause performance degradation due to memory swap impact.
*
* Metric returns a value of unsigned int type,
* Note that the returned value may not aligned to power of 2.
* Also, MODEL_PTR is the required option for this metric since the available max batch size depends on the model size.
* If the MODEL_PTR is not given, it will return 1.
*/
DECLARE_METRIC_KEY(MAX_BATCH_SIZE, unsigned int);
/**
* @brief Metric to provide a hint for a range for number of async infer requests. If device supports streams,
* the metric provides range for number of IRs per stream.
@ -241,6 +251,15 @@ namespace PluginConfigParams {
#define CONFIG_VALUE(name) InferenceEngine::PluginConfigParams::name
#define DECLARE_CONFIG_VALUE(name) static constexpr auto name = #name
/**
* @brief (Optional) config key that defines what model should be provided with more performant bounded resource first
* It provides 3 types of levels: High, Medium and Low. The default value is Medium
*/
DECLARE_CONFIG_KEY(MODEL_PRIORITY);
DECLARE_CONFIG_VALUE(MODEL_PRIORITY_HIGH);
DECLARE_CONFIG_VALUE(MODEL_PRIORITY_MED);
DECLARE_CONFIG_VALUE(MODEL_PRIORITY_LOW);
/**
* @brief High-level OpenVINO Performance Hints
* unlike low-level config keys that are individual (per-device), the hints are smth that every device accepts

View File

@ -0,0 +1,106 @@
// Copyright (C) 2022 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief A header for advanced hardware related properties for GPU plugin
* To use in set_property, compile_model, import_model, get_property methods
*
* @file openvino/runtime/intel_gpu/properties.hpp
*/
#pragma once
#include "openvino/runtime/properties.hpp"
namespace ov {
namespace intel_gpu {
/**
* @brief Read-only property which defines size of memory in bytes available for the device. For iGPU it returns host
* memory size, for dGPU - dedicated gpu memory size
*/
static constexpr Property<uint64_t, PropertyMutability::RO> device_total_mem_size{"GPU_DEVICE_TOTAL_MEM_SIZE"};
/**
* @brief Read-only property to get microarchitecture identifier in major.minor.revision format
*/
static constexpr Property<std::string, PropertyMutability::RO> uarch_version{"GPU_UARCH_VERSION"};
/**
* @brief Read-only property to get count of execution units for current GPU
*/
static constexpr Property<int32_t, PropertyMutability::RO> execution_units_count{"GPU_EXECUTION_UNITS_COUNT"};
/**
* @brief Read-only property to get statistics of GPU memory allocated by engine for each allocation type
* It contains information about current memory usage
*/
static constexpr Property<std::map<std::string, uint64_t>, PropertyMutability::RO> memory_statistics{
"GPU_MEMORY_STATISTICS"};
/**
* @brief Turning on this key enables to unroll recurrent layers such as TensorIterator or Loop with fixed iteration
* count. This key is turned on by default. Turning this key on will achieve better inference performance for loops with
* not too many iteration counts (less than 16, as a rule of thumb). Turning this key off will achieve better
* performance for both graph loading time and inference time with many iteration counts (greater than 16). Note that
* turning this key on will increase the graph loading time in proportion to the iteration counts.
* Thus, this key should be turned off if graph loading time is considered to be most important target to optimize.*/
static constexpr Property<bool> enable_loop_unrolling{"GPU_ENABLE_LOOP_UNROLLING"};
namespace hint {
/**
* @brief This enum represents the possible value of ov::intel_gpu::hint::queue_throttle property:
* - LOW is used for CL_QUEUE_THROTTLE_LOW_KHR OpenCL throttle hint
* - MEDIUM (DEFAULT) is used for CL_QUEUE_THROTTLE_MED_KHR OpenCL throttle hint
* - HIGH is used for CL_QUEUE_THROTTLE_HIGH_KHR OpenCL throttle hint
*/
using ThrottleLevel = ov::hint::Priority;
/**
* @brief This key instructs the GPU plugin to use OpenCL queue throttle hints
* as defined in https://www.khronos.org/registry/OpenCL/specs/opencl-2.1-extensions.pdf,
* chapter 9.19. This option should be used with ov::intl_gpu::hint::ThrottleLevel values.
*/
static constexpr Property<ov::hint::Priority> queue_throttle{"GPU_QUEUE_THROTTLE"};
/**
* @brief This key instructs the GPU plugin to use the OpenCL queue priority hint
* as defined in https://www.khronos.org/registry/OpenCL/specs/opencl-2.1-extensions.pdf.
* This option should be used with ov::hint::Priority:
* - LOW is used for CL_QUEUE_PRIORITY_LOW_KHR OpenCL priority hint
* - MEDIUM (DEFAULT) is used for CL_QUEUE_PRIORITY_MED_KHR OpenCL priority hint
* - HIGH is used for CL_QUEUE_PRIORITY_HIGH_KHR OpenCL priority hint
*/
static constexpr Property<ov::hint::Priority> queue_priority{"GPU_QUEUE_PRIORITY"};
/**
* @brief This key instructs the GPU plugin which cpu core type of TBB affinity used in load network.
* This option has 3 types of levels: HIGH, LOW, and ANY. It is only affected on Hybrid CPUs.
* - LOW - instructs the GPU Plugin to use LITTLE cores if they are available
* - MEDIUM (DEFAULT) - instructs the GPU Plugin to use any available cores (BIG or LITTLE cores)
* - HIGH - instructs the GPU Plugin to use BIG cores if they are available
*/
static constexpr Property<ov::hint::Priority> host_task_priority{"OV_GPU_HOST_TASK_PRIORITY"};
/**
* @brief This key identifies available device memory size in bytes
*/
static constexpr Property<int64_t> available_device_mem{"AVAILABLE_DEVICE_MEM_SIZE"};
} // namespace hint
namespace memory_type {
/**
* @brief These keys instruct the GPU plugin to use surface/buffer memory type.
*/
static constexpr auto surface = "GPU_SURFACE"; //!< Native video decoder surface
static constexpr auto buffer = "GPU_BUFFER"; //!< OpenCL buffer
} // namespace memory_type
namespace capability {
/**
* @brief Possible return value for ov::device::capabilities property
*/
constexpr static const auto HW_MATMUL = "GPU_HW_MATMUL"; //!< Device has hardware block for matrix multiplication
} // namespace capability
} // namespace intel_gpu
} // namespace ov

View File

@ -0,0 +1,177 @@
// Copyright (C) 2022 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief A header for properties of shared device contexts and shared device memory blobs for GPU plugin
* To use in constructors of Remote objects
*
* @file openvino/runtime/intel_gpu/remote_properties.hpp
*/
#pragma once
#include "openvino/runtime/properties.hpp"
namespace ov {
namespace intel_gpu {
using gpu_handle_param = void*;
/**
* @brief Enum to define the type of the shared context
*/
enum class ContextType {
OCL = 0, //!< Pure OpenCL context
VA_SHARED = 1, //!< Context shared with a video decoding device
};
/** @cond INTERNAL */
inline std::ostream& operator<<(std::ostream& os, const ContextType& context_type) {
switch (context_type) {
case ContextType::OCL:
return os << "OCL";
case ContextType::VA_SHARED:
return os << "VA_SHARED";
default:
IE_THROW() << "Unsupported context type";
}
}
inline std::istream& operator>>(std::istream& is, ContextType& context_type) {
std::string str;
is >> str;
if (str == "OCL") {
context_type = ContextType::OCL;
} else if (str == "VA_SHARED") {
context_type = ContextType::VA_SHARED;
} else {
IE_THROW() << "Unsupported context type: " + str;
}
return is;
}
/** @endcond */
/**
* @brief Shared device context type: can be either pure OpenCL (OCL)
* or shared video decoder (VA_SHARED) context
*/
static constexpr Property<ContextType> context_type{"CONTEXT_TYPE"};
/**
* @brief This key identifies OpenCL context handle
* in a shared context or shared memory blob parameter map
*/
static constexpr Property<gpu_handle_param> ocl_context{"OCL_CONTEXT"};
/**
* @brief This key identifies ID of device in OpenCL context
* if multiple devices are present in the context
*/
static constexpr Property<int> ocl_context_device_id{"OCL_CONTEXT_DEVICE_ID"};
/**
* @brief In case of multi-tile system,
* this key identifies tile within given context
*/
static constexpr Property<int> tile_id{"TILE_ID"};
/**
* @brief This key identifies OpenCL queue handle in a shared context
*/
static constexpr Property<gpu_handle_param> ocl_queue{"OCL_QUEUE"};
/**
* @brief This key identifies video acceleration device/display handle
* in a shared context or shared memory blob parameter map
*/
static constexpr Property<gpu_handle_param> va_device{"VA_DEVICE"};
/**
* @brief Enum to define the type of the shared memory buffer
*/
enum class SharedMemType {
OCL_BUFFER = 0, //!< Shared OpenCL buffer blob
OCL_IMAGE2D = 1, //!< Shared OpenCL 2D image blob
USM_USER_BUFFER = 2, //!< Shared USM pointer allocated by user
USM_HOST_BUFFER = 3, //!< Shared USM pointer type with host allocation type allocated by plugin
USM_DEVICE_BUFFER = 4, //!< Shared USM pointer type with device allocation type allocated by plugin
VA_SURFACE = 5, //!< Shared video decoder surface or D3D 2D texture blob
DX_BUFFER = 6 //!< Shared D3D buffer blob
};
/** @cond INTERNAL */
inline std::ostream& operator<<(std::ostream& os, const SharedMemType& share_mem_type) {
switch (share_mem_type) {
case SharedMemType::OCL_BUFFER:
return os << "OCL_BUFFER";
case SharedMemType::OCL_IMAGE2D:
return os << "OCL_IMAGE2D";
case SharedMemType::USM_USER_BUFFER:
return os << "USM_USER_BUFFER";
case SharedMemType::USM_HOST_BUFFER:
return os << "USM_HOST_BUFFER";
case SharedMemType::USM_DEVICE_BUFFER:
return os << "USM_DEVICE_BUFFER";
case SharedMemType::VA_SURFACE:
return os << "VA_SURFACE";
case SharedMemType::DX_BUFFER:
return os << "DX_BUFFER";
default:
IE_THROW() << "Unsupported memory type";
}
}
inline std::istream& operator>>(std::istream& is, SharedMemType& share_mem_type) {
std::string str;
is >> str;
if (str == "OCL_BUFFER") {
share_mem_type = SharedMemType::OCL_BUFFER;
} else if (str == "OCL_IMAGE2D") {
share_mem_type = SharedMemType::OCL_IMAGE2D;
} else if (str == "USM_USER_BUFFER") {
share_mem_type = SharedMemType::USM_USER_BUFFER;
} else if (str == "USM_HOST_BUFFER") {
share_mem_type = SharedMemType::USM_HOST_BUFFER;
} else if (str == "USM_DEVICE_BUFFER") {
share_mem_type = SharedMemType::USM_DEVICE_BUFFER;
} else if (str == "VA_SURFACE") {
share_mem_type = SharedMemType::VA_SURFACE;
} else if (str == "DX_BUFFER") {
share_mem_type = SharedMemType::DX_BUFFER;
} else {
IE_THROW() << "Unsupported memory type: " + str;
}
return is;
}
/** @endcond */
/**
* @brief This key identifies type of internal shared memory
* in a shared memory blob parameter map.
*/
static constexpr Property<SharedMemType> shared_mem_type{"SHARED_MEM_TYPE"};
/**
* @brief This key identifies OpenCL memory handle
* in a shared memory blob parameter map
*/
static constexpr Property<gpu_handle_param> mem_handle{"MEM_HANDLE"};
/**
* @brief This key identifies video decoder surface handle
* in a shared memory blob parameter map
*/
#ifdef _WIN32
static constexpr Property<gpu_handle_param> dev_object_handle{"DEV_OBJECT_HANDLE"};
#else
static constexpr Property<uint32_t> dev_object_handle{"DEV_OBJECT_HANDLE"};
#endif
/**
* @brief This key identifies video decoder surface plane
* in a shared memory blob parameter map
*/
static constexpr Property<uint32_t> va_plane{"VA_PLANE"};
} // namespace intel_gpu
} // namespace ov

View File

@ -188,37 +188,38 @@ namespace hint {
static constexpr Property<element::Type, PropertyMutability::RW> inference_precision{"INFERENCE_PRECISION_HINT"};
/**
* @brief Enum to define possible model priorities hints
* @brief Enum to define possible priorities hints
*/
enum class ModelPriority {
enum class Priority {
LOW = 0,
MEDIUM = 1,
HIGH = 2,
DEFAULT = MEDIUM,
};
/** @cond INTERNAL */
inline std::ostream& operator<<(std::ostream& os, const ModelPriority& model_priority) {
switch (model_priority) {
case ModelPriority::LOW:
inline std::ostream& operator<<(std::ostream& os, const Priority& priority) {
switch (priority) {
case Priority::LOW:
return os << "LOW";
case ModelPriority::MEDIUM:
case Priority::MEDIUM:
return os << "MEDIUM";
case ModelPriority::HIGH:
case Priority::HIGH:
return os << "HIGH";
default:
throw ov::Exception{"Unsupported performance measure hint"};
}
}
inline std::istream& operator>>(std::istream& is, ModelPriority& model_priority) {
inline std::istream& operator>>(std::istream& is, Priority& priority) {
std::string str;
is >> str;
if (str == "LOW") {
model_priority = ModelPriority::LOW;
priority = Priority::LOW;
} else if (str == "MEDIUM") {
model_priority = ModelPriority::MEDIUM;
priority = Priority::MEDIUM;
} else if (str == "HIGH") {
model_priority = ModelPriority::HIGH;
priority = Priority::HIGH;
} else {
throw ov::Exception{"Unsupported model priority: " + str};
}
@ -230,7 +231,7 @@ inline std::istream& operator>>(std::istream& is, ModelPriority& model_priority)
* @brief High-level OpenVINO model priority hint
* Defines what model should be provided with more performant bounded resource first
*/
static constexpr Property<ModelPriority> model_priority{"MODEL_PRIORITY"};
static constexpr Property<Priority> model_priority{"OV_MODEL_PRIORITY"};
/**
* @brief Enum to define possible performance mode hints
@ -279,6 +280,12 @@ static constexpr Property<PerformanceMode> performance_mode{"PERFORMANCE_HINT"};
* usually this value comes from the actual use-case (e.g. number of video-cameras, or other sources of inputs)
*/
static constexpr Property<uint32_t> num_requests{"PERFORMANCE_HINT_NUM_REQUESTS"};
/**
* @brief This key identifies shared pointer to the ov::Model, required for some properties (ov::max_batch_size and
* ov::optimal_batch_size)
*/
static constexpr Property<std::shared_ptr<ov::Model>> model{"MODEL_PTR"};
} // namespace hint
/**
@ -385,14 +392,20 @@ static constexpr Property<std::tuple<unsigned int, unsigned int>, PropertyMutabi
*
* Property returns a value of unsigned int type,
* Returns optimal batch size for a given network on the given device. The returned value is aligned to power of 2.
* Also, MODEL_PTR is the required option for this metric since the optimal batch size depends on the model,
* so if the MODEL_PTR is not given, the result of the metric is always 1.
* Also, ov::hint::model is the required option for this metric since the optimal batch size depends on the model,
* so if the ov::hint::model is not given, the result of the metric is always 1.
* For the GPU the metric is queried automatically whenever the OpenVINO performance hint for the throughput is used,
* so that the result (>1) governs the automatic batching (transparently to the application).
* The automatic batching can be disabled with ALLOW_AUTO_BATCHING set to NO
*/
static constexpr Property<unsigned int, PropertyMutability::RO> optimal_batch_size{"OPTIMAL_BATCH_SIZE"};
/**
* @brief Read-only property to get maximum batch size which does not cause performance degradation due to memory swap
* impact.
*/
static constexpr Property<uint32_t, PropertyMutability::RO> max_batch_size{"MAX_BATCH_SIZE"};
/**
* @brief Read-only property to provide a hint for a range for number of async infer requests. If device supports
* streams, the metric provides range for number of IRs per stream.

View File

@ -585,13 +585,13 @@ void MultiDeviceInferencePlugin::CheckConfig(const std::map<std::string, std::st
try {
int priority = -1;
if (kvp.second == "LOW") {
priority = static_cast<int>(ov::hint::ModelPriority::HIGH) - static_cast<int>(ov::hint::ModelPriority::LOW);
priority = static_cast<int>(ov::hint::Priority::HIGH) - static_cast<int>(ov::hint::Priority::LOW);
}
if (kvp.second == "MEDIUM") {
priority = static_cast<int>(ov::hint::ModelPriority::HIGH) - static_cast<int>(ov::hint::ModelPriority::MEDIUM);
priority = static_cast<int>(ov::hint::Priority::HIGH) - static_cast<int>(ov::hint::Priority::MEDIUM);
}
if (kvp.second == "HIGH") {
priority = static_cast<int>(ov::hint::ModelPriority::HIGH) - static_cast<int>(ov::hint::ModelPriority::HIGH);
priority = static_cast<int>(ov::hint::Priority::HIGH) - static_cast<int>(ov::hint::Priority::HIGH);
}
if (priority < 0) {
IE_THROW() << "Unsupported config value: " << kvp.second

View File

@ -19,6 +19,8 @@
#include <utility>
#include <vector>
#include "openvino/runtime/intel_gpu/properties.hpp"
namespace AutoBatchPlugin {
using namespace InferenceEngine;
@ -748,7 +750,8 @@ InferenceEngine::IExecutableNetworkInternal::Ptr AutoBatchInferencePlugin::LoadN
auto report_footprint = [](std::shared_ptr<ICore> pCore, std::string device) -> size_t {
size_t footprint = 0;
// TODO: use the per-network metric (22.2) rather than plugin-level
auto stats = pCore->GetMetric(device, GPU_METRIC_KEY(MEMORY_STATISTICS)).as<std::map<std::string, uint64_t>>();
auto stats =
pCore->GetMetric(device, ov::intel_gpu::memory_statistics.name()).as<std::map<std::string, uint64_t>>();
for (auto s : stats)
if (s.first.find("_current") != std::string::npos)
footprint += s.second;

View File

@ -8,8 +8,9 @@
#include <string>
#include "intel_gpu/plugin/custom_layer.hpp"
#include <ie_performance_hints.hpp>
#include "intel_gpu/graph/network.hpp"
#include "openvino/runtime/intel_gpu/properties.hpp"
#include <ie_performance_hints.hpp>
#include <threading/ie_cpu_streams_executor.hpp>
namespace ov {
@ -28,8 +29,8 @@ struct Config {
enableInt8(true),
nv12_two_inputs(false),
enable_fp16_for_quantized_models(true),
queuePriority(cldnn::priority_mode_types::disabled),
queueThrottle(cldnn::throttle_mode_types::disabled),
queuePriority(cldnn::priority_mode_types::med),
queueThrottle(cldnn::throttle_mode_types::med),
max_dynamic_batch(1),
customLayers({}),
tuningConfig(),
@ -53,6 +54,7 @@ struct Config {
}
void UpdateFromMap(const std::map<std::string, std::string>& configMap);
void adjustKeyMapValues();
static bool isNewApiProperty(std::string property);
std::string device_id;
uint16_t throughput_streams;

View File

@ -87,8 +87,8 @@ struct engine_configuration {
bool enable_profiling = false,
queue_types queue_type = queue_types::out_of_order,
const std::string& sources_dumps_dir = std::string(),
priority_mode_types priority_mode = priority_mode_types::disabled,
throttle_mode_types throttle_mode = throttle_mode_types::disabled,
priority_mode_types priority_mode = priority_mode_types::med,
throttle_mode_types throttle_mode = throttle_mode_types::med,
bool use_memory_pool = true,
bool use_unified_shared_memory = true,
const std::string& kernels_cache_path = "",

View File

@ -8,6 +8,7 @@
#include "intel_gpu/plugin/infer_request.hpp"
#include "intel_gpu/plugin/compiled_model.hpp"
#include "intel_gpu/plugin/async_infer_request.hpp"
#include "openvino/runtime/intel_gpu/properties.hpp"
#include <description_buffer.hpp>
#include <threading/ie_executor_manager.hpp>
@ -145,9 +146,27 @@ InferenceEngine::Parameter CompiledModel::GetConfig(const std::string &name) con
}
InferenceEngine::Parameter CompiledModel::GetMetric(const std::string &name) const {
if (name == METRIC_KEY(NETWORK_NAME)) {
if (name == ov::supported_properties) {
return decltype(ov::supported_properties)::value_type {
// Metrics
ov::PropertyName{ov::supported_properties.name(), PropertyMutability::RO},
ov::PropertyName{ov::model_name.name(), PropertyMutability::RO},
ov::PropertyName{ov::optimal_number_of_infer_requests.name(), PropertyMutability::RO},
// Configs
PropertyName{ov::enable_profiling.name(), PropertyMutability::RW},
PropertyName{ov::hint::model_priority.name(), PropertyMutability::RW},
PropertyName{ov::intel_gpu::hint::host_task_priority.name(), PropertyMutability::RW},
PropertyName{ov::intel_gpu::hint::queue_priority.name(), PropertyMutability::RW},
PropertyName{ov::intel_gpu::hint::queue_throttle.name(), PropertyMutability::RW},
PropertyName{ov::intel_gpu::enable_loop_unrolling.name(), PropertyMutability::RW},
PropertyName{ov::cache_dir.name(), PropertyMutability::RW},
PropertyName{ov::hint::performance_mode.name(), PropertyMutability::RW},
PropertyName{ov::compilation_num_threads.name(), PropertyMutability::RW}
};
} else if (name == ov::model_name) {
IE_ASSERT(!m_graphs.empty());
IE_SET_METRIC_RETURN(NETWORK_NAME, m_graphs[0]->getName());
return decltype(ov::model_name)::value_type {m_graphs[0]->getName()};
} else if (name == METRIC_KEY(SUPPORTED_METRICS)) {
std::vector<std::string> metrics;
metrics.push_back(METRIC_KEY(NETWORK_NAME));
@ -158,13 +177,14 @@ InferenceEngine::Parameter CompiledModel::GetMetric(const std::string &name) con
} else if (name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
std::vector<std::string> configKeys;
for (auto && value : m_config.key_config_map)
configKeys.push_back(value.first);
if (!Config::isNewApiProperty(value.first))
configKeys.push_back(value.first);
IE_SET_METRIC_RETURN(SUPPORTED_CONFIG_KEYS, configKeys);
} else if (name == METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS)) {
} else if (name == ov::optimal_number_of_infer_requests) {
unsigned int nr = m_config.throughput_streams;
if (m_config.perfHintsConfig.ovPerfHint != CONFIG_VALUE(LATENCY))
nr *= 2;
IE_SET_METRIC_RETURN(OPTIMAL_NUMBER_OF_INFER_REQUESTS, nr);
return decltype(ov::optimal_number_of_infer_requests)::value_type {nr};
} else {
IE_THROW() << "Unsupported ExecutableNetwork metric: " << name;
}

View File

@ -11,6 +11,7 @@
#include "file_utils.h"
#include "intel_gpu/plugin/device_config.hpp"
#include "intel_gpu/plugin/itt.hpp"
#include "openvino/runtime/intel_gpu/properties.hpp"
#include <ie_system_conf.h>
#include <thread>
@ -66,7 +67,8 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
const auto hints = perfHintsConfig.SupportedKeys();
if (hints.end() != std::find(hints.begin(), hints.end(), key)) {
perfHintsConfig.SetConfig(key, val);
} else if (key.compare(PluginConfigParams::KEY_PERF_COUNT) == 0) {
} else if (key.compare(PluginConfigParams::KEY_PERF_COUNT) == 0 ||
key == ov::enable_profiling) {
if (val.compare(PluginConfigParams::YES) == 0) {
useProfiling = true;
} else if (val.compare(PluginConfigParams::NO) == 0) {
@ -100,66 +102,62 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
}
switch (uVal) {
case 0:
queuePriority = cldnn::priority_mode_types::disabled;
case 2:
queuePriority = cldnn::priority_mode_types::med;
break;
case 1:
queuePriority = cldnn::priority_mode_types::low;
break;
case 2:
queuePriority = cldnn::priority_mode_types::med;
break;
case 3:
queuePriority = cldnn::priority_mode_types::high;
break;
default:
IE_THROW(ParameterMismatch) << "Unsupported queue priority value: " << uVal;
}
} else if (key.compare(GPUConfigParams::KEY_GPU_MODEL_PRIORITY) == 0) {
bool found_matched_value = false;
if (val.find(GPUConfigParams::GPU_MODEL_PRIORITY_HIGH) != std::string::npos) {
} else if (key == ov::intel_gpu::hint::queue_priority) {
std::stringstream ss(val);
ov::hint::Priority priority;
ss >> priority;
if (priority == ov::hint::Priority::HIGH)
queuePriority = cldnn::priority_mode_types::high;
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::BIG;
found_matched_value = true;
} else if (val.find(GPUConfigParams::GPU_MODEL_PRIORITY_LOW) != std::string::npos) {
else if (priority == ov::hint::Priority::MEDIUM)
queuePriority = cldnn::priority_mode_types::med;
else
queuePriority = cldnn::priority_mode_types::low;
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::LITTLE;
found_matched_value = true;
} else if (key.compare(PluginConfigParams::KEY_MODEL_PRIORITY) == 0 ||
key == ov::hint::model_priority) {
if (key == ov::hint::model_priority) {
std::stringstream ss(val);
ov::hint::Priority priority;
ss >> priority;
switch (priority) {
case ov::hint::Priority::LOW:
queuePriority = cldnn::priority_mode_types::low;
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::LITTLE;
break;
case ov::hint::Priority::MEDIUM:
queuePriority = cldnn::priority_mode_types::med;
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::ANY;
break;
case ov::hint::Priority::HIGH:
queuePriority = cldnn::priority_mode_types::high;
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::BIG;
break;
}
} else {
if (val.find(GPUConfigParams::GPU_QUEUE_PRIORITY_HIGH) != std::string::npos) {
if (val.find(PluginConfigParams::MODEL_PRIORITY_HIGH) != std::string::npos) {
queuePriority = cldnn::priority_mode_types::high;
found_matched_value = true;
} else if (val.find(GPUConfigParams::GPU_QUEUE_PRIORITY_MED) != std::string::npos) {
queuePriority = cldnn::priority_mode_types::med;
found_matched_value = true;
} else if (val.find(GPUConfigParams::GPU_QUEUE_PRIORITY_LOW) != std::string::npos) {
queuePriority = cldnn::priority_mode_types::low;
found_matched_value = true;
} else if (val.find(GPUConfigParams::GPU_QUEUE_PRIORITY_DEFAULT) != std::string::npos) {
queuePriority = cldnn::priority_mode_types::disabled;
found_matched_value = true;
} else { // default is disabled
queuePriority = cldnn::priority_mode_types::disabled;
}
if (val.find(GPUConfigParams::GPU_HOST_TASK_PRIORITY_HIGH) != std::string::npos) {
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::BIG;
found_matched_value = true;
} else if (val.find(GPUConfigParams::GPU_HOST_TASK_PRIORITY_LOW) != std::string::npos) {
} else if (val.find(PluginConfigParams::MODEL_PRIORITY_LOW) != std::string::npos) {
queuePriority = cldnn::priority_mode_types::low;
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::LITTLE;
found_matched_value = true;
} else if (val.find(GPUConfigParams::GPU_HOST_TASK_PRIORITY_ANY) != std::string::npos) {
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::ANY;
found_matched_value = true;
} else { // default is any
} else if (val.find(PluginConfigParams::MODEL_PRIORITY_MED) != std::string::npos) {
queuePriority = cldnn::priority_mode_types::med;
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::ANY;
} else {
IE_THROW() << "Not found appropriate value for config key " << PluginConfigParams::KEY_MODEL_PRIORITY << ".\n";
}
}
if (!found_matched_value) {
IE_THROW() << "Not found appropriate value for property key " << GPUConfigParams::KEY_GPU_PLUGIN_PRIORITY
<< ".\n Expected Plugin priority such as GPU_PLUGIN_PRIORITY_HIGH / GPU_PLUGIN_PRIORITY_LOW or\n"
<< " Combination of queue priority(HIGH, MED, LOW, and DISABLED) and host task priority(HIGH, LOW, and ANY)"
<< " such as GPU_QUEUE_PRIORITY_HIGH | GPU_HOST_TASK_PRIORITY_HIGH";
}
if (getAvailableCoresTypes().size() > 1) {
if (task_exec_config._threadPreferredCoreType == IStreamsExecutor::Config::BIG
|| task_exec_config._threadPreferredCoreType == IStreamsExecutor::Config::LITTLE) {
@ -181,20 +179,28 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
}
switch (uVal) {
case 0:
queueThrottle = cldnn::throttle_mode_types::disabled;
case 2:
queueThrottle = cldnn::throttle_mode_types::med;
break;
case 1:
queueThrottle = cldnn::throttle_mode_types::low;
break;
case 2:
queueThrottle = cldnn::throttle_mode_types::med;
break;
case 3:
queueThrottle = cldnn::throttle_mode_types::high;
break;
default:
IE_THROW(ParameterMismatch) << "Unsupported queue throttle value: " << uVal;
}
} else if (key == ov::intel_gpu::hint::queue_throttle) {
std::stringstream ss(val);
ov::intel_gpu::hint::ThrottleLevel throttle;
ss >> throttle;
if (throttle == ov::intel_gpu::hint::ThrottleLevel::HIGH)
queueThrottle = cldnn::throttle_mode_types::high;
else if (throttle == ov::intel_gpu::hint::ThrottleLevel::MEDIUM)
queueThrottle = cldnn::throttle_mode_types::med;
else
queueThrottle = cldnn::throttle_mode_types::low;
} else if (key.compare(PluginConfigParams::KEY_CONFIG_FILE) == 0) {
std::stringstream ss(val);
std::istream_iterator<std::string> begin(ss);
@ -232,7 +238,8 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
graph_dumps_dir = val;
createDirectory(graph_dumps_dir);
}
} else if (key.compare(PluginConfigParams::KEY_CACHE_DIR) == 0) {
} else if (key.compare(PluginConfigParams::KEY_CACHE_DIR) == 0 ||
key == ov::cache_dir) {
if (!val.empty()) {
kernels_cache_dir = val;
createDirectory(kernels_cache_dir);
@ -250,7 +257,8 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
} else {
IE_THROW(NotFound) << "Unsupported property value by plugin: " << val;
}
} else if (key.compare(PluginConfigParams::KEY_GPU_THROUGHPUT_STREAMS) == 0) {
} else if (key.compare(PluginConfigParams::KEY_GPU_THROUGHPUT_STREAMS) == 0 ||
key == ov::streams::num) {
if (val.compare(PluginConfigParams::GPU_THROUGHPUT_AUTO) == 0) {
throughput_streams = GetDefaultNStreamsForThroughputMode();
} else {
@ -265,13 +273,14 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
if (val_i > 0)
throughput_streams = static_cast<uint16_t>(val_i);
}
} else if (key.compare(PluginConfigParams::KEY_DEVICE_ID) == 0) {
} else if (key.compare(PluginConfigParams::KEY_DEVICE_ID) == 0 ||
key == ov::device::id) {
// Validate if passed value is postivie number.
try {
int val_i = std::stoi(val);
(void)val_i;
} catch (const std::exception&) {
IE_THROW() << "Wrong value for property key " << PluginConfigParams::KEY_DEVICE_ID
IE_THROW() << "Wrong value for property key " << ov::device::id.name()
<< ". DeviceIDs are only represented by positive numbers";
}
// Set this value.
@ -301,7 +310,8 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
} else {
IE_THROW(NotFound) << "Unsupported KEY_CLDNN_ENABLE_FP16_FOR_QUANTIZED_MODELS flag value: " << val;
}
} else if (key.compare(GPUConfigParams::KEY_GPU_MAX_NUM_THREADS) == 0) {
} else if (key.compare(GPUConfigParams::KEY_GPU_MAX_NUM_THREADS) == 0 ||
key == ov::compilation_num_threads) {
int max_threads = std::max(1, static_cast<int>(std::thread::hardware_concurrency()));
try {
int val_i = std::stoi(val);
@ -314,7 +324,8 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
<< "\nSpecify the number of threads use for build as an integer."
<< "\nOut of range value will be set as a default value, maximum concurrent threads.";
}
} else if (key.compare(GPUConfigParams::KEY_GPU_ENABLE_LOOP_UNROLLING) == 0) {
} else if (key.compare(GPUConfigParams::KEY_GPU_ENABLE_LOOP_UNROLLING) == 0 ||
key == ov::intel_gpu::enable_loop_unrolling) {
if (val.compare(PluginConfigParams::YES) == 0) {
enable_loop_unrolling = true;
} else if (val.compare(PluginConfigParams::NO) == 0) {
@ -322,6 +333,27 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
} else {
IE_THROW(ParameterMismatch) << "Unsupported KEY_GPU_ENABLE_LOOP_UNROLLING flag value: " << val;
}
} else if (key == ov::intel_gpu::hint::host_task_priority) {
std::stringstream ss(val);
ov::hint::Priority priority;
ss >> priority;
if (priority == ov::hint::Priority::HIGH)
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::BIG;
else if (priority == ov::hint::Priority::LOW)
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::LITTLE;
else if (priority == ov::hint::Priority::MEDIUM)
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::ANY;
else
IE_THROW(NotFound) << "Unsupported host task priority by plugin: " << val;
} else if (key.compare(GPUConfigParams::KEY_GPU_HOST_TASK_PRIORITY) == 0) {
if (val.compare(GPUConfigParams::GPU_HOST_TASK_PRIORITY_HIGH) == 0)
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::BIG;
else if (val.compare(GPUConfigParams::GPU_HOST_TASK_PRIORITY_MEDIUM) == 0)
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::ANY;
else if (val.compare(GPUConfigParams::GPU_HOST_TASK_PRIORITY_LOW) == 0)
task_exec_config._threadPreferredCoreType = IStreamsExecutor::Config::LITTLE;
else
IE_THROW(NotFound) << "Unsupported host task priority by plugin: " << val;
} else {
IE_THROW(NotFound) << "Unsupported property key by plugin: " << key;
}
@ -332,10 +364,13 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& configMap)
void Config::adjustKeyMapValues() {
OV_ITT_SCOPED_TASK(itt::domains::intel_gpu_plugin, "Config::AdjustKeyMapValues");
if (useProfiling)
if (useProfiling) {
key_config_map[PluginConfigParams::KEY_PERF_COUNT] = PluginConfigParams::YES;
else
key_config_map[ov::enable_profiling.name()] = PluginConfigParams::YES;
} else {
key_config_map[PluginConfigParams::KEY_PERF_COUNT] = PluginConfigParams::NO;
key_config_map[ov::enable_profiling.name()] = PluginConfigParams::NO;
}
if (dumpCustomKernels)
key_config_map[PluginConfigParams::KEY_DUMP_KERNELS] = PluginConfigParams::YES;
@ -371,25 +406,19 @@ void Config::adjustKeyMapValues() {
key_config_map[CLDNNConfigParams::KEY_CLDNN_ENABLE_FP16_FOR_QUANTIZED_MODELS] = PluginConfigParams::NO;
{
std::stringstream s;
if (queuePriority == cldnn::priority_mode_types::high && task_exec_config._threadPreferredCoreType == IStreamsExecutor::Config::BIG) {
key_config_map[GPUConfigParams::KEY_GPU_MODEL_PRIORITY] = GPUConfigParams::GPU_MODEL_PRIORITY_HIGH;
s << ov::hint::Priority::HIGH;
key_config_map[ov::hint::model_priority.name()] = s.str();
key_config_map[PluginConfigParams::KEY_MODEL_PRIORITY] = PluginConfigParams::MODEL_PRIORITY_HIGH;
} else if (queuePriority == cldnn::priority_mode_types::low && task_exec_config._threadPreferredCoreType == IStreamsExecutor::Config::LITTLE) {
key_config_map[GPUConfigParams::KEY_GPU_MODEL_PRIORITY] = GPUConfigParams::GPU_MODEL_PRIORITY_LOW;
} else {
std::string val_plugin_priority;
switch (queuePriority) {
case cldnn::priority_mode_types::low: val_plugin_priority = GPUConfigParams::GPU_QUEUE_PRIORITY_LOW; break;
case cldnn::priority_mode_types::med: val_plugin_priority = GPUConfigParams::GPU_QUEUE_PRIORITY_MED; break;
case cldnn::priority_mode_types::high: val_plugin_priority = GPUConfigParams::GPU_QUEUE_PRIORITY_HIGH; break;
default: val_plugin_priority = GPUConfigParams::GPU_QUEUE_PRIORITY_DEFAULT; break;
}
val_plugin_priority += "|";
switch (task_exec_config._threadPreferredCoreType) {
case IStreamsExecutor::Config::LITTLE: val_plugin_priority += GPUConfigParams::GPU_HOST_TASK_PRIORITY_HIGH; break;
case IStreamsExecutor::Config::BIG: val_plugin_priority += GPUConfigParams::GPU_HOST_TASK_PRIORITY_LOW; break;
case IStreamsExecutor::Config::ANY:default: val_plugin_priority += GPUConfigParams::GPU_HOST_TASK_PRIORITY_ANY; break;
}
key_config_map[GPUConfigParams::KEY_GPU_PLUGIN_PRIORITY] = val_plugin_priority;
s << ov::hint::Priority::LOW;
key_config_map[ov::hint::model_priority.name()] = s.str();
key_config_map[PluginConfigParams::KEY_MODEL_PRIORITY] = PluginConfigParams::MODEL_PRIORITY_LOW;
} else if (queuePriority == cldnn::priority_mode_types::med && task_exec_config._threadPreferredCoreType == IStreamsExecutor::Config::ANY) {
s << ov::hint::Priority::MEDIUM;
key_config_map[ov::hint::model_priority.name()] = s.str();
key_config_map[PluginConfigParams::KEY_MODEL_PRIORITY] = PluginConfigParams::MODEL_PRIORITY_MED;
}
}
{
@ -403,6 +432,16 @@ void Config::adjustKeyMapValues() {
key_config_map[CLDNNConfigParams::KEY_CLDNN_PLUGIN_PRIORITY] = qp;
key_config_map[GPUConfigParams::KEY_GPU_PLUGIN_PRIORITY] = qp;
}
{
std::stringstream ss;
if (queuePriority == cldnn::priority_mode_types::high)
ss << ov::hint::Priority::HIGH;
else if (queuePriority == cldnn::priority_mode_types::low)
ss << ov::hint::Priority::LOW;
else
ss << ov::hint::Priority::MEDIUM;
key_config_map[ov::intel_gpu::hint::queue_priority.name()] = ss.str();
}
{
std::string qt = "0";
switch (queueThrottle) {
@ -414,6 +453,32 @@ void Config::adjustKeyMapValues() {
key_config_map[CLDNNConfigParams::KEY_CLDNN_PLUGIN_THROTTLE] = qt;
key_config_map[GPUConfigParams::KEY_GPU_PLUGIN_THROTTLE] = qt;
}
{
std::stringstream ss;
if (queueThrottle == cldnn::throttle_mode_types::high)
ss << ov::intel_gpu::hint::ThrottleLevel::HIGH;
else if (queueThrottle == cldnn::throttle_mode_types::low)
ss << ov::intel_gpu::hint::ThrottleLevel::LOW;
else
ss << ov::intel_gpu::hint::ThrottleLevel::MEDIUM;
key_config_map[ov::intel_gpu::hint::queue_throttle.name()] = ss.str();
}
{
std::stringstream s;
if (task_exec_config._threadPreferredCoreType == IStreamsExecutor::Config::LITTLE) {
s << ov::hint::Priority::LOW;
key_config_map[ov::intel_gpu::hint::host_task_priority.name()] = s.str();
key_config_map[GPUConfigParams::KEY_GPU_HOST_TASK_PRIORITY] = GPUConfigParams::GPU_HOST_TASK_PRIORITY_LOW;
} else if (task_exec_config._threadPreferredCoreType == IStreamsExecutor::Config::BIG) {
s << ov::hint::Priority::HIGH;
key_config_map[ov::intel_gpu::hint::host_task_priority.name()] = s.str();
key_config_map[GPUConfigParams::KEY_GPU_HOST_TASK_PRIORITY] = GPUConfigParams::GPU_HOST_TASK_PRIORITY_HIGH;
} else {
s << ov::hint::Priority::MEDIUM;
key_config_map[ov::intel_gpu::hint::host_task_priority.name()] = s.str();
key_config_map[GPUConfigParams::KEY_GPU_HOST_TASK_PRIORITY] = GPUConfigParams::GPU_HOST_TASK_PRIORITY_MEDIUM;
}
}
{
std::string tm = PluginConfigParams::TUNING_DISABLED;
switch (tuningConfig.mode) {
@ -430,21 +495,45 @@ void Config::adjustKeyMapValues() {
key_config_map[CLDNNConfigParams::KEY_CLDNN_GRAPH_DUMPS_DIR] = graph_dumps_dir;
key_config_map[CLDNNConfigParams::KEY_CLDNN_SOURCES_DUMPS_DIR] = sources_dumps_dir;
key_config_map[PluginConfigParams::KEY_CACHE_DIR] = kernels_cache_dir;
key_config_map[ov::cache_dir.name()] = kernels_cache_dir;
key_config_map[PluginConfigParams::KEY_GPU_THROUGHPUT_STREAMS] = std::to_string(throughput_streams);
key_config_map[PluginConfigParams::KEY_DEVICE_ID] = device_id;
key_config_map[PluginConfigParams::KEY_CONFIG_FILE] = "";
key_config_map[GPUConfigParams::KEY_GPU_MAX_NUM_THREADS] = std::to_string(task_exec_config._streams);
key_config_map[ov::streams::num.name()] = std::to_string(throughput_streams);
if (enable_loop_unrolling)
key_config_map[PluginConfigParams::KEY_DEVICE_ID] = device_id;
key_config_map[ov::device::id.name()] = device_id;
key_config_map[PluginConfigParams::KEY_CONFIG_FILE] = "";
key_config_map[GPUConfigParams::KEY_GPU_MAX_NUM_THREADS] = std::to_string(task_exec_config._streams);
key_config_map[ov::compilation_num_threads.name()] = std::to_string(task_exec_config._streams);
if (enable_loop_unrolling) {
key_config_map[GPUConfigParams::KEY_GPU_ENABLE_LOOP_UNROLLING] = PluginConfigParams::YES;
else
key_config_map[ov::intel_gpu::enable_loop_unrolling.name()] = PluginConfigParams::YES;
} else {
key_config_map[GPUConfigParams::KEY_GPU_ENABLE_LOOP_UNROLLING] = PluginConfigParams::NO;
key_config_map[ov::intel_gpu::enable_loop_unrolling.name()] = PluginConfigParams::NO;
}
key_config_map[PluginConfigParams::KEY_PERFORMANCE_HINT]= perfHintsConfig.ovPerfHint;
key_config_map[ov::hint::performance_mode.name()]= perfHintsConfig.ovPerfHint;
key_config_map[PluginConfigParams::KEY_PERFORMANCE_HINT_NUM_REQUESTS] =
std::to_string(perfHintsConfig.ovPerfHintNumRequests);
}
bool Config::isNewApiProperty(std::string property) {
static const std::set<std::string> new_api_keys {
ov::hint::model_priority.name(),
ov::intel_gpu::hint::host_task_priority.name(),
ov::intel_gpu::hint::queue_priority.name(),
ov::intel_gpu::hint::queue_throttle.name(),
ov::compilation_num_threads.name(),
};
return new_api_keys.find(property) != new_api_keys.end();
}
void Configs::CreateConfig(std::string device_id) {
if (configs.find(device_id) == configs.end()) {
configs.emplace(device_id, Config(device_id));

View File

@ -16,6 +16,7 @@
#include <ie_ngraph_utils.hpp>
#include <ie_algorithm.hpp>
#include "openvino/runtime/intel_gpu/properties.hpp"
#include "intel_gpu/plugin/plugin.hpp"
#include "intel_gpu/plugin/compiled_model.hpp"
#include "intel_gpu/plugin/transformations_pipeline.hpp"
@ -23,6 +24,7 @@
#include "intel_gpu/plugin/itt.hpp"
#include "gpu/gpu_config.hpp"
#include "cpp_interfaces/interface/ie_internal_plugin_config.hpp"
#include "ie_icore.hpp"
#include <transformations/rt_info/fused_names_attribute.hpp>
@ -328,9 +330,11 @@ InferenceEngine::RemoteContext::Ptr Plugin::GetDefaultContext(const AnyMap& para
}
void Plugin::SetConfig(const std::map<std::string, std::string> &config) {
streamsSet = (config.find(PluginConfigParams::KEY_GPU_THROUGHPUT_STREAMS) != config.end());
streamsSet = config.find(PluginConfigParams::KEY_GPU_THROUGHPUT_STREAMS) != config.end() ||
config.find(ov::streams::num.name()) != config.end();
throttlingSet = config.find(GPUConfigParams::KEY_GPU_PLUGIN_THROTTLE) != config.end() ||
config.find(CLDNNConfigParams::KEY_CLDNN_PLUGIN_THROTTLE) != config.end();
config.find(CLDNNConfigParams::KEY_CLDNN_PLUGIN_THROTTLE) != config.end() ||
config.find(ov::intel_gpu::hint::queue_throttle.name()) != config.end();
std::string device_id;
if (config.find(PluginConfigInternalParams::KEY_CONFIG_DEVICE_ID) != config.end()) {
device_id = config.at(PluginConfigInternalParams::KEY_CONFIG_DEVICE_ID);
@ -521,8 +525,8 @@ Parameter Plugin::GetConfig(const std::string& name, const std::map<std::string,
Parameter result;
std::string device_id;
if (options.find(PluginConfigParams::KEY_DEVICE_ID) != options.end()) {
device_id = options.find(PluginConfigParams::KEY_DEVICE_ID)->second.as<std::string>();
if (options.find(ov::device::id.name()) != options.end()) {
device_id = options.find(ov::device::id.name())->second.as<std::string>();
}
Config config = _impl->m_configs.GetConfig(device_id);
@ -606,13 +610,43 @@ static float GetGOPS(cldnn::device_info info, cldnn::data_types dt) {
Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string, Parameter>& options) const {
OV_ITT_SCOPED_TASK(itt::domains::intel_gpu_plugin, "Plugin::GetMetric");
GPU_DEBUG_GET_INSTANCE(debug_config);
std::string device_id = GetConfig(CONFIG_KEY(DEVICE_ID), options);
std::string device_id = GetConfig(ov::device::id.name(), options);
auto iter = device_map.find(device_id);
auto device = iter != device_map.end() ? iter->second : device_map.begin()->second;
auto device_info = device->get_info();
bool is_new_api = GetCore()->isNewAPI();
if (name == METRIC_KEY(SUPPORTED_METRICS)) {
if (name == ov::supported_properties) {
return decltype(ov::supported_properties)::value_type {
// Metrics
ov::PropertyName{ov::supported_properties.name(), PropertyMutability::RO},
ov::PropertyName{ov::available_devices.name(), PropertyMutability::RO},
ov::PropertyName{ov::range_for_async_infer_requests.name(), PropertyMutability::RO},
ov::PropertyName{ov::range_for_streams.name(), PropertyMutability::RO},
ov::PropertyName{ov::optimal_batch_size.name(), PropertyMutability::RO},
ov::PropertyName{ov::max_batch_size.name(), PropertyMutability::RO},
ov::PropertyName{ov::device::full_name.name(), PropertyMutability::RO},
ov::PropertyName{ov::device::type.name(), PropertyMutability::RO},
ov::PropertyName{ov::device::gops.name(), PropertyMutability::RO},
ov::PropertyName{ov::device::capabilities.name(), PropertyMutability::RO},
ov::PropertyName{ov::intel_gpu::device_total_mem_size.name(), PropertyMutability::RO},
ov::PropertyName{ov::intel_gpu::uarch_version.name(), PropertyMutability::RO},
ov::PropertyName{ov::intel_gpu::execution_units_count.name(), PropertyMutability::RO},
ov::PropertyName{ov::intel_gpu::memory_statistics.name(), PropertyMutability::RO},
// Configs
PropertyName{ov::enable_profiling.name(), PropertyMutability::RW},
PropertyName{ov::hint::model_priority.name(), PropertyMutability::RW},
PropertyName{ov::intel_gpu::hint::host_task_priority.name(), PropertyMutability::RW},
PropertyName{ov::intel_gpu::hint::queue_priority.name(), PropertyMutability::RW},
PropertyName{ov::intel_gpu::hint::queue_throttle.name(), PropertyMutability::RW},
PropertyName{ov::intel_gpu::enable_loop_unrolling.name(), PropertyMutability::RW},
PropertyName{ov::cache_dir.name(), PropertyMutability::RW},
PropertyName{ov::hint::performance_mode.name(), PropertyMutability::RW},
PropertyName{ov::compilation_num_threads.name(), PropertyMutability::RW}
};
} else if (name == METRIC_KEY(SUPPORTED_METRICS)) {
std::vector<std::string> metrics;
metrics.push_back(METRIC_KEY(AVAILABLE_DEVICES));
metrics.push_back(METRIC_KEY(SUPPORTED_METRICS));
@ -624,7 +658,7 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
metrics.push_back(METRIC_KEY(DEVICE_TYPE));
metrics.push_back(METRIC_KEY(DEVICE_GOPS));
metrics.push_back(METRIC_KEY(OPTIMAL_BATCH_SIZE));
metrics.push_back(GPU_METRIC_KEY(MAX_BATCH_SIZE));
metrics.push_back(METRIC_KEY(MAX_BATCH_SIZE));
metrics.push_back(GPU_METRIC_KEY(DEVICE_TOTAL_MEM_SIZE));
metrics.push_back(GPU_METRIC_KEY(UARCH_VERSION));
metrics.push_back(GPU_METRIC_KEY(EXECUTION_UNITS_COUNT));
@ -634,22 +668,36 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
std::vector<std::string> availableDevices = { };
for (auto const& dev : device_map)
availableDevices.push_back(dev.first);
IE_SET_METRIC_RETURN(AVAILABLE_DEVICES, availableDevices);
} else if (name == GPU_METRIC_KEY(DEVICE_TOTAL_MEM_SIZE)) {
IE_SET_METRIC_RETURN(GPU_DEVICE_TOTAL_MEM_SIZE, device_info.max_global_mem_size);
} else if (name == METRIC_KEY(DEVICE_TYPE)) {
auto dev_type = device_info.dev_type == cldnn::device_type::discrete_gpu ? Metrics::DeviceType::discrete : Metrics::DeviceType::integrated;
IE_SET_METRIC_RETURN(DEVICE_TYPE, dev_type);
} else if (name == METRIC_KEY(DEVICE_GOPS)) {
std::map<InferenceEngine::Precision, float> gops;
gops[InferenceEngine::Precision::I8] = GetGOPS(device_info, cldnn::data_types::i8);
gops[InferenceEngine::Precision::U8] = GetGOPS(device_info, cldnn::data_types::u8);
gops[InferenceEngine::Precision::FP16] = GetGOPS(device_info, cldnn::data_types::f16);
gops[InferenceEngine::Precision::FP32] = GetGOPS(device_info, cldnn::data_types::f32);
IE_SET_METRIC_RETURN(DEVICE_GOPS, gops);
} else if (name == GPU_METRIC_KEY(EXECUTION_UNITS_COUNT)) {
IE_SET_METRIC_RETURN(GPU_EXECUTION_UNITS_COUNT, device_info.execution_units_count);
} else if (name == GPU_METRIC_KEY(UARCH_VERSION)) {
return decltype(ov::available_devices)::value_type {availableDevices};
} else if (name == ov::intel_gpu::device_total_mem_size) {
return decltype(ov::intel_gpu::device_total_mem_size)::value_type {device_info.max_global_mem_size};
} else if (name == ov::device::type) {
if (is_new_api) {
auto dev_type = device_info.dev_type == cldnn::device_type::discrete_gpu ? ov::device::Type::DISCRETE : ov::device::Type::INTEGRATED;
return decltype(ov::device::type)::value_type {dev_type};
} else {
auto dev_type = device_info.dev_type == cldnn::device_type::discrete_gpu ? Metrics::DeviceType::discrete : Metrics::DeviceType::integrated;
IE_SET_METRIC_RETURN(DEVICE_TYPE, dev_type);
}
} else if (name == ov::device::gops) {
if (is_new_api) {
std::map<element::Type, float> gops;
gops[element::i8] = GetGOPS(device_info, cldnn::data_types::i8);
gops[element::u8] = GetGOPS(device_info, cldnn::data_types::u8);
gops[element::f16] = GetGOPS(device_info, cldnn::data_types::f16);
gops[element::f32] = GetGOPS(device_info, cldnn::data_types::f32);
return decltype(ov::device::gops)::value_type {gops};
} else {
std::map<InferenceEngine::Precision, float> gops;
gops[InferenceEngine::Precision::I8] = GetGOPS(device_info, cldnn::data_types::i8);
gops[InferenceEngine::Precision::U8] = GetGOPS(device_info, cldnn::data_types::u8);
gops[InferenceEngine::Precision::FP16] = GetGOPS(device_info, cldnn::data_types::f16);
gops[InferenceEngine::Precision::FP32] = GetGOPS(device_info, cldnn::data_types::f32);
IE_SET_METRIC_RETURN(DEVICE_GOPS, gops);
}
} else if (name == ov::intel_gpu::execution_units_count) {
return static_cast<decltype(ov::intel_gpu::execution_units_count)::value_type>(device_info.execution_units_count);
} else if (name == ov::intel_gpu::uarch_version) {
std::stringstream s;
if (device_info.gfx_ver.major == 0 && device_info.gfx_ver.minor == 0 && device_info.gfx_ver.revision == 0) {
s << "unknown";
@ -658,27 +706,28 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
<< static_cast<int>(device_info.gfx_ver.minor) << "."
<< static_cast<int>(device_info.gfx_ver.revision);
}
IE_SET_METRIC_RETURN(GPU_UARCH_VERSION, s.str());
} else if (name == METRIC_KEY(OPTIMAL_BATCH_SIZE)) {
return decltype(ov::intel_gpu::uarch_version)::value_type {s.str()};
} else if (name == METRIC_KEY(OPTIMAL_BATCH_SIZE) ||
name == ov::optimal_batch_size) {
auto next_pow_of_2 = [] (float x) {
return pow(2, ceil(log(x)/log(2)));
return pow(2, ceil(std::log(x)/std::log(2)));
};
auto closest_pow_of_2 = [] (float x) {
return pow(2, floor(log(x)/log(2)));
return pow(2, floor(std::log(x)/std::log(2)));
};
GPU_DEBUG_GET_INSTANCE(debug_config);
auto model_param = options.find("MODEL_PTR");
auto model_param = options.find(ov::hint::model.name());
if (model_param == options.end()) {
GPU_DEBUG_IF(debug_config->verbose >= 1) {
GPU_DEBUG_COUT << "[GPU_OPTIMAL_BATCH_SIZE] MODELS_PTR is not set: return 1" << std::endl;
GPU_DEBUG_COUT << "[GPU_OPTIMAL_BATCH_SIZE] ov::hint::model is not set: return 1" << std::endl;
}
IE_SET_METRIC_RETURN(OPTIMAL_BATCH_SIZE, static_cast<unsigned int>(1));
return decltype(ov::optimal_batch_size)::value_type {static_cast<unsigned int>(1)};
}
std::shared_ptr<ngraph::Function> model;
try {
model = model_param->second.as<std::shared_ptr<ngraph::Function>>();
} catch (...) {
IE_THROW() << "[GPU_OPTIMAL_BATCH_SIZE] MODEL_PTR should be std::shared_ptr<ngraph::Function> type";
IE_THROW() << "[GPU_OPTIMAL_BATCH_SIZE] ov::hint::model should be std::shared_ptr<ov::Model> type";
}
GPU_DEBUG_IF(debug_config->verbose >= 1) {
GPU_DEBUG_COUT << "DEVICE_INFO:"
@ -718,9 +767,9 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
if (memPressure.max_mem_tolerance != ov::MemBandwidthPressure::UNKNOWN)
batch = std::max(1.0, 16 * closest_pow_of_2(memPressure.max_mem_tolerance));
std::map<std::string, InferenceEngine::Parameter> options_for_max_batch;
options_for_max_batch["MODEL_PTR"] = model;
options_for_max_batch[ov::hint::model.name()] = model;
options_for_max_batch["GPU_THROUGHPUT_STREAMS"] = CONFIG_VALUE(GPU_THROUGHPUT_AUTO);
auto max_batch_size = GetMetric(GPU_METRIC_KEY(MAX_BATCH_SIZE), options_for_max_batch).as<unsigned int>();
auto max_batch_size = GetMetric(ov::max_batch_size.name(), options_for_max_batch).as<unsigned int>();
unsigned int closest = closest_pow_of_2(max_batch_size);
batch = std::min(closest, batch);
batch = std::min(256u, batch); //batch 256 is a max
@ -729,37 +778,42 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
GPU_DEBUG_COUT << "MAX_BATCH: " << max_batch_size << std::endl;
GPU_DEBUG_COUT << "ACTUAL OPTIMAL BATCH: " << batch << std::endl;
}
IE_SET_METRIC_RETURN(OPTIMAL_BATCH_SIZE, batch);
} else if (name == METRIC_KEY(FULL_DEVICE_NAME)) {
return decltype(ov::optimal_batch_size)::value_type {batch};
} else if (name == ov::device::full_name) {
auto deviceName = StringRightTrim(device_info.dev_name, "NEO", false);
deviceName += std::string(" (") + (device_info.dev_type == cldnn::device_type::discrete_gpu ? "dGPU" : "iGPU") + ")";
IE_SET_METRIC_RETURN(FULL_DEVICE_NAME, deviceName);
return decltype(ov::device::full_name)::value_type {deviceName};
} else if (name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
std::vector<std::string> configKeys;
for (auto opt : _impl->m_configs.GetConfig(device_id).key_config_map)
configKeys.push_back(opt.first);
for (auto opt : _impl->m_configs.GetConfig(device_id).key_config_map) {
// Exclude new API properties
if (!Config::isNewApiProperty(opt.first))
configKeys.push_back(opt.first);
}
IE_SET_METRIC_RETURN(SUPPORTED_CONFIG_KEYS, configKeys);
} else if (name == METRIC_KEY(OPTIMIZATION_CAPABILITIES)) {
} else if (name == ov::device::capabilities) {
std::vector<std::string> capabilities;
capabilities.push_back(METRIC_VALUE(FP32));
capabilities.push_back(METRIC_VALUE(BIN));
capabilities.push_back(METRIC_VALUE(BATCHED_BLOB));
capabilities.push_back(ov::device::capability::FP32);
capabilities.push_back(ov::device::capability::BIN);
if (!is_new_api)
capabilities.push_back(METRIC_VALUE(BATCHED_BLOB));
if (device_info.supports_fp16)
capabilities.push_back(METRIC_VALUE(FP16));
capabilities.push_back(ov::device::capability::FP16);
if (device_info.supports_imad || device_info.supports_immad)
capabilities.push_back(METRIC_VALUE(INT8));
capabilities.push_back(ov::device::capability::INT8);
if (device_info.supports_immad)
capabilities.push_back(METRIC_VALUE(GPU_HW_MATMUL));
capabilities.push_back(ov::intel_gpu::capability::HW_MATMUL);
IE_SET_METRIC_RETURN(OPTIMIZATION_CAPABILITIES, capabilities);
} else if (name == METRIC_KEY(RANGE_FOR_ASYNC_INFER_REQUESTS)) {
return decltype(ov::device::capabilities)::value_type {capabilities};
} else if (name == ov::range_for_async_infer_requests) {
std::tuple<unsigned int, unsigned int, unsigned int> range = std::make_tuple(1, 2, 1);
IE_SET_METRIC_RETURN(RANGE_FOR_ASYNC_INFER_REQUESTS, range);
} else if (name == METRIC_KEY(RANGE_FOR_STREAMS)) {
} else if (name == ov::range_for_streams) {
std::tuple<unsigned int, unsigned int> range = std::make_tuple(1, 2);
IE_SET_METRIC_RETURN(RANGE_FOR_STREAMS, range);
} else if (name == GPU_METRIC_KEY(MEMORY_STATISTICS)) {
} else if (name == GPU_METRIC_KEY(MEMORY_STATISTICS) ||
name == ov::intel_gpu::memory_statistics) {
std::map<std::string, uint64_t> statistics;
for (auto const &item : statistics_map) {
// Before collecting memory statistics of each context, it's updated with the latest memory statistics from engine.
@ -772,12 +826,13 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
}
}
}
IE_SET_METRIC_RETURN(GPU_MEMORY_STATISTICS, statistics);
} else if (name == GPU_METRIC_KEY(MAX_BATCH_SIZE)) {
return decltype(ov::intel_gpu::memory_statistics)::value_type {statistics};
} else if (name == METRIC_KEY(MAX_BATCH_SIZE) ||
name == ov::max_batch_size) {
const auto& config = _impl->m_configs.GetConfig(device_id);
uint32_t n_streams = static_cast<uint32_t>(config.throughput_streams);
uint64_t occupied_device_mem = 0;
auto statistic_result = GetMetric(GPU_METRIC_KEY(MEMORY_STATISTICS), options).as<std::map<std::string, uint64_t>>();
auto statistic_result = GetMetric(ov::intel_gpu::memory_statistics.name(), options).as<std::map<std::string, uint64_t>>();
auto occupied_usm_dev = statistic_result.find("usm_device_current");
if (occupied_usm_dev != statistic_result.end()) {
occupied_device_mem = occupied_usm_dev->second;
@ -791,11 +846,11 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
int64_t max_batch_size = 1;
if (options.find("MODEL_PTR") == options.end()) {
if (options.find(ov::hint::model.name()) == options.end()) {
GPU_DEBUG_IF(debug_config->verbose >= 1) {
GPU_DEBUG_COUT << "[GPU_MAX_BATCH_SIZE] MODELS_PTR is not set: return 1" << std::endl;
}
IE_SET_METRIC_RETURN(GPU_MAX_BATCH_SIZE, static_cast<int32_t>(max_batch_size));
return decltype(ov::max_batch_size)::value_type {static_cast<uint32_t>(max_batch_size)};
}
if (options.find("GPU_THROUGHPUT_STREAMS") != options.end()) {
try {
@ -816,26 +871,27 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
GPU_DEBUG_COUT << "[GPU_MAX_BATCH_SIZE] n_streams : " << n_streams << std::endl;
}
if (options.find("AVAILABLE_DEVICE_MEM_SIZE") != options.end()) {
if (options.find(ov::intel_gpu::hint::available_device_mem.name()) != options.end()) {
try {
available_device_mem = std::min(static_cast<int64_t>(available_device_mem), options.find("AVAILABLE_DEVICE_MEM_SIZE")->second.as<int64_t>());
available_device_mem = std::min(static_cast<int64_t>(available_device_mem),
options.find(ov::intel_gpu::hint::available_device_mem.name())->second.as<int64_t>());
GPU_DEBUG_IF(debug_config->verbose >= 2) {
GPU_DEBUG_COUT << "[GPU_MAX_BATCH_SIZE] available memory is reset by user " << available_device_mem << std::endl;
}
} catch (...) {
IE_THROW() << "[GPU_MAX_BATCH_SIZE] bad casting: AVAILABLE_DEVICE_MEM_SIZE should be int64_t type";
IE_THROW() << "[GPU_MAX_BATCH_SIZE] bad casting: ov::intel_gpu::hint::available_device_mem should be int64_t type";
}
if (available_device_mem < 0) {
IE_THROW() << "[GPU_MAX_BATCH_SIZE] AVAILABLE_DEVICE_MEM_SIZE value should be greater than 0 for max batch size calculation";
IE_THROW() << "[GPU_MAX_BATCH_SIZE] ov::intel_gpu::hint::available_device_mem value should be greater than 0 for max batch size calculation";
}
}
std::shared_ptr<ngraph::Function> model;
auto model_param = options.find("MODEL_PTR")->second;
auto model_param = options.find(ov::hint::model.name())->second;
try {
model = model_param.as<std::shared_ptr<ngraph::Function>>();
} catch (...) {
IE_THROW() << "[GPU_MAX_BATCH_SIZE] MODEL_PTR should be std::shared_ptr<ngraph::Function> type";
IE_THROW() << "[GPU_MAX_BATCH_SIZE] ov::hint::model should be std::shared_ptr<ov::Model> type";
}
InferenceEngine::CNNNetwork network(model);
@ -913,7 +969,7 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map<std::string,
GPU_DEBUG_COUT << "[GPU_MAX_BATCH_SIZE] Failed in reshape or build program " << e.what() << std::endl;
}
}
IE_SET_METRIC_RETURN(GPU_MAX_BATCH_SIZE, static_cast<int32_t>(max_batch_size));
return decltype(ov::max_batch_size)::value_type {static_cast<uint32_t>(max_batch_size)};
} else {
IE_THROW() << "Unsupported metric key " << name;
}

View File

@ -94,22 +94,19 @@ ocl_queue_type command_queues_builder::build(const cl::Context& context, const c
}
void command_queues_builder::set_priority_mode(priority_mode_types priority, bool extension_support) {
if (priority != priority_mode_types::disabled && !extension_support) {
CLDNN_ERROR_MESSAGE("Command queues builders - priority_mode",
std::string("The param priority_mode is set in engine_configuration, ")
.append("but cl_khr_priority_hints or cl_khr_create_command_queue ")
.append("is not supported by current OpenCL implementation."));
if (extension_support) {
_priority_mode = priority;
} else {
_priority_mode = priority_mode_types::disabled;
}
_priority_mode = priority;
}
void command_queues_builder::set_throttle_mode(throttle_mode_types throttle, bool extension_support) {
if (throttle != throttle_mode_types::disabled && !extension_support) {
CLDNN_ERROR_MESSAGE("Command queues builders - throttle_mode",
std::string("The param throttle_mode is set in engine_configuration, ")
.append("but cl_khr_throttle_hints is not supported by current OpenCL implementation."));
if (extension_support) {
_throttle_mode = throttle;
} else {
_throttle_mode = throttle_mode_types::disabled;
}
_throttle_mode = throttle;
}
void command_queues_builder::set_supports_queue_families(bool extension_support) {

View File

@ -2,7 +2,10 @@
// SPDX-License-Identifier: Apache-2.0
//
#include <thread>
#include "behavior/ov_plugin/core_integration.hpp"
#include "openvino/runtime/intel_gpu/properties.hpp"
#ifdef _WIN32
# include "gpu/gpu_context_api_dx.hpp"
@ -123,6 +126,439 @@ INSTANTIATE_TEST_SUITE_P(nightly_OVClassGetMetricTest,
OVClassGetMetricTest_GPU_EXECUTION_UNITS_COUNT,
::testing::Values("GPU"));
using OVClassGetPropertyTest_GPU = OVClassBaseTestP;
TEST_P(OVClassGetPropertyTest_GPU, GetMetricAvailableDevicesAndPrintNoThrow) {
ov::Core ie;
std::vector<std::string> properties;
ASSERT_NO_THROW(properties = ie.get_property(deviceName, ov::available_devices));
std::cout << "AVAILABLE_DEVICES: ";
for (const auto& prop : properties) {
std::cout << prop << " ";
}
std::cout << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::available_devices);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricRangeForAsyncInferRequestsAndPrintNoThrow) {
ov::Core ie;
std::tuple<unsigned int, unsigned int, unsigned int> property;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::range_for_async_infer_requests));
std::cout << "RANGE_FOR_ASYNC_INFER_REQUESTS: " << std::get<0>(property) << " " <<
std::get<1>(property) << " " <<
std::get<2>(property) << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::range_for_async_infer_requests);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricRangeForStreamsAndPrintNoThrow) {
ov::Core ie;
std::tuple<unsigned int, unsigned int> property;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::range_for_streams));
std::cout << "RANGE_FOR_STREAMS: " << std::get<0>(property) << " " <<
std::get<1>(property) << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::range_for_streams);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricOptimalBatchSizeAndPrintNoThrow) {
ov::Core ie;
unsigned int property;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::optimal_batch_size));
std::cout << "OPTIMAL_BATCH_SIZE: " << property << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::optimal_batch_size);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricFullNameAndPrintNoThrow) {
ov::Core ie;
std::string property;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::device::full_name));
std::cout << "FULL_DEVICE_NAME: " << property << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::device::full_name);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricTypeAndPrintNoThrow) {
ov::Core ie;
ov::device::Type property = ov::device::Type::INTEGRATED;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::device::type));
std::cout << "DEVICE_TYPE: " << property << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::device::type);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricGopsAndPrintNoThrow) {
ov::Core ie;
std::map<ov::element::Type, float> properties;
ASSERT_NO_THROW(properties = ie.get_property(deviceName, ov::device::gops));
std::cout << "DEVICE_GOPS: " << std::endl;
for (const auto& prop : properties) {
std::cout << "- " << prop.first << ": " << prop.second << std::endl;
}
OV_ASSERT_PROPERTY_SUPPORTED(ov::device::gops);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricCapabilitiesAndPrintNoThrow) {
ov::Core ie;
std::vector<std::string> properties;
ASSERT_NO_THROW(properties = ie.get_property(deviceName, ov::device::capabilities));
std::cout << "OPTIMIZATION_CAPABILITIES: " << std::endl;
for (const auto& prop : properties) {
std::cout << "- " << prop << std::endl;
}
OV_ASSERT_PROPERTY_SUPPORTED(ov::device::capabilities);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricDeviceTotalMemSizeAndPrintNoThrow) {
ov::Core ie;
uint64_t property;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::intel_gpu::device_total_mem_size));
std::cout << "GPU_DEVICE_TOTAL_MEM_SIZE: " << property << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::intel_gpu::device_total_mem_size);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricUarchVersionAndPrintNoThrow) {
ov::Core ie;
std::string property;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::intel_gpu::uarch_version));
std::cout << "GPU_UARCH_VERSION: " << property << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::intel_gpu::uarch_version);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricExecutionUnitsCountAndPrintNoThrow) {
ov::Core ie;
int32_t property = 0;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::intel_gpu::execution_units_count));
std::cout << "GPU_EXECUTION_UNITS_COUNT: " << property << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::intel_gpu::execution_units_count);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricMemoryStatisticsAndPrintNoThrow) {
ov::Core ie;
std::map<std::string, uint64_t> properties;
ASSERT_NO_THROW(properties = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
std::cout << "GPU_MEMORY_STATISTICS: " << std::endl;
for (const auto& prop : properties) {
std::cout << " " << prop.first << " - " << prop.second << std::endl;
}
OV_ASSERT_PROPERTY_SUPPORTED(ov::intel_gpu::memory_statistics);
}
TEST_P(OVClassGetPropertyTest_GPU, GetMetricMaxBatchSizeAndPrintNoThrow) {
ov::Core ie;
uint32_t property;
ASSERT_NO_THROW(property = ie.get_property(deviceName, ov::max_batch_size));
std::cout << "GPU_MAX_BATCH_SIZE: " << property << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::max_batch_size);
}
TEST_P(OVClassGetPropertyTest_GPU, CanSetDefaultValueBackToPluginNewAPI) {
ov::Core ie;
std::vector<ov::PropertyName> properties;
ASSERT_NO_THROW(properties = ie.get_property(deviceName, ov::supported_properties));
std::cout << "SUPPORTED_PROPERTIES:" << std::endl;
for (const auto& property : properties) {
ov::Any prop;
if (property.is_mutable()) {
std::cout << "RW: " << property << " ";
ASSERT_NO_THROW(prop = ie.get_property(deviceName, property));
prop.print(std::cout);
std::cout << std::endl;
ASSERT_NO_THROW(ie.set_property(deviceName, {{property, prop}}));
} else {
std::cout << "RO: " << property << " ";
ASSERT_NO_THROW(prop = ie.get_property(deviceName, property));
prop.print(std::cout);
std::cout << std::endl;
}
}
OV_ASSERT_PROPERTY_SUPPORTED(ov::supported_properties);
}
INSTANTIATE_TEST_SUITE_P(nightly_OVClassGetMetricTest,
OVClassGetPropertyTest_GPU,
::testing::Values("GPU"));
using OVClassGetMetricTest_GPU_OPTIMAL_BATCH_SIZE = OVClassBaseTestP;
TEST_P(OVClassGetMetricTest_GPU_OPTIMAL_BATCH_SIZE, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
ov::Core ie;
unsigned int p;
ov::AnyMap _options = {ov::hint::model(simpleNetwork)};
ASSERT_NO_THROW(p = ie.get_property(deviceName, ov::optimal_batch_size.name(), _options));
std::cout << "GPU device optimal batch size: " << p << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::optimal_batch_size);
}
INSTANTIATE_TEST_SUITE_P(
nightly_OVClassExecutableNetworkGetMetricTest, OVClassGetMetricTest_GPU_OPTIMAL_BATCH_SIZE,
::testing::Values("GPU")
);
using OVClassGetMetricTest_GPU_MAX_BATCH_SIZE_DEFAULT = OVClassBaseTestP;
TEST_P(OVClassGetMetricTest_GPU_MAX_BATCH_SIZE_DEFAULT, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
ov::Core ie;
unsigned int p;
ov::AnyMap _options = {ov::hint::model(simpleNetwork)};
ASSERT_NO_THROW(p = ie.get_property(deviceName, ov::max_batch_size.name(), _options));
std::cout << "GPU device max available batch size: " << p << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::max_batch_size);
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassExecutableNetworkGetMetricTest, OVClassGetMetricTest_GPU_MAX_BATCH_SIZE_DEFAULT,
::testing::Values("GPU")
);
using OVClassGetMetricTest_GPU_MAX_BATCH_SIZE_STREAM_DEVICE_MEM = OVClassBaseTestP;
TEST_P(OVClassGetMetricTest_GPU_MAX_BATCH_SIZE_STREAM_DEVICE_MEM, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
ov::Core ie;
unsigned int p;
auto exec_net1 = ie.compile_model(simpleNetwork, deviceName);
uint32_t n_streams = 2;
int64_t available_device_mem_size = 1073741824;
ov::AnyMap _options = {ov::hint::model(simpleNetwork),
ov::streams::num(n_streams),
ov::intel_gpu::hint::available_device_mem(available_device_mem_size)};
ASSERT_NO_THROW(p = ie.get_property(deviceName, ov::max_batch_size.name(), _options));
std::cout << "GPU device max available batch size: " << p << std::endl;
OV_ASSERT_PROPERTY_SUPPORTED(ov::max_batch_size);
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassExecutableNetworkGetMetricTest, OVClassGetMetricTest_GPU_MAX_BATCH_SIZE_STREAM_DEVICE_MEM,
::testing::Values("GPU")
);
using OVClassGetMetricTest_GPU_MEMORY_STATISTICS_DEFAULT = OVClassBaseTestP;
TEST_P(OVClassGetMetricTest_GPU_MEMORY_STATISTICS_DEFAULT, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
ov::Core ie;
std::map<std::string, uint64_t> p;
auto exec_net = ie.compile_model(simpleNetwork, deviceName);
ASSERT_NO_THROW(p = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(p.empty());
std::cout << "Memory Statistics: " << std::endl;
for (auto &&kv : p) {
ASSERT_NE(kv.second, 0);
std::cout << kv.first << ": " << kv.second << " bytes" << std::endl;
}
OV_ASSERT_PROPERTY_SUPPORTED(ov::intel_gpu::memory_statistics);
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassGetMetricTest, OVClassGetMetricTest_GPU_MEMORY_STATISTICS_DEFAULT,
::testing::Values("GPU")
);
using OVClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTIPLE_NETWORKS = OVClassBaseTestP;
TEST_P(OVClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTIPLE_NETWORKS, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
ov::Core ie;
std::map<std::string, uint64_t> t1;
std::map<std::string, uint64_t> t2;
auto exec_net1 = ie.compile_model(simpleNetwork, deviceName);
ASSERT_NO_THROW(t1 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(t1.empty());
for (auto &&kv : t1) {
ASSERT_NE(kv.second, 0);
}
auto exec_net2 = ie.compile_model(simpleNetwork, deviceName);
ASSERT_NO_THROW(t2 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(t2.empty());
for (auto &&kv : t2) {
ASSERT_NE(kv.second, 0);
auto iter = t1.find(kv.first);
if (iter != t1.end()) {
ASSERT_EQ(kv.second, t1[kv.first] * 2);
}
}
OV_ASSERT_PROPERTY_SUPPORTED(ov::intel_gpu::memory_statistics);
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassGetMetricTest, OVClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTIPLE_NETWORKS,
::testing::Values("GPU")
);
using OVClassGetMetricTest_GPU_MEMORY_STATISTICS_CHECK_VALUES = OVClassBaseTestP;
TEST_P(OVClassGetMetricTest_GPU_MEMORY_STATISTICS_CHECK_VALUES, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
ov::Core ie;
std::map<std::string, uint64_t> t1;
ASSERT_NO_THROW(t1 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_TRUE(t1.empty());
{
auto exec_net1 = ie.compile_model(simpleNetwork, deviceName);
std::map<std::string, uint64_t> t2;
ASSERT_NO_THROW(t2 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(t2.empty());
for (auto &&kv : t2) {
ASSERT_NE(kv.second, 0);
}
{
auto exec_net2 = ie.compile_model(actualNetwork, deviceName);
std::map<std::string, uint64_t> t3;
ASSERT_NO_THROW(t3 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(t3.empty());
for (auto &&kv : t3) {
ASSERT_NE(kv.second, 0);
}
}
std::map<std::string, uint64_t> t4;
ASSERT_NO_THROW(t4 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(t4.empty());
for (auto &&kv : t4) {
ASSERT_NE(kv.second, 0);
if (kv.first.find("_cur") != std::string::npos) {
auto iter = t2.find(kv.first);
if (iter != t2.end()) {
ASSERT_EQ(t2[kv.first], kv.second);
}
}
}
}
std::map<std::string, uint64_t> t5;
ASSERT_NO_THROW(t5 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(t5.empty());
for (auto &&kv : t5) {
if (kv.first.find("_cur") != std::string::npos) {
ASSERT_EQ(kv.second, 0);
}
}
OV_ASSERT_PROPERTY_SUPPORTED(ov::intel_gpu::memory_statistics);
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassGetMetricTest, OVClassGetMetricTest_GPU_MEMORY_STATISTICS_CHECK_VALUES,
::testing::Values("GPU")
);
using OVClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTI_THREADS = OVClassBaseTestP;
TEST_P(OVClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTI_THREADS, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
ov::Core ie;
std::map<std::string, uint64_t> t1;
std::map<std::string, uint64_t> t2;
std::atomic<uint32_t> counter{0u};
std::vector<std::thread> threads(2);
// key: thread id, value: executable network
std::map<uint32_t, ov::CompiledModel> exec_net_map;
std::vector<std::shared_ptr<ngraph::Function>> networks;
networks.emplace_back(simpleNetwork);
networks.emplace_back(simpleNetwork);
auto exec_net1 = ie.compile_model(simpleNetwork, deviceName);
ASSERT_NO_THROW(t1 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(t1.empty());
for (auto &&kv : t1) {
ASSERT_NE(kv.second, 0);
}
for (auto & thread : threads) {
thread = std::thread([&](){
auto value = counter++;
exec_net_map[value] = ie.compile_model(networks[value], deviceName);
});
}
for (auto & thread : threads) {
if (thread.joinable()) {
thread.join();
}
}
ASSERT_NO_THROW(t2 = ie.get_property(deviceName, ov::intel_gpu::memory_statistics));
ASSERT_FALSE(t2.empty());
for (auto &&kv : t2) {
ASSERT_NE(kv.second, 0);
auto iter = t1.find(kv.first);
if (iter != t1.end()) {
ASSERT_EQ(kv.second, t1[kv.first] * 3);
}
}
OV_ASSERT_PROPERTY_SUPPORTED(ov::intel_gpu::memory_statistics);
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassGetMetricTest, OVClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTI_THREADS,
::testing::Values("GPU")
);
//
// IE Class GetConfig
//

View File

@ -13,7 +13,7 @@ namespace {
DefaultConfigurationTest,
::testing::Combine(
::testing::Values(CommonTestUtils::DEVICE_GPU),
::testing::Values(DefaultParameter{GPU_CONFIG_KEY(PLUGIN_THROTTLE), InferenceEngine::Parameter{std::string{"0"}}})),
::testing::Values(DefaultParameter{GPU_CONFIG_KEY(PLUGIN_THROTTLE), InferenceEngine::Parameter{std::string{"2"}}})),
DefaultConfigurationTest::getTestCaseName);
IE_SUPPRESS_DEPRECATED_START
@ -164,10 +164,6 @@ namespace {
{{InferenceEngine::GPUConfigParams::KEY_GPU_PLUGIN_THROTTLE, "1"}},
{{InferenceEngine::GPUConfigParams::KEY_GPU_PLUGIN_PRIORITY, "0"}},
{{InferenceEngine::GPUConfigParams::KEY_GPU_PLUGIN_PRIORITY, "1"}},
{{InferenceEngine::GPUConfigParams::KEY_GPU_MODEL_PRIORITY, InferenceEngine::GPUConfigParams::GPU_QUEUE_PRIORITY_HIGH
+ std::string("|") + InferenceEngine::GPUConfigParams::GPU_HOST_TASK_PRIORITY_ANY}},
{{InferenceEngine::GPUConfigParams::KEY_GPU_MODEL_PRIORITY, InferenceEngine::GPUConfigParams::GPU_QUEUE_PRIORITY_LOW
+ std::string("|") + InferenceEngine::GPUConfigParams::GPU_HOST_TASK_PRIORITY_ANY}},
{{InferenceEngine::GPUConfigParams::KEY_GPU_MAX_NUM_THREADS, "1"}},
{{InferenceEngine::GPUConfigParams::KEY_GPU_MAX_NUM_THREADS, "4"}},
{{InferenceEngine::GPUConfigParams::KEY_GPU_ENABLE_LOOP_UNROLLING, InferenceEngine::PluginConfigParams::YES}},

View File

@ -115,72 +115,6 @@ INSTANTIATE_TEST_SUITE_P(
::testing::Values("GPU")
);
using IEClassGetMetricTest_GPU_OPTIMAL_BATCH_SIZE = BehaviorTestsUtils::IEClassBaseTestP;
TEST_P(IEClassGetMetricTest_GPU_OPTIMAL_BATCH_SIZE, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
InferenceEngine::Core ie;
InferenceEngine::Parameter p;
std::map<std::string, InferenceEngine::Parameter> _options = {{"MODEL_PTR", simpleCnnNetwork.getFunction()}};
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, METRIC_KEY(OPTIMAL_BATCH_SIZE), _options).as<unsigned int>());
unsigned int t = p;
std::cout << "GPU device optimal batch size: " << t << std::endl;
ASSERT_METRIC_SUPPORTED_IE(METRIC_KEY(OPTIMAL_BATCH_SIZE));
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassExecutableNetworkGetMetricTest, IEClassGetMetricTest_GPU_OPTIMAL_BATCH_SIZE,
::testing::Values("GPU")
);
using IEClassGetMetricTest_GPU_MAX_BATCH_SIZE_DEFAULT = BehaviorTestsUtils::IEClassBaseTestP;
TEST_P(IEClassGetMetricTest_GPU_MAX_BATCH_SIZE_DEFAULT, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
InferenceEngine::Core ie;
InferenceEngine::Parameter p;
std::map<std::string, InferenceEngine::Parameter> _options = {{"MODEL_PTR", simpleCnnNetwork.getFunction()}};
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MAX_BATCH_SIZE), _options).as<uint32_t>());
uint32_t t = p;
std::cout << "GPU device max available batch size: " << t << std::endl;
ASSERT_METRIC_SUPPORTED_IE(GPU_METRIC_KEY(MAX_BATCH_SIZE));
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassExecutableNetworkGetMetricTest, IEClassGetMetricTest_GPU_MAX_BATCH_SIZE_DEFAULT,
::testing::Values("GPU")
);
using IEClassGetMetricTest_GPU_MAX_BATCH_SIZE_STREAM_DEVICE_MEM = BehaviorTestsUtils::IEClassBaseTestP;
TEST_P(IEClassGetMetricTest_GPU_MAX_BATCH_SIZE_STREAM_DEVICE_MEM, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
InferenceEngine::Core ie;
InferenceEngine::Parameter p;
uint32_t n_streams = 2;
int64_t available_device_mem_size = 1073741824;
std::map<std::string, InferenceEngine::Parameter> _options = {{"MODEL_PTR", simpleCnnNetwork.getFunction()}};
_options.insert(std::make_pair("GPU_THROUGHPUT_STREAMS", n_streams));
_options.insert(std::make_pair("AVAILABLE_DEVICE_MEM_SIZE", available_device_mem_size));
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MAX_BATCH_SIZE), _options).as<uint32_t>());
uint32_t t = p;
std::cout << "GPU device max available batch size: " << t << std::endl;
ASSERT_METRIC_SUPPORTED_IE(GPU_METRIC_KEY(MAX_BATCH_SIZE));
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassExecutableNetworkGetMetricTest, IEClassGetMetricTest_GPU_MAX_BATCH_SIZE_STREAM_DEVICE_MEM,
::testing::Values("GPU")
);
using IEClassGetMetricTest_GPU_UARCH_VERSION = BehaviorTestsUtils::IEClassBaseTestP;
TEST_P(IEClassGetMetricTest_GPU_UARCH_VERSION, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
@ -219,189 +153,6 @@ INSTANTIATE_TEST_SUITE_P(
::testing::Values("GPU")
);
using IEClassGetMetricTest_GPU_MEMORY_STATISTICS_DEFAULT = BehaviorTestsUtils::IEClassBaseTestP;
TEST_P(IEClassGetMetricTest_GPU_MEMORY_STATISTICS_DEFAULT, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
InferenceEngine::Core ie;
InferenceEngine::Parameter p;
InferenceEngine::ExecutableNetwork exec_net = ie.LoadNetwork(simpleCnnNetwork, deviceName);
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t = p;
ASSERT_FALSE(t.empty());
std::cout << "Memory Statistics: " << std::endl;
for (auto &&kv : t) {
ASSERT_NE(kv.second, 0);
std::cout << kv.first << ": " << kv.second << " bytes" << std::endl;
}
ASSERT_METRIC_SUPPORTED_IE(GPU_METRIC_KEY(MEMORY_STATISTICS));
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassGetMetricTest, IEClassGetMetricTest_GPU_MEMORY_STATISTICS_DEFAULT,
::testing::Values("GPU")
);
using IEClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTIPLE_NETWORKS = BehaviorTestsUtils::IEClassBaseTestP;
TEST_P(IEClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTIPLE_NETWORKS, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
InferenceEngine::Core ie;
InferenceEngine::Parameter p;
InferenceEngine::ExecutableNetwork exec_net1 = ie.LoadNetwork(simpleCnnNetwork, deviceName);
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t1 = p;
ASSERT_FALSE(t1.empty());
for (auto &&kv : t1) {
ASSERT_NE(kv.second, 0);
}
InferenceEngine::ExecutableNetwork exec_net2 = ie.LoadNetwork(simpleCnnNetwork, deviceName);
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t2 = p;
ASSERT_FALSE(t2.empty());
for (auto &&kv : t2) {
ASSERT_NE(kv.second, 0);
auto iter = t1.find(kv.first);
if (iter != t1.end()) {
ASSERT_EQ(kv.second, t1[kv.first] * 2);
}
}
ASSERT_METRIC_SUPPORTED_IE(GPU_METRIC_KEY(MEMORY_STATISTICS));
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassGetMetricTest, IEClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTIPLE_NETWORKS,
::testing::Values("GPU")
);
using IEClassGetMetricTest_GPU_MEMORY_STATISTICS_CHECK_VALUES = BehaviorTestsUtils::IEClassBaseTestP;
TEST_P(IEClassGetMetricTest_GPU_MEMORY_STATISTICS_CHECK_VALUES, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
InferenceEngine::Core ie;
InferenceEngine::Parameter p;
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t1 = p;
ASSERT_TRUE(t1.empty());
{
InferenceEngine::ExecutableNetwork exec_net1 = ie.LoadNetwork(simpleCnnNetwork, deviceName);
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t2 = p;
ASSERT_FALSE(t2.empty());
for (auto &&kv : t2) {
ASSERT_NE(kv.second, 0);
}
{
InferenceEngine::ExecutableNetwork exec_net2 = ie.LoadNetwork(actualCnnNetwork, deviceName);
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t3 = p;
ASSERT_FALSE(t3.empty());
for (auto &&kv : t3) {
ASSERT_NE(kv.second, 0);
}
}
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t4 = p;
ASSERT_FALSE(t4.empty());
for (auto &&kv : t4) {
ASSERT_NE(kv.second, 0);
if (kv.first.find("_cur") != std::string::npos) {
auto iter = t2.find(kv.first);
if (iter != t2.end()) {
ASSERT_EQ(t2[kv.first], kv.second);
}
}
}
}
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t5 = p;
ASSERT_FALSE(t5.empty());
for (auto &&kv : t5) {
if (kv.first.find("_cur") != std::string::npos) {
ASSERT_EQ(kv.second, 0);
}
}
ASSERT_METRIC_SUPPORTED_IE(GPU_METRIC_KEY(MEMORY_STATISTICS));
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassGetMetricTest, IEClassGetMetricTest_GPU_MEMORY_STATISTICS_CHECK_VALUES,
::testing::Values("GPU")
);
using IEClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTI_THREADS = BehaviorTestsUtils::IEClassBaseTestP;
TEST_P(IEClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTI_THREADS, GetMetricAndPrintNoThrow) {
SKIP_IF_CURRENT_TEST_IS_DISABLED()
InferenceEngine::Core ie;
InferenceEngine::Parameter p;
std::atomic<uint32_t> counter{0u};
std::vector<std::thread> threads(2);
// key: thread id, value: executable network
std::map<uint32_t, InferenceEngine::ExecutableNetwork> exec_net_map;
std::vector<InferenceEngine::CNNNetwork> networks;
networks.emplace_back(simpleCnnNetwork);
networks.emplace_back(simpleCnnNetwork);
InferenceEngine::ExecutableNetwork exec_net1 = ie.LoadNetwork(simpleCnnNetwork, deviceName);
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t1 = p;
ASSERT_FALSE(t1.empty());
for (auto &&kv : t1) {
ASSERT_NE(kv.second, 0);
}
for (auto & thread : threads) {
thread = std::thread([&](){
auto value = counter++;
exec_net_map[value] = ie.LoadNetwork(networks[value], deviceName);
});
}
for (auto & thread : threads) {
if (thread.joinable()) {
thread.join();
}
}
ASSERT_NO_THROW(p = ie.GetMetric(deviceName, GPU_METRIC_KEY(MEMORY_STATISTICS)));
std::map<std::string, uint64_t> t2 = p;
ASSERT_FALSE(t2.empty());
for (auto &&kv : t2) {
ASSERT_NE(kv.second, 0);
auto iter = t1.find(kv.first);
if (iter != t1.end()) {
ASSERT_EQ(kv.second, t1[kv.first] * 3);
}
}
ASSERT_METRIC_SUPPORTED_IE(GPU_METRIC_KEY(MEMORY_STATISTICS));
}
INSTANTIATE_TEST_SUITE_P(
nightly_IEClassGetMetricTest, IEClassGetMetricTest_GPU_MEMORY_STATISTICS_MULTI_THREADS,
::testing::Values("GPU")
);
//
// IE Class GetConfig
//

View File

@ -300,16 +300,16 @@ TEST_P(OVClassBasicTestP, smoke_SetConfigHeteroNoThrow) {
TEST(OVClassBasicTest, smoke_SetConfigAutoNoThrows) {
ov::Core ie = createCoreWithTemplate();
ov::hint::ModelPriority value;
OV_ASSERT_NO_THROW(ie.set_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority(ov::hint::ModelPriority::LOW)));
ov::hint::Priority value;
OV_ASSERT_NO_THROW(ie.set_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority(ov::hint::Priority::LOW)));
OV_ASSERT_NO_THROW(value = ie.get_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority));
EXPECT_EQ(value, ov::hint::ModelPriority::LOW);
OV_ASSERT_NO_THROW(ie.set_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority(ov::hint::ModelPriority::MEDIUM)));
EXPECT_EQ(value, ov::hint::Priority::LOW);
OV_ASSERT_NO_THROW(ie.set_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority(ov::hint::Priority::MEDIUM)));
OV_ASSERT_NO_THROW(value = ie.get_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority));
EXPECT_EQ(value, ov::hint::ModelPriority::MEDIUM);
OV_ASSERT_NO_THROW(ie.set_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority(ov::hint::ModelPriority::HIGH)));
EXPECT_EQ(value, ov::hint::Priority::MEDIUM);
OV_ASSERT_NO_THROW(ie.set_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority(ov::hint::Priority::HIGH)));
OV_ASSERT_NO_THROW(value = ie.get_property(CommonTestUtils::DEVICE_AUTO, ov::hint::model_priority));
EXPECT_EQ(value, ov::hint::ModelPriority::HIGH);
EXPECT_EQ(value, ov::hint::Priority::HIGH);
}
TEST_P(OVClassSpecificDeviceTestSetConfig, SetConfigSpecificDeviceNoThrow) {