diff --git a/docs/OV_Runtime_UG/supported_plugins/CPU.md b/docs/OV_Runtime_UG/supported_plugins/CPU.md index df17fd10f0d..14b569e3e42 100644 --- a/docs/OV_Runtime_UG/supported_plugins/CPU.md +++ b/docs/OV_Runtime_UG/supported_plugins/CPU.md @@ -105,14 +105,14 @@ to query ``ov::device::capabilities`` property, which should contain ``BF16`` in :fragment: [part0] -If the model has been converted to ``bf16``, the ``ov::inference_precision`` is set to ``ov::element::bf16`` and can be checked via +If the model has been converted to ``bf16``, the ``ov::hint::inference_precision`` is set to ``ov::element::bf16`` and can be checked via the ``ov::CompiledModel::get_property`` call. The code below demonstrates how to get the element type: .. doxygensnippet:: snippets/cpu/Bfloat16Inference1.cpp :language: py :fragment: [part1] -To infer the model in ``f32`` precision instead of ``bf16`` on targets with native ``bf16`` support, set the ``ov::inference_precision`` to ``ov::element::f32``. +To infer the model in ``f32`` precision instead of ``bf16`` on targets with native ``bf16`` support, set the ``ov::hint::inference_precision`` to ``ov::element::f32``. .. tab-set:: @@ -134,11 +134,11 @@ To infer the model in ``f32`` precision instead of ``bf16`` on targets with nati The ``Bfloat16`` software simulation mode is available on CPUs with IntelĀ® AVX-512 instruction set that do not support the native ``avx512_bf16`` instruction. This mode is used for development purposes and it does not guarantee good performance. -To enable the simulation, the ``ov::inference_precision`` has to be explicitly set to ``ov::element::bf16``. +To enable the simulation, the ``ov::hint::inference_precision`` has to be explicitly set to ``ov::element::bf16``. .. note:: - If ``ov::inference_precision`` is set to ``ov::element::bf16`` on a CPU without native bfloat16 support or bfloat16 simulation mode, an exception is thrown. + If ``ov::hint::inference_precision`` is set to ``ov::element::bf16`` on a CPU without native bfloat16 support or bfloat16 simulation mode, an exception is thrown. .. note:: @@ -292,7 +292,7 @@ Read-write Properties All parameters must be set before calling ``ov::Core::compile_model()`` in order to take effect or passed as additional argument to ``ov::Core::compile_model()`` - ``ov::enable_profiling`` -- ``ov::inference_precision`` +- ``ov::hint::inference_precision`` - ``ov::hint::performance_mode`` - ``ov::hint::num_request`` - ``ov::num_streams`` diff --git a/docs/OV_Runtime_UG/supported_plugins/GNA.md b/docs/OV_Runtime_UG/supported_plugins/GNA.md index 7faace9e172..1b07d5de1bf 100644 --- a/docs/OV_Runtime_UG/supported_plugins/GNA.md +++ b/docs/OV_Runtime_UG/supported_plugins/GNA.md @@ -140,7 +140,7 @@ quantization hints based on statistics for the provided dataset. * Accuracy (i16 weights) * Performance (i8 weights) -For POT quantized models, the ``ov::inference_precision`` property has no effect except in cases described in the +For POT quantized models, the ``ov::hint::inference_precision`` property has no effect except in cases described in the :ref:`Model and Operation Limitations section <#model-and-operation-limitations>`. @@ -268,7 +268,7 @@ In order to take effect, the following parameters must be set before model compi - ov::cache_dir - ov::enable_profiling -- ov::inference_precision +- ov::hint::inference_precision - ov::hint::num_requests - ov::intel_gna::compile_target - ov::intel_gna::firmware_model_image_path @@ -354,7 +354,7 @@ Support for 2D Convolutions using POT For POT to successfully work with the models including GNA3.0 2D convolutions, the following requirements must be met: * All convolution parameters are natively supported by HW (see tables above). -* The runtime precision is explicitly set by the ``ov::inference_precision`` property as ``i8`` for the models produced by +* The runtime precision is explicitly set by the ``ov::hint::inference_precision`` property as ``i8`` for the models produced by the ``performance mode`` of POT, and as ``i16`` for the models produced by the ``accuracy mode`` of POT. diff --git a/docs/OV_Runtime_UG/supported_plugins/GPU.md b/docs/OV_Runtime_UG/supported_plugins/GPU.md index d2e5bce1d37..3be45dc12eb 100644 --- a/docs/OV_Runtime_UG/supported_plugins/GPU.md +++ b/docs/OV_Runtime_UG/supported_plugins/GPU.md @@ -327,7 +327,7 @@ All parameters must be set before calling ``ov::Core::compile_model()`` in order - ov::hint::performance_mode - ov::hint::execution_mode - ov::hint::num_requests -- ov::inference_precision +- ov::hint::inference_precision - ov::num_streams - ov::compilation_num_threads - ov::device::id diff --git a/docs/optimization_guide/dldt_deployment_optimization_guide.md b/docs/optimization_guide/dldt_deployment_optimization_guide.md index 9ae41edc37a..8dfad7f088d 100644 --- a/docs/optimization_guide/dldt_deployment_optimization_guide.md +++ b/docs/optimization_guide/dldt_deployment_optimization_guide.md @@ -16,7 +16,7 @@ Runtime optimization, or deployment optimization, focuses on tuning inference parameters and execution means (e.g., the optimum number of requests executed simultaneously). Unlike model-level optimizations, they are highly specific to the hardware and case they are used for, and often come at a cost. -`ov::inference_precision `__ is a "typical runtime configuration" which trades accuracy for performance, allowing ``fp16/bf16`` execution for the layers that remain in ``fp32`` after quantization of the original ``fp32`` model. +`ov::hint::inference_precision `__ is a "typical runtime configuration" which trades accuracy for performance, allowing ``fp16/bf16`` execution for the layers that remain in ``fp32`` after quantization of the original ``fp32`` model. Therefore, optimization should start with defining the use case. For example, if it is about processing millions of samples by overnight jobs in data centers, throughput could be prioritized over latency. On the other hand, real-time usages would likely trade off throughput to deliver the results at minimal latency. A combined scenario is also possible, targeting the highest possible throughput, while maintaining a specific latency threshold. diff --git a/docs/snippets/cpu/Bfloat16Inference1.cpp b/docs/snippets/cpu/Bfloat16Inference1.cpp index 58f42ebfcaf..51850c6018d 100644 --- a/docs/snippets/cpu/Bfloat16Inference1.cpp +++ b/docs/snippets/cpu/Bfloat16Inference1.cpp @@ -6,7 +6,7 @@ using namespace InferenceEngine; ov::Core core; auto network = core.read_model("sample.xml"); auto exec_network = core.compile_model(network, "CPU"); -auto inference_precision = exec_network.get_property(ov::inference_precision); +auto inference_precision = exec_network.get_property(ov::hint::inference_precision); //! [part1] return 0; diff --git a/docs/snippets/cpu/Bfloat16Inference2.cpp b/docs/snippets/cpu/Bfloat16Inference2.cpp index 762329269fc..c06a6491b89 100644 --- a/docs/snippets/cpu/Bfloat16Inference2.cpp +++ b/docs/snippets/cpu/Bfloat16Inference2.cpp @@ -4,7 +4,7 @@ int main() { using namespace InferenceEngine; //! [part2] ov::Core core; -core.set_property("CPU", ov::inference_precision(ov::element::f32)); +core.set_property("CPU", ov::hint::inference_precision(ov::element::f32)); //! [part2] return 0; diff --git a/docs/snippets/ov_hetero.cpp b/docs/snippets/ov_hetero.cpp index 2f5cf3f5c9e..791340afff5 100644 --- a/docs/snippets/ov_hetero.cpp +++ b/docs/snippets/ov_hetero.cpp @@ -49,7 +49,7 @@ auto compiled_model = core.compile_model(model, "HETERO", // profiling is enabled only for GPU ov::device::properties("GPU", ov::enable_profiling(true)), // FP32 inference precision only for CPU - ov::device::properties("CPU", ov::inference_precision(ov::element::f32)) + ov::device::properties("CPU", ov::hint::inference_precision(ov::element::f32)) ); //! [configure_fallback_devices] } diff --git a/docs/snippets/ov_properties_api.cpp b/docs/snippets/ov_properties_api.cpp index 7815291ee7b..e5f1ff7648f 100644 --- a/docs/snippets/ov_properties_api.cpp +++ b/docs/snippets/ov_properties_api.cpp @@ -19,7 +19,7 @@ auto model = core.read_model("sample.xml"); //! [compile_model_with_property] auto compiled_model = core.compile_model(model, "CPU", ov::hint::performance_mode(ov::hint::PerformanceMode::THROUGHPUT), - ov::inference_precision(ov::element::f32)); + ov::hint::inference_precision(ov::element::f32)); //! [compile_model_with_property] } diff --git a/docs/snippets/ov_properties_migration.cpp b/docs/snippets/ov_properties_migration.cpp index 6ee3279395c..7be66b4a1d1 100644 --- a/docs/snippets/ov_properties_migration.cpp +++ b/docs/snippets/ov_properties_migration.cpp @@ -25,7 +25,7 @@ auto model = core.read_model("sample.xml"); auto compiled_model = core.compile_model(model, "MULTI", ov::device::priorities("GPU", "CPU"), ov::hint::performance_mode(ov::hint::PerformanceMode::THROUGHPUT), - ov::inference_precision(ov::element::f32)); + ov::hint::inference_precision(ov::element::f32)); //! [core_compile_model] //! [compiled_model_set_property] diff --git a/samples/cpp/benchmark_app/benchmark_app.hpp b/samples/cpp/benchmark_app/benchmark_app.hpp index 50fe8e8dac1..1631861b18c 100644 --- a/samples/cpp/benchmark_app/benchmark_app.hpp +++ b/samples/cpp/benchmark_app/benchmark_app.hpp @@ -327,7 +327,7 @@ DEFINE_string(nstreams, "", infer_num_streams_message); /// @brief Define flag for inference only mode
DEFINE_bool(inference_only, true, inference_only_message); -/// @brief Define flag for inference precision +/// @brief Define flag for inference precision hint DEFINE_string(infer_precision, "", inference_precision_message); /// @brief Specify precision for all input layers of the network diff --git a/samples/cpp/benchmark_app/main.cpp b/samples/cpp/benchmark_app/main.cpp index 99e268dc9dd..107de484a1b 100644 --- a/samples/cpp/benchmark_app/main.cpp +++ b/samples/cpp/benchmark_app/main.cpp @@ -481,17 +481,17 @@ int main(int argc, char* argv[]) { auto it_device_infer_precision = device_infer_precision.find(device); if (it_device_infer_precision != device_infer_precision.end()) { // set to user defined value - if (supported(ov::inference_precision.name())) { - device_config.emplace(ov::inference_precision(it_device_infer_precision->second)); + if (supported(ov::hint::inference_precision.name())) { + device_config.emplace(ov::hint::inference_precision(it_device_infer_precision->second)); } else if (is_virtual_device(device)) { update_device_config_for_virtual_device(it_device_infer_precision->second, device_config, - ov::inference_precision, + ov::hint::inference_precision, is_dev_set_property, is_load_config); } else { throw std::logic_error("Device " + device + " doesn't support config key '" + - ov::inference_precision.name() + "'! " + + ov::hint::inference_precision.name() + "'! " + "Please specify -infer_precision for correct devices in format " ":,:" + " or via configuration file."); diff --git a/samples/cpp/benchmark_app/utils.cpp b/samples/cpp/benchmark_app/utils.cpp index 8c53f3d1924..48940059dfe 100644 --- a/samples/cpp/benchmark_app/utils.cpp +++ b/samples/cpp/benchmark_app/utils.cpp @@ -200,7 +200,7 @@ void update_device_config_for_virtual_device(const std::string& value, const auto& device_value = it.second; if (device_config.find(ov::device::properties.name()) == device_config.end() || (is_load_config && is_dev_set_property[device_name])) { - // Create ov::device::properties with ov::num_stream/ov::inference_precision and + // Create ov::device::properties with ov::num_stream/ov::hint::inference_precision and // 1. Insert this ov::device::properties into device config if this // ov::device::properties isn't existed. Otherwise, // 2. Replace the existed ov::device::properties within device config. diff --git a/samples/cpp/speech_sample/main.cpp b/samples/cpp/speech_sample/main.cpp index 7553ee7bf22..3c17b8e3488 100644 --- a/samples/cpp/speech_sample/main.cpp +++ b/samples/cpp/speech_sample/main.cpp @@ -220,7 +220,7 @@ int main(int argc, char* argv[]) { gnaPluginConfig[ov::intel_gna::scale_factors_per_input.name()] = scale_factors_per_input; } } - gnaPluginConfig[ov::inference_precision.name()] = (FLAGS_qb == 8) ? ov::element::i8 : ov::element::i16; + gnaPluginConfig[ov::hint::inference_precision.name()] = (FLAGS_qb == 8) ? ov::element::i8 : ov::element::i16; const std::unordered_map StringHWGenerationMap{ {"GNA_TARGET_1_0", ov::intel_gna::HWGeneration::GNA_1_0}, {"GNA_TARGET_2_0", ov::intel_gna::HWGeneration::GNA_2_0}, diff --git a/src/bindings/python/src/pyopenvino/core/properties/properties.cpp b/src/bindings/python/src/pyopenvino/core/properties/properties.cpp index 3bc97410508..51c88082a86 100644 --- a/src/bindings/python/src/pyopenvino/core/properties/properties.cpp +++ b/src/bindings/python/src/pyopenvino/core/properties/properties.cpp @@ -39,7 +39,6 @@ void regmodule_properties(py::module m) { wrap_property_RO(m_properties, ov::optimal_batch_size, "optimal_batch_size"); wrap_property_RO(m_properties, ov::max_batch_size, "max_batch_size"); wrap_property_RO(m_properties, ov::range_for_async_infer_requests, "range_for_async_infer_requests"); - wrap_property_RW(m_properties, ov::inference_precision, "inference_precision"); // Submodule hint py::module m_hint = diff --git a/src/bindings/python/tests/test_runtime/test_properties.py b/src/bindings/python/tests/test_runtime/test_properties.py index f525ed23e11..1b00524b549 100644 --- a/src/bindings/python/tests/test_runtime/test_properties.py +++ b/src/bindings/python/tests/test_runtime/test_properties.py @@ -215,7 +215,6 @@ def test_properties_ro(ov_property_ro, expected_value): ((properties.Affinity.NONE, properties.Affinity.NONE),), ), (properties.force_tbb_terminate, "FORCE_TBB_TERMINATE", ((True, True),)), - (properties.inference_precision, "INFERENCE_PRECISION_HINT", ((Type.f32, Type.f32),)), (properties.hint.inference_precision, "INFERENCE_PRECISION_HINT", ((Type.f32, Type.f32),)), ( properties.hint.model_priority, @@ -342,12 +341,12 @@ def test_properties_device_properties(): {"CPU": {"NUM_STREAMS": 2}}) check({"CPU": make_dict(properties.streams.num(2))}, {"CPU": {"NUM_STREAMS": properties.streams.Num(2)}}) - check({"GPU": make_dict(properties.inference_precision(Type.f32))}, + check({"GPU": make_dict(properties.hint.inference_precision(Type.f32))}, {"GPU": {"INFERENCE_PRECISION_HINT": Type.f32}}) - check({"CPU": make_dict(properties.streams.num(2), properties.inference_precision(Type.f32))}, + check({"CPU": make_dict(properties.streams.num(2), properties.hint.inference_precision(Type.f32))}, {"CPU": {"INFERENCE_PRECISION_HINT": Type.f32, "NUM_STREAMS": properties.streams.Num(2)}}) - check({"CPU": make_dict(properties.streams.num(2), properties.inference_precision(Type.f32)), - "GPU": make_dict(properties.streams.num(1), properties.inference_precision(Type.f16))}, + check({"CPU": make_dict(properties.streams.num(2), properties.hint.inference_precision(Type.f32)), + "GPU": make_dict(properties.streams.num(1), properties.hint.inference_precision(Type.f16))}, {"CPU": {"INFERENCE_PRECISION_HINT": Type.f32, "NUM_STREAMS": properties.streams.Num(2)}, "GPU": {"INFERENCE_PRECISION_HINT": Type.f16, "NUM_STREAMS": properties.streams.Num(1)}}) @@ -420,7 +419,7 @@ def test_single_property_setting(device): properties.cache_dir("./"), properties.inference_num_threads(9), properties.affinity(properties.Affinity.NONE), - properties.inference_precision(Type.f32), + properties.hint.inference_precision(Type.f32), properties.hint.performance_mode(properties.hint.PerformanceMode.LATENCY), properties.hint.scheduling_core_type(properties.hint.SchedulingCoreType.PCORE_ONLY), properties.hint.use_hyper_threading(True), @@ -434,7 +433,7 @@ def test_single_property_setting(device): properties.cache_dir(): "./", properties.inference_num_threads(): 9, properties.affinity(): properties.Affinity.NONE, - properties.inference_precision(): Type.f32, + properties.hint.inference_precision(): Type.f32, properties.hint.performance_mode(): properties.hint.PerformanceMode.LATENCY, properties.hint.scheduling_core_type(): properties.hint.SchedulingCoreType.PCORE_ONLY, properties.hint.use_hyper_threading(): True, diff --git a/src/inference/include/openvino/runtime/properties.hpp b/src/inference/include/openvino/runtime/properties.hpp index bff6714cbd5..497e8a8a322 100644 --- a/src/inference/include/openvino/runtime/properties.hpp +++ b/src/inference/include/openvino/runtime/properties.hpp @@ -233,22 +233,16 @@ static constexpr Property model_name{"NETWO static constexpr Property optimal_number_of_infer_requests{ "OPTIMAL_NUMBER_OF_INFER_REQUESTS"}; -/** - * @brief Hint for device to use specified precision for inference - * @ingroup ov_runtime_cpp_prop_api - */ -static constexpr Property inference_precision{"INFERENCE_PRECISION_HINT"}; - /** * @brief Namespace with hint properties */ namespace hint { /** - * @brief An alias for inference_precision property for backward compatibility + * @brief Hint for device to use specified precision for inference * @ingroup ov_runtime_cpp_prop_api */ -using ov::inference_precision; +static constexpr Property inference_precision{"INFERENCE_PRECISION_HINT"}; /** * @brief Enum to define possible priorities hints @@ -271,7 +265,7 @@ inline std::ostream& operator<<(std::ostream& os, const Priority& priority) { case Priority::HIGH: return os << "HIGH"; default: - OPENVINO_THROW("Unsupported performance measure hint"); + OPENVINO_THROW("Unsupported model priority value"); } } diff --git a/src/plugins/intel_cpu/src/config.cpp b/src/plugins/intel_cpu/src/config.cpp index 8407639d873..cc0f8d621fc 100644 --- a/src/plugins/intel_cpu/src/config.cpp +++ b/src/plugins/intel_cpu/src/config.cpp @@ -176,7 +176,7 @@ void Config::readProperties(const std::map &prop) { if (!device_id.empty()) { IE_THROW() << "CPU plugin supports only '' as device id"; } - } else if (key == ov::inference_precision.name()) { + } else if (key == ov::hint::inference_precision.name()) { if (val == "bf16") { if (dnnl::impl::cpu::x64::mayiuse(dnnl::impl::cpu::x64::avx512_core)) { enforceBF16 = true; @@ -186,7 +186,7 @@ void Config::readProperties(const std::map &prop) { } else if (val == "f32") { enforceBF16 = false; } else { - IE_THROW() << "Wrong value for property key " << ov::inference_precision.name() + IE_THROW() << "Wrong value for property key " << ov::hint::inference_precision.name() << ". Supported values: bf16, f32"; } } else if (PluginConfigInternalParams::KEY_CPU_RUNTIME_CACHE_CAPACITY == key) { diff --git a/src/plugins/intel_cpu/src/exec_network.cpp b/src/plugins/intel_cpu/src/exec_network.cpp index 3617829e86b..ed9f4e01e79 100644 --- a/src/plugins/intel_cpu/src/exec_network.cpp +++ b/src/plugins/intel_cpu/src/exec_network.cpp @@ -309,7 +309,7 @@ InferenceEngine::Parameter ExecNetwork::GetMetric(const std::string &name) const RO_property(ov::affinity.name()), RO_property(ov::inference_num_threads.name()), RO_property(ov::enable_profiling.name()), - RO_property(ov::inference_precision.name()), + RO_property(ov::hint::inference_precision.name()), RO_property(ov::hint::performance_mode.name()), RO_property(ov::hint::num_requests.name()), RO_property(ov::hint::scheduling_core_type.name()), @@ -347,10 +347,10 @@ InferenceEngine::Parameter ExecNetwork::GetMetric(const std::string &name) const } else if (name == ov::enable_profiling.name()) { const bool perfCount = config.collectPerfCounters; return decltype(ov::enable_profiling)::value_type(perfCount); - } else if (name == ov::inference_precision) { + } else if (name == ov::hint::inference_precision) { const auto enforceBF16 = config.enforceBF16; const auto inference_precision = enforceBF16 ? ov::element::bf16 : ov::element::f32; - return decltype(ov::inference_precision)::value_type(inference_precision); + return decltype(ov::hint::inference_precision)::value_type(inference_precision); } else if (name == ov::hint::performance_mode) { const auto perfHint = ov::util::from_string(config.perfHintsConfig.ovPerfHint, ov::hint::performance_mode); return perfHint; diff --git a/src/plugins/intel_cpu/src/plugin.cpp b/src/plugins/intel_cpu/src/plugin.cpp index 1cb154e5c48..9b0a3705bcc 100644 --- a/src/plugins/intel_cpu/src/plugin.cpp +++ b/src/plugins/intel_cpu/src/plugin.cpp @@ -577,10 +577,10 @@ Parameter Engine::GetConfig(const std::string& name, const std::map& config) { } } else if (key == ov::hint::performance_mode) { performance_mode = ov::util::from_string(value, ov::hint::performance_mode); - } else if (key == ov::inference_precision) { + } else if (key == ov::hint::inference_precision) { inference_precision = ov::util::from_string(value); if ((inference_precision != ov::element::i8) && (inference_precision != ov::element::i16)) { THROW_GNA_EXCEPTION << "Unsupported precision of GNA hardware, should be I16 or I8, but was: " << value; @@ -187,7 +187,7 @@ void Config::UpdateFromMap(const std::map& config) { << value; } // Update gnaPrecision basing on execution_mode only if inference_precision is not set - if (config.count(ov::inference_precision.name()) == 0) { + if (config.count(ov::hint::inference_precision.name()) == 0) { gnaPrecision = execution_mode == ov::hint::ExecutionMode::PERFORMANCE ? InferenceEngine::Precision::I8 : InferenceEngine::Precision::I16; } @@ -320,7 +320,7 @@ void Config::AdjustKeyMapValues() { gnaFlags.exclusive_async_requests ? PluginConfigParams::YES : PluginConfigParams::NO; keyConfigMap[ov::hint::performance_mode.name()] = ov::util::to_string(performance_mode); if (inference_precision != ov::element::undefined) { - keyConfigMap[ov::inference_precision.name()] = ov::util::to_string(inference_precision); + keyConfigMap[ov::hint::inference_precision.name()] = ov::util::to_string(inference_precision); } else { keyConfigMap[GNA_CONFIG_KEY(PRECISION)] = gnaPrecision.name(); } @@ -355,7 +355,7 @@ Parameter Config::GetParameter(const std::string& name) const { return DeviceToHwGeneration(target->get_user_set_compile_target()); } else if (name == ov::hint::performance_mode) { return performance_mode; - } else if (name == ov::inference_precision) { + } else if (name == ov::hint::inference_precision) { return inference_precision; } else { auto result = keyConfigMap.find(name); @@ -375,7 +375,7 @@ const Parameter Config::GetImpactingModelCompilationProperties(bool compiled) { {ov::intel_gna::compile_target.name(), model_mutability}, {ov::intel_gna::pwl_design_algorithm.name(), model_mutability}, {ov::intel_gna::pwl_max_error_percent.name(), model_mutability}, - {ov::inference_precision.name(), model_mutability}, + {ov::hint::inference_precision.name(), model_mutability}, {ov::hint::execution_mode.name(), model_mutability}, {ov::hint::num_requests.name(), model_mutability}, }; diff --git a/src/plugins/intel_gna/tests/functional/shared_tests_instances/behavior/ov_executable_network/get_metric.cpp b/src/plugins/intel_gna/tests/functional/shared_tests_instances/behavior/ov_executable_network/get_metric.cpp index f57484850d3..72b43943020 100644 --- a/src/plugins/intel_gna/tests/functional/shared_tests_instances/behavior/ov_executable_network/get_metric.cpp +++ b/src/plugins/intel_gna/tests/functional/shared_tests_instances/behavior/ov_executable_network/get_metric.cpp @@ -193,7 +193,7 @@ INSTANTIATE_TEST_SUITE_P( ::testing::Combine( ::testing::Values("GNA"), ::testing::Values(ov::intel_gna::scale_factors_per_input(std::map{{"0", 1.0f}}), - ov::inference_precision(ngraph::element::i8), + ov::hint::inference_precision(ngraph::element::i8), ov::hint::num_requests(2), ov::intel_gna::pwl_design_algorithm(ov::intel_gna::PWLDesignAlgorithm::UNIFORM_DISTRIBUTION), ov::intel_gna::pwl_max_error_percent(0.2), @@ -221,8 +221,8 @@ INSTANTIATE_TEST_SUITE_P( ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_FP32), ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::AUTO), ov::intel_gna::scale_factors_per_input(std::map{{"input", 1.0f}}), - ov::inference_precision(ov::element::i8), - ov::inference_precision(ov::element::i16), + ov::hint::inference_precision(ov::element::i8), + ov::hint::inference_precision(ov::element::i16), ov::hint::performance_mode(ov::hint::PerformanceMode::LATENCY), ov::hint::performance_mode(ov::hint::PerformanceMode::THROUGHPUT), ov::hint::performance_mode(ov::hint::PerformanceMode::UNDEFINED), diff --git a/src/plugins/intel_gna/tests/functional/shared_tests_instances/behavior/ov_plugin/core_integration.cpp b/src/plugins/intel_gna/tests/functional/shared_tests_instances/behavior/ov_plugin/core_integration.cpp index 1aa99fe9df9..2e644951b56 100644 --- a/src/plugins/intel_gna/tests/functional/shared_tests_instances/behavior/ov_plugin/core_integration.cpp +++ b/src/plugins/intel_gna/tests/functional/shared_tests_instances/behavior/ov_plugin/core_integration.cpp @@ -116,11 +116,11 @@ TEST(OVClassBasicTest, smoke_SetConfigAfterCreatedPrecisionHint) { ov::Core core; ov::element::Type precision; - OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::inference_precision)); + OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::hint::inference_precision)); ASSERT_EQ(ov::element::undefined, precision); - OV_ASSERT_NO_THROW(core.set_property("GNA", ov::inference_precision(ov::element::i8))); - OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::inference_precision)); + OV_ASSERT_NO_THROW(core.set_property("GNA", ov::hint::inference_precision(ov::element::i8))); + OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::hint::inference_precision)); ASSERT_EQ(ov::element::i8, precision); OPENVINO_SUPPRESS_DEPRECATED_START @@ -128,23 +128,23 @@ TEST(OVClassBasicTest, smoke_SetConfigAfterCreatedPrecisionHint) { OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::hint::inference_precision)); OPENVINO_SUPPRESS_DEPRECATED_END - OV_ASSERT_NO_THROW(core.set_property("GNA", ov::inference_precision(ov::element::i16))); - OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::inference_precision)); + OV_ASSERT_NO_THROW(core.set_property("GNA", ov::hint::inference_precision(ov::element::i16))); + OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::hint::inference_precision)); ASSERT_EQ(ov::element::i16, precision); - OV_ASSERT_NO_THROW(core.set_property("GNA", {{ov::inference_precision.name(), "I8"}})); - OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::inference_precision)); + OV_ASSERT_NO_THROW(core.set_property("GNA", {{ov::hint::inference_precision.name(), "I8"}})); + OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::hint::inference_precision)); ASSERT_EQ(ov::element::i8, precision); - OV_ASSERT_NO_THROW(core.set_property("GNA", {{ov::inference_precision.name(), "I16"}})); - OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::inference_precision)); + OV_ASSERT_NO_THROW(core.set_property("GNA", {{ov::hint::inference_precision.name(), "I16"}})); + OV_ASSERT_NO_THROW(precision = core.get_property("GNA", ov::hint::inference_precision)); ASSERT_EQ(ov::element::i16, precision); OV_ASSERT_NO_THROW( - core.set_property("GNA", {ov::inference_precision(ov::element::i8), {GNA_CONFIG_KEY(PRECISION), "I16"}})); - ASSERT_THROW(core.set_property("GNA", ov::inference_precision(ov::element::i32)), ov::Exception); - ASSERT_THROW(core.set_property("GNA", ov::inference_precision(ov::element::undefined)), ov::Exception); - ASSERT_THROW(core.set_property("GNA", {{ov::inference_precision.name(), "ABC"}}), ov::Exception); + core.set_property("GNA", {ov::hint::inference_precision(ov::element::i8), {GNA_CONFIG_KEY(PRECISION), "I16"}})); + ASSERT_THROW(core.set_property("GNA", ov::hint::inference_precision(ov::element::i32)), ov::Exception); + ASSERT_THROW(core.set_property("GNA", ov::hint::inference_precision(ov::element::undefined)), ov::Exception); + ASSERT_THROW(core.set_property("GNA", {{ov::hint::inference_precision.name(), "ABC"}}), ov::Exception); } TEST(OVClassBasicTest, smoke_SetConfigAfterCreatedPerformanceHint) { diff --git a/src/plugins/intel_gna/tests/unit/gna_export_import_test.cpp b/src/plugins/intel_gna/tests/unit/gna_export_import_test.cpp index 0856912be3e..707c60a0591 100644 --- a/src/plugins/intel_gna/tests/unit/gna_export_import_test.cpp +++ b/src/plugins/intel_gna/tests/unit/gna_export_import_test.cpp @@ -169,7 +169,7 @@ protected: TEST_F(GNAExportImportTest, ExportImportI16) { const ov::AnyMap gna_config = {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), - ov::inference_precision(ngraph::element::i16)}; + ov::hint::inference_precision(ngraph::element::i16)}; exported_file_name = "export_test.bin"; ExportModel(exported_file_name, gna_config); ImportModel(exported_file_name, gna_config); @@ -177,7 +177,7 @@ TEST_F(GNAExportImportTest, ExportImportI16) { TEST_F(GNAExportImportTest, ExportImportI8) { const ov::AnyMap gna_config = {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), - ov::inference_precision(ngraph::element::i8)}; + ov::hint::inference_precision(ngraph::element::i8)}; exported_file_name = "export_test.bin"; ExportModel(exported_file_name, gna_config); ImportModel(exported_file_name, gna_config); diff --git a/src/plugins/intel_gna/tests/unit/gna_hw_precision_test.cpp b/src/plugins/intel_gna/tests/unit/gna_hw_precision_test.cpp index ce771167da4..cd7d27997b4 100644 --- a/src/plugins/intel_gna/tests/unit/gna_hw_precision_test.cpp +++ b/src/plugins/intel_gna/tests/unit/gna_hw_precision_test.cpp @@ -85,13 +85,13 @@ TEST_F(GNAHwPrecisionTest, GNAHwPrecisionTestDefault) { TEST_F(GNAHwPrecisionTest, GNAHwPrecisionTestI16) { Run({ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), - ov::inference_precision(ngraph::element::i16)}); + ov::hint::inference_precision(ngraph::element::i16)}); compare(ngraph::element::i16, ngraph::element::i32, sizeof(int16_t), sizeof(uint32_t)); } TEST_F(GNAHwPrecisionTest, GNAHwPrecisionTestI8) { Run({ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), - ov::inference_precision(ngraph::element::i8)}); + ov::hint::inference_precision(ngraph::element::i8)}); compare(ngraph::element::i16, ngraph::element::i32, sizeof(int8_t), @@ -100,7 +100,7 @@ TEST_F(GNAHwPrecisionTest, GNAHwPrecisionTestI8) { TEST_F(GNAHwPrecisionTest, GNAHwPrecisionTestI8LP) { Run({ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, true); compare(ngraph::element::i8, ngraph::element::i32, sizeof(int8_t), sizeof(int8_t)); } diff --git a/src/plugins/intel_gna/tests/unit/gna_input_preproc_test.cpp b/src/plugins/intel_gna/tests/unit/gna_input_preproc_test.cpp index a06b7b12ff6..e9b0edb4637 100644 --- a/src/plugins/intel_gna/tests/unit/gna_input_preproc_test.cpp +++ b/src/plugins/intel_gna/tests/unit/gna_input_preproc_test.cpp @@ -122,13 +122,13 @@ INSTANTIATE_TEST_SUITE_P( // gna config map {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 1.0f}}), - ov::inference_precision(ngraph::element::i16)}, + ov::hint::inference_precision(ngraph::element::i16)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 8.0f}}), - ov::inference_precision(ngraph::element::i16)}, + ov::hint::inference_precision(ngraph::element::i16)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 0.125f}}), - ov::inference_precision(ngraph::element::i16)}, + ov::hint::inference_precision(ngraph::element::i16)}, }), ::testing::Values(true), // gna device ::testing::Values(false), // use low precision @@ -148,13 +148,13 @@ INSTANTIATE_TEST_SUITE_P( // gna config map {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 1.0f}}), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 4.0f}}), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 0.25f}}), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, }), ::testing::Values(true), // gna device ::testing::Values(true), // use low precision @@ -200,13 +200,13 @@ INSTANTIATE_TEST_SUITE_P( // gna config map {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 1.0f}}), - ov::inference_precision(ngraph::element::i16)}, + ov::hint::inference_precision(ngraph::element::i16)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 4.0f}}), - ov::inference_precision(ngraph::element::i16)}, + ov::hint::inference_precision(ngraph::element::i16)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 0.25f}}), - ov::inference_precision(ngraph::element::i16)}, + ov::hint::inference_precision(ngraph::element::i16)}, }), ::testing::Values(true), // gna device ::testing::Values(false), // use low precision @@ -227,13 +227,13 @@ INSTANTIATE_TEST_SUITE_P( // gna config map, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 1.0f}}), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 10.0f}}), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 20.0f}}), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, }), ::testing::Values(true), // gna device ::testing::Values(true), // use low precision @@ -254,10 +254,10 @@ INSTANTIATE_TEST_SUITE_P( // gna config map {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 1.0f}}), - ov::inference_precision(ngraph::element::i16)}, + ov::hint::inference_precision(ngraph::element::i16)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 8.0f}}), - ov::inference_precision(ngraph::element::i16)}, + ov::hint::inference_precision(ngraph::element::i16)}, }), ::testing::Values(true), // gna device ::testing::Values(false), // use low precision @@ -278,10 +278,10 @@ INSTANTIATE_TEST_SUITE_P( // gna config map {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 1.0f}}), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, {ov::intel_gna::execution_mode(ov::intel_gna::ExecutionMode::SW_EXACT), ov::intel_gna::scale_factors_per_input(std::map{{"0", 4.0f}}), - ov::inference_precision(ngraph::element::i8)}, + ov::hint::inference_precision(ngraph::element::i8)}, }), ::testing::Values(true), // gna device ::testing::Values(true), // use low precision diff --git a/src/plugins/intel_gna/tests/unit/gna_plugin_config_test.cpp b/src/plugins/intel_gna/tests/unit/gna_plugin_config_test.cpp index c3a9aeaf7d8..62a1cba9f22 100644 --- a/src/plugins/intel_gna/tests/unit/gna_plugin_config_test.cpp +++ b/src/plugins/intel_gna/tests/unit/gna_plugin_config_test.cpp @@ -247,9 +247,9 @@ TEST_F(GNAPluginConfigTest, GnaConfigExecutionModeUpdatesGnaPrecision) { } TEST_F(GNAPluginConfigTest, GnaConfigInferencePrecisionUpdatesGnaPrecision) { - SetAndCompare(ov::inference_precision.name(), ov::util::to_string(ov::element::i8)); + SetAndCompare(ov::hint::inference_precision.name(), ov::util::to_string(ov::element::i8)); EXPECT_EQ(config.gnaPrecision, InferenceEngine::Precision::I8); - SetAndCompare(ov::inference_precision.name(), ov::util::to_string(ov::element::i16)); + SetAndCompare(ov::hint::inference_precision.name(), ov::util::to_string(ov::element::i16)); EXPECT_EQ(config.gnaPrecision, InferenceEngine::Precision::I16); } @@ -257,7 +257,7 @@ TEST_F(GNAPluginConfigTest, GnaConfigInferencePrecisionHasHigherPriorityI16) { SetAndCompare(GNA_CONFIG_KEY(PRECISION), Precision(Precision::I8).name()); SetAndCompare(ov::hint::execution_mode.name(), ov::util::to_string(ov::hint::ExecutionMode::PERFORMANCE)); - SetAndCompare(ov::inference_precision.name(), ov::util::to_string(ov::element::i16)); + SetAndCompare(ov::hint::inference_precision.name(), ov::util::to_string(ov::element::i16)); EXPECT_EQ(config.gnaPrecision, InferenceEngine::Precision::I16); } @@ -265,6 +265,6 @@ TEST_F(GNAPluginConfigTest, GnaConfigInferencePrecisionHasHigherPriorityI8) { SetAndCompare(GNA_CONFIG_KEY(PRECISION), Precision(Precision::I16).name()); SetAndCompare(ov::hint::execution_mode.name(), ov::util::to_string(ov::hint::ExecutionMode::ACCURACY)); - SetAndCompare(ov::inference_precision.name(), ov::util::to_string(ov::element::i8)); + SetAndCompare(ov::hint::inference_precision.name(), ov::util::to_string(ov::element::i8)); EXPECT_EQ(config.gnaPrecision, InferenceEngine::Precision::I8); } diff --git a/src/plugins/intel_gpu/src/plugin/compiled_model.cpp b/src/plugins/intel_gpu/src/plugin/compiled_model.cpp index a45891801ea..5d2e7e22749 100644 --- a/src/plugins/intel_gpu/src/plugin/compiled_model.cpp +++ b/src/plugins/intel_gpu/src/plugin/compiled_model.cpp @@ -325,7 +325,7 @@ InferenceEngine::Parameter CompiledModel::GetMetric(const std::string &name) con ov::PropertyName{ov::compilation_num_threads.name(), PropertyMutability::RO}, ov::PropertyName{ov::num_streams.name(), PropertyMutability::RO}, ov::PropertyName{ov::hint::num_requests.name(), PropertyMutability::RO}, - ov::PropertyName{ov::inference_precision.name(), PropertyMutability::RO}, + ov::PropertyName{ov::hint::inference_precision.name(), PropertyMutability::RO}, ov::PropertyName{ov::device::id.name(), PropertyMutability::RO}, ov::PropertyName{ov::execution_devices.name(), PropertyMutability::RO} }; diff --git a/src/plugins/intel_gpu/src/plugin/legacy_api_helper.cpp b/src/plugins/intel_gpu/src/plugin/legacy_api_helper.cpp index 56c92bc84bc..4945e76716b 100644 --- a/src/plugins/intel_gpu/src/plugin/legacy_api_helper.cpp +++ b/src/plugins/intel_gpu/src/plugin/legacy_api_helper.cpp @@ -14,7 +14,7 @@ bool LegacyAPIHelper::is_new_api_property(const std::pair& static const std::vector new_properties_list = { ov::intel_gpu::hint::queue_priority.name(), ov::intel_gpu::hint::queue_throttle.name(), - ov::inference_precision.name(), + ov::hint::inference_precision.name(), ov::compilation_num_threads.name(), ov::num_streams.name(), }; diff --git a/src/plugins/intel_gpu/src/plugin/plugin.cpp b/src/plugins/intel_gpu/src/plugin/plugin.cpp index ddf75aefaaf..d652fa9f354 100644 --- a/src/plugins/intel_gpu/src/plugin/plugin.cpp +++ b/src/plugins/intel_gpu/src/plugin/plugin.cpp @@ -671,7 +671,7 @@ Parameter Plugin::GetMetric(const std::string& name, const std::map Plugin::get_supported_properties() const { ov::PropertyName{ov::compilation_num_threads.name(), PropertyMutability::RW}, ov::PropertyName{ov::num_streams.name(), PropertyMutability::RW}, ov::PropertyName{ov::hint::num_requests.name(), PropertyMutability::RW}, - ov::PropertyName{ov::inference_precision.name(), PropertyMutability::RW}, + ov::PropertyName{ov::hint::inference_precision.name(), PropertyMutability::RW}, ov::PropertyName{ov::device::id.name(), PropertyMutability::RW}, }; diff --git a/src/plugins/intel_gpu/src/runtime/execution_config.cpp b/src/plugins/intel_gpu/src/runtime/execution_config.cpp index d73dfda4ec9..7abc6e759b5 100644 --- a/src/plugins/intel_gpu/src/runtime/execution_config.cpp +++ b/src/plugins/intel_gpu/src/runtime/execution_config.cpp @@ -40,7 +40,7 @@ void ExecutionConfig::set_default() { std::make_tuple(ov::cache_dir, ""), std::make_tuple(ov::num_streams, 1), std::make_tuple(ov::compilation_num_threads, std::max(1, static_cast(std::thread::hardware_concurrency()))), - std::make_tuple(ov::inference_precision, ov::element::f16, InferencePrecisionValidator()), + std::make_tuple(ov::hint::inference_precision, ov::element::f16, InferencePrecisionValidator()), std::make_tuple(ov::hint::model_priority, ov::hint::Priority::MEDIUM), std::make_tuple(ov::hint::performance_mode, ov::hint::PerformanceMode::LATENCY, PerformanceModeValidator()), std::make_tuple(ov::hint::execution_mode, ov::hint::ExecutionMode::PERFORMANCE), @@ -123,14 +123,14 @@ Any ExecutionConfig::get_property(const std::string& name) const { void ExecutionConfig::apply_execution_hints(const cldnn::device_info& info) { if (is_set_by_user(ov::hint::execution_mode)) { const auto mode = get_property(ov::hint::execution_mode); - if (!is_set_by_user(ov::inference_precision)) { + if (!is_set_by_user(ov::hint::inference_precision)) { if (mode == ov::hint::ExecutionMode::ACCURACY) { - set_property(ov::inference_precision(ov::element::f32)); + set_property(ov::hint::inference_precision(ov::element::f32)); } else if (mode == ov::hint::ExecutionMode::PERFORMANCE) { if (info.supports_fp16) - set_property(ov::inference_precision(ov::element::f16)); + set_property(ov::hint::inference_precision(ov::element::f16)); else - set_property(ov::inference_precision(ov::element::f32)); + set_property(ov::hint::inference_precision(ov::element::f32)); } } } diff --git a/src/tests/functional/plugin/gpu/behavior/inference_precision.cpp b/src/tests/functional/plugin/gpu/behavior/inference_precision.cpp index 215ea840516..2032b71cebe 100644 --- a/src/tests/functional/plugin/gpu/behavior/inference_precision.cpp +++ b/src/tests/functional/plugin/gpu/behavior/inference_precision.cpp @@ -39,7 +39,7 @@ TEST_P(InferencePrecisionTests, smoke_canSetInferencePrecisionAndInfer) { std::tie(model_precision, inference_precision) = GetParam(); auto function = ov::test::behavior::getDefaultNGraphFunctionForTheDevice(CommonTestUtils::DEVICE_GPU, {1, 1, 32, 32}, model_precision); ov::CompiledModel compiled_model; - OV_ASSERT_NO_THROW(compiled_model = core->compile_model(function, CommonTestUtils::DEVICE_GPU, ov::inference_precision(inference_precision))); + OV_ASSERT_NO_THROW(compiled_model = core->compile_model(function, CommonTestUtils::DEVICE_GPU, ov::hint::inference_precision(inference_precision))); auto req = compiled_model.create_infer_request(); OV_ASSERT_NO_THROW(req.infer()); } @@ -67,7 +67,7 @@ TEST(ExecutionModeTest, SetCompileGetInferPrecisionAndExecMode) { core.set_property(CommonTestUtils::DEVICE_GPU, ov::hint::execution_mode(ov::hint::ExecutionMode::PERFORMANCE)); auto model = ngraph::builder::subgraph::makeConvPoolRelu(); { - auto compiled_model = core.compile_model(model, CommonTestUtils::DEVICE_GPU, ov::inference_precision(ov::element::f32)); + auto compiled_model = core.compile_model(model, CommonTestUtils::DEVICE_GPU, ov::hint::inference_precision(ov::element::f32)); ASSERT_EQ(ov::hint::ExecutionMode::PERFORMANCE, compiled_model.get_property(ov::hint::execution_mode)); ASSERT_EQ(ov::element::f32, compiled_model.get_property(ov::hint::inference_precision)); } diff --git a/src/tests/functional/plugin/gpu/concurrency/gpu_concurrency_tests.cpp b/src/tests/functional/plugin/gpu/concurrency/gpu_concurrency_tests.cpp index eda756c53eb..846f5b17731 100644 --- a/src/tests/functional/plugin/gpu/concurrency/gpu_concurrency_tests.cpp +++ b/src/tests/functional/plugin/gpu/concurrency/gpu_concurrency_tests.cpp @@ -55,7 +55,7 @@ TEST_P(OVConcurrencyTest, canInferTwoExecNets) { auto fn = fn_ptrs[i]; auto exec_net = ie.compile_model(fn_ptrs[i], CommonTestUtils::DEVICE_GPU, - ov::num_streams(num_streams), ov::inference_precision(ov::element::f32)); + ov::num_streams(num_streams), ov::hint::inference_precision(ov::element::f32)); auto input = fn_ptrs[i]->get_parameters().at(0); auto output = fn_ptrs[i]->get_results().at(0); @@ -115,7 +115,7 @@ TEST(canSwapTensorsBetweenInferRequests, inputs) { auto fn = ngraph::builder::subgraph::makeSplitMultiConvConcat(); auto ie = ov::Core(); - auto compiled_model = ie.compile_model(fn, CommonTestUtils::DEVICE_GPU, ov::inference_precision(ov::element::f32)); + auto compiled_model = ie.compile_model(fn, CommonTestUtils::DEVICE_GPU, ov::hint::inference_precision(ov::element::f32)); const int infer_requests_num = 2; ov::InferRequest infer_request1 = compiled_model.create_infer_request(); @@ -193,7 +193,7 @@ TEST(smoke_InferRequestDeviceMemoryAllocation, usmHostIsNotChanged) { auto fn = ngraph::builder::subgraph::makeDetectionOutput(ngraph::element::Type_t::f32); auto ie = ov::Core(); - auto compiled_model = ie.compile_model(fn, CommonTestUtils::DEVICE_GPU, ov::inference_precision(ov::element::f32)); + auto compiled_model = ie.compile_model(fn, CommonTestUtils::DEVICE_GPU, ov::hint::inference_precision(ov::element::f32)); ov::InferRequest infer_request1 = compiled_model.create_infer_request(); ov::InferRequest infer_request2 = compiled_model.create_infer_request(); @@ -232,7 +232,7 @@ TEST(smoke_InferRequestDeviceMemoryAllocation, canSetSystemHostTensor) { auto fn = ngraph::builder::subgraph::makeDetectionOutput(ngraph::element::Type_t::f32); auto ie = ov::Core(); - auto compiled_model = ie.compile_model(fn, CommonTestUtils::DEVICE_GPU, ov::inference_precision(ov::element::f32)); + auto compiled_model = ie.compile_model(fn, CommonTestUtils::DEVICE_GPU, ov::hint::inference_precision(ov::element::f32)); ov::InferRequest infer_request1 = compiled_model.create_infer_request(); ov::InferRequest infer_request2 = compiled_model.create_infer_request(); @@ -258,7 +258,7 @@ TEST(canSwapTensorsBetweenInferRequests, outputs) { auto fn = ngraph::builder::subgraph::makeSplitMultiConvConcat(); auto ie = ov::Core(); - auto compiled_model = ie.compile_model(fn, CommonTestUtils::DEVICE_GPU, ov::inference_precision(ov::element::f32)); + auto compiled_model = ie.compile_model(fn, CommonTestUtils::DEVICE_GPU, ov::hint::inference_precision(ov::element::f32)); const int infer_requests_num = 2; ov::InferRequest infer_request1 = compiled_model.create_infer_request(); diff --git a/src/tests/functional/plugin/gpu/remote_blob_tests/cldnn_remote_blob_tests.cpp b/src/tests/functional/plugin/gpu/remote_blob_tests/cldnn_remote_blob_tests.cpp index 96904a9eead..ecf8575d4fb 100644 --- a/src/tests/functional/plugin/gpu/remote_blob_tests/cldnn_remote_blob_tests.cpp +++ b/src/tests/functional/plugin/gpu/remote_blob_tests/cldnn_remote_blob_tests.cpp @@ -40,7 +40,7 @@ public: {CONFIG_KEY(AUTO_BATCH_TIMEOUT) , "0"}, }; } - config.insert({ov::inference_precision.name(), "f32"}); + config.insert({ov::hint::inference_precision.name(), "f32"}); fn_ptr = ov::test::behavior::getDefaultNGraphFunctionForTheDevice(with_auto_batching ? CommonTestUtils::DEVICE_BATCH : deviceName); } static std::string getTestCaseName(const testing::TestParamInfo& obj) { @@ -230,7 +230,7 @@ TEST_P(RemoteBlob_Test, smoke_canInferOnUserContext) { auto blob = FuncTestUtils::createAndFillBlob(net.getInputsInfo().begin()->second->getTensorDesc()); auto ie = PluginCache::get().ie(); - auto exec_net_regular = ie->LoadNetwork(net, deviceName, {{ov::inference_precision.name(), "f32"}}); + auto exec_net_regular = ie->LoadNetwork(net, deviceName, {{ov::hint::inference_precision.name(), "f32"}}); // regular inference auto inf_req_regular = exec_net_regular.CreateInferRequest(); @@ -277,7 +277,7 @@ TEST_P(RemoteBlob_Test, smoke_canInferOnUserQueue_out_of_order) { auto blob = FuncTestUtils::createAndFillBlob(net.getInputsInfo().begin()->second->getTensorDesc()); auto ie = PluginCache::get().ie(); - auto exec_net_regular = ie->LoadNetwork(net, deviceName, {{ov::inference_precision.name(), "f32"}}); + auto exec_net_regular = ie->LoadNetwork(net, deviceName, {{ov::hint::inference_precision.name(), "f32"}}); // regular inference auto inf_req_regular = exec_net_regular.CreateInferRequest(); @@ -305,7 +305,7 @@ TEST_P(RemoteBlob_Test, smoke_canInferOnUserQueue_out_of_order) { // In this scenario we create shared OCL queue and run simple pre-process action and post-process action (buffer copies in both cases) // without calling thread blocks auto remote_context = make_shared_context(*ie, deviceName, ocl_instance->_queue.get()); - auto exec_net_shared = ie->LoadNetwork(net, remote_context, {{ov::inference_precision.name(), "f32"}}); + auto exec_net_shared = ie->LoadNetwork(net, remote_context, {{ov::hint::inference_precision.name(), "f32"}}); auto inf_req_shared = exec_net_shared.CreateInferRequest(); // Allocate shared buffers for input and output data which will be set to infer request @@ -375,7 +375,7 @@ TEST_P(RemoteBlob_Test, smoke_canInferOnUserQueue_in_order) { auto blob = FuncTestUtils::createAndFillBlob(net.getInputsInfo().begin()->second->getTensorDesc()); auto ie = PluginCache::get().ie(); - auto exec_net_regular = ie->LoadNetwork(net, deviceName, {{ov::inference_precision.name(), "f32"}}); + auto exec_net_regular = ie->LoadNetwork(net, deviceName, {{ov::hint::inference_precision.name(), "f32"}}); // regular inference auto inf_req_regular = exec_net_regular.CreateInferRequest(); @@ -404,7 +404,7 @@ TEST_P(RemoteBlob_Test, smoke_canInferOnUserQueue_in_order) { // In this scenario we create shared OCL queue and run simple pre-process action and post-process action (buffer copies in both cases) // without calling thread blocks auto remote_context = make_shared_context(*ie, deviceName, ocl_instance->_queue.get()); - auto exec_net_shared = ie->LoadNetwork(net, remote_context, {{ov::inference_precision.name(), "f32"}}); + auto exec_net_shared = ie->LoadNetwork(net, remote_context, {{ov::hint::inference_precision.name(), "f32"}}); auto inf_req_shared = exec_net_shared.CreateInferRequest(); // Allocate shared buffers for input and output data which will be set to infer request @@ -469,7 +469,7 @@ TEST_P(RemoteBlob_Test, smoke_canInferOnUserQueue_infer_call_many_times) { auto blob = FuncTestUtils::createAndFillBlob(net.getInputsInfo().begin()->second->getTensorDesc()); auto ie = PluginCache::get().ie(); - auto exec_net_regular = ie->LoadNetwork(net, deviceName, {{ov::inference_precision.name(), "f32"}}); + auto exec_net_regular = ie->LoadNetwork(net, deviceName, {{ov::hint::inference_precision.name(), "f32"}}); // regular inference auto inf_req_regular = exec_net_regular.CreateInferRequest(); @@ -498,7 +498,7 @@ TEST_P(RemoteBlob_Test, smoke_canInferOnUserQueue_infer_call_many_times) { // In this scenario we create shared OCL queue and run simple pre-process action and post-process action (buffer copies in both cases) // without calling thread blocks auto remote_context = make_shared_context(*ie, deviceName, ocl_instance->_queue.get()); - auto exec_net_shared = ie->LoadNetwork(net, remote_context, {{ov::inference_precision.name(), "f32"}}); + auto exec_net_shared = ie->LoadNetwork(net, remote_context, {{ov::hint::inference_precision.name(), "f32"}}); auto inf_req_shared = exec_net_shared.CreateInferRequest(); // Allocate shared buffers for input and output data which will be set to infer request @@ -601,7 +601,7 @@ TEST_P(BatchedBlob_Test, canInputNV12) { /* XXX: is it correct to set KEY_CLDNN_NV12_TWO_INPUTS in case of remote blob? */ auto exec_net_b = ie.LoadNetwork(net_remote, CommonTestUtils::DEVICE_GPU, - { { GPUConfigParams::KEY_GPU_NV12_TWO_INPUTS, PluginConfigParams::YES}, {ov::inference_precision.name(), "f32"} }); + { { GPUConfigParams::KEY_GPU_NV12_TWO_INPUTS, PluginConfigParams::YES}, {ov::hint::inference_precision.name(), "f32"} }); auto inf_req_remote = exec_net_b.CreateInferRequest(); auto cldnn_context = exec_net_b.GetContext(); cl_context ctx = std::dynamic_pointer_cast(cldnn_context)->get(); @@ -670,7 +670,7 @@ TEST_P(BatchedBlob_Test, canInputNV12) { net_local.getInputsInfo().begin()->second->setPrecision(Precision::U8); net_local.getInputsInfo().begin()->second->getPreProcess().setColorFormat(ColorFormat::NV12); - auto exec_net_b1 = ie.LoadNetwork(net_local, CommonTestUtils::DEVICE_GPU, {{ov::inference_precision.name(), "f32"}}); + auto exec_net_b1 = ie.LoadNetwork(net_local, CommonTestUtils::DEVICE_GPU, {{ov::hint::inference_precision.name(), "f32"}}); auto inf_req_local = exec_net_b1.CreateInferRequest(); @@ -742,7 +742,7 @@ TEST_P(TwoNets_Test, canInferTwoExecNets) { auto exec_net = ie.LoadNetwork(net, CommonTestUtils::DEVICE_GPU, {{PluginConfigParams::KEY_GPU_THROUGHPUT_STREAMS, std::to_string(num_streams)}, - {ov::inference_precision.name(), "f32"}}); + {ov::hint::inference_precision.name(), "f32"}}); for (int j = 0; j < num_streams * num_requests; j++) { outputs.push_back(net.getOutputsInfo().begin()->first); diff --git a/src/tests/functional/plugin/gpu/shared_tests_instances/behavior/ov_plugin/core_integration.cpp b/src/tests/functional/plugin/gpu/shared_tests_instances/behavior/ov_plugin/core_integration.cpp index 7c223fc92b8..c70581d9984 100644 --- a/src/tests/functional/plugin/gpu/shared_tests_instances/behavior/ov_plugin/core_integration.cpp +++ b/src/tests/functional/plugin/gpu/shared_tests_instances/behavior/ov_plugin/core_integration.cpp @@ -350,13 +350,13 @@ TEST_P(OVClassGetPropertyTest_GPU, GetAndSetInferencePrecisionNoThrow) { auto value = ov::element::undefined; const auto expected_default_precision = ov::element::f16; - OV_ASSERT_NO_THROW(value = ie.get_property(target_device, ov::inference_precision)); + OV_ASSERT_NO_THROW(value = ie.get_property(target_device, ov::hint::inference_precision)); ASSERT_EQ(expected_default_precision, value); const auto forced_precision = ov::element::f32; - OV_ASSERT_NO_THROW(ie.set_property(target_device, ov::inference_precision(forced_precision))); - OV_ASSERT_NO_THROW(value = ie.get_property(target_device, ov::inference_precision)); + OV_ASSERT_NO_THROW(ie.set_property(target_device, ov::hint::inference_precision(forced_precision))); + OV_ASSERT_NO_THROW(value = ie.get_property(target_device, ov::hint::inference_precision)); ASSERT_EQ(value, forced_precision); OPENVINO_SUPPRESS_DEPRECATED_START @@ -728,7 +728,7 @@ auto gpuCorrectConfigsWithSecondaryProperties = []() { return std::vector{ {ov::device::properties(CommonTestUtils::DEVICE_GPU, ov::hint::execution_mode(ov::hint::ExecutionMode::PERFORMANCE), - ov::inference_precision(ov::element::f32))}, + ov::hint::inference_precision(ov::element::f32))}, {ov::device::properties(CommonTestUtils::DEVICE_GPU, ov::hint::performance_mode(ov::hint::PerformanceMode::THROUGHPUT), ov::hint::allow_auto_batching(false))}, @@ -821,7 +821,7 @@ TEST_P(OVClassGetMetricTest_CACHING_PROPERTIES, GetMetricAndPrintNoThrow) { ov::device::architecture.name(), ov::intel_gpu::execution_units_count.name(), ov::intel_gpu::driver_version.name(), - ov::inference_precision.name(), + ov::hint::inference_precision.name(), ov::hint::execution_mode.name(), }; diff --git a/src/tests/functional/plugin/shared/src/execution_graph_tests/normalize_l2_decomposition.cpp b/src/tests/functional/plugin/shared/src/execution_graph_tests/normalize_l2_decomposition.cpp index 5de247f6f0d..5eb9fb1b402 100644 --- a/src/tests/functional/plugin/shared/src/execution_graph_tests/normalize_l2_decomposition.cpp +++ b/src/tests/functional/plugin/shared/src/execution_graph_tests/normalize_l2_decomposition.cpp @@ -36,7 +36,7 @@ TEST_P(ExecGrapDecomposeNormalizeL2, CheckIfDecomposeAppliedForNonContiguousAxes auto core = ov::Core(); ov::AnyMap config; if (device_name == CommonTestUtils::DEVICE_GPU) - config.insert(ov::inference_precision(ov::element::f32)); + config.insert(ov::hint::inference_precision(ov::element::f32)); const auto compiled_model = core.compile_model(model, device_name, config); ASSERT_TRUE(model->get_ops().size() < compiled_model.get_runtime_model()->get_ops().size()); // decomposition applied @@ -56,7 +56,7 @@ TEST_P(ExecGrapDecomposeNormalizeL2, CheckIfDecomposeAppliedForNormalizeOverAllA auto core = ov::Core(); ov::AnyMap config; if (device_name == CommonTestUtils::DEVICE_GPU) - config.insert(ov::inference_precision(ov::element::f32)); + config.insert(ov::hint::inference_precision(ov::element::f32)); const auto compiled_model = core.compile_model(model, device_name, config); ASSERT_TRUE(model->get_ops().size() < compiled_model.get_runtime_model()->get_ops().size()); // decomposition applied @@ -76,7 +76,7 @@ TEST_P(ExecGrapDecomposeNormalizeL2, CheckIfDecomposeNotAppliedForNotSorted) { auto core = ov::Core(); ov::AnyMap config; if (device_name == CommonTestUtils::DEVICE_GPU) - config.insert(ov::inference_precision(ov::element::f32)); + config.insert(ov::hint::inference_precision(ov::element::f32)); const auto compiled_model = core.compile_model(model, device_name, config); ASSERT_TRUE(model->get_ops().size() >= compiled_model.get_runtime_model()->get_ops().size()); // decomposition not applied @@ -96,7 +96,7 @@ TEST_P(ExecGrapDecomposeNormalizeL2, CheckIfDecomposeNotAppliedForSingleAxis) { auto core = ov::Core(); ov::AnyMap config; if (device_name == CommonTestUtils::DEVICE_GPU) - config.insert(ov::inference_precision(ov::element::f32)); + config.insert(ov::hint::inference_precision(ov::element::f32)); const auto compiled_model = core.compile_model(model, device_name, config); ASSERT_TRUE(model->get_ops().size() >= compiled_model.get_runtime_model()->get_ops().size()); // decomposition not applied diff --git a/src/tests/functional/shared_test_classes/src/base/ov_subgraph.cpp b/src/tests/functional/shared_test_classes/src/base/ov_subgraph.cpp index ee6c57ca694..f4d36beefa5 100644 --- a/src/tests/functional/shared_test_classes/src/base/ov_subgraph.cpp +++ b/src/tests/functional/shared_test_classes/src/base/ov_subgraph.cpp @@ -226,7 +226,7 @@ void SubgraphBaseTest::compile_model() { break; } } - configuration.insert({ov::inference_precision.name(), hint}); + configuration.insert({ov::hint::inference_precision.name(), hint}); } compiledModel = core->compile_model(function, targetDevice, configuration); diff --git a/src/tests/functional/shared_test_classes/src/base/snippets_test_utils.cpp b/src/tests/functional/shared_test_classes/src/base/snippets_test_utils.cpp index 3ea4432c33a..30560a943cf 100644 --- a/src/tests/functional/shared_test_classes/src/base/snippets_test_utils.cpp +++ b/src/tests/functional/shared_test_classes/src/base/snippets_test_utils.cpp @@ -54,7 +54,7 @@ void SnippetsTestsCommon::validateOriginalLayersNamesByType(const std::string& l ASSERT_TRUE(false) << "Layer type '" << layerType << "' was not found in compiled model"; } void SnippetsTestsCommon::setInferenceType(ov::element::Type type) { - configuration.emplace(ov::inference_precision(type)); + configuration.emplace(ov::hint::inference_precision(type)); } } // namespace test