[AUTO BATCH PLUGIN] enable api 2.0 for auto batch plugin (#18172)
* [AUTO BATCH PLUGIN] enable API 2.0 for auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] disenable auto batch plugin unite test for tmp Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] remove test with ov::auto_batch_timeout(-1), cause the variable is unsigned int Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix compiler error caused by std::atomic_uint32_t Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [Remote Context] fix revew comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix compiler warnings Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix compiler warnings Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix test error Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix CI test error in cpu func test case, caused by batched model lost rt info Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix CI build error, caused by unused variable Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] using ov::threading Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] clear code in batched req share buffer with non-batched req Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] clean code & fix format issue Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] clean code & fix format issue Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] add api implementation about get_default_context() & create_context() and remove the test config with AUTO_BATCH_TIMEOUT(-1) Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix gpu test with auto btch failed Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix warning Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix get_default_context() issue Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix using namespace redundancy Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] modify variable naming style Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix CI test error, cause by tensor reference in virtual plugin Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] implement get_profiling() Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] remove get_context() from auto batch compiled model using the interface from parent class Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] implement create_context() & get_default_context for auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix format issue Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] implement auto batch remote context Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix error after merge with master Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix compiler error caused by update master Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] refact remote context in auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] add unite test cases for auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix CI warning caused by unused variable & add unite of remote context Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] add virtual property for get_context() in icompiled_model & implement it in auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] add ov::loaded_from_cache support Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix error caused by updating with master Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix unite test error Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix conflict Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix error caused by update master Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> --------- Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> Signed-off-by: xuejun <xuejun.zhai@intel.com>
This commit is contained in:
parent
9cd39455fc
commit
ba76b45194
@ -9,57 +9,57 @@
|
||||
namespace ov {
|
||||
namespace autobatch_plugin {
|
||||
|
||||
AsyncInferRequest::AsyncInferRequest(const SyncInferRequest::Ptr& inferRequest,
|
||||
InferenceEngine::SoIInferRequestInternal& inferRequestWithoutBatch,
|
||||
const InferenceEngine::ITaskExecutor::Ptr& callbackExecutor)
|
||||
: AsyncInferRequestThreadSafeDefault(inferRequest, nullptr, callbackExecutor),
|
||||
m_infer_request_without_batch(inferRequestWithoutBatch),
|
||||
m_sync_infer_request{inferRequest} {
|
||||
AsyncInferRequest::AsyncInferRequest(const std::shared_ptr<SyncInferRequest>& request,
|
||||
const ov::SoPtr<ov::IAsyncInferRequest>& request_without_batch,
|
||||
const std::shared_ptr<ov::threading::ITaskExecutor>& callback_executor)
|
||||
: ov::IAsyncInferRequest(request, nullptr, callback_executor),
|
||||
m_sync_request(request),
|
||||
m_request_without_batch(request_without_batch) {
|
||||
// this executor starts the inference while the task (checking the result) is passed to the next stage
|
||||
struct ThisRequestExecutor : public InferenceEngine::ITaskExecutor {
|
||||
struct ThisRequestExecutor : public ov::threading::ITaskExecutor {
|
||||
explicit ThisRequestExecutor(AsyncInferRequest* _this_) : _this{_this_} {}
|
||||
void run(InferenceEngine::Task task) override {
|
||||
auto& workerInferRequest = _this->m_sync_infer_request->m_batched_request_wrapper;
|
||||
std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
|
||||
void run(ov::threading::Task task) override {
|
||||
auto workerInferRequest = _this->m_sync_request->m_batched_request_wrapper;
|
||||
std::pair<AsyncInferRequest*, ov::threading::Task> t;
|
||||
t.first = _this;
|
||||
t.second = std::move(task);
|
||||
workerInferRequest._tasks.push(t);
|
||||
workerInferRequest->_tasks.push(t);
|
||||
// it is ok to call size() here as the queue only grows (and the bulk removal happens under the mutex)
|
||||
const int sz = static_cast<int>(workerInferRequest._tasks.size());
|
||||
if (sz == workerInferRequest._batchSize) {
|
||||
workerInferRequest._cond.notify_one();
|
||||
const int sz = static_cast<int>(workerInferRequest->_tasks.size());
|
||||
if (sz == workerInferRequest->_batch_size) {
|
||||
workerInferRequest->_cond.notify_one();
|
||||
}
|
||||
};
|
||||
AsyncInferRequest* _this = nullptr;
|
||||
};
|
||||
_pipeline = {
|
||||
{/*TaskExecutor*/ std::make_shared<ThisRequestExecutor>(this), /*task*/ [this] {
|
||||
if (this->m_sync_infer_request->m_exceptionPtr) // if the exception happened in the batch1 fallback
|
||||
std::rethrow_exception(this->m_sync_infer_request->m_exceptionPtr);
|
||||
auto& batchReq = this->m_sync_infer_request->m_batched_request_wrapper;
|
||||
if (batchReq.m_exceptionPtr) // when the batchN execution failed
|
||||
std::rethrow_exception(batchReq.m_exceptionPtr);
|
||||
// in the case of non-batched execution the blobs were set explicitly
|
||||
if (SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED ==
|
||||
this->m_sync_infer_request->m_batched_request_status)
|
||||
this->m_sync_infer_request->CopyOutputsIfNeeded();
|
||||
}}};
|
||||
m_pipeline = {{/*TaskExecutor*/ std::make_shared<ThisRequestExecutor>(this), /*task*/ [this] {
|
||||
if (this->m_sync_request->m_exception_ptr) // if the exception happened in the batch1 fallback
|
||||
std::rethrow_exception(this->m_sync_request->m_exception_ptr);
|
||||
auto batchReq = this->m_sync_request->m_batched_request_wrapper;
|
||||
if (batchReq->_exception_ptr) // when the batchN execution failed
|
||||
std::rethrow_exception(batchReq->_exception_ptr);
|
||||
// in the case of non-batched execution the tensors were set explicitly
|
||||
if (SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED ==
|
||||
this->m_sync_request->m_batched_request_status) {
|
||||
this->m_sync_request->copy_outputs_if_needed();
|
||||
}
|
||||
}}};
|
||||
}
|
||||
|
||||
std::map<std::string, InferenceEngine::InferenceEngineProfileInfo> AsyncInferRequest::GetPerformanceCounts() const {
|
||||
CheckState();
|
||||
if (SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED == m_sync_infer_request->m_batched_request_status)
|
||||
return m_sync_infer_request->m_batched_request_wrapper._inferRequestBatched->GetPerformanceCounts();
|
||||
std::vector<ov::ProfilingInfo> AsyncInferRequest::get_profiling_info() const {
|
||||
check_state();
|
||||
if (SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED == m_sync_request->m_batched_request_status)
|
||||
return m_sync_request->get_profiling_info();
|
||||
else
|
||||
return m_infer_request_without_batch->GetPerformanceCounts();
|
||||
return m_request_without_batch->get_profiling_info();
|
||||
}
|
||||
|
||||
void AsyncInferRequest::Infer_ThreadUnsafe() {
|
||||
InferUsingAsync();
|
||||
void AsyncInferRequest::infer_thread_unsafe() {
|
||||
start_async_thread_unsafe();
|
||||
}
|
||||
|
||||
AsyncInferRequest::~AsyncInferRequest() {
|
||||
StopAndWait();
|
||||
stop_and_wait();
|
||||
}
|
||||
} // namespace autobatch_plugin
|
||||
} // namespace ov
|
@ -4,28 +4,27 @@
|
||||
|
||||
///////////////////////////////////////////////////////////////////////////////////////////////////
|
||||
#pragma once
|
||||
#include "cpp_interfaces/impl/ie_infer_async_request_thread_safe_default.hpp"
|
||||
|
||||
#include "openvino/runtime/iasync_infer_request.hpp"
|
||||
#include "sync_infer_request.hpp"
|
||||
|
||||
namespace ov {
|
||||
namespace autobatch_plugin {
|
||||
class AsyncInferRequest : public InferenceEngine::AsyncInferRequestThreadSafeDefault {
|
||||
class AsyncInferRequest : public ov::IAsyncInferRequest {
|
||||
public:
|
||||
using Ptr = std::shared_ptr<AsyncInferRequest>;
|
||||
AsyncInferRequest(const std::shared_ptr<SyncInferRequest>& request,
|
||||
const ov::SoPtr<ov::IAsyncInferRequest>& request_without_batch,
|
||||
const std::shared_ptr<ov::threading::ITaskExecutor>& callback_executor);
|
||||
|
||||
explicit AsyncInferRequest(const SyncInferRequest::Ptr& inferRequest,
|
||||
InferenceEngine::SoIInferRequestInternal& inferRequestWithoutBatch,
|
||||
const InferenceEngine::ITaskExecutor::Ptr& callbackExecutor);
|
||||
|
||||
void Infer_ThreadUnsafe() override;
|
||||
void infer_thread_unsafe() override;
|
||||
|
||||
virtual ~AsyncInferRequest();
|
||||
|
||||
std::map<std::string, InferenceEngine::InferenceEngineProfileInfo> GetPerformanceCounts() const override;
|
||||
std::vector<ov::ProfilingInfo> get_profiling_info() const override;
|
||||
|
||||
InferenceEngine::SoIInferRequestInternal m_infer_request_without_batch;
|
||||
std::shared_ptr<ov::autobatch_plugin::SyncInferRequest> m_sync_request;
|
||||
|
||||
SyncInferRequest::Ptr m_sync_infer_request;
|
||||
ov::SoPtr<ov::IAsyncInferRequest> m_request_without_batch;
|
||||
};
|
||||
} // namespace autobatch_plugin
|
||||
} // namespace ov
|
@ -6,29 +6,29 @@
|
||||
#include "compiled_model.hpp"
|
||||
|
||||
#include "async_infer_request.hpp"
|
||||
#include "ie_performance_hints.hpp"
|
||||
#include "sync_infer_request.hpp"
|
||||
|
||||
namespace ov {
|
||||
namespace autobatch_plugin {
|
||||
CompiledModel::CompiledModel(const InferenceEngine::SoExecutableNetworkInternal& networkWithBatch,
|
||||
const InferenceEngine::SoExecutableNetworkInternal& networkWithoutBatch,
|
||||
const DeviceInformation& networkDevice,
|
||||
const std::unordered_map<std::string, InferenceEngine::Parameter>& config,
|
||||
const std::set<std::string>& batchedInputs,
|
||||
const std::set<std::string>& batchedOutputs)
|
||||
: InferenceEngine::ExecutableNetworkThreadSafeDefault(nullptr,
|
||||
std::make_shared<InferenceEngine::ImmediateExecutor>()),
|
||||
m_model_with_batch{networkWithBatch},
|
||||
m_model_without_batch{networkWithoutBatch},
|
||||
m_config{config},
|
||||
m_batched_inputs(batchedInputs),
|
||||
m_batched_outputs(batchedOutputs) {
|
||||
CompiledModel::CompiledModel(const std::shared_ptr<ov::Model>& model,
|
||||
const std::shared_ptr<const ov::IPlugin>& plugin,
|
||||
const ov::AnyMap& config,
|
||||
const DeviceInformation& device_info,
|
||||
const std::set<std::string>& batched_inputs,
|
||||
const std::set<std::string>& batched_outputs,
|
||||
const ov::SoPtr<ov::ICompiledModel>& compiled_model_with_batch,
|
||||
const ov::SoPtr<ov::ICompiledModel>& compiled_model_without_batch,
|
||||
const ov::SoPtr<ov::IRemoteContext>& context)
|
||||
: ov::ICompiledModel(model, plugin, context),
|
||||
m_config(config),
|
||||
m_batched_inputs(batched_inputs),
|
||||
m_batched_outputs(batched_outputs),
|
||||
m_compiled_model_with_batch(compiled_model_with_batch),
|
||||
m_compiled_model_without_batch(compiled_model_without_batch) {
|
||||
// WA for gcc 4.8 ( fails compilation with member init-list)
|
||||
m_device_info = networkDevice;
|
||||
auto time_out = config.find(CONFIG_KEY(AUTO_BATCH_TIMEOUT));
|
||||
IE_ASSERT(time_out != config.end());
|
||||
m_timeout = ParseTimeoutValue(time_out->second.as<std::string>());
|
||||
m_device_info = device_info;
|
||||
auto time_out = config.find(ov::auto_batch_timeout.name());
|
||||
OPENVINO_ASSERT(time_out != config.end(), "No timeout property be set in config, default will be used!");
|
||||
m_time_out = time_out->second.as<std::uint32_t>();
|
||||
}
|
||||
|
||||
CompiledModel::~CompiledModel() {
|
||||
@ -39,63 +39,38 @@ CompiledModel::~CompiledModel() {
|
||||
m_worker_requests.clear();
|
||||
}
|
||||
|
||||
unsigned int CompiledModel::ParseTimeoutValue(const std::string& s) {
|
||||
auto val = std::stoi(s);
|
||||
if (val < 0)
|
||||
IE_THROW(ParameterMismatch) << "Value for the " << CONFIG_KEY(AUTO_BATCH_TIMEOUT) << " should be unsigned int";
|
||||
return val;
|
||||
}
|
||||
|
||||
std::shared_ptr<InferenceEngine::RemoteContext> CompiledModel::GetContext() const {
|
||||
return m_model_without_batch->GetContext();
|
||||
}
|
||||
|
||||
InferenceEngine::IInferRequestInternal::Ptr CompiledModel::CreateInferRequestImpl(
|
||||
InferenceEngine::InputsDataMap networkInputs,
|
||||
InferenceEngine::OutputsDataMap networkOutputs) {
|
||||
std::shared_ptr<ov::ISyncInferRequest> CompiledModel::create_sync_infer_request() const {
|
||||
auto workerRequestPtrAndId = GetWorkerInferRequest();
|
||||
return std::make_shared<SyncInferRequest>(networkInputs,
|
||||
networkOutputs,
|
||||
workerRequestPtrAndId.first,
|
||||
workerRequestPtrAndId.second,
|
||||
m_device_info.batch_for_device,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
auto async_infer_request = std::make_shared<ov::autobatch_plugin::SyncInferRequest>(
|
||||
std::dynamic_pointer_cast<const ov::autobatch_plugin::CompiledModel>(shared_from_this()),
|
||||
workerRequestPtrAndId.first,
|
||||
workerRequestPtrAndId.second,
|
||||
m_device_info.device_batch_size,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
return async_infer_request;
|
||||
}
|
||||
|
||||
InferenceEngine::IInferRequestInternal::Ptr CompiledModel::CreateInferRequestImpl(
|
||||
const std::vector<std::shared_ptr<const ov::Node>>& inputs,
|
||||
const std::vector<std::shared_ptr<const ov::Node>>& outputs) {
|
||||
if (!this->_plugin || !_plugin->IsNewAPI())
|
||||
return nullptr;
|
||||
auto workerRequestPtrAndId = GetWorkerInferRequest();
|
||||
return std::make_shared<SyncInferRequest>(inputs,
|
||||
outputs,
|
||||
workerRequestPtrAndId.first,
|
||||
workerRequestPtrAndId.second,
|
||||
m_device_info.batch_for_device,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
}
|
||||
|
||||
std::pair<CompiledModel::WorkerInferRequest&, int> CompiledModel::GetWorkerInferRequest() {
|
||||
std::pair<std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest>, int>
|
||||
CompiledModel::GetWorkerInferRequest() const {
|
||||
auto num = m_num_requests_created++;
|
||||
std::lock_guard<std::mutex> lock(m_worker_requests_mutex);
|
||||
auto batch_id = num % m_device_info.batch_for_device;
|
||||
auto batch_id = num % m_device_info.device_batch_size;
|
||||
if (!batch_id) { // need new request
|
||||
m_worker_requests.push_back(std::make_shared<WorkerInferRequest>());
|
||||
auto workerRequestPtr = m_worker_requests.back().get();
|
||||
workerRequestPtr->_inferRequestBatched = {m_model_with_batch->CreateInferRequest(), m_model_with_batch._so};
|
||||
workerRequestPtr->_batchSize = m_device_info.batch_for_device;
|
||||
workerRequestPtr->_completionTasks.resize(workerRequestPtr->_batchSize);
|
||||
workerRequestPtr->_inferRequestBatched->SetCallback(
|
||||
workerRequestPtr->_infer_request_batched = {m_compiled_model_with_batch->create_infer_request(),
|
||||
m_compiled_model_with_batch._so};
|
||||
workerRequestPtr->_batch_size = m_device_info.device_batch_size;
|
||||
workerRequestPtr->_completion_tasks.resize(workerRequestPtr->_batch_size);
|
||||
workerRequestPtr->_infer_request_batched->set_callback(
|
||||
[workerRequestPtr](std::exception_ptr exceptionPtr) mutable {
|
||||
if (exceptionPtr)
|
||||
workerRequestPtr->m_exceptionPtr = exceptionPtr;
|
||||
IE_ASSERT(workerRequestPtr->_completionTasks.size() == (size_t)workerRequestPtr->_batchSize);
|
||||
workerRequestPtr->_exception_ptr = exceptionPtr;
|
||||
OPENVINO_ASSERT(workerRequestPtr->_completion_tasks.size() == (size_t)workerRequestPtr->_batch_size);
|
||||
// notify the individual requests on the completion
|
||||
for (int c = 0; c < workerRequestPtr->_batchSize; c++) {
|
||||
workerRequestPtr->_completionTasks[c]();
|
||||
for (int c = 0; c < workerRequestPtr->_batch_size; c++) {
|
||||
workerRequestPtr->_completion_tasks[c]();
|
||||
}
|
||||
// reset the timeout
|
||||
workerRequestPtr->_cond.notify_one();
|
||||
@ -106,7 +81,7 @@ std::pair<CompiledModel::WorkerInferRequest&, int> CompiledModel::GetWorkerInfer
|
||||
std::cv_status status;
|
||||
{
|
||||
std::unique_lock<std::mutex> lock(workerRequestPtr->_mutex);
|
||||
status = workerRequestPtr->_cond.wait_for(lock, std::chrono::milliseconds(m_timeout));
|
||||
status = workerRequestPtr->_cond.wait_for(lock, std::chrono::milliseconds(m_time_out));
|
||||
}
|
||||
if (m_terminate) {
|
||||
break;
|
||||
@ -114,38 +89,38 @@ std::pair<CompiledModel::WorkerInferRequest&, int> CompiledModel::GetWorkerInfer
|
||||
// as we pop the tasks from the queue only here
|
||||
// it is ok to call size() (as the _tasks can only grow in parallel)
|
||||
const int sz = static_cast<int>(workerRequestPtr->_tasks.size());
|
||||
if (sz == workerRequestPtr->_batchSize) {
|
||||
std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
|
||||
if (sz == workerRequestPtr->_batch_size) {
|
||||
std::pair<ov::autobatch_plugin::AsyncInferRequest*, ov::threading::Task> t;
|
||||
for (int n = 0; n < sz; n++) {
|
||||
IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
|
||||
workerRequestPtr->_completionTasks[n] = std::move(t.second);
|
||||
t.first->m_sync_infer_request->CopyInputsIfNeeded();
|
||||
t.first->m_sync_infer_request->m_batched_request_status =
|
||||
SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED;
|
||||
OPENVINO_ASSERT(workerRequestPtr->_tasks.try_pop(t));
|
||||
workerRequestPtr->_completion_tasks[n] = std::move(t.second);
|
||||
t.first->m_sync_request->copy_inputs_if_needed();
|
||||
t.first->m_sync_request->m_batched_request_status =
|
||||
ov::autobatch_plugin::SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED;
|
||||
}
|
||||
workerRequestPtr->_inferRequestBatched->StartAsync();
|
||||
workerRequestPtr->_infer_request_batched->start_async();
|
||||
} else if ((status == std::cv_status::timeout) && sz) {
|
||||
// timeout to collect the batch is over, have to execute the requests in the batch1 mode
|
||||
std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
|
||||
std::pair<ov::autobatch_plugin::AsyncInferRequest*, ov::threading::Task> t;
|
||||
// popping all tasks collected by the moment of the time-out and execute each with batch1
|
||||
std::atomic<int> arrived = {0};
|
||||
std::promise<void> all_completed;
|
||||
auto all_completed_future = all_completed.get_future();
|
||||
for (int n = 0; n < sz; n++) {
|
||||
IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
|
||||
t.first->m_infer_request_without_batch->SetCallback(
|
||||
OPENVINO_ASSERT(workerRequestPtr->_tasks.try_pop(t));
|
||||
t.first->m_request_without_batch->set_callback(
|
||||
[t, sz, &arrived, &all_completed](std::exception_ptr p) {
|
||||
if (p)
|
||||
t.first->m_sync_infer_request->m_exceptionPtr = p;
|
||||
t.first->m_sync_request->m_exception_ptr = p;
|
||||
t.second();
|
||||
if (sz == ++arrived)
|
||||
if (sz == ++arrived) {
|
||||
all_completed.set_value();
|
||||
}
|
||||
});
|
||||
t.first->m_sync_infer_request->m_batched_request_status =
|
||||
SyncInferRequest::eExecutionFlavor::TIMEOUT_EXECUTED;
|
||||
t.first->m_sync_infer_request->SetBlobsToAnotherRequest(
|
||||
t.first->m_infer_request_without_batch);
|
||||
t.first->m_infer_request_without_batch->StartAsync();
|
||||
t.first->m_sync_request->m_batched_request_status =
|
||||
ov::autobatch_plugin::SyncInferRequest::eExecutionFlavor::TIMEOUT_EXECUTED;
|
||||
t.first->m_sync_request->set_tensors_to_another_request(t.first->m_request_without_batch);
|
||||
t.first->m_request_without_batch->start_async();
|
||||
}
|
||||
all_completed_future.get();
|
||||
// now when all the tasks for this batch are completed, start waiting for the timeout again
|
||||
@ -154,93 +129,103 @@ std::pair<CompiledModel::WorkerInferRequest&, int> CompiledModel::GetWorkerInfer
|
||||
}
|
||||
});
|
||||
}
|
||||
return {*m_worker_requests.back(), static_cast<int>(batch_id)};
|
||||
return {m_worker_requests.back(), static_cast<int>(batch_id)};
|
||||
}
|
||||
|
||||
InferenceEngine::IInferRequestInternal::Ptr CompiledModel::CreateInferRequest() {
|
||||
if (!m_model_with_batch) {
|
||||
auto res = m_model_without_batch->CreateInferRequest();
|
||||
res->setPointerToExecutableNetworkInternal(shared_from_this());
|
||||
res->setPointerToSo(m_model_without_batch._so);
|
||||
_so = m_model_without_batch._so;
|
||||
std::shared_ptr<ov::IAsyncInferRequest> CompiledModel::create_infer_request() const {
|
||||
if (!m_compiled_model_with_batch) {
|
||||
auto res = m_compiled_model_without_batch->create_infer_request();
|
||||
for (auto& iter : res->get_inputs()) {
|
||||
auto&& tensor = res->get_tensor(iter);
|
||||
if (!tensor._so)
|
||||
tensor._so = m_compiled_model_without_batch._so;
|
||||
}
|
||||
for (auto& iter : res->get_outputs()) {
|
||||
auto&& tensor = res->get_tensor(iter);
|
||||
if (!tensor._so)
|
||||
tensor._so = m_compiled_model_without_batch._so;
|
||||
}
|
||||
return res;
|
||||
}
|
||||
// trying to create the new API request first
|
||||
InferenceEngine::IInferRequestInternal::Ptr syncRequestImpl = CreateInferRequestImpl(_parameters, _results);
|
||||
if (!syncRequestImpl)
|
||||
syncRequestImpl = CreateInferRequestImpl(_networkInputs, _networkOutputs);
|
||||
syncRequestImpl->setPointerToExecutableNetworkInternal(shared_from_this());
|
||||
InferenceEngine::SoIInferRequestInternal inferRequestWithoutBatch = {m_model_without_batch->CreateInferRequest(),
|
||||
m_model_without_batch._so};
|
||||
return std::make_shared<AsyncInferRequest>(std::static_pointer_cast<SyncInferRequest>(syncRequestImpl),
|
||||
inferRequestWithoutBatch,
|
||||
_callbackExecutor);
|
||||
|
||||
auto sync_res = create_sync_infer_request();
|
||||
|
||||
ov::SoPtr<ov::IAsyncInferRequest> infer_request_without_batch = {
|
||||
m_compiled_model_without_batch->create_infer_request(),
|
||||
m_compiled_model_without_batch._so};
|
||||
return std::make_shared<ov::autobatch_plugin::AsyncInferRequest>(
|
||||
std::dynamic_pointer_cast<ov::autobatch_plugin::SyncInferRequest>(sync_res),
|
||||
infer_request_without_batch,
|
||||
get_callback_executor());
|
||||
}
|
||||
|
||||
std::shared_ptr<ngraph::Function> CompiledModel::GetExecGraphInfo() {
|
||||
return m_model_with_batch && m_model_with_batch->GetExecGraphInfo() ? m_model_with_batch->GetExecGraphInfo()
|
||||
: m_model_without_batch->GetExecGraphInfo();
|
||||
std::shared_ptr<const ov::Model> CompiledModel::get_runtime_model() const {
|
||||
return m_compiled_model_with_batch ? m_compiled_model_with_batch->get_runtime_model()
|
||||
: m_compiled_model_without_batch->get_runtime_model();
|
||||
}
|
||||
|
||||
void CompiledModel::SetConfig(const std::map<std::string, InferenceEngine::Parameter>& user_config) {
|
||||
auto timeout = user_config.find(CONFIG_KEY(AUTO_BATCH_TIMEOUT));
|
||||
if (timeout == user_config.end() || user_config.size() > 1) {
|
||||
IE_THROW() << "The only config that can be changed on the fly for the AutoBatching the is the "
|
||||
<< CONFIG_KEY(AUTO_BATCH_TIMEOUT);
|
||||
void CompiledModel::set_property(const ov::AnyMap& properties) {
|
||||
auto time_out = properties.find(ov::auto_batch_timeout.name());
|
||||
if (time_out == properties.end() || properties.size() > 1) {
|
||||
OPENVINO_THROW("The only config that can be changed on the fly for the AutoBatching is the ",
|
||||
ov::auto_batch_timeout.name());
|
||||
} else {
|
||||
m_timeout = ParseTimeoutValue(timeout->second.as<std::string>());
|
||||
m_time_out = time_out->second.as<std::uint32_t>();
|
||||
}
|
||||
}
|
||||
|
||||
InferenceEngine::Parameter CompiledModel::GetConfig(const std::string& name) const {
|
||||
ov::Any CompiledModel::get_property(const std::string& name) const {
|
||||
auto it = m_config.find(name);
|
||||
if (it != m_config.end()) {
|
||||
return it->second;
|
||||
} else {
|
||||
// find config key among networks config keys
|
||||
auto param = m_model_without_batch->GetMetric(METRIC_KEY(SUPPORTED_CONFIG_KEYS));
|
||||
for (auto&& configKey : param.as<std::vector<std::string>>()) {
|
||||
if (configKey == name) {
|
||||
return m_model_without_batch->GetConfig(configKey);
|
||||
auto modelSupportedProperties = m_compiled_model_without_batch->get_property(ov::supported_properties.name());
|
||||
for (auto&& property : modelSupportedProperties.as<std::vector<ov::PropertyName>>()) {
|
||||
if (property == name) {
|
||||
return m_compiled_model_without_batch->get_property(property);
|
||||
}
|
||||
}
|
||||
IE_THROW(NotFound) << name << " not found in the ExecutableNetwork config";
|
||||
if (name == ov::optimal_number_of_infer_requests.name()) {
|
||||
uint32_t num_request = 0;
|
||||
try {
|
||||
num_request =
|
||||
m_compiled_model_without_batch->get_property(ov::hint::num_requests.name()).as<std::uint32_t>();
|
||||
if (num_request == 0) // no limitations from user, let's deduce the full blown #requests
|
||||
// (multiplied by the devices capabilities to run multiple <batched> requests for further perf)
|
||||
num_request =
|
||||
m_device_info.device_batch_size *
|
||||
m_compiled_model_without_batch->get_property(ov::optimal_number_of_infer_requests.name())
|
||||
.as<uint32_t>();
|
||||
} catch (const ov::Exception&) {
|
||||
}
|
||||
num_request =
|
||||
std::max(num_request, m_device_info.device_batch_size); // round up to the possible user's value
|
||||
return num_request;
|
||||
} else if (name == ov::model_name.name()) {
|
||||
return m_compiled_model_without_batch->get_property(name);
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
} else if (name == METRIC_KEY(SUPPORTED_METRICS)) {
|
||||
return std::vector<std::string>{ov::optimal_number_of_infer_requests.name(),
|
||||
METRIC_KEY(SUPPORTED_METRICS),
|
||||
ov::model_name.name(),
|
||||
METRIC_KEY(SUPPORTED_CONFIG_KEYS),
|
||||
ov::execution_devices.name()};
|
||||
} else if (name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
|
||||
return std::vector<std::string>{ov::auto_batch_timeout.name()};
|
||||
} else if (name == ov::execution_devices) {
|
||||
return m_compiled_model_without_batch->get_property(name);
|
||||
} else if (name == ov::loaded_from_cache) {
|
||||
return m_compiled_model_without_batch->get_property(ov::loaded_from_cache.name());
|
||||
} else {
|
||||
OPENVINO_THROW("Unsupported Compiled Model Property: ", name);
|
||||
}
|
||||
}
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
}
|
||||
|
||||
InferenceEngine::Parameter CompiledModel::GetMetric(const std::string& name) const {
|
||||
if (name == METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS)) {
|
||||
auto reqs = 0;
|
||||
try {
|
||||
auto hint = m_model_without_batch->GetConfig(CONFIG_KEY(PERFORMANCE_HINT_NUM_REQUESTS)).as<std::string>();
|
||||
reqs = InferenceEngine::PerfHintsConfig::CheckPerformanceHintRequestValue(hint);
|
||||
if (!reqs) // no limitations from user, let's deduce the full blown #requests
|
||||
// (multiplied by the devices capabilities to run multiple <batched> requests for further perf)
|
||||
reqs =
|
||||
m_device_info.batch_for_device *
|
||||
m_model_without_batch->GetMetric(METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS)).as<unsigned int>();
|
||||
} catch (const InferenceEngine::Exception&) {
|
||||
}
|
||||
reqs = std::max(reqs, m_device_info.batch_for_device); // round up to the possible user's value
|
||||
IE_SET_METRIC_RETURN(OPTIMAL_NUMBER_OF_INFER_REQUESTS, reqs);
|
||||
} else if (name == METRIC_KEY(NETWORK_NAME)) {
|
||||
IE_SET_METRIC_RETURN(NETWORK_NAME,
|
||||
m_model_without_batch->GetMetric(METRIC_KEY(NETWORK_NAME)).as<std::string>());
|
||||
} else if (name == METRIC_KEY(SUPPORTED_METRICS)) {
|
||||
IE_SET_METRIC_RETURN(SUPPORTED_METRICS,
|
||||
{METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS),
|
||||
METRIC_KEY(SUPPORTED_METRICS),
|
||||
METRIC_KEY(NETWORK_NAME),
|
||||
METRIC_KEY(SUPPORTED_CONFIG_KEYS),
|
||||
ov::execution_devices.name()});
|
||||
} else if (name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
|
||||
IE_SET_METRIC_RETURN(SUPPORTED_CONFIG_KEYS,
|
||||
{CONFIG_KEY(AUTO_BATCH_TIMEOUT)}); // only timeout can be changed on the fly
|
||||
} else if (name == ov::execution_devices) {
|
||||
return m_model_without_batch->GetMetric(name);
|
||||
} else {
|
||||
IE_THROW() << "Unsupported Network metric: " << name;
|
||||
}
|
||||
void CompiledModel::export_model(std::ostream& model) const {
|
||||
OPENVINO_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
} // namespace autobatch_plugin
|
||||
|
@ -5,79 +5,75 @@
|
||||
///////////////////////////////////////////////////////////////////////////////////////////////////
|
||||
#pragma once
|
||||
|
||||
#include <map>
|
||||
#include <condition_variable>
|
||||
#include <thread>
|
||||
|
||||
#include "cpp_interfaces/impl/ie_executable_network_thread_safe_default.hpp"
|
||||
#include "ie_metric_helpers.hpp"
|
||||
#include "openvino/runtime/iasync_infer_request.hpp"
|
||||
#include "openvino/runtime/icompiled_model.hpp"
|
||||
#include "openvino/runtime/threading/thread_safe_containers.hpp"
|
||||
#include "plugin.hpp"
|
||||
#include "threading/ie_thread_safe_containers.hpp"
|
||||
|
||||
namespace ov {
|
||||
namespace autobatch_plugin {
|
||||
|
||||
class AsyncInferRequest;
|
||||
|
||||
class CompiledModel : public InferenceEngine::ExecutableNetworkThreadSafeDefault {
|
||||
class CompiledModel : public ov::ICompiledModel {
|
||||
public:
|
||||
using Ptr = std::shared_ptr<CompiledModel>;
|
||||
struct WorkerInferRequest {
|
||||
using Ptr = std::shared_ptr<WorkerInferRequest>;
|
||||
InferenceEngine::SoIInferRequestInternal _inferRequestBatched;
|
||||
int _batchSize;
|
||||
InferenceEngine::ThreadSafeQueueWithSize<std::pair<AsyncInferRequest*, InferenceEngine::Task>> _tasks;
|
||||
std::vector<InferenceEngine::Task> _completionTasks;
|
||||
ov::SoPtr<ov::IAsyncInferRequest> _infer_request_batched;
|
||||
int _batch_size;
|
||||
ov::threading::ThreadSafeQueueWithSize<std::pair<ov::autobatch_plugin::AsyncInferRequest*, ov::threading::Task>>
|
||||
_tasks;
|
||||
std::vector<ov::threading::Task> _completion_tasks;
|
||||
std::thread _thread;
|
||||
std::condition_variable _cond;
|
||||
std::mutex _mutex;
|
||||
std::exception_ptr m_exceptionPtr;
|
||||
std::exception_ptr _exception_ptr;
|
||||
};
|
||||
|
||||
CompiledModel(const InferenceEngine::SoExecutableNetworkInternal& networkForDevice,
|
||||
const InferenceEngine::SoExecutableNetworkInternal& networkForDeviceWithoutBatch,
|
||||
const DeviceInformation& networkDevices,
|
||||
const std::unordered_map<std::string, InferenceEngine::Parameter>& config,
|
||||
const std::set<std::string>& batchedIntputs,
|
||||
const std::set<std::string>& batchedOutputs);
|
||||
CompiledModel(const std::shared_ptr<ov::Model>& model,
|
||||
const std::shared_ptr<const ov::IPlugin>& plugin,
|
||||
const ov::AnyMap& config,
|
||||
const DeviceInformation& device_info,
|
||||
const std::set<std::string>& batched_inputs,
|
||||
const std::set<std::string>& batched_outputs,
|
||||
const ov::SoPtr<ov::ICompiledModel>& compiled_model_with_batch,
|
||||
const ov::SoPtr<ov::ICompiledModel>& compiled_model_without_batch,
|
||||
const ov::SoPtr<ov::IRemoteContext>& context);
|
||||
|
||||
void SetConfig(const std::map<std::string, InferenceEngine::Parameter>& config) override;
|
||||
void set_property(const ov::AnyMap& properties) override;
|
||||
|
||||
InferenceEngine::Parameter GetConfig(const std::string& name) const override;
|
||||
ov::Any get_property(const std::string& name) const override;
|
||||
|
||||
InferenceEngine::Parameter GetMetric(const std::string& name) const override;
|
||||
std::shared_ptr<ov::IAsyncInferRequest> create_infer_request() const override;
|
||||
|
||||
InferenceEngine::IInferRequestInternal::Ptr CreateInferRequest() override;
|
||||
std::shared_ptr<const ov::Model> get_runtime_model() const override;
|
||||
|
||||
InferenceEngine::IInferRequestInternal::Ptr CreateInferRequestImpl(
|
||||
InferenceEngine::InputsDataMap networkInputs,
|
||||
InferenceEngine::OutputsDataMap networkOutputs) override;
|
||||
|
||||
InferenceEngine::IInferRequestInternal::Ptr CreateInferRequestImpl(
|
||||
const std::vector<std::shared_ptr<const ov::Node>>& inputs,
|
||||
const std::vector<std::shared_ptr<const ov::Node>>& outputs) override;
|
||||
|
||||
std::shared_ptr<InferenceEngine::RemoteContext> GetContext() const override;
|
||||
|
||||
std::shared_ptr<ngraph::Function> GetExecGraphInfo() override;
|
||||
void export_model(std::ostream& model) const override;
|
||||
|
||||
virtual ~CompiledModel();
|
||||
|
||||
protected:
|
||||
std::shared_ptr<ov::ISyncInferRequest> create_sync_infer_request() const override;
|
||||
static unsigned int ParseTimeoutValue(const std::string&);
|
||||
std::atomic_bool m_terminate = {false};
|
||||
ov::AnyMap m_config;
|
||||
DeviceInformation m_device_info;
|
||||
InferenceEngine::SoExecutableNetworkInternal m_model_with_batch;
|
||||
InferenceEngine::SoExecutableNetworkInternal m_model_without_batch;
|
||||
|
||||
std::pair<WorkerInferRequest&, int> GetWorkerInferRequest();
|
||||
std::vector<WorkerInferRequest::Ptr> m_worker_requests;
|
||||
std::mutex m_worker_requests_mutex;
|
||||
std::pair<std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest>, int> GetWorkerInferRequest()
|
||||
const;
|
||||
mutable std::vector<std::shared_ptr<WorkerInferRequest>> m_worker_requests;
|
||||
mutable std::mutex m_worker_requests_mutex;
|
||||
|
||||
std::unordered_map<std::string, InferenceEngine::Parameter> m_config;
|
||||
std::atomic_size_t m_num_requests_created = {0};
|
||||
std::atomic_int m_timeout = {0}; // in ms
|
||||
mutable std::atomic_size_t m_num_requests_created = {0};
|
||||
std::atomic<std::uint32_t> m_time_out = {0}; // in ms
|
||||
|
||||
const std::set<std::string> m_batched_inputs;
|
||||
const std::set<std::string> m_batched_outputs;
|
||||
|
||||
ov::SoPtr<ov::ICompiledModel> m_compiled_model_with_batch;
|
||||
ov::SoPtr<ov::ICompiledModel> m_compiled_model_without_batch;
|
||||
};
|
||||
} // namespace autobatch_plugin
|
||||
} // namespace ov
|
||||
|
@ -7,10 +7,6 @@
|
||||
#include "plugin.hpp"
|
||||
|
||||
#include "compiled_model.hpp"
|
||||
#include "ie_icore.hpp"
|
||||
#include "ie_metric_helpers.hpp"
|
||||
#include "ie_ngraph_utils.hpp"
|
||||
#include "ie_performance_hints.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "openvino/pass/manager.hpp"
|
||||
#include "openvino/runtime/intel_gpu/properties.hpp"
|
||||
@ -19,227 +15,220 @@
|
||||
#include "transformations/common_optimizations/dimension_tracking.hpp"
|
||||
#include "transformations/init_node_info.hpp"
|
||||
#include "transformations/utils/utils.hpp"
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
#include "ie_layouts.h"
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
|
||||
namespace ov {
|
||||
namespace autobatch_plugin {
|
||||
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
std::vector<std::string> supported_configKeys = {CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG),
|
||||
ov::device::priorities.name(),
|
||||
CONFIG_KEY(AUTO_BATCH_TIMEOUT),
|
||||
CONFIG_KEY(CACHE_DIR)};
|
||||
namespace {
|
||||
ov::auto_batch_timeout.name(),
|
||||
ov::cache_dir.name()};
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
|
||||
std::map<std::string, std::string> mergeConfigs(std::map<std::string, std::string> config,
|
||||
const std::map<std::string, std::string>& user_config) {
|
||||
inline ov::AnyMap merge_properties(ov::AnyMap config, const ov::AnyMap& user_config) {
|
||||
for (auto&& kvp : user_config) {
|
||||
config[kvp.first] = kvp.second;
|
||||
}
|
||||
return config;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
DeviceInformation Plugin::ParseBatchDevice(const std::string& deviceWithBatch) {
|
||||
auto&& d = deviceWithBatch;
|
||||
auto openingBracket = d.find_first_of('(');
|
||||
auto closingBracket = d.find_first_of(')', openingBracket);
|
||||
auto deviceName = d.substr(0, openingBracket);
|
||||
DeviceInformation Plugin::parse_batch_device(const std::string& device_with_batch) {
|
||||
auto openingBracket = device_with_batch.find_first_of('(');
|
||||
auto closingBracket = device_with_batch.find_first_of(')', openingBracket);
|
||||
auto deviceName = device_with_batch.substr(0, openingBracket);
|
||||
|
||||
int batch = 0;
|
||||
if (closingBracket != std::string::npos && openingBracket < closingBracket) {
|
||||
batch = std::stol(d.substr(openingBracket + 1, closingBracket - 1));
|
||||
batch = std::stol(device_with_batch.substr(openingBracket + 1, closingBracket - 1));
|
||||
|
||||
if (batch <= 0) {
|
||||
IE_THROW() << "Batch value for '" << deviceName << "' must be > 0, while " << batch << "is passed";
|
||||
OPENVINO_THROW("Batch value for '", deviceName, "' must be > 0, while ", batch, "is passed");
|
||||
}
|
||||
}
|
||||
return {deviceName, {{}}, batch};
|
||||
return {deviceName, {{}}, static_cast<uint32_t>(batch)};
|
||||
}
|
||||
|
||||
DeviceInformation Plugin::ParseMetaDevice(const std::string& devicesBatchCfg,
|
||||
const std::map<std::string, std::string>& user_config) const {
|
||||
auto metaDevice = ParseBatchDevice(devicesBatchCfg);
|
||||
metaDevice.config = GetCore()->GetSupportedConfig(metaDevice.device_name, user_config);
|
||||
|
||||
DeviceInformation Plugin::parse_meta_device(const std::string& devices_batch_config,
|
||||
const ov::AnyMap& user_config) const {
|
||||
auto meta_device = parse_batch_device(devices_batch_config);
|
||||
meta_device.device_config = get_core()->get_supported_property(meta_device.device_name, user_config);
|
||||
// check that no irrelevant config-keys left
|
||||
for (const auto& k : user_config) {
|
||||
const auto& name = k.first;
|
||||
if (metaDevice.config.find(name) == metaDevice.config.end() &&
|
||||
if (meta_device.device_config.find(name) == meta_device.device_config.end() &&
|
||||
!ov::util::contains(supported_configKeys, name)) {
|
||||
IE_THROW() << "Unsupported config key: " << name;
|
||||
OPENVINO_THROW("Unsupported config key: ", name);
|
||||
}
|
||||
}
|
||||
return metaDevice;
|
||||
return meta_device;
|
||||
}
|
||||
|
||||
InferenceEngine::RemoteContext::Ptr Plugin::CreateContext(const InferenceEngine::ParamMap& remote_properties) {
|
||||
auto cfg = remote_properties;
|
||||
auto it = cfg.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
|
||||
if (it == cfg.end())
|
||||
it = cfg.find(ov::device::priorities.name());
|
||||
if (it == cfg.end())
|
||||
IE_THROW() << "Value for KEY_AUTO_BATCH_DEVICE_CONFIG is not set";
|
||||
ov::SoPtr<ov::IRemoteContext> Plugin::create_context(const ov::AnyMap& remote_properties) const {
|
||||
auto full_properties = remote_properties;
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
auto it = full_properties.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
if (it == full_properties.end())
|
||||
it = full_properties.find(ov::device::priorities.name());
|
||||
if (it == full_properties.end())
|
||||
OPENVINO_THROW("Value for ov::device::priorities is not set");
|
||||
|
||||
auto val = it->second.as<std::string>();
|
||||
auto core = GetCore();
|
||||
if (!core)
|
||||
return nullptr;
|
||||
auto metaDevice = ParseMetaDevice(val, std::map<std::string, std::string>());
|
||||
cfg.erase(it);
|
||||
return core->CreateContext(metaDevice.device_name, cfg);
|
||||
auto metaDevice = parse_meta_device(val, ov::AnyMap());
|
||||
full_properties.erase(it);
|
||||
return get_core()->create_context(metaDevice.device_name, full_properties);
|
||||
}
|
||||
|
||||
InferenceEngine::Parameter Plugin::GetConfig(
|
||||
const std::string& name,
|
||||
const std::map<std::string, InferenceEngine::Parameter>& user_options) const {
|
||||
ov::Any Plugin::get_property(const std::string& name, const ov::AnyMap& arguments) const {
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
if (supported_configKeys.end() != std::find(supported_configKeys.begin(), supported_configKeys.end(), name)) {
|
||||
auto it = _config.find(name);
|
||||
if (it == _config.end()) {
|
||||
IE_THROW() << "Value for " << name << " is not set";
|
||||
auto it = m_plugin_config.find(name);
|
||||
if (it == m_plugin_config.end()) {
|
||||
OPENVINO_THROW("The Value is not set for ", name);
|
||||
} else {
|
||||
return {it->second};
|
||||
}
|
||||
} else {
|
||||
IE_THROW() << "Unsupported config key: " << name;
|
||||
}
|
||||
}
|
||||
|
||||
void Plugin::CheckConfig(const std::map<std::string, std::string>& user_config) {
|
||||
for (auto&& kvp : user_config) {
|
||||
const auto name = kvp.first;
|
||||
const auto val = kvp.second;
|
||||
if (supported_configKeys.end() == std::find(supported_configKeys.begin(), supported_configKeys.end(), name))
|
||||
IE_THROW() << "Unsupported config key: " << name;
|
||||
if (name == CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG) || name == ov::device::priorities.name()) {
|
||||
ParseBatchDevice(val);
|
||||
} else if (name == CONFIG_KEY(AUTO_BATCH_TIMEOUT)) {
|
||||
try {
|
||||
auto t = std::stoi(val);
|
||||
if (t < 0)
|
||||
IE_THROW(ParameterMismatch);
|
||||
} catch (const std::exception&) {
|
||||
IE_THROW(ParameterMismatch)
|
||||
<< " Expecting unsigned int value for " << CONFIG_KEY(AUTO_BATCH_TIMEOUT) << " got " << val;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Plugin::SetConfig(const std::map<std::string, std::string>& user_config) {
|
||||
CheckConfig(user_config);
|
||||
for (auto&& kvp : user_config) {
|
||||
_config[kvp.first] = kvp.second;
|
||||
}
|
||||
}
|
||||
|
||||
static const InferenceEngine::Version version = {{2, 1}, CI_BUILD_NUMBER, "AutoBatchPlugin"};
|
||||
IE_DEFINE_PLUGIN_CREATE_FUNCTION(Plugin, version)
|
||||
|
||||
Plugin::Plugin() {
|
||||
_pluginName = "BATCH";
|
||||
_config[CONFIG_KEY(AUTO_BATCH_TIMEOUT)] = "1000"; // default value, in ms
|
||||
}
|
||||
|
||||
InferenceEngine::Parameter Plugin::GetMetric(
|
||||
const std::string& name,
|
||||
const std::map<std::string, InferenceEngine::Parameter>& user_options) const {
|
||||
if (name == METRIC_KEY(SUPPORTED_METRICS)) {
|
||||
std::vector<std::string> metrics;
|
||||
metrics.push_back(METRIC_KEY(SUPPORTED_METRICS));
|
||||
metrics.push_back(METRIC_KEY(FULL_DEVICE_NAME));
|
||||
metrics.push_back(METRIC_KEY(SUPPORTED_CONFIG_KEYS));
|
||||
IE_SET_METRIC_RETURN(SUPPORTED_METRICS, metrics);
|
||||
} else if (name == METRIC_KEY(SUPPORTED_METRICS)) {
|
||||
return std::vector<std::string>{METRIC_KEY(SUPPORTED_METRICS),
|
||||
ov::device::full_name.name(),
|
||||
METRIC_KEY(SUPPORTED_CONFIG_KEYS)};
|
||||
} else if (name == ov::supported_properties.name()) {
|
||||
return std::vector<ov::PropertyName>{
|
||||
ov::PropertyName{ov::supported_properties.name(), ov::PropertyMutability::RO},
|
||||
ov::PropertyName{ov::device::full_name.name(), ov::PropertyMutability::RO}};
|
||||
} else if (name == ov::internal::supported_properties.name()) {
|
||||
return decltype(ov::internal::supported_properties)::value_type{};
|
||||
} else if (name == METRIC_KEY(FULL_DEVICE_NAME)) {
|
||||
IE_SET_METRIC_RETURN(FULL_DEVICE_NAME, _pluginName);
|
||||
} else if (name == ov::device::full_name.name()) {
|
||||
return get_device_name();
|
||||
} else if (name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
|
||||
IE_SET_METRIC_RETURN(SUPPORTED_CONFIG_KEYS, supported_configKeys);
|
||||
return supported_configKeys;
|
||||
} else {
|
||||
IE_THROW(NotFound) << "Unsupported metric key " << name;
|
||||
OPENVINO_THROW("Unsupported property: ", name);
|
||||
}
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
}
|
||||
|
||||
void Plugin::set_property(const ov::AnyMap& properties) {
|
||||
for (auto&& c : properties) {
|
||||
const auto& name = c.first;
|
||||
const auto& val = c.second;
|
||||
if (supported_configKeys.end() == std::find(supported_configKeys.begin(), supported_configKeys.end(), name))
|
||||
OPENVINO_THROW("Unsupported config key: ", name);
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
if (name == CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG) || name == ov::device::priorities.name()) {
|
||||
parse_batch_device(val.as<std::string>());
|
||||
} else if (name == ov::auto_batch_timeout.name()) {
|
||||
try {
|
||||
auto t = val.as<uint32_t>();
|
||||
if (t < 0)
|
||||
OPENVINO_THROW("The value for ", ov::auto_batch_timeout.name(), " should > 0, which is ", t);
|
||||
} catch (const std::exception&) {
|
||||
OPENVINO_THROW(" Expecting unsigned int value for ",
|
||||
ov::auto_batch_timeout.name(),
|
||||
" got ",
|
||||
val.as<uint32_t>());
|
||||
}
|
||||
}
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
m_plugin_config[name] = val;
|
||||
}
|
||||
}
|
||||
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr Plugin::LoadExeNetworkImpl(
|
||||
const InferenceEngine::CNNNetwork& network,
|
||||
const std::map<std::string, std::string>& user_config) {
|
||||
return LoadNetworkImpl(network, nullptr, user_config);
|
||||
static const ov::Version version = {CI_BUILD_NUMBER, "openvino_auto_batch_plugin"};
|
||||
OV_DEFINE_PLUGIN_CREATE_FUNCTION(Plugin, version)
|
||||
|
||||
Plugin::Plugin() {
|
||||
set_device_name("BATCH");
|
||||
m_plugin_config.insert(ov::auto_batch_timeout(1000)); // default value (ms)
|
||||
}
|
||||
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr Plugin::LoadNetworkImpl(
|
||||
const InferenceEngine::CNNNetwork& network,
|
||||
const std::shared_ptr<InferenceEngine::RemoteContext> ctx,
|
||||
const std::map<std::string, std::string>& user_config) {
|
||||
auto core = GetCore();
|
||||
std::shared_ptr<ov::ICompiledModel> Plugin::compile_model(const std::shared_ptr<const ov::Model>& model,
|
||||
const ov::AnyMap& properties) const {
|
||||
return compile_model(model, properties, {});
|
||||
}
|
||||
|
||||
std::shared_ptr<ov::ICompiledModel> Plugin::compile_model(const std::shared_ptr<const ov::Model>& model,
|
||||
const ov::AnyMap& properties,
|
||||
const ov::SoPtr<ov::IRemoteContext>& context) const {
|
||||
auto core = get_core();
|
||||
if (core == nullptr) {
|
||||
IE_THROW() << "Please, work with Auto-Batching device via InferencEngine::Core object";
|
||||
OPENVINO_THROW("Please, work with Auto-Batching device via InferencEngine::Core object");
|
||||
}
|
||||
auto fullConfig = mergeConfigs(_config, user_config);
|
||||
auto device_batch = fullConfig.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
|
||||
if (device_batch == fullConfig.end())
|
||||
device_batch = fullConfig.find(ov::device::priorities.name());
|
||||
if (device_batch == fullConfig.end()) {
|
||||
IE_THROW() << "KEY_AUTO_BATCH key is not set for BATCH device";
|
||||
|
||||
// merge configs from func properties and m_plugin_config
|
||||
auto full_properties = merge_properties(m_plugin_config, properties);
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
auto device_batch = full_properties.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
|
||||
if (device_batch == full_properties.end())
|
||||
device_batch = full_properties.find(ov::device::priorities.name());
|
||||
if (device_batch == full_properties.end()) {
|
||||
OPENVINO_THROW("ov::device::priorities key for AUTO NATCH is not set for BATCH device");
|
||||
}
|
||||
auto metaDevice = ParseMetaDevice(device_batch->second, user_config);
|
||||
const auto& deviceName = metaDevice.device_name;
|
||||
const auto& deviceConfig = metaDevice.config;
|
||||
auto deviceConfigNoAutoBatch = deviceConfig;
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
auto meta_device = parse_meta_device(device_batch->second.as<std::string>(), properties);
|
||||
|
||||
const auto& device_name = meta_device.device_name;
|
||||
const auto& device_config = meta_device.device_config;
|
||||
auto device_config_no_auto_batch = device_config;
|
||||
// avoid recursive auto-batching
|
||||
deviceConfigNoAutoBatch[CONFIG_KEY(ALLOW_AUTO_BATCHING)] = CONFIG_VALUE(NO);
|
||||
device_config_no_auto_batch[ov::hint::allow_auto_batching.name()] = false;
|
||||
|
||||
std::set<std::string> batched_inputs;
|
||||
std::set<std::string> batched_outputs;
|
||||
// check that the auto-batching is applicable in general
|
||||
try {
|
||||
// if applicable, the Auto-Batching is implicitly enabled via the performance hints
|
||||
const auto tput = CONFIG_VALUE(THROUGHPUT);
|
||||
const bool bTputInPlg = core->GetConfig(deviceName, CONFIG_KEY(PERFORMANCE_HINT)).as<std::string>() == tput;
|
||||
const auto& mode = deviceConfig.find(CONFIG_KEY(PERFORMANCE_HINT));
|
||||
const bool bTputInLoadCfg = (mode != deviceConfig.end() && mode->second == tput);
|
||||
const bool enable_tput_plugin =
|
||||
core->get_property(device_name, ov::hint::performance_mode) == ov::hint::PerformanceMode::THROUGHPUT;
|
||||
const auto& performance_mode = device_config.find(ov::hint::performance_mode.name());
|
||||
const bool enable_tput_cfg = (performance_mode != device_config.end() &&
|
||||
performance_mode->second == ov::hint::PerformanceMode::THROUGHPUT);
|
||||
// if the auto-batching is enabled implicitly, check the dims carefully, to avoid outstanding failures
|
||||
const bool check_dims = (bTputInPlg || bTputInLoadCfg);
|
||||
InferenceEngine::CNNNetwork clonedNetwork(InferenceEngine::details::cloneNetwork(network));
|
||||
auto function = clonedNetwork.getFunction();
|
||||
const bool check_dims = (enable_tput_plugin || enable_tput_cfg);
|
||||
// find the batch dim
|
||||
ov::pass::Manager m;
|
||||
m.register_pass<ov::pass::InitNodeInfo>();
|
||||
m.register_pass<ov::pass::FindBatch>(false, check_dims);
|
||||
m.run_passes(function);
|
||||
auto cloned_model = model->clone();
|
||||
ov::pass::Manager pass_manager;
|
||||
pass_manager.register_pass<ov::pass::InitNodeInfo>();
|
||||
pass_manager.register_pass<ov::pass::FindBatch>(false, check_dims);
|
||||
pass_manager.run_passes(cloned_model);
|
||||
// do not reshape/re-batch originally batched networks and when there are no inputs with the N* layouts
|
||||
// input(s) should have the batch dim as the first dim (current limitation of the auto-batching impl)
|
||||
const auto& params = function->get_parameters();
|
||||
const auto& params = cloned_model->get_parameters();
|
||||
for (size_t input_id = 0; input_id < params.size(); input_id++) {
|
||||
const auto& input = params[input_id];
|
||||
const auto& shape = input->get_partial_shape();
|
||||
// currently no plugin support batched execution for dynamic networks
|
||||
if (shape.is_dynamic())
|
||||
IE_THROW(NotImplemented) << "Auto-batching does not support dynamic networks!";
|
||||
OPENVINO_THROW("Auto-batching does not support dynamic networks!");
|
||||
// check the batch dim: either 0th (and the original batch size of 1) or none
|
||||
if (shape.size() && ov::DimensionTracker::get_label(shape[0])) {
|
||||
const auto& static_shape = input->get_shape();
|
||||
if (static_shape[0] != 1)
|
||||
IE_THROW(NotImplemented) << "Auto-batching does not reshape/re-batch originally batched networks!";
|
||||
OPENVINO_THROW("Auto-batching does not reshape/re-batch originally batched networks!");
|
||||
batched_inputs.insert(
|
||||
ov::op::util::get_ie_output_name(params[input_id]->output(0))); // batched dim for the input
|
||||
} else {
|
||||
// if the 0-th dim is not for the batch, then we support only the case when NONE dimension is batch
|
||||
for (size_t s = 1; s < shape.size(); s++)
|
||||
if (ov::DimensionTracker::get_label(shape[s]))
|
||||
IE_THROW(NotImplemented)
|
||||
<< "Auto-batching operates only networks with inputs/outputs batched by 0th dimension";
|
||||
OPENVINO_THROW(
|
||||
"Auto-batching operates only networks with inputs/outputs batched by 0th dimension");
|
||||
}
|
||||
}
|
||||
const auto& results = function->get_results();
|
||||
const auto& results = cloned_model->get_results();
|
||||
for (size_t output_id = 0; output_id < results.size(); output_id++) {
|
||||
const auto& output = results[output_id];
|
||||
const auto& shape = output->get_output_partial_shape(0);
|
||||
if (shape.is_dynamic())
|
||||
IE_THROW(NotImplemented) << "Auto-batching does not support dynamic networks!";
|
||||
OPENVINO_THROW("Auto-batching does not support dynamic networks!");
|
||||
// check the batch dim: either 0th (and the original batch size of 1) or none
|
||||
if (shape.size() && ov::DimensionTracker::get_label(shape[0])) {
|
||||
if (shape[0] != 1)
|
||||
IE_THROW(NotImplemented) << "Auto-batching does not reshape/re-batch originally batched networks!";
|
||||
OPENVINO_THROW("Auto-batching does not reshape/re-batch originally batched networks!");
|
||||
const auto& node = output->input_value(0);
|
||||
batched_outputs.insert(
|
||||
ov::op::util::get_ie_output_name(ov::Output<const ov::Node>(node.get_node(), node.get_index())));
|
||||
@ -247,116 +236,177 @@ InferenceEngine::IExecutableNetworkInternal::Ptr Plugin::LoadNetworkImpl(
|
||||
// if the 0-th dim is not for the batch, then we support only the case when NONE dimension is batch
|
||||
for (size_t s = 1; s < shape.size(); s++)
|
||||
if (ov::DimensionTracker::get_label(shape[s]))
|
||||
IE_THROW(NotImplemented)
|
||||
<< "Auto-batching operates only networks with outputs batched by 0th dimension";
|
||||
OPENVINO_THROW("Auto-batching operates only networks with outputs batched by 0th dimension");
|
||||
}
|
||||
}
|
||||
if (!batched_inputs.size() || !batched_outputs.size())
|
||||
IE_THROW(NotImplemented)
|
||||
<< "Auto-batching supports only networks with inputs/outputs featuring batched dim!";
|
||||
} catch (const InferenceEngine::Exception&) {
|
||||
metaDevice.batch_for_device = 1;
|
||||
OPENVINO_THROW("Auto-batching supports only networks with inputs/outputs featuring batched dim!");
|
||||
} catch (const ov::Exception&) {
|
||||
meta_device.device_batch_size = 1;
|
||||
}
|
||||
|
||||
if (!metaDevice.batch_for_device) {
|
||||
unsigned int requests = 0;
|
||||
if (!meta_device.device_batch_size) {
|
||||
// batch size is not set explicitly via device name e.g. BATCH:GPU(4)
|
||||
// let's query the optimal batch size
|
||||
std::map<std::string, InferenceEngine::Parameter> options;
|
||||
options["MODEL_PTR"] = std::const_pointer_cast<ngraph::Function>(network.getFunction());
|
||||
auto optBatchSize = core->GetMetric(deviceName, METRIC_KEY(OPTIMAL_BATCH_SIZE), options).as<unsigned int>();
|
||||
auto res = core->GetConfig(deviceName, CONFIG_KEY(PERFORMANCE_HINT_NUM_REQUESTS)).as<std::string>();
|
||||
requests = InferenceEngine::PerfHintsConfig::CheckPerformanceHintRequestValue(res);
|
||||
const auto& reqs = user_config.find(CONFIG_KEY(PERFORMANCE_HINT_NUM_REQUESTS));
|
||||
if (reqs != user_config.end())
|
||||
requests = static_cast<unsigned int>(
|
||||
InferenceEngine::PerfHintsConfig::CheckPerformanceHintRequestValue(reqs->second));
|
||||
// auto cloned_model = model->clone();
|
||||
ov::AnyMap options = {ov::hint::model(std::const_pointer_cast<ov::Model>(model))};
|
||||
unsigned int opt_batch_size = core->get_property(device_name, ov::optimal_batch_size, options);
|
||||
auto requests = core->get_property(device_name, ov::hint::num_requests);
|
||||
const auto& reqs = properties.find(ov::hint::num_requests.name());
|
||||
if (reqs != properties.end())
|
||||
requests = reqs->second.as<unsigned int>();
|
||||
if (requests)
|
||||
optBatchSize = std::max(1u, std::min(requests, optBatchSize));
|
||||
if (optBatchSize > 2) // batching is usually in-efficient for batch<4 (as batch1 kernels are heavily optimized)
|
||||
metaDevice.batch_for_device = optBatchSize;
|
||||
opt_batch_size = std::max(1u, std::min(requests, opt_batch_size));
|
||||
if (opt_batch_size >
|
||||
2) // batching is usually in-efficient for batch<4 (as batch1 kernels are heavily optimized)
|
||||
meta_device.device_batch_size = opt_batch_size;
|
||||
else
|
||||
metaDevice.batch_for_device = 1;
|
||||
meta_device.device_batch_size = 1;
|
||||
}
|
||||
|
||||
auto report_footprint = [](std::shared_ptr<InferenceEngine::ICore> pCore, std::string device) -> size_t {
|
||||
auto report_footprint = [](std::shared_ptr<ICore> pCore, std::string device) -> size_t {
|
||||
size_t footprint = 0;
|
||||
// TODO: use the per-network metric (22.2) rather than plugin-level
|
||||
auto stats =
|
||||
pCore->GetMetric(device, ov::intel_gpu::memory_statistics.name()).as<std::map<std::string, uint64_t>>();
|
||||
// TODO: use the per-model metric (22.2) rather than plugin-level
|
||||
auto stats = pCore->get_property(device, ov::intel_gpu::memory_statistics);
|
||||
for (const auto& s : stats)
|
||||
footprint += s.second;
|
||||
return footprint;
|
||||
};
|
||||
|
||||
size_t batch1_footprint = 0;
|
||||
if (deviceName.find("GPU") != std::string::npos)
|
||||
batch1_footprint = report_footprint(core, deviceName);
|
||||
auto executableNetworkWithoutBatch = ctx ? core->LoadNetwork(network, ctx, deviceConfigNoAutoBatch)
|
||||
: core->LoadNetwork(network, deviceName, deviceConfigNoAutoBatch);
|
||||
if (deviceName.find("GPU") != std::string::npos) {
|
||||
batch1_footprint = report_footprint(core, deviceName) - batch1_footprint;
|
||||
if (device_name.find("GPU") != std::string::npos)
|
||||
batch1_footprint = report_footprint(core, device_name);
|
||||
auto compiled_model_without_batch = context ? core->compile_model(model, context, device_config_no_auto_batch)
|
||||
: core->compile_model(model, device_name, device_config_no_auto_batch);
|
||||
if (device_name.find("GPU") != std::string::npos) {
|
||||
batch1_footprint = report_footprint(core, device_name) - batch1_footprint;
|
||||
if (batch1_footprint) {
|
||||
const auto total_mem =
|
||||
GetCore()->GetMetric(deviceName, GPU_METRIC_KEY(DEVICE_TOTAL_MEM_SIZE)).as<uint64_t>();
|
||||
const auto total_mem = core->get_property(device_name, ov::intel_gpu::device_total_mem_size);
|
||||
const int estimated_batch = static_cast<int>((total_mem - batch1_footprint) / batch1_footprint);
|
||||
int closest = static_cast<int>(pow(2, floor(std::log(estimated_batch) / std::log(2))));
|
||||
closest = std::max(1, closest);
|
||||
metaDevice.batch_for_device = std::min(metaDevice.batch_for_device, closest);
|
||||
meta_device.device_batch_size = std::min(static_cast<int>(meta_device.device_batch_size), closest);
|
||||
}
|
||||
}
|
||||
|
||||
// auto-batch settings
|
||||
std::unordered_map<std::string, InferenceEngine::Parameter> networkConfig;
|
||||
for (const auto& c : fullConfig) {
|
||||
ov::AnyMap compiled_model_config;
|
||||
for (const auto& c : full_properties) {
|
||||
if (supported_configKeys.end() != std::find(supported_configKeys.begin(), supported_configKeys.end(), c.first))
|
||||
networkConfig.insert(c);
|
||||
compiled_model_config.insert(c);
|
||||
}
|
||||
|
||||
InferenceEngine::SoExecutableNetworkInternal executableNetworkWithBatch;
|
||||
if (metaDevice.batch_for_device > 1 && batched_inputs.size()) {
|
||||
ov::SoPtr<ov::ICompiledModel> compiled_model_with_batch;
|
||||
auto reshaped = model->clone();
|
||||
if (meta_device.device_batch_size > 1 && batched_inputs.size()) {
|
||||
try {
|
||||
InferenceEngine::CNNNetwork reshaped(InferenceEngine::details::cloneNetwork(network));
|
||||
InferenceEngine::ICNNNetwork::InputShapes shapes = reshaped.getInputShapes();
|
||||
for (const auto& input : batched_inputs)
|
||||
shapes[input][0] = metaDevice.batch_for_device;
|
||||
reshaped.reshape(shapes);
|
||||
executableNetworkWithBatch = ctx ? core->LoadNetwork(reshaped, ctx, deviceConfigNoAutoBatch)
|
||||
: core->LoadNetwork(reshaped, deviceName, deviceConfigNoAutoBatch);
|
||||
} catch (const InferenceEngine::Exception&) {
|
||||
metaDevice.batch_for_device = 1;
|
||||
auto inputs = reshaped->inputs();
|
||||
std::map<ov::Output<ov::Node>, ov::PartialShape> partial_shapes;
|
||||
for (auto& input : inputs) {
|
||||
auto input_shape = input.get_shape();
|
||||
if (batched_inputs.find(ov::op::util::get_ie_output_name(input)) != batched_inputs.end()) {
|
||||
input_shape[0] = meta_device.device_batch_size;
|
||||
}
|
||||
partial_shapes.insert({input, ov::PartialShape(input_shape)});
|
||||
}
|
||||
|
||||
reshaped->reshape(partial_shapes);
|
||||
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
for (auto&& input : reshaped->inputs()) {
|
||||
auto& rt_info = input.get_rt_info();
|
||||
auto it = rt_info.find("ie_legacy_td");
|
||||
if (it != rt_info.end()) {
|
||||
auto td = it->second.as<InferenceEngine::TensorDesc>();
|
||||
rt_info["ie_legacy_td"] =
|
||||
InferenceEngine::TensorDesc(td.getPrecision(), input.get_shape(), td.getLayout());
|
||||
}
|
||||
}
|
||||
for (auto&& result : reshaped->get_results()) {
|
||||
auto output = result->input_value(0);
|
||||
auto& rt_info = output.get_rt_info();
|
||||
auto it = rt_info.find("ie_legacy_td");
|
||||
if (it != rt_info.end()) {
|
||||
auto td = it->second.as<InferenceEngine::TensorDesc>();
|
||||
rt_info["ie_legacy_td"] =
|
||||
InferenceEngine::TensorDesc(td.getPrecision(), output.get_shape(), td.getLayout());
|
||||
}
|
||||
}
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
|
||||
compiled_model_with_batch = context
|
||||
? core->compile_model(reshaped, context, device_config_no_auto_batch)
|
||||
: core->compile_model(reshaped, device_name, device_config_no_auto_batch);
|
||||
} catch (const ov::Exception&) {
|
||||
meta_device.device_batch_size = 1;
|
||||
}
|
||||
}
|
||||
|
||||
return std::make_shared<CompiledModel>(executableNetworkWithBatch,
|
||||
executableNetworkWithoutBatch,
|
||||
metaDevice,
|
||||
networkConfig,
|
||||
ov::SoPtr<ov::IRemoteContext> device_context;
|
||||
if (!context) {
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
try {
|
||||
device_context = compiled_model_without_batch->get_context();
|
||||
if (!device_context._so)
|
||||
device_context._so = compiled_model_without_batch._so;
|
||||
} catch (const ov::NotImplemented&) {
|
||||
} catch (const InferenceEngine::NotImplemented&) {
|
||||
}
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
} else {
|
||||
device_context = context;
|
||||
}
|
||||
|
||||
return std::make_shared<CompiledModel>(model->clone(),
|
||||
shared_from_this(),
|
||||
compiled_model_config,
|
||||
meta_device,
|
||||
batched_inputs,
|
||||
batched_outputs);
|
||||
batched_outputs,
|
||||
compiled_model_with_batch,
|
||||
compiled_model_without_batch,
|
||||
device_context);
|
||||
}
|
||||
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr Plugin::LoadExeNetworkImpl(
|
||||
const InferenceEngine::CNNNetwork& network,
|
||||
const std::shared_ptr<InferenceEngine::RemoteContext>& context,
|
||||
const std::map<std::string, std::string>& user_config) {
|
||||
return LoadNetworkImpl(network, context, user_config);
|
||||
}
|
||||
|
||||
InferenceEngine::QueryNetworkResult Plugin::QueryNetwork(const InferenceEngine::CNNNetwork& network,
|
||||
const std::map<std::string, std::string>& user_config) const {
|
||||
auto core = GetCore();
|
||||
if (!core)
|
||||
return InferenceEngine::QueryNetworkResult();
|
||||
auto cfg = user_config;
|
||||
ov::SupportedOpsMap Plugin::query_model(const std::shared_ptr<const ov::Model>& model,
|
||||
const ov::AnyMap& properties) const {
|
||||
OPENVINO_ASSERT(model, "OpenVINO Model is empty!");
|
||||
OPENVINO_ASSERT(get_core(), "Core is missing!");
|
||||
auto cfg = properties;
|
||||
for (const auto& c : cfg) {
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
if (c.first == CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG) || c.first == ov::device::priorities.name()) {
|
||||
auto val = c.second;
|
||||
cfg.erase(c.first);
|
||||
auto metaDevice = ParseMetaDevice(val, cfg);
|
||||
return core->QueryNetwork(network, metaDevice.device_name, cfg);
|
||||
auto metaDevice = parse_meta_device(val.as<std::string>(), cfg);
|
||||
return get_core()->query_model(model, metaDevice.device_name, cfg);
|
||||
}
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
}
|
||||
IE_THROW() << "Value for KEY_AUTO_BATCH_DEVICE_CONFIG is not set";
|
||||
OPENVINO_THROW("Value for ov::device::priorities for AUTO BATCH PLUGIN is not set");
|
||||
}
|
||||
|
||||
ov::SoPtr<ov::IRemoteContext> Plugin::get_default_context(const ov::AnyMap& remote_properties) const {
|
||||
OPENVINO_SUPPRESS_DEPRECATED_START
|
||||
auto it = remote_properties.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
|
||||
OPENVINO_SUPPRESS_DEPRECATED_END
|
||||
if (it == remote_properties.end())
|
||||
it = remote_properties.find(ov::device::priorities.name());
|
||||
if (it == remote_properties.end())
|
||||
OPENVINO_THROW("Value for ov::device::priorities is not set");
|
||||
|
||||
auto val = it->second.as<std::string>();
|
||||
auto metaDevice = parse_meta_device(val, ov::AnyMap());
|
||||
return get_core()->get_default_context(metaDevice.device_name);
|
||||
}
|
||||
|
||||
std::shared_ptr<ov::ICompiledModel> Plugin::import_model(std::istream& model, const ov::AnyMap& properties) const {
|
||||
OPENVINO_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
std::shared_ptr<ov::ICompiledModel> Plugin::import_model(std::istream& model,
|
||||
const ov::SoPtr<ov::IRemoteContext>& context,
|
||||
const ov::AnyMap& properties) const {
|
||||
OPENVINO_NOT_IMPLEMENTED;
|
||||
}
|
||||
} // namespace autobatch_plugin
|
||||
} // namespace ov
|
||||
|
@ -7,8 +7,9 @@
|
||||
|
||||
#include <map>
|
||||
|
||||
#include "cpp_interfaces/impl/ie_executable_network_thread_safe_default.hpp"
|
||||
#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
|
||||
#include "ie/ie_plugin_config.hpp"
|
||||
#include "openvino/runtime/iplugin.hpp"
|
||||
#include "openvino/runtime/properties.hpp"
|
||||
|
||||
#ifdef AUTOBATCH_UNITTEST
|
||||
# define autobatch_plugin mock_autobatch_plugin
|
||||
@ -19,40 +20,39 @@ namespace autobatch_plugin {
|
||||
|
||||
struct DeviceInformation {
|
||||
std::string device_name;
|
||||
std::map<std::string, std::string> config;
|
||||
int batch_for_device;
|
||||
ov::AnyMap device_config;
|
||||
uint32_t device_batch_size;
|
||||
};
|
||||
|
||||
class Plugin : public InferenceEngine::IInferencePlugin {
|
||||
class Plugin : public ov::IPlugin {
|
||||
public:
|
||||
Plugin();
|
||||
|
||||
virtual ~Plugin() = default;
|
||||
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr LoadExeNetworkImpl(
|
||||
const InferenceEngine::CNNNetwork& network,
|
||||
const std::map<std::string, std::string>& config) override;
|
||||
std::shared_ptr<ov::ICompiledModel> compile_model(const std::shared_ptr<const ov::Model>& model,
|
||||
const ov::AnyMap& properties) const override;
|
||||
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr LoadExeNetworkImpl(
|
||||
const InferenceEngine::CNNNetwork& network,
|
||||
const std::shared_ptr<InferenceEngine::RemoteContext>& context,
|
||||
const std::map<std::string, std::string>& config) override;
|
||||
std::shared_ptr<ov::ICompiledModel> compile_model(const std::shared_ptr<const ov::Model>& model,
|
||||
const ov::AnyMap& properties,
|
||||
const ov::SoPtr<ov::IRemoteContext>& context) const override;
|
||||
|
||||
void SetConfig(const std::map<std::string, std::string>& config) override;
|
||||
void set_property(const ov::AnyMap& properties) override;
|
||||
|
||||
void CheckConfig(const std::map<std::string, std::string>& config);
|
||||
ov::Any get_property(const std::string& name, const ov::AnyMap& arguments) const override;
|
||||
|
||||
InferenceEngine::Parameter GetConfig(
|
||||
const std::string& name,
|
||||
const std::map<std::string, InferenceEngine::Parameter>& options) const override;
|
||||
ov::SupportedOpsMap query_model(const std::shared_ptr<const ov::Model>& model,
|
||||
const ov::AnyMap& properties) const override;
|
||||
|
||||
InferenceEngine::QueryNetworkResult QueryNetwork(const InferenceEngine::CNNNetwork& network,
|
||||
const std::map<std::string, std::string>& config) const override;
|
||||
InferenceEngine::Parameter GetMetric(
|
||||
const std::string& name,
|
||||
const std::map<std::string, InferenceEngine::Parameter>& options) const override;
|
||||
ov::SoPtr<ov::IRemoteContext> create_context(const ov::AnyMap& remote_properties) const override;
|
||||
|
||||
InferenceEngine::RemoteContext::Ptr CreateContext(const InferenceEngine::ParamMap&) override;
|
||||
ov::SoPtr<ov::IRemoteContext> get_default_context(const ov::AnyMap& remote_properties) const override;
|
||||
|
||||
std::shared_ptr<ov::ICompiledModel> import_model(std::istream& model, const ov::AnyMap& properties) const override;
|
||||
|
||||
std::shared_ptr<ov::ICompiledModel> import_model(std::istream& model,
|
||||
const ov::SoPtr<ov::IRemoteContext>& context,
|
||||
const ov::AnyMap& properties) const override;
|
||||
|
||||
#ifdef AUTOBATCH_UNITTEST
|
||||
|
||||
@ -61,15 +61,12 @@ public:
|
||||
|
||||
protected:
|
||||
#endif
|
||||
DeviceInformation ParseMetaDevice(const std::string& devicesBatchCfg,
|
||||
const std::map<std::string, std::string>& config) const;
|
||||
DeviceInformation parse_meta_device(const std::string& devices_batch_config, const ov::AnyMap& user_config) const;
|
||||
|
||||
static DeviceInformation ParseBatchDevice(const std::string& deviceWithBatch);
|
||||
static DeviceInformation parse_batch_device(const std::string& device_with_batch);
|
||||
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr LoadNetworkImpl(
|
||||
const InferenceEngine::CNNNetwork& network,
|
||||
const std::shared_ptr<InferenceEngine::RemoteContext> context,
|
||||
const std::map<std::string, std::string>& config);
|
||||
private:
|
||||
mutable ov::AnyMap m_plugin_config;
|
||||
};
|
||||
} // namespace autobatch_plugin
|
||||
} // namespace ov
|
@ -5,324 +5,104 @@
|
||||
///////////////////////////////////////////////////////////////////////////////////////////////////
|
||||
#include "sync_infer_request.hpp"
|
||||
|
||||
#include "openvino/core/type/element_type_traits.hpp"
|
||||
#include "openvino/runtime/make_tensor.hpp"
|
||||
#include "transformations/utils/utils.hpp"
|
||||
|
||||
namespace ov {
|
||||
namespace autobatch_plugin {
|
||||
|
||||
template <InferenceEngine::Precision::ePrecision precision>
|
||||
InferenceEngine::Blob::Ptr create_shared_blob_on_top_of_batched_blob(InferenceEngine::Blob::Ptr batched_blob,
|
||||
inline ov::SoPtr<ov::ITensor> create_shared_tensor_on_batched_tensor(ov::SoPtr<ov::ITensor> batched_tensor,
|
||||
std::string name,
|
||||
const std::set<std::string>& batched_names,
|
||||
size_t batch_id,
|
||||
size_t batch_num) {
|
||||
typedef typename InferenceEngine::PrecisionTrait<precision>::value_type TYPE;
|
||||
typedef typename std::add_pointer<TYPE>::type TYPEPTR;
|
||||
auto ptr = batched_blob->buffer().as<TYPEPTR>();
|
||||
auto sizePerBatch = batched_blob->size() / batch_num;
|
||||
InferenceEngine::SizeVector dims = batched_blob->getTensorDesc().getDims();
|
||||
auto ptr = static_cast<uint8_t*>(batched_tensor->data());
|
||||
auto size_per_batch = batched_tensor->get_byte_size() / batch_num;
|
||||
auto batched_shape = batched_tensor->get_shape();
|
||||
// for performance reason (copy avoidance) current impl of the auto-batching supports only batching by 0th dim
|
||||
if (batched_names.count(name)) {
|
||||
dims[0] = 1;
|
||||
return InferenceEngine::make_shared_blob<TYPE>({precision, dims, batched_blob->getTensorDesc().getLayout()},
|
||||
ptr + sizePerBatch * batch_id,
|
||||
sizePerBatch);
|
||||
batched_shape[0] = 1;
|
||||
return {ov::make_tensor(batched_tensor->get_element_type(), batched_shape, ptr + size_per_batch * batch_id),
|
||||
batched_tensor._so};
|
||||
} else {
|
||||
// same blob for all requests (e.g. constants)
|
||||
return InferenceEngine::make_shared_blob<TYPE>({precision, dims, batched_blob->getTensorDesc().getLayout()},
|
||||
ptr);
|
||||
return {ov::make_tensor(batched_tensor->get_element_type(), batched_shape, ptr), batched_tensor._so};
|
||||
}
|
||||
}
|
||||
|
||||
SyncInferRequest::SyncInferRequest(const std::vector<std::shared_ptr<const ov::Node>>& inputs,
|
||||
const std::vector<std::shared_ptr<const ov::Node>>& outputs,
|
||||
CompiledModel::WorkerInferRequest& workerRequest,
|
||||
int batch_id,
|
||||
int num_batch,
|
||||
const std::set<std::string>& batchedInputs,
|
||||
const std::set<std::string>& batchedOutputs)
|
||||
: IInferRequestInternal(inputs, outputs),
|
||||
m_batched_request_wrapper(workerRequest),
|
||||
SyncInferRequest::SyncInferRequest(
|
||||
const std::shared_ptr<const ov::autobatch_plugin::CompiledModel>& compiled_model,
|
||||
const std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest>& worker_request,
|
||||
int batch_id,
|
||||
int num_batch,
|
||||
const std::set<std::string>& batched_inputs,
|
||||
const std::set<std::string>& batched_outputs)
|
||||
: ov::ISyncInferRequest(compiled_model),
|
||||
m_batched_request_wrapper(worker_request),
|
||||
m_batch_id(batch_id),
|
||||
m_batch_size(num_batch) {
|
||||
ShareBlobsWithBatchRequest(batchedInputs, batchedOutputs);
|
||||
share_tensors_with_batched_req(batched_inputs, batched_outputs);
|
||||
}
|
||||
|
||||
SyncInferRequest::SyncInferRequest(const InferenceEngine::InputsDataMap& networkInputs,
|
||||
const InferenceEngine::OutputsDataMap& networkOutputs,
|
||||
CompiledModel::WorkerInferRequest& workerRequest,
|
||||
int batch_id,
|
||||
int num_batch,
|
||||
const std::set<std::string>& batchedInputs,
|
||||
const std::set<std::string>& batchedOutputs)
|
||||
: IInferRequestInternal(networkInputs, networkOutputs),
|
||||
m_batched_request_wrapper(workerRequest),
|
||||
m_batch_id(batch_id),
|
||||
m_batch_size(num_batch) {
|
||||
ShareBlobsWithBatchRequest(batchedInputs, batchedOutputs);
|
||||
void SyncInferRequest::share_tensors_with_batched_req(const std::set<std::string>& batched_inputs,
|
||||
const std::set<std::string>& batched_outputs) {
|
||||
for (const auto& it : get_inputs()) {
|
||||
auto name = ov::op::util::get_ie_output_name(it);
|
||||
ov::SoPtr<ov::ITensor> res;
|
||||
auto batched_tensor = m_batched_request_wrapper->_infer_request_batched->get_tensor(it);
|
||||
if (!batched_tensor._so)
|
||||
batched_tensor._so = m_batched_request_wrapper->_infer_request_batched._so;
|
||||
res = create_shared_tensor_on_batched_tensor(batched_tensor, name, batched_inputs, m_batch_id, m_batch_size);
|
||||
set_tensor(it, res);
|
||||
}
|
||||
|
||||
for (const auto& it : get_outputs()) {
|
||||
auto name = ov::op::util::get_ie_output_name(it.get_node_shared_ptr()->input_value(0));
|
||||
ov::SoPtr<ov::ITensor> res;
|
||||
auto batched_tensor = m_batched_request_wrapper->_infer_request_batched->get_tensor(it);
|
||||
if (!batched_tensor._so)
|
||||
batched_tensor._so = m_batched_request_wrapper->_infer_request_batched._so;
|
||||
res = create_shared_tensor_on_batched_tensor(batched_tensor, name, batched_outputs, m_batch_id, m_batch_size);
|
||||
set_tensor(it, res);
|
||||
}
|
||||
}
|
||||
|
||||
void SyncInferRequest::ShareBlobsWithBatchRequest(const std::set<std::string>& batchedInputs,
|
||||
const std::set<std::string>& batchedOutputs) {
|
||||
// Allocate all input blobs
|
||||
for (const auto& it : _networkInputs) {
|
||||
auto blob = m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first);
|
||||
InferenceEngine::Blob::Ptr res;
|
||||
switch (it.second->getTensorDesc().getPrecision()) {
|
||||
case InferenceEngine::Precision::FP32:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP32>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::I32:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I32>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::I8:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I8>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::I16:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I16>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::U16:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U16>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::U32:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U32>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::FP64:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP64>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::FP16:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP16>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::BF16:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::BF16>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::U64:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U64>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::I64:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I64>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::U8:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U8>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::BOOL:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::BOOL>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedInputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
default:
|
||||
IE_THROW() << "Unsupported input precision " << it.second->getTensorDesc().getPrecision();
|
||||
void SyncInferRequest::set_tensors_to_another_request(ov::SoPtr<ov::IAsyncInferRequest>& req) {
|
||||
for (const auto& it : get_inputs()) {
|
||||
// this request is already in BUSY state, so using the internal functions safely
|
||||
auto tensor = get_tensor(it);
|
||||
OPENVINO_ASSERT(tensor != nullptr, "The tensor is empty!");
|
||||
auto type = tensor->get_element_type();
|
||||
if (req->get_tensor(it)->data(type) != tensor->data(type)) {
|
||||
req->set_tensor(it, tensor);
|
||||
}
|
||||
_inputs[it.first] = res;
|
||||
}
|
||||
// Allocate all output blobs
|
||||
for (const auto& it : _networkOutputs) {
|
||||
auto blob = m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first);
|
||||
InferenceEngine::Blob::Ptr res;
|
||||
switch (it.second->getTensorDesc().getPrecision()) {
|
||||
case InferenceEngine::Precision::FP32:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP32>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::I32:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I32>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::I8:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I8>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::I16:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I16>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::U16:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U16>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::U32:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U32>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::FP64:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP64>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::FP16:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP16>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::BF16:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::BF16>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::U64:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U64>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::I64:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I64>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::U8:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U8>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
case InferenceEngine::Precision::BOOL:
|
||||
res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::BOOL>(
|
||||
m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
|
||||
it.first,
|
||||
batchedOutputs,
|
||||
m_batch_id,
|
||||
m_batch_size);
|
||||
break;
|
||||
default:
|
||||
IE_THROW(NotImplemented) << "Unsupported input precision " << it.second->getTensorDesc().getPrecision();
|
||||
for (const auto& it : get_outputs()) {
|
||||
// this request is already in BUSY state, so using the internal functions safely
|
||||
auto tensor = get_tensor(it);
|
||||
OPENVINO_ASSERT(tensor != nullptr, "The tensor is empty!");
|
||||
auto type = tensor->get_element_type();
|
||||
if (req->get_tensor(it)->data(type) != tensor->data(type)) {
|
||||
req->set_tensor(it, tensor);
|
||||
}
|
||||
_outputs[it.first] = res;
|
||||
}
|
||||
}
|
||||
void SyncInferRequest::SetBlobsToAnotherRequest(InferenceEngine::SoIInferRequestInternal& req) {
|
||||
for (const auto& it : _networkInputs) {
|
||||
auto& name = it.first;
|
||||
// this request is already in BUSY state, so using the internal functions safely
|
||||
auto blob = GetBlob(name);
|
||||
if (req->GetBlob(name) != blob)
|
||||
req->SetBlob(name, blob);
|
||||
}
|
||||
for (const auto& it : _networkOutputs) {
|
||||
auto& name = it.first;
|
||||
// this request is already in BUSY state, so using the internal functions safely
|
||||
auto blob = GetBlob(name);
|
||||
if (req->GetBlob(name) != blob)
|
||||
req->SetBlob(name, blob);
|
||||
}
|
||||
}
|
||||
|
||||
void SyncInferRequest::CopyInputsIfNeeded() {
|
||||
for (const auto& it : _networkInputs) {
|
||||
auto& name = it.first;
|
||||
void SyncInferRequest::copy_inputs_if_needed() {
|
||||
for (const auto& it : get_inputs()) {
|
||||
// this request is already in BUSY state, so using the internal functions safely
|
||||
CopyBlobIfNeeded(GetBlob(name), m_batched_request_wrapper._inferRequestBatched->GetBlob(name), true);
|
||||
auto dst_tensor = m_batched_request_wrapper->_infer_request_batched->get_tensor(it);
|
||||
copy_tensor_if_needed(get_tensor(it), dst_tensor, true);
|
||||
}
|
||||
}
|
||||
|
||||
void SyncInferRequest::CopyBlobIfNeeded(InferenceEngine::Blob::CPtr src, InferenceEngine::Blob::Ptr dst, bool bInput) {
|
||||
auto bufferDst = dst->buffer();
|
||||
auto ptrDst = bufferDst.as<char*>();
|
||||
auto bufferSrc = src->cbuffer();
|
||||
auto ptrSrc = bufferSrc.as<const char*>();
|
||||
ptrdiff_t szDst = dst->byteSize();
|
||||
ptrdiff_t szSrc = src->byteSize();
|
||||
void SyncInferRequest::copy_tensor_if_needed(const ov::SoPtr<ov::ITensor>& src,
|
||||
ov::SoPtr<ov::ITensor>& dst,
|
||||
const bool bInput) {
|
||||
auto ptrDst = static_cast<char*>(dst->data());
|
||||
auto ptrSrc = static_cast<char*>(src->data());
|
||||
ptrdiff_t szDst = dst->get_byte_size();
|
||||
ptrdiff_t szSrc = src->get_byte_size();
|
||||
if (bInput) {
|
||||
ptrdiff_t offset = szSrc != szDst ? m_batch_id * szDst / m_batch_size : 0;
|
||||
if ((ptrDst + offset) == ptrSrc)
|
||||
@ -338,12 +118,29 @@ void SyncInferRequest::CopyBlobIfNeeded(InferenceEngine::Blob::CPtr src, Inferen
|
||||
}
|
||||
}
|
||||
|
||||
void SyncInferRequest::CopyOutputsIfNeeded() {
|
||||
for (const auto& it : _networkOutputs) {
|
||||
auto& name = it.first;
|
||||
void SyncInferRequest::copy_outputs_if_needed() {
|
||||
for (const auto& it : get_outputs()) {
|
||||
// this request is already in BUSY state, so using the internal functions safely
|
||||
CopyBlobIfNeeded(m_batched_request_wrapper._inferRequestBatched->GetBlob(name), GetBlob(name), false);
|
||||
auto dst_tensor = get_tensor(it);
|
||||
copy_tensor_if_needed(m_batched_request_wrapper->_infer_request_batched->get_tensor(it), dst_tensor, false);
|
||||
}
|
||||
}
|
||||
|
||||
void SyncInferRequest::infer() {
|
||||
OPENVINO_NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
std::vector<ov::SoPtr<ov::IVariableState>> SyncInferRequest::query_state() const {
|
||||
auto states = m_batched_request_wrapper->_infer_request_batched->query_state();
|
||||
for (auto&& state : states) {
|
||||
if (!state._so)
|
||||
state._so = m_batched_request_wrapper->_infer_request_batched._so;
|
||||
}
|
||||
return states;
|
||||
}
|
||||
|
||||
std::vector<ov::ProfilingInfo> SyncInferRequest::get_profiling_info() const {
|
||||
return m_batched_request_wrapper->_infer_request_batched->get_profiling_info();
|
||||
}
|
||||
} // namespace autobatch_plugin
|
||||
} // namespace ov
|
@ -6,40 +6,36 @@
|
||||
#pragma once
|
||||
|
||||
#include "compiled_model.hpp"
|
||||
#include "cpp_interfaces/interface/ie_iinfer_request_internal.hpp"
|
||||
#include "openvino/runtime/isync_infer_request.hpp"
|
||||
|
||||
namespace ov {
|
||||
namespace autobatch_plugin {
|
||||
|
||||
class SyncInferRequest : public InferenceEngine::IInferRequestInternal {
|
||||
class SyncInferRequest : public ov::ISyncInferRequest {
|
||||
public:
|
||||
using Ptr = std::shared_ptr<SyncInferRequest>;
|
||||
explicit SyncInferRequest(const InferenceEngine::InputsDataMap& networkInputs,
|
||||
const InferenceEngine::OutputsDataMap& networkOutputs,
|
||||
CompiledModel::WorkerInferRequest& workerRequestPtr,
|
||||
int batch_id,
|
||||
int num_batch,
|
||||
const std::set<std::string>& batchedIntputs,
|
||||
const std::set<std::string>& batchedOutputs);
|
||||
|
||||
explicit SyncInferRequest(const std::vector<std::shared_ptr<const ov::Node>>& inputs,
|
||||
const std::vector<std::shared_ptr<const ov::Node>>& outputs,
|
||||
CompiledModel::WorkerInferRequest& workerRequestPtr,
|
||||
int batch_id,
|
||||
int num_batch,
|
||||
const std::set<std::string>& batchedIntputs,
|
||||
const std::set<std::string>& batchedOutputs);
|
||||
SyncInferRequest(const std::shared_ptr<const ov::autobatch_plugin::CompiledModel>& compiled_model,
|
||||
const std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest>& worker_request,
|
||||
int batch_id,
|
||||
int num_batch,
|
||||
const std::set<std::string>& batched_inputs,
|
||||
const std::set<std::string>& batched_outputs);
|
||||
|
||||
// Batch-Device impl specific: sets the data (blobs from the device request to the batched device request)
|
||||
void SetBlobsToAnotherRequest(InferenceEngine::SoIInferRequestInternal& req);
|
||||
void set_tensors_to_another_request(ov::SoPtr<ov::IAsyncInferRequest>& req);
|
||||
|
||||
void CopyInputsIfNeeded();
|
||||
void copy_inputs_if_needed();
|
||||
|
||||
void CopyOutputsIfNeeded();
|
||||
void copy_outputs_if_needed();
|
||||
|
||||
CompiledModel::WorkerInferRequest& m_batched_request_wrapper;
|
||||
void infer() override;
|
||||
|
||||
std::exception_ptr m_exceptionPtr;
|
||||
std::vector<ov::SoPtr<ov::IVariableState>> query_state() const override;
|
||||
|
||||
std::vector<ov::ProfilingInfo> get_profiling_info() const override;
|
||||
|
||||
std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest> m_batched_request_wrapper;
|
||||
|
||||
std::exception_ptr m_exception_ptr;
|
||||
|
||||
enum eExecutionFlavor : uint8_t {
|
||||
NOT_EXECUTED,
|
||||
@ -48,10 +44,11 @@ public:
|
||||
} m_batched_request_status = eExecutionFlavor::NOT_EXECUTED;
|
||||
|
||||
protected:
|
||||
void CopyBlobIfNeeded(InferenceEngine::Blob::CPtr src, InferenceEngine::Blob::Ptr dst, bool bInput);
|
||||
void copy_tensor_if_needed(const ov::SoPtr<ov::ITensor>& src, ov::SoPtr<ov::ITensor>& dst, const bool bInput);
|
||||
|
||||
void share_tensors_with_batched_req(const std::set<std::string>& batched_inputs,
|
||||
const std::set<std::string>& batched_outputs);
|
||||
|
||||
void ShareBlobsWithBatchRequest(const std::set<std::string>& batchedIntputs,
|
||||
const std::set<std::string>& batchedOutputs);
|
||||
size_t m_batch_id;
|
||||
|
||||
size_t m_batch_size;
|
||||
|
@ -13,8 +13,6 @@ namespace {
|
||||
|
||||
const std::vector<ov::AnyMap> auto_batch_inproperties = {
|
||||
{ov::num_streams(-100)},
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), std::string(CommonTestUtils::DEVICE_TEMPLATE) + "(4)"},
|
||||
{ov::auto_batch_timeout(-1)}},
|
||||
{ov::device::id("UNSUPPORTED_DEVICE_ID_STRING")},
|
||||
};
|
||||
|
||||
|
@ -53,7 +53,7 @@ INSTANTIATE_TEST_SUITE_P(smoke_AutoBatching_test_uint,
|
||||
::testing::Combine(::testing::Values(std::string(CommonTestUtils::DEVICE_BATCH) + ":" +
|
||||
CommonTestUtils::DEVICE_TEMPLATE),
|
||||
::testing::Values(DefaultParameter{ov::auto_batch_timeout.name(),
|
||||
InferenceEngine::Parameter{1000}})),
|
||||
InferenceEngine::Parameter{uint32_t(1000)}})),
|
||||
DefaultConfigurationTest::getTestCaseName);
|
||||
|
||||
} // namespace
|
||||
|
@ -8,8 +8,6 @@ using namespace BehaviorTestsDefinitions;
|
||||
namespace {
|
||||
auto auto_batch_inconfigs = []() {
|
||||
return std::vector<std::map<std::string, std::string>>{
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_TEMPLATE},
|
||||
{ov::auto_batch_timeout.name(), "-1"}},
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_TEMPLATE},
|
||||
{ov::hint::performance_mode.name(), "DOESN'T EXIST"}},
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_TEMPLATE},
|
||||
|
324
src/plugins/auto_batch/tests/unit/async_infer_request_test.cpp
Normal file
324
src/plugins/auto_batch/tests/unit/async_infer_request_test.cpp
Normal file
@ -0,0 +1,324 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "openvino/core/type/element_type.hpp"
|
||||
#include "openvino/runtime/threading/immediate_executor.hpp"
|
||||
#include "transformations/utils/utils.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using AutoBatchRequestTestParams = std::tuple<uint32_t, // batch_size
|
||||
ov::element::Type_t, // data type
|
||||
uint32_t>; // inference interval
|
||||
|
||||
class AutoBatchAsyncInferRequestTest : public ::testing::TestWithParam<AutoBatchRequestTestParams> {
|
||||
public:
|
||||
std::shared_ptr<ov::Model> m_model;
|
||||
std::shared_ptr<ov::Model> m_batched_model;
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_auto_batch_plugin;
|
||||
|
||||
std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
|
||||
|
||||
std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_without_batch;
|
||||
ov::SoPtr<ov::ICompiledModel> m_compile_model_without_batch;
|
||||
|
||||
std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_with_batch;
|
||||
ov::SoPtr<ov::ICompiledModel> m_compile_model_with_batch;
|
||||
|
||||
ov::AnyMap m_config;
|
||||
DeviceInformation m_device_info;
|
||||
std::set<std::string> m_batched_inputs;
|
||||
std::set<std::string> m_batched_outputs;
|
||||
ov::SoPtr<ov::IRemoteContext> m_remote_context;
|
||||
|
||||
std::shared_ptr<CompiledModel> m_auto_batch_compile_model;
|
||||
|
||||
std::shared_ptr<NiceMock<MockISyncInferRequest>> m_sync_infer_request_with_batch;
|
||||
|
||||
std::shared_ptr<NiceMock<MockIAsyncInferRequest>> m_async_infer_request_with_batch;
|
||||
|
||||
std::shared_ptr<NiceMock<MockISyncInferRequest>> m_sync_infer_request_without_batch;
|
||||
|
||||
std::shared_ptr<NiceMock<MockIAsyncInferRequest>> m_async_infer_request_without_batch;
|
||||
|
||||
std::shared_ptr<ov::threading::ImmediateExecutor> m_executor;
|
||||
|
||||
std::shared_ptr<CompiledModel::WorkerInferRequest> workerRequestPtr;
|
||||
|
||||
uint32_t m_batch_size;
|
||||
ov::element::Type_t m_element_type;
|
||||
uint32_t m_infer_interval;
|
||||
|
||||
std::vector<std::shared_ptr<AsyncInferRequest>> m_auto_batch_async_infer_requests;
|
||||
|
||||
std::vector<ov::ProfilingInfo> m_profiling_info;
|
||||
|
||||
bool m_terminate;
|
||||
|
||||
static std::string getTestCaseName(testing::TestParamInfo<AutoBatchRequestTestParams> obj) {
|
||||
uint32_t batch_size, infer_interval;
|
||||
ov::element::Type_t element_type;
|
||||
std::tie(batch_size, element_type, infer_interval) = obj.param;
|
||||
|
||||
std::string res;
|
||||
res = "batch_size_" + std::to_string(batch_size);
|
||||
res += "_element_type_" + std::to_string(static_cast<int>(element_type));
|
||||
if (infer_interval > 0)
|
||||
res += "_infer_interval_" + std::to_string(infer_interval);
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_terminate = true;
|
||||
m_profiling_info.clear();
|
||||
m_auto_batch_async_infer_requests.clear();
|
||||
m_auto_batch_plugin.reset();
|
||||
m_model.reset();
|
||||
m_batched_model.reset();
|
||||
m_core.reset();
|
||||
m_i_compile_model_without_batch.reset();
|
||||
m_compile_model_without_batch = {};
|
||||
m_i_compile_model_with_batch.reset();
|
||||
m_compile_model_with_batch = {};
|
||||
m_auto_batch_compile_model.reset();
|
||||
m_sync_infer_request_without_batch.reset();
|
||||
m_async_infer_request_without_batch.reset();
|
||||
m_executor.reset();
|
||||
clear_worker();
|
||||
workerRequestPtr.reset();
|
||||
m_sync_infer_request_with_batch.reset();
|
||||
m_async_infer_request_with_batch.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_batch_size, m_element_type, m_infer_interval) = this->GetParam();
|
||||
m_terminate = false;
|
||||
std::vector<size_t> inputShape = {1, 3, 24, 24};
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, m_element_type);
|
||||
|
||||
prepare_input(m_model, m_batch_size);
|
||||
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
|
||||
m_auto_batch_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
|
||||
m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
|
||||
|
||||
m_auto_batch_plugin->set_core(m_core);
|
||||
m_i_compile_model_without_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
|
||||
m_compile_model_without_batch = {m_i_compile_model_without_batch, {}};
|
||||
|
||||
m_config = {{"AUTO_BATCH_TIMEOUT", "200"}};
|
||||
|
||||
m_device_info = {"CPU", {}, m_batch_size};
|
||||
|
||||
auto reshaped = m_model->clone();
|
||||
auto inputs = reshaped->inputs();
|
||||
std::map<ov::Output<ov::Node>, ov::PartialShape> partial_shapes;
|
||||
for (auto& input : inputs) {
|
||||
auto input_shape = input.get_shape();
|
||||
if (m_batched_inputs.find(ov::op::util::get_ie_output_name(input)) != m_batched_inputs.end()) {
|
||||
input_shape[0] = m_batch_size;
|
||||
}
|
||||
partial_shapes.insert({input, ov::PartialShape(input_shape)});
|
||||
}
|
||||
|
||||
reshaped->reshape(partial_shapes);
|
||||
|
||||
m_i_compile_model_with_batch = std::make_shared<NiceMock<MockICompiledModel>>(reshaped, m_hardware_plugin);
|
||||
m_compile_model_with_batch = {m_i_compile_model_with_batch, {}};
|
||||
|
||||
ASSERT_NO_THROW(m_auto_batch_compile_model = std::make_shared<CompiledModel>(m_model->clone(),
|
||||
m_auto_batch_plugin,
|
||||
m_config,
|
||||
m_device_info,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs,
|
||||
m_compile_model_with_batch,
|
||||
m_compile_model_without_batch,
|
||||
m_remote_context));
|
||||
|
||||
m_sync_infer_request_with_batch =
|
||||
std::make_shared<NiceMock<MockISyncInferRequest>>(m_i_compile_model_with_batch);
|
||||
|
||||
m_executor = std::make_shared<ov::threading::ImmediateExecutor>();
|
||||
|
||||
m_async_infer_request_with_batch =
|
||||
std::make_shared<NiceMock<MockIAsyncInferRequest>>(m_sync_infer_request_with_batch, m_executor, nullptr);
|
||||
|
||||
m_sync_infer_request_without_batch =
|
||||
std::make_shared<NiceMock<MockISyncInferRequest>>(m_i_compile_model_without_batch);
|
||||
|
||||
m_async_infer_request_without_batch =
|
||||
std::make_shared<NiceMock<MockIAsyncInferRequest>>(m_sync_infer_request_without_batch, m_executor, nullptr);
|
||||
|
||||
m_profiling_info = {};
|
||||
}
|
||||
|
||||
void create_worker(int batch_size) {
|
||||
workerRequestPtr = std::make_shared<CompiledModel::WorkerInferRequest>();
|
||||
|
||||
workerRequestPtr->_infer_request_batched = {m_async_infer_request_with_batch, {}};
|
||||
workerRequestPtr->_batch_size = batch_size;
|
||||
workerRequestPtr->_completion_tasks.resize(workerRequestPtr->_batch_size);
|
||||
workerRequestPtr->_infer_request_batched->set_callback([this](std::exception_ptr exceptionPtr) mutable {
|
||||
if (exceptionPtr)
|
||||
workerRequestPtr->_exception_ptr = exceptionPtr;
|
||||
});
|
||||
|
||||
ON_CALL(*m_async_infer_request_with_batch, start_async()).WillByDefault([this]() {
|
||||
OPENVINO_ASSERT(workerRequestPtr->_completion_tasks.size() == (size_t)workerRequestPtr->_batch_size);
|
||||
for (int c = 0; c < workerRequestPtr->_batch_size; c++) {
|
||||
workerRequestPtr->_completion_tasks[c]();
|
||||
}
|
||||
workerRequestPtr->_cond.notify_one();
|
||||
});
|
||||
|
||||
workerRequestPtr->_thread = std::thread([this] {
|
||||
while (1) {
|
||||
std::cv_status status;
|
||||
{
|
||||
std::unique_lock<std::mutex> lock(workerRequestPtr->_mutex);
|
||||
status = workerRequestPtr->_cond.wait_for(lock, std::chrono::milliseconds(10));
|
||||
}
|
||||
if (m_terminate) {
|
||||
break;
|
||||
} else {
|
||||
// as we pop the tasks from the queue only here
|
||||
// it is ok to call size() (as the _tasks can only grow in parallel)
|
||||
const int sz = static_cast<int>(workerRequestPtr->_tasks.size());
|
||||
if (sz == workerRequestPtr->_batch_size) {
|
||||
std::pair<ov::autobatch_plugin::AsyncInferRequest*, ov::threading::Task> t;
|
||||
for (int n = 0; n < sz; n++) {
|
||||
OPENVINO_ASSERT(workerRequestPtr->_tasks.try_pop(t));
|
||||
workerRequestPtr->_completion_tasks[n] = std::move(t.second);
|
||||
t.first->m_sync_request->copy_inputs_if_needed();
|
||||
t.first->m_sync_request->m_batched_request_status =
|
||||
ov::autobatch_plugin::SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED;
|
||||
}
|
||||
workerRequestPtr->_infer_request_batched->start_async();
|
||||
} else if ((status == std::cv_status::timeout) && sz) {
|
||||
std::pair<AsyncInferRequest*, ov::threading::Task> t;
|
||||
for (int n = 0; n < sz; n++) {
|
||||
IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
|
||||
t.first->m_sync_request->m_batched_request_status =
|
||||
SyncInferRequest::eExecutionFlavor::TIMEOUT_EXECUTED;
|
||||
t.first->m_request_without_batch->start_async();
|
||||
t.second();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
void clear_worker() {
|
||||
workerRequestPtr->_infer_request_batched = {};
|
||||
workerRequestPtr->_completion_tasks.clear();
|
||||
workerRequestPtr->_thread.join();
|
||||
}
|
||||
|
||||
void prepare_input(std::shared_ptr<ov::Model>& model, int batch_size) {
|
||||
const auto& params = model->get_parameters();
|
||||
for (size_t i = 0; i < params.size(); i++) {
|
||||
m_batched_inputs.insert(ov::op::util::get_ie_output_name(params[i]->output(0)));
|
||||
}
|
||||
const auto& results = model->get_results();
|
||||
for (size_t i = 0; i < results.size(); i++) {
|
||||
const auto& output = results[i];
|
||||
const auto& node = output->input_value(0);
|
||||
m_batched_outputs.insert(
|
||||
ov::op::util::get_ie_output_name(ov::Output<const ov::Node>(node.get_node(), node.get_index())));
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(AutoBatchAsyncInferRequestTest, AutoBatchRequestCreateTestCase) {
|
||||
prepare_input(m_model, m_batch_size);
|
||||
create_worker(m_batch_size);
|
||||
|
||||
for (uint32_t batch_id = 0; batch_id < m_batch_size; batch_id++) {
|
||||
auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
|
||||
workerRequestPtr,
|
||||
batch_id,
|
||||
m_batch_size,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
EXPECT_NE(req, nullptr);
|
||||
|
||||
auto asyncInferRequest = std::make_shared<AsyncInferRequest>(req, m_async_infer_request_without_batch, nullptr);
|
||||
EXPECT_NE(asyncInferRequest, nullptr);
|
||||
m_auto_batch_async_infer_requests.emplace_back(asyncInferRequest);
|
||||
}
|
||||
}
|
||||
|
||||
TEST_P(AutoBatchAsyncInferRequestTest, AutoBatchAsyncInferRequestStartAsyncTest) {
|
||||
prepare_input(m_model, m_batch_size);
|
||||
create_worker(m_batch_size);
|
||||
|
||||
for (uint32_t batch_id = 0; batch_id < m_batch_size; batch_id++) {
|
||||
auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
|
||||
workerRequestPtr,
|
||||
batch_id,
|
||||
m_batch_size,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
EXPECT_NE(req, nullptr);
|
||||
|
||||
auto asyncInferRequest = std::make_shared<AsyncInferRequest>(req, m_async_infer_request_without_batch, nullptr);
|
||||
EXPECT_NE(asyncInferRequest, nullptr);
|
||||
m_auto_batch_async_infer_requests.emplace_back(asyncInferRequest);
|
||||
}
|
||||
|
||||
for (auto& req : m_auto_batch_async_infer_requests) {
|
||||
if (m_infer_interval > 0)
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds(m_infer_interval));
|
||||
EXPECT_NO_THROW(req->start_async());
|
||||
}
|
||||
|
||||
for (auto& req : m_auto_batch_async_infer_requests) {
|
||||
EXPECT_NO_THROW(req->wait());
|
||||
}
|
||||
}
|
||||
|
||||
std::vector<ov::element::Type_t> element_type_param{ov::element::Type_t::f16,
|
||||
ov::element::Type_t::f32,
|
||||
ov::element::Type_t::f64,
|
||||
ov::element::Type_t::i8,
|
||||
ov::element::Type_t::i16,
|
||||
ov::element::Type_t::i32,
|
||||
ov::element::Type_t::i64,
|
||||
ov::element::Type_t::u8,
|
||||
ov::element::Type_t::u16,
|
||||
ov::element::Type_t::u32,
|
||||
ov::element::Type_t::u64};
|
||||
const std::vector<uint32_t> batch_size_param{1, 8, 16, 32, 64, 128};
|
||||
const std::vector<uint32_t> infer_interval_timeout_param{0, 10};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
AutoBatchAsyncInferRequestTest,
|
||||
::testing::Combine(::testing::ValuesIn(batch_size_param),
|
||||
::testing::ValuesIn(element_type_param),
|
||||
::testing::ValuesIn(infer_interval_timeout_param)),
|
||||
AutoBatchAsyncInferRequestTest::getTestCaseName);
|
@ -1,397 +0,0 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include <thread>
|
||||
|
||||
#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
|
||||
#include "ie_ngraph_utils.hpp"
|
||||
#include "mock_auto_batch_plugin.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "transformations/utils/utils.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iexecutable_network_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_ivariable_state_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/mock_task_executor.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
using namespace InferenceEngine;
|
||||
|
||||
using AutoBatchRequestTestParams = std::tuple<int, // batch_size
|
||||
ngraph::element::Type_t, // data type
|
||||
int>; // inference interval
|
||||
class AutoBatchRequestTest : public ::testing::TestWithParam<AutoBatchRequestTestParams> {
|
||||
public:
|
||||
// Mock inferRequest
|
||||
std::shared_ptr<NiceMock<MockIInferRequestInternal>> mockInferRequestBatched;
|
||||
|
||||
std::vector<std::shared_ptr<SyncInferRequest>> autoBatchInferRequests;
|
||||
std::map<std::string, InferenceEngine::Blob::Ptr> blobMap;
|
||||
|
||||
std::vector<std::shared_ptr<const ov::Node>> inputs, outputs;
|
||||
std::set<std::string> batchedInputs, batchedOutputs;
|
||||
std::shared_ptr<CompiledModel::WorkerInferRequest> workerRequestPtr;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<AutoBatchRequestTestParams> obj) {
|
||||
int batch_size, infer_interval;
|
||||
ngraph::element::Type_t element_type;
|
||||
std::tie(batch_size, element_type, infer_interval) = obj.param;
|
||||
|
||||
std::string res;
|
||||
res = "batch_size_" + std::to_string(batch_size);
|
||||
res += "_element_type_" + std::to_string(static_cast<int>(element_type));
|
||||
if (infer_interval > 0)
|
||||
res += "_infer_interval_" + std::to_string(infer_interval);
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
mockInferRequestBatched = {};
|
||||
autoBatchInferRequests.clear();
|
||||
blobMap.clear();
|
||||
|
||||
inputs.clear();
|
||||
outputs.clear();
|
||||
batchedInputs.clear();
|
||||
batchedOutputs.clear();
|
||||
clear_worker();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
mockInferRequestBatched = std::make_shared<NiceMock<MockIInferRequestInternal>>();
|
||||
}
|
||||
|
||||
void create_worker(int batch_size) {
|
||||
workerRequestPtr = std::make_shared<CompiledModel::WorkerInferRequest>();
|
||||
|
||||
workerRequestPtr->_inferRequestBatched = {mockInferRequestBatched, {}};
|
||||
workerRequestPtr->_batchSize = batch_size;
|
||||
workerRequestPtr->_completionTasks.resize(workerRequestPtr->_batchSize);
|
||||
workerRequestPtr->_inferRequestBatched->SetCallback([this](std::exception_ptr exceptionPtr) mutable {
|
||||
if (exceptionPtr)
|
||||
workerRequestPtr->m_exceptionPtr = exceptionPtr;
|
||||
});
|
||||
workerRequestPtr->_thread = std::thread([] {
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds(10));
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
void clear_worker() {
|
||||
workerRequestPtr->_inferRequestBatched = {};
|
||||
workerRequestPtr->_completionTasks.clear();
|
||||
workerRequestPtr->_thread.join();
|
||||
}
|
||||
|
||||
void prepare_input(std::shared_ptr<ov::Model>& function, int batch_size) {
|
||||
for (auto& input : function->inputs()) {
|
||||
std::shared_ptr<const ov::Node> n = input.get_node_shared_ptr();
|
||||
inputs.emplace_back(n);
|
||||
}
|
||||
|
||||
for (auto& output : function->outputs()) {
|
||||
std::shared_ptr<const ov::Node> n = output.get_node_shared_ptr();
|
||||
outputs.emplace_back(n);
|
||||
}
|
||||
|
||||
const auto& params = function->get_parameters();
|
||||
for (size_t i = 0; i < params.size(); i++) {
|
||||
batchedInputs.insert(ov::op::util::get_ie_output_name(params[i]->output(0)));
|
||||
}
|
||||
const auto& results = function->get_results();
|
||||
for (size_t i = 0; i < results.size(); i++) {
|
||||
const auto& output = results[i];
|
||||
const auto& node = output->input_value(0);
|
||||
batchedOutputs.insert(
|
||||
ov::op::util::get_ie_output_name(ov::Output<const ov::Node>(node.get_node(), node.get_index())));
|
||||
}
|
||||
|
||||
ON_CALL(*mockInferRequestBatched, GetBlob(StrEq(*batchedInputs.begin())))
|
||||
.WillByDefault([this, batch_size](const std::string& name) {
|
||||
auto item = blobMap.find(name);
|
||||
if (item != blobMap.end()) {
|
||||
return item->second;
|
||||
}
|
||||
auto shape = inputs[0]->get_shape();
|
||||
shape[0] = batch_size;
|
||||
auto element_type = inputs[0]->get_element_type();
|
||||
InferenceEngine::TensorDesc tensorDesc = {InferenceEngine::details::convertPrecision(element_type),
|
||||
shape,
|
||||
InferenceEngine::TensorDesc::getLayoutByRank(shape.size())};
|
||||
auto blob = make_blob_with_precision(tensorDesc);
|
||||
blob->allocate();
|
||||
blobMap[name] = blob;
|
||||
return blob;
|
||||
});
|
||||
|
||||
ON_CALL(*mockInferRequestBatched, GetBlob(StrEq(*batchedOutputs.begin())))
|
||||
.WillByDefault([this, batch_size](const std::string& name) {
|
||||
auto item = blobMap.find(name);
|
||||
if (item != blobMap.end()) {
|
||||
return item->second;
|
||||
}
|
||||
auto shape = outputs[0]->get_shape();
|
||||
shape[0] = batch_size;
|
||||
auto element_type = outputs[0]->get_element_type();
|
||||
InferenceEngine::TensorDesc tensorDesc = {InferenceEngine::details::convertPrecision(element_type),
|
||||
shape,
|
||||
InferenceEngine::TensorDesc::getLayoutByRank(shape.size())};
|
||||
auto blob = make_blob_with_precision(tensorDesc);
|
||||
blob->allocate();
|
||||
blobMap[name] = blob;
|
||||
return blob;
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(AutoBatchRequestTest, AutoBatchRequestCreateTestCase) {
|
||||
int batch_size, infer_interval;
|
||||
ngraph::element::Type_t element_type;
|
||||
std::tie(batch_size, element_type, infer_interval) = this->GetParam();
|
||||
|
||||
std::vector<size_t> inputShape = {1, 3, 24, 24};
|
||||
auto function = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, element_type);
|
||||
prepare_input(function, batch_size);
|
||||
create_worker(batch_size);
|
||||
|
||||
for (int batch_id = 0; batch_id < batch_size; batch_id++) {
|
||||
auto req = std::make_shared<SyncInferRequest>(inputs,
|
||||
outputs,
|
||||
*workerRequestPtr,
|
||||
batch_id,
|
||||
batch_size,
|
||||
batchedInputs,
|
||||
batchedOutputs);
|
||||
EXPECT_NE(req, nullptr);
|
||||
autoBatchInferRequests.emplace_back(req);
|
||||
|
||||
std::vector<std::string> names = {*batchedInputs.begin(), *batchedOutputs.begin()};
|
||||
for (auto& name : names) {
|
||||
auto blob = req->GetBlob(name);
|
||||
auto ptr = blob->buffer().as<char*>();
|
||||
auto size = blob->byteSize();
|
||||
auto batch_blob = mockInferRequestBatched->GetBlob(name);
|
||||
auto batch_ptr = batch_blob->buffer().as<char*>();
|
||||
EXPECT_EQ(ptr, batch_ptr + size * batch_id);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
TEST_P(AutoBatchRequestTest, AutoBatchRequestCopyBlobTestCase) {
|
||||
int batch_size, infer_interval;
|
||||
ngraph::element::Type_t element_type;
|
||||
std::tie(batch_size, element_type, infer_interval) = this->GetParam();
|
||||
|
||||
std::vector<size_t> inputShape = {1, 3, 24, 24};
|
||||
auto function = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, element_type);
|
||||
prepare_input(function, batch_size);
|
||||
create_worker(batch_size);
|
||||
|
||||
for (int batch_id = 0; batch_id < batch_size; batch_id++) {
|
||||
auto req = std::make_shared<SyncInferRequest>(inputs,
|
||||
outputs,
|
||||
*workerRequestPtr,
|
||||
batch_id,
|
||||
batch_size,
|
||||
batchedInputs,
|
||||
batchedOutputs);
|
||||
EXPECT_NE(req, nullptr);
|
||||
autoBatchInferRequests.emplace_back(req);
|
||||
|
||||
EXPECT_NO_THROW(req->CopyInputsIfNeeded());
|
||||
EXPECT_NO_THROW(req->CopyOutputsIfNeeded());
|
||||
}
|
||||
}
|
||||
|
||||
class AutoBatchAsyncInferRequestTest : public AutoBatchRequestTest {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockIInferRequestInternal>> mockInferRequestWithoutBatched;
|
||||
MockTaskExecutor::Ptr mockTaskExecutor;
|
||||
std::vector<AsyncInferRequest::Ptr> autoBatchAsyncInferRequestVec;
|
||||
bool terminate;
|
||||
|
||||
public:
|
||||
void TearDown() override {
|
||||
terminate = true;
|
||||
autoBatchAsyncInferRequestVec.clear();
|
||||
AutoBatchRequestTest::TearDown();
|
||||
mockInferRequestWithoutBatched = {};
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
AutoBatchRequestTest::SetUp();
|
||||
mockInferRequestWithoutBatched = std::make_shared<NiceMock<MockIInferRequestInternal>>();
|
||||
terminate = false;
|
||||
|
||||
mockTaskExecutor = std::make_shared<MockTaskExecutor>();
|
||||
}
|
||||
|
||||
void create_worker(int batch_size) {
|
||||
workerRequestPtr = std::make_shared<CompiledModel::WorkerInferRequest>();
|
||||
|
||||
workerRequestPtr->_inferRequestBatched = {mockInferRequestBatched, {}};
|
||||
workerRequestPtr->_batchSize = batch_size;
|
||||
workerRequestPtr->_completionTasks.resize(workerRequestPtr->_batchSize);
|
||||
workerRequestPtr->_inferRequestBatched->SetCallback([this](std::exception_ptr exceptionPtr) mutable {
|
||||
if (exceptionPtr)
|
||||
workerRequestPtr->m_exceptionPtr = exceptionPtr;
|
||||
});
|
||||
|
||||
ON_CALL(*mockInferRequestBatched, StartAsync()).WillByDefault([this]() {
|
||||
IE_ASSERT(workerRequestPtr->_completionTasks.size() == (size_t)workerRequestPtr->_batchSize);
|
||||
for (int c = 0; c < workerRequestPtr->_batchSize; c++) {
|
||||
workerRequestPtr->_completionTasks[c]();
|
||||
}
|
||||
workerRequestPtr->_cond.notify_one();
|
||||
});
|
||||
|
||||
workerRequestPtr->_thread = std::thread([this] {
|
||||
while (1) {
|
||||
std::cv_status status;
|
||||
{
|
||||
std::unique_lock<std::mutex> lock(workerRequestPtr->_mutex);
|
||||
status = workerRequestPtr->_cond.wait_for(lock, std::chrono::milliseconds(10));
|
||||
}
|
||||
if (terminate) {
|
||||
break;
|
||||
} else {
|
||||
const int sz = static_cast<int>(workerRequestPtr->_tasks.size());
|
||||
if (sz == workerRequestPtr->_batchSize) {
|
||||
std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
|
||||
for (int n = 0; n < sz; n++) {
|
||||
IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
|
||||
workerRequestPtr->_completionTasks[n] = std::move(t.second);
|
||||
t.first->m_sync_infer_request->m_batched_request_status =
|
||||
SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED;
|
||||
}
|
||||
workerRequestPtr->_inferRequestBatched->StartAsync();
|
||||
} else if ((status == std::cv_status::timeout) && sz) {
|
||||
std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
|
||||
for (int n = 0; n < sz; n++) {
|
||||
IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
|
||||
t.first->m_sync_infer_request->m_batched_request_status =
|
||||
SyncInferRequest::eExecutionFlavor::TIMEOUT_EXECUTED;
|
||||
t.first->m_infer_request_without_batch->StartAsync();
|
||||
t.second();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
return;
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(AutoBatchAsyncInferRequestTest, AutoBatchAsyncInferRequestCreateTest) {
|
||||
int batch_size, infer_interval;
|
||||
ngraph::element::Type_t element_type;
|
||||
std::tie(batch_size, element_type, infer_interval) = this->GetParam();
|
||||
|
||||
std::vector<size_t> inputShape = {1, 3, 24, 24};
|
||||
auto function = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, element_type);
|
||||
prepare_input(function, batch_size);
|
||||
create_worker(batch_size);
|
||||
|
||||
for (int batch_id = 0; batch_id < batch_size; batch_id++) {
|
||||
auto autoRequestImpl = std::make_shared<SyncInferRequest>(inputs,
|
||||
outputs,
|
||||
*workerRequestPtr,
|
||||
batch_id,
|
||||
batch_size,
|
||||
batchedInputs,
|
||||
batchedOutputs);
|
||||
EXPECT_NE(autoRequestImpl, nullptr);
|
||||
autoBatchInferRequests.emplace_back(autoRequestImpl);
|
||||
|
||||
InferenceEngine::SoIInferRequestInternal inferRequestWithoutBatched = {mockInferRequestWithoutBatched, {}};
|
||||
auto asyncInferRequest =
|
||||
std::make_shared<AsyncInferRequest>(autoRequestImpl, inferRequestWithoutBatched, nullptr);
|
||||
EXPECT_NE(asyncInferRequest, nullptr);
|
||||
autoBatchAsyncInferRequestVec.emplace_back(asyncInferRequest);
|
||||
}
|
||||
}
|
||||
|
||||
TEST_P(AutoBatchAsyncInferRequestTest, AutoBatchAsyncInferRequestStartAsyncTest) {
|
||||
int batch_size, infer_interval;
|
||||
ngraph::element::Type_t element_type;
|
||||
std::tie(batch_size, element_type, infer_interval) = this->GetParam();
|
||||
|
||||
std::vector<size_t> inputShape = {1, 3, 24, 24};
|
||||
auto function = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, element_type);
|
||||
prepare_input(function, batch_size);
|
||||
create_worker(batch_size);
|
||||
|
||||
for (int batch_id = 0; batch_id < batch_size; batch_id++) {
|
||||
auto autoRequestImpl = std::make_shared<SyncInferRequest>(inputs,
|
||||
outputs,
|
||||
*workerRequestPtr,
|
||||
batch_id,
|
||||
batch_size,
|
||||
batchedInputs,
|
||||
batchedOutputs);
|
||||
EXPECT_NE(autoRequestImpl, nullptr);
|
||||
autoBatchInferRequests.emplace_back(autoRequestImpl);
|
||||
|
||||
InferenceEngine::SoIInferRequestInternal inferRequestWithoutBatched = {mockInferRequestWithoutBatched, {}};
|
||||
auto asyncInferRequest =
|
||||
std::make_shared<AsyncInferRequest>(autoRequestImpl, inferRequestWithoutBatched, nullptr);
|
||||
EXPECT_NE(asyncInferRequest, nullptr);
|
||||
autoBatchAsyncInferRequestVec.emplace_back(asyncInferRequest);
|
||||
}
|
||||
|
||||
for (auto& req : autoBatchAsyncInferRequestVec) {
|
||||
if (infer_interval > 0)
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds(infer_interval));
|
||||
EXPECT_NO_THROW(req->StartAsync());
|
||||
}
|
||||
|
||||
for (auto& req : autoBatchAsyncInferRequestVec)
|
||||
EXPECT_NO_THROW(req->Wait(InferRequest::WaitMode::RESULT_READY));
|
||||
}
|
||||
|
||||
const std::vector<ngraph::element::Type_t> element_type{ngraph::element::Type_t::f16,
|
||||
ngraph::element::Type_t::f32,
|
||||
ngraph::element::Type_t::f64,
|
||||
ngraph::element::Type_t::i8,
|
||||
ngraph::element::Type_t::i16,
|
||||
ngraph::element::Type_t::i32,
|
||||
ngraph::element::Type_t::i64,
|
||||
ngraph::element::Type_t::u8,
|
||||
ngraph::element::Type_t::u16,
|
||||
ngraph::element::Type_t::u32,
|
||||
ngraph::element::Type_t::u64};
|
||||
const std::vector<int> batch_size{1, 8, 16, 32, 64, 128};
|
||||
const std::vector<int> infer_interval{0};
|
||||
const std::vector<int> infer_interval_timeout{0, 10};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
AutoBatchRequestTest,
|
||||
::testing::Combine(::testing::ValuesIn(batch_size),
|
||||
::testing::ValuesIn(element_type),
|
||||
::testing::ValuesIn(infer_interval)),
|
||||
AutoBatchRequestTest::getTestCaseName);
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
AutoBatchAsyncInferRequestTest,
|
||||
::testing::Combine(::testing::ValuesIn(batch_size),
|
||||
::testing::ValuesIn(element_type),
|
||||
::testing::ValuesIn(infer_interval_timeout)),
|
||||
AutoBatchAsyncInferRequestTest::getTestCaseName);
|
@ -0,0 +1,148 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "openvino/runtime/threading/immediate_executor.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using CreateInferRequestTestParams = std::tuple<int, // batch_size
|
||||
int>; // inferReq number
|
||||
|
||||
class CompileModelCreateInferRequestTest : public ::testing::TestWithParam<CreateInferRequestTestParams> {
|
||||
public:
|
||||
std::shared_ptr<ov::Model> m_model;
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_auto_batch_plugin;
|
||||
|
||||
std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_without_batch;
|
||||
ov::SoPtr<ov::ICompiledModel> m_compile_model_without_batch;
|
||||
|
||||
std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_with_batch;
|
||||
ov::SoPtr<ov::ICompiledModel> m_compile_model_with_batch;
|
||||
|
||||
ov::AnyMap m_config;
|
||||
DeviceInformation m_device_info;
|
||||
std::set<std::string> m_batched_inputs;
|
||||
std::set<std::string> m_batched_outputs;
|
||||
ov::SoPtr<ov::IRemoteContext> m_remote_context;
|
||||
|
||||
std::shared_ptr<MockAutoBatchCompileModel> m_auto_batch_compile_model;
|
||||
|
||||
std::shared_ptr<NiceMock<MockISyncInferRequest>> m_sync_infer_request;
|
||||
|
||||
std::shared_ptr<ov::threading::ImmediateExecutor> m_executor;
|
||||
|
||||
uint32_t m_batch_size;
|
||||
int m_infer_request_num;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<CreateInferRequestTestParams> obj) {
|
||||
int batch_size;
|
||||
int infer_num;
|
||||
std::tie(batch_size, infer_num) = obj.param;
|
||||
|
||||
std::string res;
|
||||
res = "batch_size_" + std::to_string(batch_size);
|
||||
res += "_infer_num_" + std::to_string(infer_num);
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_auto_batch_plugin.reset();
|
||||
m_model.reset();
|
||||
m_core.reset();
|
||||
m_i_compile_model_without_batch.reset();
|
||||
m_compile_model_without_batch = {};
|
||||
m_i_compile_model_with_batch.reset();
|
||||
m_compile_model_with_batch = {};
|
||||
m_auto_batch_compile_model.reset();
|
||||
m_sync_infer_request.reset();
|
||||
m_executor.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_batch_size, m_infer_request_num) = this->GetParam();
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
|
||||
m_auto_batch_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
|
||||
m_auto_batch_plugin->set_core(m_core);
|
||||
m_i_compile_model_without_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_auto_batch_plugin);
|
||||
m_compile_model_without_batch = {m_i_compile_model_without_batch, {}};
|
||||
|
||||
m_config = {{"AUTO_BATCH_TIMEOUT", "200"}};
|
||||
|
||||
m_device_info = {"CPU", {}, m_batch_size};
|
||||
m_batched_inputs = {"Parameter_0"};
|
||||
m_batched_outputs = {"Convolution_20"};
|
||||
|
||||
if (m_batch_size > 1) {
|
||||
m_i_compile_model_with_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_auto_batch_plugin);
|
||||
m_compile_model_with_batch = {m_i_compile_model_with_batch, {}};
|
||||
}
|
||||
|
||||
ASSERT_NO_THROW(m_auto_batch_compile_model =
|
||||
std::make_shared<MockAutoBatchCompileModel>(m_model->clone(),
|
||||
m_auto_batch_plugin,
|
||||
m_config,
|
||||
m_device_info,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs,
|
||||
m_compile_model_with_batch,
|
||||
m_compile_model_without_batch,
|
||||
m_remote_context));
|
||||
|
||||
m_sync_infer_request = std::make_shared<NiceMock<MockISyncInferRequest>>(m_i_compile_model_without_batch);
|
||||
|
||||
m_executor = std::make_shared<ov::threading::ImmediateExecutor>();
|
||||
|
||||
ON_CALL(*m_i_compile_model_without_batch, create_infer_request()).WillByDefault([this]() {
|
||||
return std::make_shared<NiceMock<MockIAsyncInferRequest>>(m_sync_infer_request, m_executor, nullptr);
|
||||
});
|
||||
|
||||
EXPECT_CALL(*m_auto_batch_compile_model, create_sync_infer_request())
|
||||
.WillRepeatedly(Return(m_sync_infer_request));
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(CompileModelCreateInferRequestTest, CreateInferRequestTestCases) {
|
||||
std::vector<std::shared_ptr<ov::IAsyncInferRequest>> inferReqs;
|
||||
std::shared_ptr<ov::IAsyncInferRequest> inferReq;
|
||||
for (int i = 0; i < m_infer_request_num; i++) {
|
||||
EXPECT_NO_THROW(inferReq = m_auto_batch_compile_model->create_infer_request());
|
||||
EXPECT_NE(inferReq, nullptr);
|
||||
inferReqs.push_back(inferReq);
|
||||
}
|
||||
inferReqs.clear();
|
||||
}
|
||||
|
||||
const std::vector<int> requests_num{1, 8, 16, 64};
|
||||
const std::vector<int> batch_size{1, 8, 16, 32, 128, 256};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
CompileModelCreateInferRequestTest,
|
||||
::testing::Combine(::testing::ValuesIn(batch_size), ::testing::ValuesIn(requests_num)),
|
||||
CompileModelCreateInferRequestTest::getTestCaseName);
|
@ -0,0 +1,171 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using get_property_param = std::tuple<std::string, // Property need to be set
|
||||
bool>; // Throw exception
|
||||
|
||||
class CompileModelGetPropertyTest : public ::testing::TestWithParam<get_property_param> {
|
||||
public:
|
||||
std::string m_properity_name;
|
||||
bool m_throw_exception;
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
std::shared_ptr<ov::Model> m_model;
|
||||
|
||||
ov::SoPtr<MockICompiledModel> m_mock_compile_model;
|
||||
std::shared_ptr<MockICompiledModel> m_mock_i_compile_model;
|
||||
std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
|
||||
|
||||
std::shared_ptr<ov::ICompiledModel> auto_batch_compile_model;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<get_property_param> obj) {
|
||||
std::string properity_name;
|
||||
bool throw_exception;
|
||||
std::tie(properity_name, throw_exception) = obj.param;
|
||||
|
||||
std::string res;
|
||||
res += "_" + properity_name;
|
||||
if (throw_exception)
|
||||
res += "throw";
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_core.reset();
|
||||
m_plugin.reset();
|
||||
m_model.reset();
|
||||
m_mock_i_compile_model.reset();
|
||||
m_mock_compile_model = {};
|
||||
auto_batch_compile_model.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_properity_name, m_throw_exception) = this->GetParam();
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
m_plugin->set_core(m_core);
|
||||
m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
|
||||
m_mock_i_compile_model = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
|
||||
m_mock_compile_model = {m_mock_i_compile_model, {}};
|
||||
|
||||
ON_CALL(*m_core,
|
||||
compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
|
||||
MatcherCast<const std::string&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(m_mock_compile_model));
|
||||
|
||||
ON_CALL(*m_core,
|
||||
compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
|
||||
MatcherCast<const ov::SoPtr<ov::IRemoteContext>&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(m_mock_compile_model));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT")))
|
||||
.WillByDefault(Return(ov::hint::PerformanceMode::THROUGHPUT));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("OPTIMAL_BATCH_SIZE"), _))
|
||||
.WillByDefault(Return(static_cast<unsigned int>(16)));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
|
||||
.WillByDefault(Return(static_cast<uint32_t>(12)));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("GPU_MEMORY_STATISTICS"), _))
|
||||
.WillByDefault([](const std::string& device, const std::string& key, const ov::AnyMap& options) {
|
||||
std::map<std::string, uint64_t> ret = {{"xyz", 1024}};
|
||||
return ret;
|
||||
});
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _)).WillByDefault(Return("10240"));
|
||||
|
||||
const ov::AnyMap configs = {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(16)"}};
|
||||
ASSERT_NO_THROW(auto_batch_compile_model = m_plugin->compile_model(m_model, configs));
|
||||
|
||||
std::string network_name = m_model.get()->get_name();
|
||||
std::vector<ov::PropertyName> supported_props = {ov::optimal_batch_size, ov::cache_dir};
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq(ov::supported_properties.name())))
|
||||
.WillByDefault(Return(ov::Any(supported_props)));
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
|
||||
.WillByDefault(Return("0"));
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("OPTIMAL_NUMBER_OF_INFER_REQUESTS")))
|
||||
.WillByDefault(Return("12"));
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("NETWORK_NAME")))
|
||||
.WillByDefault(Return(network_name.c_str()));
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("EXECUTION_DEVICES"))).WillByDefault(Return("CPU"));
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("SUPPORTED_CONFIG_KEYS")))
|
||||
.WillByDefault(Return("CPU"));
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("SUPPORTED_CONFIG_KEYS")))
|
||||
.WillByDefault([](const std::string& name) {
|
||||
std::vector<std::string> res_config;
|
||||
res_config.emplace_back("CACHE_DIR");
|
||||
res_config.emplace_back("OPTIMAL_BATCH_SIZE");
|
||||
return res_config;
|
||||
});
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("CACHE_DIR"))).WillByDefault(Return("./abc"));
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("OPTIMAL_BATCH_SIZE"))).WillByDefault(Return("16"));
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(CompileModelGetPropertyTest, CompileModelGetPropertyTestCase) {
|
||||
if (m_throw_exception)
|
||||
ASSERT_ANY_THROW(auto_batch_compile_model->get_property(m_properity_name));
|
||||
else
|
||||
ASSERT_NO_THROW(auto_batch_compile_model->get_property(m_properity_name));
|
||||
}
|
||||
|
||||
const std::vector<get_property_param> compile_model_get_property_param_test = {
|
||||
get_property_param{METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS), false},
|
||||
get_property_param{METRIC_KEY(NETWORK_NAME), false},
|
||||
get_property_param{METRIC_KEY(SUPPORTED_METRICS), false},
|
||||
get_property_param{METRIC_KEY(SUPPORTED_CONFIG_KEYS), false},
|
||||
get_property_param{ov::execution_devices.name(), false},
|
||||
get_property_param{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), false},
|
||||
get_property_param{CONFIG_KEY(AUTO_BATCH_TIMEOUT), false},
|
||||
get_property_param{CONFIG_KEY(CACHE_DIR), false},
|
||||
// Config in dependent m_plugin
|
||||
get_property_param{"OPTIMAL_BATCH_SIZE", false},
|
||||
// Incorrect Property
|
||||
get_property_param{"INCORRECT_METRIC", true},
|
||||
get_property_param{"INCORRECT_CONFIG", true},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
CompileModelGetPropertyTest,
|
||||
::testing::ValuesIn(compile_model_get_property_param_test),
|
||||
CompileModelGetPropertyTest::getTestCaseName);
|
@ -0,0 +1,99 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
class CompileModelGetRuntimeModelTest : public ::testing::Test {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
std::shared_ptr<ov::Model> m_model;
|
||||
|
||||
ov::SoPtr<MockICompiledModel> m_mock_compile_model;
|
||||
std::shared_ptr<MockICompiledModel> m_mock_i_compile_model;
|
||||
std::shared_ptr<ov::ICompiledModel> m_auto_batch_compile_model;
|
||||
|
||||
std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
|
||||
|
||||
public:
|
||||
void TearDown() override {
|
||||
m_core.reset();
|
||||
m_plugin.reset();
|
||||
m_model.reset();
|
||||
m_mock_i_compile_model.reset();
|
||||
m_mock_compile_model = {};
|
||||
m_auto_batch_compile_model.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
m_plugin->set_core(m_core);
|
||||
m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
|
||||
m_mock_i_compile_model = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
|
||||
m_mock_compile_model = {m_mock_i_compile_model, {}};
|
||||
|
||||
ON_CALL(*m_core,
|
||||
compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
|
||||
MatcherCast<const std::string&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(m_mock_compile_model));
|
||||
|
||||
ON_CALL(*m_core,
|
||||
compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
|
||||
MatcherCast<const ov::SoPtr<ov::IRemoteContext>&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(m_mock_compile_model));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT")))
|
||||
.WillByDefault(Return(ov::hint::PerformanceMode::THROUGHPUT));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("OPTIMAL_BATCH_SIZE"), _))
|
||||
.WillByDefault(Return(static_cast<unsigned int>(16)));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
|
||||
.WillByDefault(Return(static_cast<uint32_t>(12)));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("GPU_MEMORY_STATISTICS"), _))
|
||||
.WillByDefault([](const std::string& device, const std::string& key, const ov::AnyMap& options) {
|
||||
std::map<std::string, uint64_t> ret = {{"xyz", 1024}};
|
||||
return ret;
|
||||
});
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _)).WillByDefault(Return("10240"));
|
||||
|
||||
ON_CALL(*m_mock_i_compile_model.get(), get_runtime_model()).WillByDefault(Return(m_model));
|
||||
|
||||
const ov::AnyMap configs = {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(16)"}};
|
||||
|
||||
ASSERT_NO_THROW(m_auto_batch_compile_model = m_plugin->compile_model(m_model, configs));
|
||||
}
|
||||
};
|
||||
|
||||
TEST_F(CompileModelGetRuntimeModelTest, CompileModelGetRuntimeModelTestCase) {
|
||||
ASSERT_NO_THROW(m_auto_batch_compile_model->get_runtime_model());
|
||||
}
|
@ -0,0 +1,132 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using set_property_param = std::tuple<ov::AnyMap, // Property need to be set
|
||||
bool>; // Throw exception
|
||||
|
||||
class CompileModelSetPropertyTest : public ::testing::TestWithParam<set_property_param> {
|
||||
public:
|
||||
ov::AnyMap m_properities;
|
||||
bool m_throw_exception;
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
std::shared_ptr<ov::Model> m_model;
|
||||
|
||||
// Mock execNetwork
|
||||
ov::SoPtr<MockICompiledModel> m_mock_compile_model;
|
||||
std::shared_ptr<MockICompiledModel> m_mock_i_compile_model;
|
||||
std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
|
||||
|
||||
std::shared_ptr<ov::ICompiledModel> m_auto_batch_compile_model;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<set_property_param> obj) {
|
||||
ov::AnyMap properities;
|
||||
bool throw_exception;
|
||||
std::tie(properities, throw_exception) = obj.param;
|
||||
|
||||
std::string res;
|
||||
for (auto& c : properities) {
|
||||
res += "_" + c.first + "_" + c.second.as<std::string>();
|
||||
}
|
||||
if (throw_exception)
|
||||
res += "throw";
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_core.reset();
|
||||
m_plugin.reset();
|
||||
m_model.reset();
|
||||
m_mock_i_compile_model.reset();
|
||||
m_mock_compile_model = {};
|
||||
m_auto_batch_compile_model.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_properities, m_throw_exception) = this->GetParam();
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
m_plugin->set_core(m_core);
|
||||
m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
|
||||
m_mock_i_compile_model = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
|
||||
m_mock_compile_model = {m_mock_i_compile_model, {}};
|
||||
|
||||
ON_CALL(*m_core,
|
||||
compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
|
||||
MatcherCast<const std::string&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(m_mock_compile_model));
|
||||
|
||||
ON_CALL(*m_core,
|
||||
compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
|
||||
MatcherCast<const ov::SoPtr<ov::IRemoteContext>&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(m_mock_compile_model));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT")))
|
||||
.WillByDefault(Return(ov::hint::PerformanceMode::THROUGHPUT));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("OPTIMAL_BATCH_SIZE"), _))
|
||||
.WillByDefault(Return(static_cast<unsigned int>(16)));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
|
||||
.WillByDefault(Return(static_cast<uint32_t>(12)));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("GPU_MEMORY_STATISTICS"), _))
|
||||
.WillByDefault([](const std::string& device, const std::string& key, const ov::AnyMap& options) {
|
||||
std::map<std::string, uint64_t> ret = {{"xyz", 1024}};
|
||||
return ret;
|
||||
});
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _)).WillByDefault(Return("10240"));
|
||||
|
||||
const ov::AnyMap configs = {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(16)"}};
|
||||
|
||||
ASSERT_NO_THROW(m_auto_batch_compile_model = m_plugin->compile_model(m_model, configs));
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(CompileModelSetPropertyTest, CompileModelSetPropertyTestCase) {
|
||||
if (m_throw_exception)
|
||||
ASSERT_ANY_THROW(m_auto_batch_compile_model->set_property(m_properities));
|
||||
else
|
||||
ASSERT_NO_THROW(m_auto_batch_compile_model->set_property(m_properities));
|
||||
}
|
||||
|
||||
const std::vector<set_property_param> compile_model_set_property_param_test = {
|
||||
set_property_param{{{CONFIG_KEY(AUTO_BATCH_TIMEOUT), std::uint32_t(100)}}, false},
|
||||
set_property_param{{{"INCORRECT_CONFIG", 2}}, true},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
CompileModelSetPropertyTest,
|
||||
::testing::ValuesIn(compile_model_set_property_param_test),
|
||||
CompileModelSetPropertyTest::getTestCaseName);
|
@ -1,133 +0,0 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
|
||||
#include "mock_auto_batch_plugin.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iexecutable_network_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_ivariable_state_internal.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
using namespace InferenceEngine;
|
||||
|
||||
using CreateInferRequestTestParams = std::tuple<int, // batch_size
|
||||
int>; // inferReq number
|
||||
class CreateInferRequestTest : public ::testing::TestWithParam<CreateInferRequestTestParams> {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
|
||||
|
||||
// Mock execNetwork
|
||||
std::shared_ptr<NiceMock<MockIExecutableNetworkInternal>> mockIExecNet;
|
||||
ov::SoPtr<IExecutableNetworkInternal> mockExecNetwork;
|
||||
std::shared_ptr<NiceMock<MockIInferencePlugin>> mockIPlugin;
|
||||
std::shared_ptr<InferenceEngine::IInferencePlugin> mockPlugin;
|
||||
ov::SoPtr<IExecutableNetworkInternal> batchedExecNetwork;
|
||||
|
||||
std::shared_ptr<CompiledModel> actualExecNet;
|
||||
std::vector<std::shared_ptr<NiceMock<MockIInferRequestInternal>>> inferRequestVec;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<CreateInferRequestTestParams> obj) {
|
||||
int batch_size;
|
||||
int infer_num;
|
||||
std::tie(batch_size, infer_num) = obj.param;
|
||||
|
||||
std::string res;
|
||||
res = "batch_size_" + std::to_string(batch_size);
|
||||
res += "_infer_num_" + std::to_string(infer_num);
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
core.reset();
|
||||
plugin.reset();
|
||||
mockIExecNet.reset();
|
||||
mockExecNetwork = {};
|
||||
batchedExecNetwork = {};
|
||||
mockPlugin = {};
|
||||
actualExecNet.reset();
|
||||
inferRequestVec.clear();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
mockIExecNet = std::make_shared<NiceMock<MockIExecutableNetworkInternal>>();
|
||||
mockIPlugin = std::make_shared<NiceMock<MockIInferencePlugin>>();
|
||||
ON_CALL(*mockIPlugin, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _)).WillByDefault(Return(mockIExecNet));
|
||||
mockPlugin = mockIPlugin;
|
||||
mockExecNetwork =
|
||||
ov::SoPtr<InferenceEngine::IExecutableNetworkInternal>(mockPlugin->LoadNetwork(CNNNetwork{}, {}), {});
|
||||
batchedExecNetwork = {};
|
||||
|
||||
core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
plugin->SetCore(core);
|
||||
|
||||
// Create inferRequest
|
||||
ON_CALL(*mockIExecNet.get(), CreateInferRequest()).WillByDefault([this]() {
|
||||
auto inferReq = std::make_shared<NiceMock<MockIInferRequestInternal>>();
|
||||
inferRequestVec.push_back(inferReq);
|
||||
return inferReq;
|
||||
});
|
||||
}
|
||||
|
||||
CompiledModel::Ptr createAutoBatchExecutableNetwork(int batch_size) {
|
||||
DeviceInformation metaDevice = {"CPU", {}, batch_size};
|
||||
std::unordered_map<std::string, InferenceEngine::Parameter> config = {{CONFIG_KEY(AUTO_BATCH_TIMEOUT), "200"}};
|
||||
std::set<std::string> batched_inputs = {"Parameter_0"};
|
||||
std::set<std::string> batched_outputs = {"Convolution_20"};
|
||||
|
||||
if (batch_size > 1)
|
||||
batchedExecNetwork =
|
||||
ov::SoPtr<InferenceEngine::IExecutableNetworkInternal>(mockPlugin->LoadNetwork(CNNNetwork{}, {}), {});
|
||||
return std::make_shared<CompiledModel>(batchedExecNetwork,
|
||||
mockExecNetwork,
|
||||
metaDevice,
|
||||
config,
|
||||
batched_inputs,
|
||||
batched_outputs);
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(CreateInferRequestTest, CreateInferRequestTestCases) {
|
||||
int batch_size;
|
||||
int infer_num;
|
||||
std::tie(batch_size, infer_num) = this->GetParam();
|
||||
|
||||
actualExecNet = createAutoBatchExecutableNetwork(batch_size);
|
||||
std::vector<InferenceEngine::IInferRequestInternal::Ptr> inferReqs;
|
||||
InferenceEngine::IInferRequestInternal::Ptr inferReq;
|
||||
for (int i = 0; i < infer_num; i++) {
|
||||
EXPECT_NO_THROW(inferReq = actualExecNet->CreateInferRequest());
|
||||
EXPECT_NE(inferReq, nullptr);
|
||||
inferReqs.push_back(inferReq);
|
||||
}
|
||||
inferReqs.clear();
|
||||
}
|
||||
|
||||
const std::vector<int> requests_num{1, 8, 16, 64};
|
||||
const std::vector<int> batch_size{1, 8, 16, 32, 128, 256};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
CreateInferRequestTest,
|
||||
::testing::Combine(::testing::ValuesIn(batch_size), ::testing::ValuesIn(requests_num)),
|
||||
CreateInferRequestTest::getTestCaseName);
|
@ -1,201 +0,0 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
|
||||
#include "mock_auto_batch_plugin.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iexecutable_network_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_ivariable_state_internal.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
using namespace InferenceEngine;
|
||||
|
||||
using ExecNetworkParams = std::tuple<std::string, // Key name
|
||||
int, // GetMetric(0) or GetConfig(1) or SetConfig(3)
|
||||
bool>; // Throw exception
|
||||
class ExecNetworkTest : public ::testing::TestWithParam<ExecNetworkParams> {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
|
||||
|
||||
// Mock execNetwork
|
||||
std::shared_ptr<NiceMock<MockIExecutableNetworkInternal>> mockIExecNet;
|
||||
ov::SoPtr<IExecutableNetworkInternal> mockExecNetwork;
|
||||
std::shared_ptr<NiceMock<MockIInferencePlugin>> mockIPlugin;
|
||||
std::shared_ptr<InferenceEngine::IInferencePlugin> mockPlugin;
|
||||
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr actualExecNet;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<ExecNetworkParams> obj) {
|
||||
std::string name;
|
||||
bool throw_exception;
|
||||
int action;
|
||||
std::tie(name, action, throw_exception) = obj.param;
|
||||
|
||||
std::string res;
|
||||
switch (action) {
|
||||
case 0:
|
||||
res += "GetMetric_" + name;
|
||||
break;
|
||||
case 1:
|
||||
res += "GetConfig_" + name;
|
||||
break;
|
||||
case 3:
|
||||
res += "SetConfig_" + name;
|
||||
break;
|
||||
default:
|
||||
res += "error_" + name;
|
||||
}
|
||||
|
||||
if (throw_exception)
|
||||
res += "throw";
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
core.reset();
|
||||
plugin.reset();
|
||||
mockIExecNet.reset();
|
||||
mockExecNetwork = {};
|
||||
mockPlugin = {};
|
||||
actualExecNet.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
mockIExecNet = std::make_shared<NiceMock<MockIExecutableNetworkInternal>>();
|
||||
auto mockIPluginPtr = std::make_shared<NiceMock<MockIInferencePlugin>>();
|
||||
ON_CALL(*mockIPluginPtr, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _)).WillByDefault(Return(mockIExecNet));
|
||||
mockPlugin = mockIPluginPtr;
|
||||
EXPECT_CALL(*mockIPluginPtr, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _)).Times(1);
|
||||
mockExecNetwork =
|
||||
ov::SoPtr<InferenceEngine::IExecutableNetworkInternal>(mockPlugin->LoadNetwork(CNNNetwork{}, {}), {});
|
||||
|
||||
core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
plugin->SetCore(core);
|
||||
|
||||
ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
|
||||
return plugin->Plugin::ParseBatchDevice(batchDevice);
|
||||
});
|
||||
ON_CALL(*core, LoadNetwork(MatcherCast<const CNNNetwork&>(_), MatcherCast<const std::string&>(_), _))
|
||||
.WillByDefault(Return(mockExecNetwork));
|
||||
ON_CALL(*core,
|
||||
LoadNetwork(MatcherCast<const CNNNetwork&>(_),
|
||||
MatcherCast<const std::shared_ptr<InferenceEngine::RemoteContext>&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(mockExecNetwork));
|
||||
ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT"))).WillByDefault(Return("THROUGHPUT"));
|
||||
ON_CALL(*core, GetMetric(_, StrEq("OPTIMAL_BATCH_SIZE"), _)).WillByDefault(Return("16"));
|
||||
ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS"))).WillByDefault(Return("12"));
|
||||
ON_CALL(*core, GetMetric(_, StrEq("GPU_MEMORY_STATISTICS"), _))
|
||||
.WillByDefault([](const std::string& device, const std::string& key, const ov::AnyMap& options) {
|
||||
std::map<std::string, uint64_t> ret = {{"xyz", 1024}};
|
||||
return ret;
|
||||
});
|
||||
ON_CALL(*core, GetMetric(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _)).WillByDefault(Return("10240"));
|
||||
auto graph = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
auto net = CNNNetwork(graph);
|
||||
|
||||
const std::map<std::string, std::string> configs = {{"AUTO_BATCH_TIMEOUT", "200"},
|
||||
{"AUTO_BATCH_DEVICE_CONFIG", "CPU(16)"}};
|
||||
ASSERT_NO_THROW(actualExecNet = plugin->LoadNetworkImpl(net, {}, configs));
|
||||
|
||||
ON_CALL(*mockIExecNet, GetConfig(StrEq("PERFORMANCE_HINT_NUM_REQUESTS"))).WillByDefault(Return("0"));
|
||||
ON_CALL(*mockIExecNet, GetMetric(StrEq("OPTIMAL_NUMBER_OF_INFER_REQUESTS"))).WillByDefault(Return("12"));
|
||||
ON_CALL(*mockIExecNet, GetMetric(StrEq("NETWORK_NAME"))).WillByDefault(Return("network_name"));
|
||||
ON_CALL(*mockIExecNet, GetMetric(StrEq("EXECUTION_DEVICES"))).WillByDefault(Return("CPU"));
|
||||
ON_CALL(*mockIExecNet, GetMetric(StrEq("SUPPORTED_CONFIG_KEYS"))).WillByDefault(Return("CPU"));
|
||||
ON_CALL(*mockIExecNet, GetMetric(StrEq("SUPPORTED_CONFIG_KEYS"))).WillByDefault([](const std::string& name) {
|
||||
std::vector<std::string> res_config;
|
||||
res_config.emplace_back("CACHE_DIR");
|
||||
res_config.emplace_back("OPTIMAL_BATCH_SIZE");
|
||||
return res_config;
|
||||
});
|
||||
ON_CALL(*mockIExecNet, GetConfig(StrEq("CACHE_DIR"))).WillByDefault(Return("./abc"));
|
||||
ON_CALL(*mockIExecNet, GetConfig(StrEq("OPTIMAL_BATCH_SIZE"))).WillByDefault(Return("16"));
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(ExecNetworkTest, ExecNetworkGetConfigMetricTestCase) {
|
||||
std::string name;
|
||||
bool throw_exception;
|
||||
int action;
|
||||
std::tie(name, action, throw_exception) = this->GetParam();
|
||||
|
||||
std::map<std::string, InferenceEngine::Parameter> config;
|
||||
|
||||
switch (action) {
|
||||
case 0: {
|
||||
if (throw_exception)
|
||||
ASSERT_ANY_THROW(actualExecNet->GetMetric(name));
|
||||
else
|
||||
ASSERT_NO_THROW(actualExecNet->GetMetric(name));
|
||||
break;
|
||||
}
|
||||
case 1: {
|
||||
if (throw_exception)
|
||||
ASSERT_ANY_THROW(actualExecNet->GetConfig(name));
|
||||
else
|
||||
ASSERT_NO_THROW(actualExecNet->GetConfig(name));
|
||||
break;
|
||||
}
|
||||
case 3: {
|
||||
config[name] = InferenceEngine::Parameter(100);
|
||||
if (throw_exception)
|
||||
ASSERT_ANY_THROW(actualExecNet->SetConfig(config));
|
||||
else
|
||||
ASSERT_NO_THROW(actualExecNet->SetConfig(config));
|
||||
break;
|
||||
}
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
const std::vector<ExecNetworkParams> testConfigs = {
|
||||
// Metric
|
||||
ExecNetworkParams{METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS), 0, false},
|
||||
ExecNetworkParams{METRIC_KEY(NETWORK_NAME), 0, false},
|
||||
ExecNetworkParams{METRIC_KEY(SUPPORTED_METRICS), 0, false},
|
||||
ExecNetworkParams{METRIC_KEY(SUPPORTED_CONFIG_KEYS), 0, false},
|
||||
ExecNetworkParams{ov::execution_devices.name(), 0, false},
|
||||
// Config in autobatch
|
||||
ExecNetworkParams{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), 1, false},
|
||||
ExecNetworkParams{CONFIG_KEY(AUTO_BATCH_TIMEOUT), 1, false},
|
||||
ExecNetworkParams{CONFIG_KEY(CACHE_DIR), 1, false},
|
||||
// Config in dependent plugin
|
||||
ExecNetworkParams{"OPTIMAL_BATCH_SIZE", 1, false},
|
||||
// Incorrect Metric
|
||||
ExecNetworkParams{"INCORRECT_METRIC", 0, true},
|
||||
// Incorrect config
|
||||
ExecNetworkParams{"INCORRECT_CONFIG", 1, true},
|
||||
// Set Config
|
||||
ExecNetworkParams{CONFIG_KEY(AUTO_BATCH_TIMEOUT), 2, false},
|
||||
ExecNetworkParams{"INCORRECT_CONFIG", 2, true},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
ExecNetworkTest,
|
||||
::testing::ValuesIn(testConfigs),
|
||||
ExecNetworkTest::getTestCaseName);
|
@ -1,313 +0,0 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
|
||||
#include "mock_auto_batch_plugin.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iexecutable_network_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_ivariable_state_internal.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
using namespace InferenceEngine;
|
||||
|
||||
using PluginLoadNetworkParams = std::tuple<std::map<std::string, std::string>, // Paramters
|
||||
std::map<std::string, std::string>, // Config
|
||||
int>; // Batch Size
|
||||
class PluginLoadNetworkTest : public ::testing::TestWithParam<PluginLoadNetworkParams> {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
|
||||
|
||||
// Mock CPU execNetwork
|
||||
std::shared_ptr<NiceMock<MockIExecutableNetworkInternal>> cpuMockIExecNet;
|
||||
ov::SoPtr<IExecutableNetworkInternal> cpuMockExecNetwork;
|
||||
std::shared_ptr<NiceMock<MockIInferencePlugin>> cpuMockIPlugin;
|
||||
std::shared_ptr<InferenceEngine::IInferencePlugin> cpuMockPlugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<PluginLoadNetworkParams> obj) {
|
||||
std::map<std::string, std::string> params;
|
||||
std::map<std::string, std::string> configs;
|
||||
int batch_size;
|
||||
std::tie(params, configs, batch_size) = obj.param;
|
||||
|
||||
std::string res;
|
||||
for (auto& c : params) {
|
||||
res += "_" + c.first + "_" + c.second;
|
||||
}
|
||||
for (auto& c : configs) {
|
||||
res += "_" + c.first + "_" + c.second;
|
||||
}
|
||||
res += "_" + std::to_string(batch_size);
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
core.reset();
|
||||
plugin.reset();
|
||||
cpuMockIExecNet.reset();
|
||||
cpuMockExecNetwork = {};
|
||||
cpuMockPlugin = {};
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
cpuMockIExecNet = std::make_shared<NiceMock<MockIExecutableNetworkInternal>>();
|
||||
auto cpuMockIPluginPtr = std::make_shared<NiceMock<MockIInferencePlugin>>();
|
||||
ON_CALL(*cpuMockIPluginPtr, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _))
|
||||
.WillByDefault(Return(cpuMockIExecNet));
|
||||
cpuMockPlugin = cpuMockIPluginPtr;
|
||||
EXPECT_CALL(*cpuMockIPluginPtr, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _)).Times(1);
|
||||
cpuMockExecNetwork =
|
||||
ov::SoPtr<InferenceEngine::IExecutableNetworkInternal>(cpuMockPlugin->LoadNetwork(CNNNetwork{}, {}), {});
|
||||
|
||||
core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
plugin->SetCore(core);
|
||||
|
||||
ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
|
||||
return plugin->Plugin::ParseBatchDevice(batchDevice);
|
||||
});
|
||||
ON_CALL(*core, LoadNetwork(MatcherCast<const CNNNetwork&>(_), MatcherCast<const std::string&>(_), _))
|
||||
.WillByDefault(Return(cpuMockExecNetwork));
|
||||
ON_CALL(*core,
|
||||
LoadNetwork(MatcherCast<const CNNNetwork&>(_),
|
||||
MatcherCast<const std::shared_ptr<InferenceEngine::RemoteContext>&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(cpuMockExecNetwork));
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(PluginLoadNetworkTest, PluginLoadNetworkTestCase) {
|
||||
std::map<std::string, std::string> params;
|
||||
std::map<std::string, std::string> configs;
|
||||
int batch_size;
|
||||
std::tie(params, configs, batch_size) = this->GetParam();
|
||||
|
||||
ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT"))).WillByDefault(Return(params["PERFORMANCE_HINT"]));
|
||||
ON_CALL(*core, GetMetric(_, StrEq("OPTIMAL_BATCH_SIZE"), _)).WillByDefault(Return(params["OPTIMAL_BATCH_SIZE"]));
|
||||
ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
|
||||
.WillByDefault(Return(params["PERFORMANCE_HINT_NUM_REQUESTS"]));
|
||||
|
||||
ON_CALL(*core, GetMetric(_, StrEq("GPU_MEMORY_STATISTICS"), _))
|
||||
.WillByDefault([¶ms](const std::string& device, const std::string& key, const ov::AnyMap& options) {
|
||||
static int flag = 0;
|
||||
ov::Any value = params[key];
|
||||
uint64_t data = flag * value.as<uint64_t>();
|
||||
std::map<std::string, uint64_t> ret = {{"xyz", data}};
|
||||
flag = flag ? 0 : 1;
|
||||
return ret;
|
||||
});
|
||||
|
||||
ON_CALL(*core, GetMetric(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _))
|
||||
.WillByDefault(Return(params["GPU_DEVICE_TOTAL_MEM_SIZE"]));
|
||||
|
||||
auto graph = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
auto net = CNNNetwork(graph);
|
||||
ASSERT_NO_THROW(plugin->LoadNetworkImpl(net, {}, configs));
|
||||
}
|
||||
|
||||
TEST_P(PluginLoadNetworkTest, PluginLoadBatchedNetworkTestCase) {
|
||||
std::map<std::string, std::string> params;
|
||||
std::map<std::string, std::string> configs;
|
||||
int batch_size;
|
||||
std::tie(params, configs, batch_size) = this->GetParam();
|
||||
|
||||
ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT"))).WillByDefault(Return(params["PERFORMANCE_HINT"]));
|
||||
ON_CALL(*core, GetMetric(_, StrEq("OPTIMAL_BATCH_SIZE"), _)).WillByDefault(Return(params["OPTIMAL_BATCH_SIZE"]));
|
||||
ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
|
||||
.WillByDefault(Return(params["PERFORMANCE_HINT_NUM_REQUESTS"]));
|
||||
|
||||
ON_CALL(*core, GetMetric(_, StrEq("GPU_MEMORY_STATISTICS"), _))
|
||||
.WillByDefault([¶ms](const std::string& device, const std::string& key, const ov::AnyMap& options) {
|
||||
static int flag = 0;
|
||||
ov::Any value = params[key];
|
||||
uint64_t data = flag * value.as<uint64_t>();
|
||||
std::map<std::string, uint64_t> ret = {{"xyz", data}};
|
||||
flag = flag ? 0 : 1;
|
||||
return ret;
|
||||
});
|
||||
|
||||
ON_CALL(*core, GetMetric(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _))
|
||||
.WillByDefault(Return(params["GPU_DEVICE_TOTAL_MEM_SIZE"]));
|
||||
|
||||
auto graph = ngraph::builder::subgraph::makeConvPoolReluNonZero({1, 1, 32, 32});
|
||||
auto batch = ov::Dimension(5);
|
||||
ov::DimensionTracker::set_label(batch, 11);
|
||||
auto p_shape = ov::PartialShape{batch, 1, 32, 32};
|
||||
graph->reshape(p_shape);
|
||||
auto net = CNNNetwork(graph);
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr execNet;
|
||||
ASSERT_NO_THROW(execNet = plugin->LoadNetworkImpl(net, {}, configs));
|
||||
|
||||
ON_CALL(*cpuMockIExecNet, GetConfig(StrEq("PERFORMANCE_HINT_NUM_REQUESTS"))).WillByDefault(Return("0"));
|
||||
ON_CALL(*cpuMockIExecNet, GetMetric(StrEq("OPTIMAL_NUMBER_OF_INFER_REQUESTS"))).WillByDefault(Return("1"));
|
||||
|
||||
InferenceEngine::Parameter res;
|
||||
ASSERT_NO_THROW(res = execNet->GetMetric("OPTIMAL_NUMBER_OF_INFER_REQUESTS"));
|
||||
EXPECT_EQ(1, std::atoi(res.as<std::string>().c_str()));
|
||||
}
|
||||
|
||||
TEST_P(PluginLoadNetworkTest, PluginLoadNetworkGetMetricTestCase) {
|
||||
std::map<std::string, std::string> params;
|
||||
std::map<std::string, std::string> configs;
|
||||
int batch_size;
|
||||
std::tie(params, configs, batch_size) = this->GetParam();
|
||||
|
||||
ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT"))).WillByDefault(Return(params["PERFORMANCE_HINT"]));
|
||||
ON_CALL(*core, GetMetric(_, StrEq("OPTIMAL_BATCH_SIZE"), _)).WillByDefault(Return(params["OPTIMAL_BATCH_SIZE"]));
|
||||
ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
|
||||
.WillByDefault(Return(params["PERFORMANCE_HINT_NUM_REQUESTS"]));
|
||||
|
||||
ON_CALL(*core, GetMetric(_, StrEq("GPU_MEMORY_STATISTICS"), _))
|
||||
.WillByDefault([¶ms](const std::string& device, const std::string& key, const ov::AnyMap& options) {
|
||||
static int flag = 0;
|
||||
ov::Any value = params[key];
|
||||
uint64_t data = flag * value.as<uint64_t>();
|
||||
std::map<std::string, uint64_t> ret = {{"xyz", data}};
|
||||
flag = flag ? 0 : 1;
|
||||
return ret;
|
||||
});
|
||||
|
||||
ON_CALL(*core, GetMetric(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _))
|
||||
.WillByDefault(Return(params["GPU_DEVICE_TOTAL_MEM_SIZE"]));
|
||||
|
||||
auto graph = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
auto net = CNNNetwork(graph);
|
||||
InferenceEngine::IExecutableNetworkInternal::Ptr execNet;
|
||||
ASSERT_NO_THROW(execNet = plugin->LoadNetworkImpl(net, {}, configs));
|
||||
|
||||
std::string network_name = graph.get()->get_name();
|
||||
ON_CALL(*cpuMockIExecNet, GetConfig(StrEq("PERFORMANCE_HINT_NUM_REQUESTS"))).WillByDefault(Return("0"));
|
||||
ON_CALL(*cpuMockIExecNet, GetMetric(StrEq("OPTIMAL_NUMBER_OF_INFER_REQUESTS"))).WillByDefault(Return("1"));
|
||||
ON_CALL(*cpuMockIExecNet, GetMetric(StrEq("NETWORK_NAME"))).WillByDefault(Return(network_name.c_str()));
|
||||
ON_CALL(*cpuMockIExecNet, GetMetric(StrEq("EXECUTION_DEVICES"))).WillByDefault(Return("CPU"));
|
||||
|
||||
InferenceEngine::Parameter res;
|
||||
ASSERT_NO_THROW(res = execNet->GetMetric("OPTIMAL_NUMBER_OF_INFER_REQUESTS"));
|
||||
EXPECT_EQ(batch_size, std::atoi(res.as<std::string>().c_str()));
|
||||
|
||||
ASSERT_NO_THROW(res = execNet->GetMetric("NETWORK_NAME"));
|
||||
EXPECT_EQ(network_name, res.as<std::string>());
|
||||
|
||||
ASSERT_NO_THROW(res = execNet->GetMetric("SUPPORTED_METRICS"));
|
||||
|
||||
ASSERT_NO_THROW(res = execNet->GetMetric("EXECUTION_DEVICES"));
|
||||
EXPECT_STREQ("CPU", res.as<std::string>().c_str());
|
||||
|
||||
ASSERT_ANY_THROW(execNet->GetMetric("XYZ"));
|
||||
}
|
||||
|
||||
const std::vector<PluginLoadNetworkParams> testConfigs = {
|
||||
// Case 1: explict apply batch size by config of AUTO_BATCH_DEVICE_CONFIG
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "16"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(32)"}},
|
||||
32},
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "16"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU(32)"}},
|
||||
32},
|
||||
// Case 2: CPU batch size is figured out by min of opt_batch_size and infReq_num
|
||||
// If config contains "PERFORMANCE_HINT_NUM_REQUESTS" else get it from core->GetConfig
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "16"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
|
||||
12},
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "8"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "16"},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
|
||||
8},
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "8"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "2"},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
|
||||
1},
|
||||
// PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
// {"OPTIMAL_BATCH_SIZE", "32"},
|
||||
// {"PERFORMANCE_HINT_NUM_REQUESTS", "16"},
|
||||
// {"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
// {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
// {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"},
|
||||
// {"PERFORMANCE_HINT_NUM_REQUESTS", "12"}},
|
||||
// 12},
|
||||
//
|
||||
// Case 3: GPU batch size is figured out by
|
||||
// 1) min of opt_batch_size and infReq_num
|
||||
// 2) available_mem/one_graph_mem_footprint with power 2
|
||||
// Final batch_size is the min of 1) and 2)
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "16"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
|
||||
{"GPU_MEMORY_STATISTICS", "1000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "5000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
|
||||
4},
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "16"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "40960000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
|
||||
12},
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "32"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "24"},
|
||||
{"GPU_MEMORY_STATISTICS", "1000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "18000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
|
||||
16},
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
|
||||
{"OPTIMAL_BATCH_SIZE", "32"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "48"},
|
||||
{"GPU_MEMORY_STATISTICS", "1000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "180000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
|
||||
32},
|
||||
// Case 4:
|
||||
PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "LATENCY"},
|
||||
{"OPTIMAL_BATCH_SIZE", "16"},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(32)"}},
|
||||
32},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
PluginLoadNetworkTest,
|
||||
::testing::ValuesIn(testConfigs),
|
||||
PluginLoadNetworkTest::getTestCaseName);
|
@ -1,36 +0,0 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
#include <gmock/gmock.h>
|
||||
|
||||
#include <iostream>
|
||||
|
||||
#include "async_infer_request.hpp"
|
||||
#include "compiled_model.hpp"
|
||||
#include "ie_icore.hpp"
|
||||
#include "plugin.hpp"
|
||||
#include "sync_infer_request.hpp"
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
class MockAutoBatchInferencePlugin : public Plugin {
|
||||
public:
|
||||
MOCK_METHOD((DeviceInformation),
|
||||
ParseMetaDevices,
|
||||
(const std::string&, (const std::map<std::string, std::string>&)),
|
||||
(const));
|
||||
MOCK_METHOD((DeviceInformation), ParseBatchDevice, (const std::string&), ());
|
||||
|
||||
MOCK_METHOD((InferenceEngine::Parameter),
|
||||
GetMetric,
|
||||
(const std::string&, (const std::map<std::string, InferenceEngine::Parameter>&)),
|
||||
(const, override));
|
||||
};
|
||||
|
||||
class MockAutoBatchExecutableNetwork : public CompiledModel {
|
||||
public:
|
||||
MOCK_METHOD((InferenceEngine::Parameter), GetConfig, (const std::string&), (const, override));
|
||||
MOCK_METHOD((InferenceEngine::Parameter), GetMetric, (const std::string&), (const, override));
|
||||
};
|
139
src/plugins/auto_batch/tests/unit/mock_common.hpp
Normal file
139
src/plugins/auto_batch/tests/unit/mock_common.hpp
Normal file
@ -0,0 +1,139 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
#include <gmock/gmock.h>
|
||||
|
||||
#include <iostream>
|
||||
|
||||
#include "async_infer_request.hpp"
|
||||
#include "compiled_model.hpp"
|
||||
#include "ie_icore.hpp"
|
||||
#include "openvino/runtime/make_tensor.hpp"
|
||||
#include "plugin.hpp"
|
||||
#include "sync_infer_request.hpp"
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
class MockIPlugin : public ov::IPlugin {
|
||||
public:
|
||||
MockIPlugin() {
|
||||
set_device_name("HWPLUGIN");
|
||||
}
|
||||
MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>,
|
||||
compile_model,
|
||||
(const std::shared_ptr<const ov::Model>&, const ov::AnyMap&),
|
||||
(const, override));
|
||||
MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>,
|
||||
compile_model,
|
||||
(const std::shared_ptr<const ov::Model>&, const ov::AnyMap&, const ov::SoPtr<ov::IRemoteContext>&),
|
||||
(const, override));
|
||||
MOCK_METHOD(ov::Any, get_property, (const std::string&, const ov::AnyMap&), (const, override));
|
||||
MOCK_METHOD(void, set_property, (const ov::AnyMap&), (override));
|
||||
MOCK_METHOD(ov::SoPtr<ov::IRemoteContext>, create_context, (const ov::AnyMap&), (const, override));
|
||||
MOCK_METHOD(ov::SoPtr<ov::IRemoteContext>, get_default_context, (const ov::AnyMap&), (const, override));
|
||||
MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>,
|
||||
import_model,
|
||||
(std::istream&, const ov::AnyMap&),
|
||||
(const, override));
|
||||
MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>,
|
||||
import_model,
|
||||
(std::istream&, const ov::SoPtr<ov::IRemoteContext>&, const ov::AnyMap&),
|
||||
(const, override));
|
||||
MOCK_METHOD(ov::SupportedOpsMap,
|
||||
query_model,
|
||||
(const std::shared_ptr<const ov::Model>&, const ov::AnyMap&),
|
||||
(const, override));
|
||||
};
|
||||
|
||||
class MockAutoBatchInferencePlugin : public Plugin {
|
||||
public:
|
||||
MOCK_METHOD((ov::Any), get_property, (const std::string&, const ov::AnyMap&), (const, override));
|
||||
};
|
||||
|
||||
class MockICompiledModel : public ov::ICompiledModel {
|
||||
public:
|
||||
MockICompiledModel(const std::shared_ptr<const ov::Model>& model, const std::shared_ptr<const ov::IPlugin>& plugin)
|
||||
: ov::ICompiledModel(model, plugin) {}
|
||||
MOCK_METHOD(std::shared_ptr<ov::ISyncInferRequest>, create_sync_infer_request, (), (const, override));
|
||||
MOCK_METHOD(ov::Any, get_property, (const std::string&), (const, override));
|
||||
MOCK_METHOD(void, set_property, (const ov::AnyMap&), (override));
|
||||
MOCK_METHOD(void, export_model, (std::ostream&), (const, override));
|
||||
MOCK_METHOD(std::shared_ptr<const ov::Model>, get_runtime_model, (), (const, override));
|
||||
MOCK_METHOD(std::shared_ptr<ov::IAsyncInferRequest>, create_infer_request, (), (const, override));
|
||||
};
|
||||
|
||||
class MockAutoBatchCompileModel : public CompiledModel {
|
||||
public:
|
||||
MockAutoBatchCompileModel(const std::shared_ptr<ov::Model>& model,
|
||||
const std::shared_ptr<const ov::IPlugin>& plugin,
|
||||
const ov::AnyMap& config,
|
||||
const DeviceInformation& device_info,
|
||||
const std::set<std::string>& batched_inputs,
|
||||
const std::set<std::string>& batched_outputs,
|
||||
const ov::SoPtr<ov::ICompiledModel>& compiled_model_with_batch,
|
||||
const ov::SoPtr<ov::ICompiledModel>& compiled_model_without_batch,
|
||||
const ov::SoPtr<ov::IRemoteContext>& context)
|
||||
: CompiledModel(model,
|
||||
plugin,
|
||||
config,
|
||||
device_info,
|
||||
batched_inputs,
|
||||
batched_outputs,
|
||||
compiled_model_with_batch,
|
||||
compiled_model_without_batch,
|
||||
context) {}
|
||||
MOCK_METHOD(std::shared_ptr<ov::ISyncInferRequest>, create_sync_infer_request, (), (const, override));
|
||||
};
|
||||
|
||||
class MockISyncInferRequest : public ov::ISyncInferRequest {
|
||||
public:
|
||||
MockISyncInferRequest(const std::shared_ptr<const MockICompiledModel>& compiled_model)
|
||||
: ov::ISyncInferRequest(compiled_model) {
|
||||
OPENVINO_ASSERT(compiled_model);
|
||||
// Allocate input/output tensors
|
||||
for (const auto& input : get_inputs()) {
|
||||
allocate_tensor(input, [this, input](ov::SoPtr<ov::ITensor>& tensor) {
|
||||
// Can add a check to avoid double work in case of shared tensors
|
||||
allocate_tensor_impl(tensor,
|
||||
input.get_element_type(),
|
||||
input.get_partial_shape().is_dynamic() ? ov::Shape{0} : input.get_shape());
|
||||
});
|
||||
}
|
||||
for (const auto& output : get_outputs()) {
|
||||
allocate_tensor(output, [this, output](ov::SoPtr<ov::ITensor>& tensor) {
|
||||
// Can add a check to avoid double work in case of shared tensors
|
||||
allocate_tensor_impl(tensor,
|
||||
output.get_element_type(),
|
||||
output.get_partial_shape().is_dynamic() ? ov::Shape{0} : output.get_shape());
|
||||
});
|
||||
}
|
||||
}
|
||||
MOCK_METHOD(std::vector<ov::ProfilingInfo>, get_profiling_info, (), (const, override));
|
||||
MOCK_METHOD(void, infer, (), (override));
|
||||
MOCK_METHOD(std::vector<ov::SoPtr<ov::IVariableState>>, query_state, (), (const, override));
|
||||
~MockISyncInferRequest() = default;
|
||||
|
||||
private:
|
||||
void allocate_tensor_impl(ov::SoPtr<ov::ITensor>& tensor,
|
||||
const ov::element::Type& element_type,
|
||||
const ov::Shape& shape) {
|
||||
if (!tensor || tensor->get_element_type() != element_type) {
|
||||
tensor = ov::make_tensor(element_type, shape);
|
||||
} else {
|
||||
tensor->set_shape(shape);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
class MockIAsyncInferRequest : public ov::IAsyncInferRequest {
|
||||
public:
|
||||
MockIAsyncInferRequest(const std::shared_ptr<IInferRequest>& request,
|
||||
const std::shared_ptr<ov::threading::ITaskExecutor>& task_executor,
|
||||
const std::shared_ptr<ov::threading::ITaskExecutor>& callback_executor)
|
||||
: IAsyncInferRequest(request, task_executor, callback_executor) {
|
||||
m_pipeline = {};
|
||||
}
|
||||
MOCK_METHOD(void, start_async, (), (override));
|
||||
};
|
@ -0,0 +1,81 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using batch_device_config_params = std::tuple<std::string, // Batch devices
|
||||
std::string, // Expected device name
|
||||
int, // Expected batch size
|
||||
bool // Throw exception
|
||||
>;
|
||||
|
||||
class ParseBatchDeviceTest : public ::testing::TestWithParam<batch_device_config_params> {
|
||||
public:
|
||||
std::string m_batch_device_config;
|
||||
std::string m_device_name;
|
||||
int m_batch_size;
|
||||
bool m_throw_exception;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<batch_device_config_params> obj) {
|
||||
std::string batch_device_config;
|
||||
std::string device_name;
|
||||
int batch_size;
|
||||
bool throw_exception;
|
||||
std::tie(batch_device_config, device_name, batch_size, throw_exception) = obj.param;
|
||||
std::string res = batch_device_config;
|
||||
if (throw_exception)
|
||||
res += "_throw";
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_plugin.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_batch_device_config, m_device_name, m_batch_size, m_throw_exception) = this->GetParam();
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(ParseBatchDeviceTest, ParseBatchDeviceTestCase) {
|
||||
if (m_throw_exception) {
|
||||
ASSERT_ANY_THROW(m_plugin->parse_batch_device(m_batch_device_config));
|
||||
} else {
|
||||
auto result = m_plugin->parse_batch_device(m_batch_device_config);
|
||||
EXPECT_EQ(result.device_name, m_device_name);
|
||||
EXPECT_EQ(result.device_batch_size, m_batch_size);
|
||||
}
|
||||
}
|
||||
|
||||
const std::vector<batch_device_config_params> batch_device_test_configs = {
|
||||
batch_device_config_params{"CPU(4)", "CPU", 4, false},
|
||||
batch_device_config_params{"GPU(8)", "GPU", 8, false},
|
||||
batch_device_config_params{"CPU(0)", "CPU", 0, true},
|
||||
batch_device_config_params{"GPU(-1)", "GPU", 0, true},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
ParseBatchDeviceTest,
|
||||
::testing::ValuesIn(batch_device_test_configs),
|
||||
ParseBatchDeviceTest::getTestCaseName);
|
143
src/plugins/auto_batch/tests/unit/parse_meta_device_test.cpp
Normal file
143
src/plugins/auto_batch/tests/unit/parse_meta_device_test.cpp
Normal file
@ -0,0 +1,143 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using meta_device_params = std::tuple<std::string, // Device batch cfg
|
||||
ov::AnyMap, // property map
|
||||
DeviceInformation, // Expected result
|
||||
bool>; // Throw exception
|
||||
|
||||
const std::vector<std::string> cpu_supported_properties = {
|
||||
"CACHE_DIR",
|
||||
};
|
||||
|
||||
const std::vector<std::string> gpu_supported_properties = {
|
||||
"CACHE_DIR",
|
||||
"OPTIMAL_BATCH_SIZE",
|
||||
};
|
||||
|
||||
class ParseMetaDeviceTest : public ::testing::TestWithParam<meta_device_params> {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
|
||||
std::string m_batch_cfg;
|
||||
ov::AnyMap m_config;
|
||||
DeviceInformation m_expected_device_info;
|
||||
bool m_throw_exception;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<meta_device_params> obj) {
|
||||
std::string batch_cfg;
|
||||
ov::AnyMap config;
|
||||
DeviceInformation info;
|
||||
bool throw_exception;
|
||||
|
||||
std::tie(batch_cfg, config, info, throw_exception) = obj.param;
|
||||
std::string res = batch_cfg;
|
||||
for (auto& c : config) {
|
||||
res += "_" + c.first + "_" + c.second.as<std::string>();
|
||||
}
|
||||
if (throw_exception)
|
||||
res += "_throw";
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_core.reset();
|
||||
m_plugin.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
m_plugin->set_core(m_core);
|
||||
|
||||
std::tie(m_batch_cfg, m_config, m_expected_device_info, m_throw_exception) = this->GetParam();
|
||||
|
||||
ON_CALL(*m_core, get_supported_property)
|
||||
.WillByDefault([](const std::string& device, const ov::AnyMap& configs) {
|
||||
ov::AnyMap res_config;
|
||||
if (device == "CPU") {
|
||||
for (auto& c : configs) {
|
||||
if (std::find(begin(cpu_supported_properties), end(cpu_supported_properties), c.first) !=
|
||||
cpu_supported_properties.end())
|
||||
res_config[c.first] = c.second;
|
||||
}
|
||||
} else if (device == "GPU") {
|
||||
for (auto& c : configs) {
|
||||
if (std::find(begin(gpu_supported_properties), end(gpu_supported_properties), c.first) !=
|
||||
gpu_supported_properties.end())
|
||||
res_config[c.first] = c.second;
|
||||
}
|
||||
}
|
||||
return res_config;
|
||||
});
|
||||
}
|
||||
|
||||
bool compare(ov::AnyMap a, ov::AnyMap b) {
|
||||
if (a.size() != b.size())
|
||||
return false;
|
||||
|
||||
for (auto& it : a) {
|
||||
auto item = b.find(it.first);
|
||||
if (item == b.end())
|
||||
return false;
|
||||
if (it.second != item->second)
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(ParseMetaDeviceTest, ParseMetaDeviceTestCase) {
|
||||
if (m_throw_exception) {
|
||||
ASSERT_ANY_THROW(m_plugin->parse_meta_device(m_batch_cfg, m_config));
|
||||
} else {
|
||||
auto result = m_plugin->parse_meta_device(m_batch_cfg, m_config);
|
||||
EXPECT_EQ(result.device_name, m_expected_device_info.device_name);
|
||||
EXPECT_EQ(result.device_batch_size, m_expected_device_info.device_batch_size);
|
||||
EXPECT_TRUE(compare(result.device_config, m_expected_device_info.device_config));
|
||||
}
|
||||
}
|
||||
|
||||
const std::vector<meta_device_params> meta_device_test_configs = {
|
||||
meta_device_params{"CPU(4)", {}, DeviceInformation{"CPU", {}, 4}, false},
|
||||
meta_device_params{"CPU(4)", {{}}, DeviceInformation{"CPU", {{}}, 4}, true},
|
||||
meta_device_params{"CPU(4)", {{"CACHE_DIR", "./"}}, DeviceInformation{"CPU", {{"CACHE_DIR", "./"}}, 4}, false},
|
||||
meta_device_params{"GPU(4)", {{"CACHE_DIR", "./"}}, DeviceInformation{"GPU", {{"CACHE_DIR", "./"}}, 4}, false},
|
||||
meta_device_params{"GPU(8)",
|
||||
{{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}},
|
||||
DeviceInformation{"GPU", {{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}}, 8},
|
||||
false},
|
||||
meta_device_params{"CPU(4)", {{"OPTIMAL_BATCH_SIZE", "16"}}, DeviceInformation{"CPU", {{}}, 4}, true},
|
||||
meta_device_params{"CPU(4)",
|
||||
{{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}},
|
||||
DeviceInformation{"CPU", {{"CACHE_DIR", "./"}}, 4},
|
||||
true},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
ParseMetaDeviceTest,
|
||||
::testing::ValuesIn(meta_device_test_configs),
|
||||
ParseMetaDeviceTest::getTestCaseName);
|
232
src/plugins/auto_batch/tests/unit/plugin_compile_model_test.cpp
Normal file
232
src/plugins/auto_batch/tests/unit/plugin_compile_model_test.cpp
Normal file
@ -0,0 +1,232 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using plugin_compile_model_param = std::tuple<ov::AnyMap, // Core Properties
|
||||
ov::AnyMap, // Plugin Properties
|
||||
uint32_t>; // batch size
|
||||
|
||||
class PluginCompileModelTest : public ::testing::TestWithParam<plugin_compile_model_param> {
|
||||
public:
|
||||
ov::AnyMap m_core_properities;
|
||||
ov::AnyMap m_plugin_properities;
|
||||
int m_batch_size;
|
||||
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
std::shared_ptr<ov::Model> m_model;
|
||||
ov::SoPtr<ov::IRemoteContext> m_remote_context;
|
||||
|
||||
ov::SoPtr<MockICompiledModel> m_mock_compile_model;
|
||||
std::shared_ptr<MockICompiledModel> m_mock_i_compile_model;
|
||||
std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<plugin_compile_model_param> obj) {
|
||||
ov::AnyMap core_properities;
|
||||
ov::AnyMap plugin_properities;
|
||||
uint32_t expect_batch_size;
|
||||
std::tie(core_properities, plugin_properities, expect_batch_size) = obj.param;
|
||||
|
||||
std::string res;
|
||||
for (auto& c : core_properities) {
|
||||
res += "_" + c.first + "_" + c.second.as<std::string>();
|
||||
}
|
||||
for (auto& c : plugin_properities) {
|
||||
res += "_" + c.first + "_" + c.second.as<std::string>();
|
||||
}
|
||||
res += "_" + std::to_string(expect_batch_size);
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_core.reset();
|
||||
m_plugin.reset();
|
||||
m_model.reset();
|
||||
m_remote_context = {};
|
||||
m_mock_i_compile_model.reset();
|
||||
m_mock_compile_model = {};
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_core_properities, m_plugin_properities, m_batch_size) = this->GetParam();
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
m_plugin->set_core(m_core);
|
||||
m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
|
||||
m_mock_i_compile_model = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
|
||||
m_mock_compile_model = {m_mock_i_compile_model, {}};
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT")))
|
||||
.WillByDefault(Return(m_core_properities["PERFORMANCE_HINT"]));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("OPTIMAL_BATCH_SIZE"), _))
|
||||
.WillByDefault(Return(m_core_properities["OPTIMAL_BATCH_SIZE"]));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
|
||||
.WillByDefault(Return(m_core_properities["PERFORMANCE_HINT_NUM_REQUESTS"]));
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("GPU_MEMORY_STATISTICS"), _))
|
||||
.WillByDefault([&](const std::string& device, const std::string& key, const ov::AnyMap& options) {
|
||||
static int flag = 0;
|
||||
ov::Any value = m_core_properities[key];
|
||||
uint64_t data = flag * value.as<uint64_t>();
|
||||
std::map<std::string, uint64_t> ret = {{"xyz", data}};
|
||||
flag = flag ? 0 : 1;
|
||||
return ret;
|
||||
});
|
||||
|
||||
ON_CALL(*m_core, get_property(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _))
|
||||
.WillByDefault(Return(m_core_properities["GPU_DEVICE_TOTAL_MEM_SIZE"]));
|
||||
|
||||
ON_CALL(*m_core,
|
||||
compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
|
||||
MatcherCast<const std::string&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(m_mock_compile_model));
|
||||
|
||||
ON_CALL(*m_core,
|
||||
compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
|
||||
MatcherCast<const ov::SoPtr<ov::IRemoteContext>&>(_),
|
||||
_))
|
||||
.WillByDefault(Return(m_mock_compile_model));
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(PluginCompileModelTest, PluginCompileModelTestCase) {
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
ASSERT_NO_THROW(m_plugin->compile_model(m_model, m_plugin_properities));
|
||||
}
|
||||
|
||||
TEST_P(PluginCompileModelTest, PluginCompileModelWithRemoteContextTestCase) {
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
ASSERT_NO_THROW(m_plugin->compile_model(m_model, m_plugin_properities, m_remote_context));
|
||||
}
|
||||
|
||||
TEST_P(PluginCompileModelTest, PluginCompileModelBatchedModelTestCase) {
|
||||
m_model = ngraph::builder::subgraph::makeConvPoolReluNonZero({1, 1, 32, 32});
|
||||
auto batch = ov::Dimension(5);
|
||||
ov::DimensionTracker::set_label(batch, 11);
|
||||
auto p_shape = ov::PartialShape{batch, 1, 32, 32};
|
||||
m_model->reshape(p_shape);
|
||||
ASSERT_NO_THROW(m_plugin->compile_model(m_model, m_plugin_properities));
|
||||
}
|
||||
|
||||
TEST_P(PluginCompileModelTest, PluginCompileModelBatchedModelWithRemoteContextTestCase) {
|
||||
m_model = ngraph::builder::subgraph::makeConvPoolReluNonZero({1, 1, 32, 32});
|
||||
auto batch = ov::Dimension(5);
|
||||
ov::DimensionTracker::set_label(batch, 11);
|
||||
auto p_shape = ov::PartialShape{batch, 1, 32, 32};
|
||||
m_model->reshape(p_shape);
|
||||
ASSERT_NO_THROW(m_plugin->compile_model(m_model, m_plugin_properities, m_remote_context));
|
||||
}
|
||||
|
||||
const std::vector<plugin_compile_model_param> plugin_compile_model_param_test = {
|
||||
// Case 1: explict apply batch size by config of AUTO_BATCH_DEVICE_CONFIG
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(32)"}},
|
||||
32},
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU(32)"}},
|
||||
32},
|
||||
// Case 2: CPU batch size is figured out by min of opt_batch_size and infReq_num
|
||||
// If config contains "PERFORMANCE_HINT_NUM_REQUESTS"
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
|
||||
12},
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(8)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(16)},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
|
||||
8},
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(8)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(2)},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
|
||||
1},
|
||||
// Case 3: GPU batch size is figured out by
|
||||
// 1) min of opt_batch_size and infReq_num
|
||||
// 2) available_mem/one_graph_mem_footprint with power 2
|
||||
// Final m_batch_size is the min of 1) and 2)
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
|
||||
{"GPU_MEMORY_STATISTICS", "1000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "5000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
|
||||
4},
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "40960000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
|
||||
12},
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(32)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(24)},
|
||||
{"GPU_MEMORY_STATISTICS", "1000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "18000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
|
||||
16},
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(32)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(48)},
|
||||
{"GPU_MEMORY_STATISTICS", "1000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "180000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
|
||||
32},
|
||||
// Case 4:
|
||||
plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::LATENCY},
|
||||
{"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
|
||||
{"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
|
||||
{"GPU_MEMORY_STATISTICS", "1024000"},
|
||||
{"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
|
||||
{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(32)"}},
|
||||
32},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
PluginCompileModelTest,
|
||||
::testing::ValuesIn(plugin_compile_model_param_test),
|
||||
PluginCompileModelTest::getTestCaseName);
|
102
src/plugins/auto_batch/tests/unit/plugin_get_property_test.cpp
Normal file
102
src/plugins/auto_batch/tests/unit/plugin_get_property_test.cpp
Normal file
@ -0,0 +1,102 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using get_property_params = std::tuple<std::string, // Get Property Name
|
||||
bool>; // Throw exception
|
||||
|
||||
const char supported_metric[] = "SUPPORTED_METRICS FULL_DEVICE_NAME SUPPORTED_CONFIG_KEYS";
|
||||
const char supported_config_keys[] = "AUTO_BATCH_DEVICE_CONFIG MULTI_DEVICE_PRIORITIES AUTO_BATCH_TIMEOUT CACHE_DIR";
|
||||
|
||||
class GetPropertyTest : public ::testing::TestWithParam<get_property_params> {
|
||||
public:
|
||||
std::string m_property_name;
|
||||
bool m_throw_exception;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<get_property_params> obj) {
|
||||
std::string property_name;
|
||||
bool throw_exception;
|
||||
|
||||
std::tie(property_name, throw_exception) = obj.param;
|
||||
std::string res = "";
|
||||
|
||||
if (!property_name.empty()) {
|
||||
res += "GetProperty_" + property_name;
|
||||
}
|
||||
if (throw_exception)
|
||||
res += "_throw";
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_plugin.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_property_name, m_throw_exception) = this->GetParam();
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
|
||||
ON_CALL(*m_plugin, get_property).WillByDefault([this](const std::string& name, const ov::AnyMap& arguments) {
|
||||
return m_plugin->Plugin::get_property(name, arguments);
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(GetPropertyTest, GetPropertyTestCase) {
|
||||
ov::AnyMap options = {};
|
||||
if (m_throw_exception) {
|
||||
ASSERT_ANY_THROW(m_plugin->get_property(m_property_name, options));
|
||||
} else {
|
||||
ov::Any value;
|
||||
ASSERT_NO_THROW(value = m_plugin->get_property(m_property_name, options));
|
||||
if (m_property_name == METRIC_KEY(SUPPORTED_METRICS)) {
|
||||
EXPECT_EQ(value.as<std::string>(), supported_metric);
|
||||
return;
|
||||
}
|
||||
if (m_property_name == ov::device::full_name.name()) {
|
||||
EXPECT_EQ(value.as<std::string>(), "BATCH");
|
||||
return;
|
||||
}
|
||||
if (m_property_name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
|
||||
EXPECT_EQ(value.as<std::string>(), supported_config_keys);
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const std::vector<get_property_params> get_property_params_test = {
|
||||
get_property_params{"AUTO_BATCH_TIMEOUT", false},
|
||||
get_property_params{"AUTO_BATCH_DEVICE_CONFIG", true},
|
||||
get_property_params{"CACHE_DIR", true},
|
||||
get_property_params{METRIC_KEY(SUPPORTED_METRICS), false},
|
||||
get_property_params{METRIC_KEY(SUPPORTED_CONFIG_KEYS), false},
|
||||
get_property_params{"CPU_THREADS_NUM", true},
|
||||
get_property_params{"PERFORMANCE_HINT", true},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
GetPropertyTest,
|
||||
::testing::ValuesIn(get_property_params_test),
|
||||
GetPropertyTest::getTestCaseName);
|
@ -0,0 +1,91 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using query_model_params = std::tuple<ov::AnyMap, // Set Property
|
||||
bool>;
|
||||
|
||||
class QueryModelTest : public ::testing::TestWithParam<query_model_params> {
|
||||
public:
|
||||
ov::AnyMap m_properties;
|
||||
bool m_throw_exception;
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
std::shared_ptr<ov::Model> m_model;
|
||||
ov::SupportedOpsMap m_supported_ops_map;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<query_model_params> obj) {
|
||||
ov::AnyMap properties;
|
||||
bool throw_exception;
|
||||
|
||||
std::tie(properties, throw_exception) = obj.param;
|
||||
std::string res = "";
|
||||
if (properties.size() > 0) {
|
||||
res += "QueryModel_";
|
||||
for (auto& it : properties) {
|
||||
res += it.first + "_" + it.second.as<std::string>() + "_";
|
||||
}
|
||||
}
|
||||
if (throw_exception)
|
||||
res += "_throw";
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_core.reset();
|
||||
m_plugin.reset();
|
||||
m_model.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_properties, m_throw_exception) = this->GetParam();
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv();
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
m_plugin->set_core(m_core);
|
||||
|
||||
ON_CALL(*m_core, query_model).WillByDefault(Return(m_supported_ops_map));
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(QueryModelTest, QueryModelTestCase) {
|
||||
if (m_throw_exception) {
|
||||
ASSERT_ANY_THROW(m_plugin->query_model(m_model, m_properties));
|
||||
} else {
|
||||
ASSERT_NO_THROW(m_plugin->query_model(m_model, m_properties));
|
||||
}
|
||||
}
|
||||
|
||||
const std::vector<query_model_params> query_model_params_test = {
|
||||
query_model_params{{{}}, true},
|
||||
query_model_params{{{"AUTO_BATCH_TIMEOUT", "200"}}, true},
|
||||
query_model_params{{{"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, false},
|
||||
query_model_params{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, false},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
QueryModelTest,
|
||||
::testing::ValuesIn(query_model_params_test),
|
||||
QueryModelTest::getTestCaseName);
|
@ -0,0 +1,88 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
|
||||
using set_property_params = std::tuple<ov::AnyMap, // Set Property
|
||||
bool>;
|
||||
|
||||
class SetPropertyTest : public ::testing::TestWithParam<set_property_params> {
|
||||
public:
|
||||
ov::AnyMap m_properties;
|
||||
bool m_throw_exception;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<set_property_params> obj) {
|
||||
ov::AnyMap properties;
|
||||
bool throw_exception;
|
||||
|
||||
std::tie(properties, throw_exception) = obj.param;
|
||||
std::string res = "";
|
||||
if (properties.size() > 0) {
|
||||
res += "SetProperty_";
|
||||
for (auto& it : properties) {
|
||||
res += it.first + "_" + it.second.as<std::string>() + "_";
|
||||
}
|
||||
}
|
||||
if (throw_exception)
|
||||
res += "_throw";
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_plugin.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_properties, m_throw_exception) = this->GetParam();
|
||||
m_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(SetPropertyTest, SetPropertyTestCase) {
|
||||
if (m_properties.size() == 0) {
|
||||
ASSERT_NO_THROW(m_plugin->set_property(m_properties));
|
||||
return;
|
||||
}
|
||||
|
||||
if (m_throw_exception) {
|
||||
ASSERT_ANY_THROW(m_plugin->set_property(m_properties));
|
||||
} else {
|
||||
ASSERT_NO_THROW(m_plugin->set_property(m_properties));
|
||||
}
|
||||
}
|
||||
|
||||
const std::vector<set_property_params> plugin_set_property_params_test = {
|
||||
set_property_params{{{"AUTO_BATCH_TIMEOUT", "200"}}, false},
|
||||
set_property_params{{{"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, false},
|
||||
set_property_params{{{"CACHE_DIR", "./xyz"}}, false},
|
||||
set_property_params{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, false},
|
||||
set_property_params{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}, {"CACHE_DIR", "./xyz"}},
|
||||
false},
|
||||
set_property_params{{{"XYZ", "200"}}, true},
|
||||
set_property_params{{{"XYZ", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}, {"CACHE_DIR", "./xyz"}}, true},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
SetPropertyTest,
|
||||
::testing::ValuesIn(plugin_set_property_params_test),
|
||||
SetPropertyTest::getTestCaseName);
|
@ -1,397 +0,0 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_auto_batch_plugin.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
|
||||
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
using namespace ov::mock_autobatch_plugin;
|
||||
using BatchDeviceConfigParams = std::tuple<std::string, // Batch devices
|
||||
std::string, // Expected device name
|
||||
int, // Expected batch size
|
||||
bool // Throw exception
|
||||
>;
|
||||
using MetricConfigParams = std::tuple<std::string, std::string, bool>;
|
||||
using MetaDeviceParams = std::tuple<std::string, // Device batch cfg
|
||||
std::map<std::string, std::string>, // Config
|
||||
DeviceInformation, // Expected result
|
||||
bool>; // Throw exception
|
||||
using SetGetConfigParams = std::tuple<std::map<std::string, std::string>, // Set Config
|
||||
std::string, // Get Config
|
||||
bool>; // Throw exception
|
||||
|
||||
const std::vector<std::string> cpu_supported_properties = {
|
||||
"CACHE_DIR",
|
||||
};
|
||||
const std::vector<std::string> gpu_supported_properties = {
|
||||
"CACHE_DIR",
|
||||
"OPTIMAL_BATCH_SIZE",
|
||||
};
|
||||
|
||||
class SetGetConfigTest : public ::testing::TestWithParam<SetGetConfigParams> {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<SetGetConfigParams> obj) {
|
||||
std::map<std::string, std::string> set_config;
|
||||
std::string get_config;
|
||||
bool throw_exception;
|
||||
|
||||
std::tie(set_config, get_config, throw_exception) = obj.param;
|
||||
std::string res = "";
|
||||
if (set_config.size() > 0) {
|
||||
res += "GetConfig_";
|
||||
for (auto& it : set_config) {
|
||||
res += it.first + "_" + it.second + "_";
|
||||
}
|
||||
}
|
||||
if (!get_config.empty()) {
|
||||
res += "GetConfig_" + get_config;
|
||||
}
|
||||
if (throw_exception)
|
||||
res += "_throw";
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
core.reset();
|
||||
plugin.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
plugin->SetCore(core);
|
||||
|
||||
ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
|
||||
return plugin->Plugin::ParseBatchDevice(batchDevice);
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(SetGetConfigTest, SetConfigTestCase) {
|
||||
std::map<std::string, std::string> set_config;
|
||||
std::string temp;
|
||||
bool throw_exception;
|
||||
std::tie(set_config, temp, throw_exception) = this->GetParam();
|
||||
|
||||
if (set_config.size() == 0) {
|
||||
ASSERT_NO_THROW(plugin->SetConfig(set_config));
|
||||
return;
|
||||
}
|
||||
|
||||
if (throw_exception) {
|
||||
ASSERT_ANY_THROW(plugin->SetConfig(set_config));
|
||||
} else {
|
||||
ASSERT_NO_THROW(plugin->SetConfig(set_config));
|
||||
}
|
||||
}
|
||||
|
||||
TEST_P(SetGetConfigTest, GetConfigTestCase) {
|
||||
std::map<std::string, std::string> temp;
|
||||
std::string get_config;
|
||||
bool throw_exception;
|
||||
std::tie(temp, get_config, throw_exception) = this->GetParam();
|
||||
|
||||
if (get_config.empty() || temp.size() > 0) {
|
||||
return;
|
||||
}
|
||||
|
||||
std::map<std::string, InferenceEngine::Parameter> options = {};
|
||||
if (throw_exception) {
|
||||
ASSERT_ANY_THROW(plugin->GetConfig(get_config, options));
|
||||
} else {
|
||||
ASSERT_NO_THROW(plugin->GetConfig(get_config, options));
|
||||
}
|
||||
}
|
||||
|
||||
TEST_P(SetGetConfigTest, SetGetConfigTestCase) {
|
||||
std::map<std::string, std::string> set_config;
|
||||
std::string get_config;
|
||||
bool throw_exception;
|
||||
std::tie(set_config, get_config, throw_exception) = this->GetParam();
|
||||
|
||||
if (get_config.empty() || set_config.size() == 0) {
|
||||
return;
|
||||
}
|
||||
|
||||
std::map<std::string, InferenceEngine::Parameter> options = {};
|
||||
ASSERT_NO_THROW(plugin->SetConfig(set_config));
|
||||
InferenceEngine::Parameter result;
|
||||
ASSERT_NO_THROW(result = plugin->GetConfig(get_config, options));
|
||||
EXPECT_EQ(result.as<std::string>(), set_config[get_config]);
|
||||
}
|
||||
|
||||
class ParseMetaDeviceTest : public ::testing::TestWithParam<MetaDeviceParams> {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<MetaDeviceParams> obj) {
|
||||
std::string batch_cfg;
|
||||
std::map<std::string, std::string> config;
|
||||
DeviceInformation info;
|
||||
bool throw_exception;
|
||||
|
||||
std::tie(batch_cfg, config, info, throw_exception) = obj.param;
|
||||
std::string res = batch_cfg;
|
||||
for (auto& c : config) {
|
||||
res += "_" + c.first + "_" + c.second;
|
||||
}
|
||||
if (throw_exception)
|
||||
res += "_throw";
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
core.reset();
|
||||
plugin.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
plugin->SetCore(core);
|
||||
|
||||
ON_CALL(*core, GetSupportedConfig)
|
||||
.WillByDefault([](const std::string& device, const std::map<std::string, std::string>& configs) {
|
||||
std::map<std::string, std::string> res_config;
|
||||
if (device == "CPU") {
|
||||
for (auto& c : configs) {
|
||||
if (std::find(begin(cpu_supported_properties), end(cpu_supported_properties), c.first) !=
|
||||
cpu_supported_properties.end())
|
||||
res_config[c.first] = c.second;
|
||||
}
|
||||
} else if (device == "GPU") {
|
||||
for (auto& c : configs) {
|
||||
if (std::find(begin(gpu_supported_properties), end(gpu_supported_properties), c.first) !=
|
||||
gpu_supported_properties.end())
|
||||
res_config[c.first] = c.second;
|
||||
}
|
||||
}
|
||||
return res_config;
|
||||
});
|
||||
|
||||
ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
|
||||
return plugin->Plugin::ParseBatchDevice(batchDevice);
|
||||
});
|
||||
}
|
||||
|
||||
bool compare(std::map<std::string, std::string> a, std::map<std::string, std::string> b) {
|
||||
if (a.size() != b.size())
|
||||
return false;
|
||||
|
||||
for (auto& it : a) {
|
||||
auto item = b.find(it.first);
|
||||
if (item == b.end())
|
||||
return false;
|
||||
if (it.second != item->second)
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(ParseMetaDeviceTest, ParseMetaDeviceTestCase) {
|
||||
std::string batch_cfg;
|
||||
std::map<std::string, std::string> config;
|
||||
DeviceInformation expected;
|
||||
bool throw_exception;
|
||||
|
||||
std::tie(batch_cfg, config, expected, throw_exception) = this->GetParam();
|
||||
|
||||
if (throw_exception) {
|
||||
ASSERT_ANY_THROW(plugin->ParseMetaDevice(batch_cfg, config));
|
||||
} else {
|
||||
auto result = plugin->ParseMetaDevice(batch_cfg, config);
|
||||
EXPECT_EQ(result.device_name, expected.device_name);
|
||||
EXPECT_EQ(result.batch_for_device, expected.batch_for_device);
|
||||
EXPECT_TRUE(compare(result.config, expected.config));
|
||||
}
|
||||
}
|
||||
|
||||
class ParseBatchDeviceTest : public ::testing::TestWithParam<BatchDeviceConfigParams> {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<BatchDeviceConfigParams> obj) {
|
||||
std::string batchDevice;
|
||||
std::string deviceName;
|
||||
int batchSize;
|
||||
bool throw_exception;
|
||||
std::tie(batchDevice, deviceName, batchSize, throw_exception) = obj.param;
|
||||
return batchDevice;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
core.reset();
|
||||
plugin.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
plugin->SetCore(core);
|
||||
|
||||
ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
|
||||
return plugin->Plugin::ParseBatchDevice(batchDevice);
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(ParseBatchDeviceTest, ParseBatchDeviceTestCase) {
|
||||
std::string batchDevice;
|
||||
std::string deviceName;
|
||||
int batchSize;
|
||||
bool throw_exception;
|
||||
std::tie(batchDevice, deviceName, batchSize, throw_exception) = this->GetParam();
|
||||
|
||||
if (throw_exception) {
|
||||
ASSERT_ANY_THROW(plugin->ParseBatchDevice(batchDevice));
|
||||
} else {
|
||||
auto result = plugin->ParseBatchDevice(batchDevice);
|
||||
EXPECT_EQ(result.device_name, deviceName);
|
||||
EXPECT_EQ(result.batch_for_device, batchSize);
|
||||
}
|
||||
}
|
||||
|
||||
class PluginMetricTest : public ::testing::TestWithParam<MetricConfigParams> {
|
||||
public:
|
||||
std::shared_ptr<NiceMock<MockICore>> core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
|
||||
|
||||
public:
|
||||
static std::string getTestCaseName(testing::TestParamInfo<MetricConfigParams> obj) {
|
||||
std::string metricName;
|
||||
std::string value;
|
||||
bool throw_exception;
|
||||
std::tie(metricName, value, throw_exception) = obj.param;
|
||||
return "Metric_" + metricName;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
core.reset();
|
||||
plugin.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
plugin->SetCore(core);
|
||||
|
||||
ON_CALL(*plugin, GetMetric)
|
||||
.WillByDefault(
|
||||
[this](const std::string& name, const std::map<std::string, InferenceEngine::Parameter>& options) {
|
||||
return plugin->Plugin::GetMetric(name, options);
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(PluginMetricTest, GetPluginMetricTest) {
|
||||
std::string metricName;
|
||||
std::string expected;
|
||||
bool throw_exception;
|
||||
std::tie(metricName, expected, throw_exception) = this->GetParam();
|
||||
|
||||
if (throw_exception) {
|
||||
ASSERT_ANY_THROW(plugin->GetMetric(metricName, {}));
|
||||
} else {
|
||||
auto value = plugin->GetMetric(metricName, {});
|
||||
EXPECT_EQ(value.as<std::string>(), expected);
|
||||
}
|
||||
}
|
||||
|
||||
const char supported_metric[] = "SUPPORTED_METRICS FULL_DEVICE_NAME SUPPORTED_CONFIG_KEYS";
|
||||
const char supported_config_keys[] = "AUTO_BATCH_DEVICE_CONFIG MULTI_DEVICE_PRIORITIES AUTO_BATCH_TIMEOUT CACHE_DIR";
|
||||
|
||||
const std::vector<BatchDeviceConfigParams> batchDeviceTestConfigs = {
|
||||
BatchDeviceConfigParams{"CPU(4)", "CPU", 4, false},
|
||||
BatchDeviceConfigParams{"GPU(8)", "GPU", 8, false},
|
||||
BatchDeviceConfigParams{"CPU(0)", "CPU", 0, true},
|
||||
BatchDeviceConfigParams{"GPU(-1)", "GPU", 0, true},
|
||||
};
|
||||
|
||||
const std::vector<MetricConfigParams> metricTestConfigs = {
|
||||
MetricConfigParams{METRIC_KEY(SUPPORTED_METRICS), supported_metric, false},
|
||||
MetricConfigParams{METRIC_KEY(FULL_DEVICE_NAME), "BATCH", false},
|
||||
MetricConfigParams{METRIC_KEY(SUPPORTED_CONFIG_KEYS), supported_config_keys, false},
|
||||
MetricConfigParams{"CPU_THREADS_NUM", "16", true},
|
||||
MetricConfigParams{"PERFORMANCE_HINT", "LATENCY", true},
|
||||
};
|
||||
|
||||
const std::vector<MetaDeviceParams> testMetaDeviceConfigs = {
|
||||
MetaDeviceParams{"CPU(4)", {}, DeviceInformation{"CPU", {}, 4}, false},
|
||||
MetaDeviceParams{"CPU(4)", {{}}, DeviceInformation{"CPU", {{}}, 4}, true},
|
||||
MetaDeviceParams{"CPU(4)", {{"CACHE_DIR", "./"}}, DeviceInformation{"CPU", {{"CACHE_DIR", "./"}}, 4}, false},
|
||||
MetaDeviceParams{"GPU(4)", {{"CACHE_DIR", "./"}}, DeviceInformation{"GPU", {{"CACHE_DIR", "./"}}, 4}, false},
|
||||
MetaDeviceParams{"GPU(8)",
|
||||
{{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}},
|
||||
DeviceInformation{"GPU", {{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}}, 8},
|
||||
false},
|
||||
MetaDeviceParams{"CPU(4)", {{"OPTIMAL_BATCH_SIZE", "16"}}, DeviceInformation{"CPU", {{}}, 4}, true},
|
||||
MetaDeviceParams{"CPU(4)",
|
||||
{{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}},
|
||||
DeviceInformation{"CPU", {{"CACHE_DIR", "./"}}, 4},
|
||||
true},
|
||||
};
|
||||
|
||||
const std::vector<SetGetConfigParams> testSetGetConfigParams = {
|
||||
// Set Config
|
||||
SetGetConfigParams{{{"AUTO_BATCH_TIMEOUT", "200"}}, {}, false},
|
||||
SetGetConfigParams{{{"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, {}, false},
|
||||
SetGetConfigParams{{{"CACHE_DIR", "./xyz"}}, {}, false},
|
||||
SetGetConfigParams{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, {}, false},
|
||||
SetGetConfigParams{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}, {"CACHE_DIR", "./xyz"}},
|
||||
{},
|
||||
false},
|
||||
SetGetConfigParams{{{"XYZ", "200"}}, {}, true},
|
||||
SetGetConfigParams{{{"XYZ", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}, {"CACHE_DIR", "./xyz"}}, {}, true},
|
||||
// Get Config
|
||||
SetGetConfigParams{{}, "AUTO_BATCH_TIMEOUT", false},
|
||||
SetGetConfigParams{{}, "AUTO_BATCH_DEVICE_CONFIG", true},
|
||||
SetGetConfigParams{{}, "CACHE_DIR", true},
|
||||
// Set and get Config
|
||||
SetGetConfigParams{{{"AUTO_BATCH_TIMEOUT", "200"}}, "AUTO_BATCH_TIMEOUT", false},
|
||||
SetGetConfigParams{{{"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, "AUTO_BATCH_DEVICE_CONFIG", false},
|
||||
SetGetConfigParams{{{"CACHE_DIR", "./abc"}}, "CACHE_DIR", false},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
SetGetConfigTest,
|
||||
::testing::ValuesIn(testSetGetConfigParams),
|
||||
SetGetConfigTest::getTestCaseName);
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
ParseBatchDeviceTest,
|
||||
::testing::ValuesIn(batchDeviceTestConfigs),
|
||||
ParseBatchDeviceTest::getTestCaseName);
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
PluginMetricTest,
|
||||
::testing::ValuesIn(metricTestConfigs),
|
||||
PluginMetricTest::getTestCaseName);
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
ParseMetaDeviceTest,
|
||||
::testing::ValuesIn(testMetaDeviceConfigs),
|
||||
ParseMetaDeviceTest::getTestCaseName);
|
257
src/plugins/auto_batch/tests/unit/sync_infer_request_test.cpp
Normal file
257
src/plugins/auto_batch/tests/unit/sync_infer_request_test.cpp
Normal file
@ -0,0 +1,257 @@
|
||||
// Copyright (C) 2018-2023 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gmock/gmock.h>
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "mock_common.hpp"
|
||||
#include "ngraph_functions/subgraph_builders.hpp"
|
||||
#include "openvino/core/dimension_tracker.hpp"
|
||||
#include "openvino/core/type/element_type.hpp"
|
||||
#include "openvino/runtime/threading/immediate_executor.hpp"
|
||||
#include "transformations/utils/utils.hpp"
|
||||
#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
|
||||
using ::testing::_;
|
||||
using ::testing::AnyNumber;
|
||||
using ::testing::AtLeast;
|
||||
using ::testing::Eq;
|
||||
using ::testing::MatcherCast;
|
||||
using ::testing::Matches;
|
||||
using ::testing::NiceMock;
|
||||
using ::testing::Return;
|
||||
using ::testing::ReturnRef;
|
||||
using ::testing::StrEq;
|
||||
using ::testing::StrNe;
|
||||
using ::testing::Throw;
|
||||
|
||||
using AutoBatchRequestTestParams = std::tuple<uint32_t, // batch_size
|
||||
ov::element::Type_t>; // data type
|
||||
|
||||
class AutoBatchRequestTest : public ::testing::TestWithParam<AutoBatchRequestTestParams> {
|
||||
public:
|
||||
std::shared_ptr<ov::Model> m_model;
|
||||
std::shared_ptr<NiceMock<MockICore>> m_core;
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_auto_batch_plugin;
|
||||
|
||||
std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_without_batch;
|
||||
ov::SoPtr<ov::ICompiledModel> m_compile_model_without_batch;
|
||||
|
||||
std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_with_batch;
|
||||
ov::SoPtr<ov::ICompiledModel> m_compile_model_with_batch;
|
||||
|
||||
ov::AnyMap m_config;
|
||||
DeviceInformation m_device_info;
|
||||
std::set<std::string> m_batched_inputs;
|
||||
std::set<std::string> m_batched_outputs;
|
||||
ov::SoPtr<ov::IRemoteContext> m_remote_context;
|
||||
|
||||
std::shared_ptr<MockAutoBatchCompileModel> m_auto_batch_compile_model;
|
||||
|
||||
std::shared_ptr<NiceMock<MockISyncInferRequest>> m_sync_infer_request_with_batch;
|
||||
|
||||
std::shared_ptr<NiceMock<MockIAsyncInferRequest>> m_async_infer_request_with_batch;
|
||||
|
||||
std::shared_ptr<ov::threading::ImmediateExecutor> m_executor;
|
||||
|
||||
std::shared_ptr<CompiledModel::WorkerInferRequest> workerRequestPtr;
|
||||
|
||||
uint32_t m_batch_size;
|
||||
ov::element::Type_t m_element_type;
|
||||
|
||||
std::vector<std::shared_ptr<SyncInferRequest>> m_auto_batch_infer_requests;
|
||||
|
||||
std::vector<ov::ProfilingInfo> m_profiling_info;
|
||||
|
||||
static std::string getTestCaseName(testing::TestParamInfo<AutoBatchRequestTestParams> obj) {
|
||||
uint32_t batch_size;
|
||||
ov::element::Type_t element_type;
|
||||
std::tie(batch_size, element_type) = obj.param;
|
||||
|
||||
std::string res;
|
||||
res = "batch_size_" + std::to_string(batch_size);
|
||||
res += "_element_type_" + std::to_string(static_cast<int>(element_type));
|
||||
return res;
|
||||
}
|
||||
|
||||
void TearDown() override {
|
||||
m_profiling_info.clear();
|
||||
m_auto_batch_infer_requests.clear();
|
||||
m_auto_batch_plugin.reset();
|
||||
m_model.reset();
|
||||
m_core.reset();
|
||||
m_i_compile_model_without_batch.reset();
|
||||
m_compile_model_without_batch = {};
|
||||
m_i_compile_model_with_batch.reset();
|
||||
m_compile_model_with_batch = {};
|
||||
m_auto_batch_compile_model.reset();
|
||||
m_sync_infer_request_with_batch.reset();
|
||||
m_async_infer_request_with_batch.reset();
|
||||
m_executor.reset();
|
||||
clear_worker();
|
||||
workerRequestPtr.reset();
|
||||
}
|
||||
|
||||
void SetUp() override {
|
||||
std::tie(m_batch_size, m_element_type) = this->GetParam();
|
||||
std::vector<size_t> inputShape = {1, 3, 24, 24};
|
||||
m_model = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, m_element_type);
|
||||
m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
|
||||
|
||||
m_auto_batch_plugin =
|
||||
std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
|
||||
|
||||
m_auto_batch_plugin->set_core(m_core);
|
||||
m_i_compile_model_without_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_auto_batch_plugin);
|
||||
m_compile_model_without_batch = {m_i_compile_model_without_batch, {}};
|
||||
|
||||
m_config = {{"AUTO_BATCH_TIMEOUT", "200"}};
|
||||
|
||||
m_device_info = {"CPU", {}, m_batch_size};
|
||||
m_batched_inputs = {"Parameter_0"};
|
||||
m_batched_outputs = {"Convolution_20"};
|
||||
|
||||
m_i_compile_model_with_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_auto_batch_plugin);
|
||||
m_compile_model_with_batch = {m_i_compile_model_with_batch, {}};
|
||||
|
||||
ASSERT_NO_THROW(m_auto_batch_compile_model =
|
||||
std::make_shared<MockAutoBatchCompileModel>(m_model->clone(),
|
||||
m_auto_batch_plugin,
|
||||
m_config,
|
||||
m_device_info,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs,
|
||||
m_compile_model_with_batch,
|
||||
m_compile_model_without_batch,
|
||||
m_remote_context));
|
||||
|
||||
m_sync_infer_request_with_batch =
|
||||
std::make_shared<NiceMock<MockISyncInferRequest>>(m_i_compile_model_with_batch);
|
||||
|
||||
m_executor = std::make_shared<ov::threading::ImmediateExecutor>();
|
||||
|
||||
m_async_infer_request_with_batch =
|
||||
std::make_shared<NiceMock<MockIAsyncInferRequest>>(m_sync_infer_request_with_batch, m_executor, nullptr);
|
||||
|
||||
m_profiling_info = {};
|
||||
}
|
||||
|
||||
void create_worker(int batch_size) {
|
||||
workerRequestPtr = std::make_shared<CompiledModel::WorkerInferRequest>();
|
||||
|
||||
workerRequestPtr->_infer_request_batched = {m_async_infer_request_with_batch, {}};
|
||||
workerRequestPtr->_batch_size = batch_size;
|
||||
workerRequestPtr->_completion_tasks.resize(workerRequestPtr->_batch_size);
|
||||
workerRequestPtr->_infer_request_batched->set_callback([this](std::exception_ptr exceptionPtr) mutable {
|
||||
if (exceptionPtr)
|
||||
workerRequestPtr->_exception_ptr = exceptionPtr;
|
||||
});
|
||||
workerRequestPtr->_thread = std::thread([] {
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds(10));
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
void clear_worker() {
|
||||
workerRequestPtr->_infer_request_batched = {};
|
||||
workerRequestPtr->_completion_tasks.clear();
|
||||
workerRequestPtr->_thread.join();
|
||||
}
|
||||
|
||||
void prepare_input(std::shared_ptr<ov::Model>& model, int batch_size) {
|
||||
const auto& params = model->get_parameters();
|
||||
for (size_t i = 0; i < params.size(); i++) {
|
||||
m_batched_inputs.insert(ov::op::util::get_ie_output_name(params[i]->output(0)));
|
||||
}
|
||||
const auto& results = model->get_results();
|
||||
for (size_t i = 0; i < results.size(); i++) {
|
||||
const auto& output = results[i];
|
||||
const auto& node = output->input_value(0);
|
||||
m_batched_outputs.insert(
|
||||
ov::op::util::get_ie_output_name(ov::Output<const ov::Node>(node.get_node(), node.get_index())));
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
TEST_P(AutoBatchRequestTest, AutoBatchRequestCreateTestCase) {
|
||||
prepare_input(m_model, m_batch_size);
|
||||
create_worker(m_batch_size);
|
||||
|
||||
for (uint32_t batch_id = 0; batch_id < m_batch_size; batch_id++) {
|
||||
auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
|
||||
workerRequestPtr,
|
||||
batch_id,
|
||||
m_batch_size,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
EXPECT_NE(req, nullptr);
|
||||
m_auto_batch_infer_requests.emplace_back(req);
|
||||
}
|
||||
}
|
||||
|
||||
TEST_P(AutoBatchRequestTest, AutoBatchRequestCopyInputTensorTestCase) {
|
||||
prepare_input(m_model, m_batch_size);
|
||||
create_worker(m_batch_size);
|
||||
|
||||
auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
|
||||
workerRequestPtr,
|
||||
0,
|
||||
m_batch_size,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
EXPECT_NE(req, nullptr);
|
||||
m_auto_batch_infer_requests.emplace_back(req);
|
||||
|
||||
EXPECT_NO_THROW(req->copy_inputs_if_needed());
|
||||
}
|
||||
|
||||
TEST_P(AutoBatchRequestTest, AutoBatchRequestCopyOutputTensorTestCase) {
|
||||
prepare_input(m_model, m_batch_size);
|
||||
create_worker(m_batch_size);
|
||||
|
||||
auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
|
||||
workerRequestPtr,
|
||||
0,
|
||||
m_batch_size,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
EXPECT_NE(req, nullptr);
|
||||
m_auto_batch_infer_requests.emplace_back(req);
|
||||
|
||||
EXPECT_NO_THROW(req->copy_outputs_if_needed());
|
||||
}
|
||||
|
||||
TEST_P(AutoBatchRequestTest, AutoBatchRequestGetProfilingInfoTestCase) {
|
||||
prepare_input(m_model, m_batch_size);
|
||||
create_worker(m_batch_size);
|
||||
|
||||
auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
|
||||
workerRequestPtr,
|
||||
0,
|
||||
m_batch_size,
|
||||
m_batched_inputs,
|
||||
m_batched_outputs);
|
||||
EXPECT_NE(req, nullptr);
|
||||
|
||||
ON_CALL(*m_sync_infer_request_with_batch, get_profiling_info()).WillByDefault(Return(m_profiling_info));
|
||||
|
||||
EXPECT_NO_THROW(req->get_profiling_info());
|
||||
}
|
||||
|
||||
std::vector<ov::element::Type_t> element_type{ov::element::Type_t::f16,
|
||||
ov::element::Type_t::f32,
|
||||
ov::element::Type_t::f64,
|
||||
ov::element::Type_t::i8,
|
||||
ov::element::Type_t::i16,
|
||||
ov::element::Type_t::i32,
|
||||
ov::element::Type_t::i64,
|
||||
ov::element::Type_t::u8,
|
||||
ov::element::Type_t::u16,
|
||||
ov::element::Type_t::u32,
|
||||
ov::element::Type_t::u64};
|
||||
const std::vector<uint32_t> batch_size{1, 8, 16, 32, 64, 128};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
|
||||
AutoBatchRequestTest,
|
||||
::testing::Combine(::testing::ValuesIn(batch_size), ::testing::ValuesIn(element_type)),
|
||||
AutoBatchRequestTest::getTestCaseName);
|
@ -16,8 +16,6 @@ const std::vector<ov::AnyMap> inproperties = {
|
||||
|
||||
const std::vector<ov::AnyMap> auto_batch_inproperties = {
|
||||
{ov::num_streams(-100)},
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), std::string(CommonTestUtils::DEVICE_CPU) + "(4)"},
|
||||
{ov::auto_batch_timeout(-1)}},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_BehaviorTests,
|
||||
|
@ -57,6 +57,6 @@ INSTANTIATE_TEST_SUITE_P(
|
||||
::testing::Combine(
|
||||
::testing::Values(std::string(CommonTestUtils::DEVICE_BATCH) + ":" + CommonTestUtils::DEVICE_GPU),
|
||||
::testing::Values(DefaultParameter{ov::auto_batch_timeout.name(),
|
||||
InferenceEngine::Parameter{1000}})),
|
||||
InferenceEngine::Parameter{uint32_t(1000)}})),
|
||||
DefaultConfigurationTest::getTestCaseName);
|
||||
} // namespace AutoBatchingTests
|
||||
|
@ -89,8 +89,6 @@ namespace {
|
||||
|
||||
auto auto_batch_inconfigs = []() {
|
||||
return std::vector<std::map<std::string, std::string>>{
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_GPU},
|
||||
{CONFIG_KEY(AUTO_BATCH_TIMEOUT), "-1"}},
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_GPU},
|
||||
{InferenceEngine::PluginConfigParams::KEY_PERFORMANCE_HINT, "DOESN'T EXIST"}},
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_GPU},
|
||||
|
@ -16,8 +16,6 @@ const std::vector<ov::AnyMap> inproperties = {
|
||||
|
||||
const std::vector<ov::AnyMap> auto_batch_inproperties = {
|
||||
{ov::device::id("UNSUPPORTED_DEVICE_ID_STRING")},
|
||||
{{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), std::string(CommonTestUtils::DEVICE_TEMPLATE) + "(4)"},
|
||||
{ov::auto_batch_timeout(-1)}},
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(smoke_BehaviorTests,
|
||||
|
@ -15,9 +15,7 @@ const std::vector<ov::AnyMap> inproperties = {
|
||||
{ov::device::id("UNSUPPORTED_DEVICE_ID_STRING")},
|
||||
};
|
||||
|
||||
const std::vector<ov::AnyMap> auto_batch_inproperties = {
|
||||
{{ov::auto_batch_timeout(-1)}},
|
||||
};
|
||||
const std::vector<ov::AnyMap> auto_batch_inproperties = {};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(ov_compiled_model_mandatory, OVClassCompiledModelPropertiesIncorrectTests,
|
||||
::testing::Combine(
|
||||
|
@ -16,9 +16,7 @@ const std::vector<ov::AnyMap> inproperties = {
|
||||
{ov::device::id("UNSUPPORTED_DEVICE_ID_STRING")},
|
||||
};
|
||||
|
||||
const std::vector<ov::AnyMap> auto_batch_inproperties = {
|
||||
{{ov::auto_batch_timeout(-1)}},
|
||||
};
|
||||
const std::vector<ov::AnyMap> auto_batch_inproperties = {};
|
||||
|
||||
INSTANTIATE_TEST_SUITE_P(ov_plugin_mandatory, OVPropertiesIncorrectTests,
|
||||
::testing::Combine(
|
||||
|
@ -58,6 +58,7 @@ public:
|
||||
MOCK_CONST_METHOD3(GetMetric, ov::Any(const std::string&, const std::string&, const ov::AnyMap&));
|
||||
MOCK_CONST_METHOD2(GetConfig, ov::Any(const std::string&, const std::string&));
|
||||
MOCK_CONST_METHOD3(get_property, ov::Any(const std::string&, const std::string&, const ov::AnyMap&));
|
||||
MOCK_CONST_METHOD2(get_property, ov::Any(const std::string&, const std::string&));
|
||||
MOCK_CONST_METHOD0(GetAvailableDevices, std::vector<std::string>());
|
||||
MOCK_CONST_METHOD1(DeviceSupportsModelCaching, bool(const std::string&)); // NOLINT not a cast to bool
|
||||
MOCK_METHOD2(GetSupportedConfig,
|
||||
|
Loading…
Reference in New Issue
Block a user