[AUTO BATCH PLUGIN] enable api 2.0 for auto batch plugin (#18172)

* [AUTO BATCH PLUGIN] enable API 2.0 for auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] disenable auto batch plugin unite test for tmp Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] remove test with ov::auto_batch_timeout(-1), cause the variable is unsigned int Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix compiler error caused by std::atomic_uint32_t Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [Remote Context] fix revew comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix compiler warnings Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix compiler warnings Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix test error Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix CI test error in cpu func test case, caused by batched model lost rt info Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix CI build error, caused by unused variable Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] using ov::threading Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] clear code in batched req share buffer with non-batched req Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] clean code & fix format issue Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] clean code & fix format issue Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] add api implementation about get_default_context() & create_context() and remove the test config with AUTO_BATCH_TIMEOUT(-1) Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix gpu test with auto btch failed Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix warning Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix get_default_context() issue Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix using namespace redundancy Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] modify variable naming style Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix CI test error, cause by tensor reference in virtual plugin Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] implement get_profiling() Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] remove get_context() from auto batch compiled model using the interface from parent class Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] implement create_context() & get_default_context for auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix format issue Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] implement auto batch remote context Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix error after merge with master Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix compiler error caused by update master Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] refact remote context in auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] add unite test cases for auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix CI warning caused by unused variable & add unite of remote context Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] add virtual property for get_context() in icompiled_model & implement it in auto batch plugin Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] add ov::loaded_from_cache support Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix error caused by updating with master Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix unite test error Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix conflict Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix error caused by update master Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [AUTO BATCH PLUGIN] fix review comments Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> --------- Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> Signed-off-by: xuejun <xuejun.zhai@intel.com>
2023-07-20 11:02:45 +08:00 · 2023-07-20 11:02:45 +08:00 · ba76b45194
commit ba76b45194
parent 9cd39455fc
37 changed files with 2627 additions and 2289 deletions
--- a/src/plugins/auto_batch/src/async_infer_request.cpp
+++ b/src/plugins/auto_batch/src/async_infer_request.cpp
@ -9,57 +9,57 @@
 namespace ov {
 namespace autobatch_plugin {

-AsyncInferRequest::AsyncInferRequest(const SyncInferRequest::Ptr& inferRequest,
-                                     InferenceEngine::SoIInferRequestInternal& inferRequestWithoutBatch,
-                                     const InferenceEngine::ITaskExecutor::Ptr& callbackExecutor)
-    : AsyncInferRequestThreadSafeDefault(inferRequest, nullptr, callbackExecutor),
-      m_infer_request_without_batch(inferRequestWithoutBatch),
-      m_sync_infer_request{inferRequest} {
+AsyncInferRequest::AsyncInferRequest(const std::shared_ptr<SyncInferRequest>& request,
+                                     const ov::SoPtr<ov::IAsyncInferRequest>& request_without_batch,
+                                     const std::shared_ptr<ov::threading::ITaskExecutor>& callback_executor)
+    : ov::IAsyncInferRequest(request, nullptr, callback_executor),
+      m_sync_request(request),
+      m_request_without_batch(request_without_batch) {
    // this executor starts the inference while  the task (checking the result) is passed to the next stage
-    struct ThisRequestExecutor : public InferenceEngine::ITaskExecutor {
+    struct ThisRequestExecutor : public ov::threading::ITaskExecutor {
        explicit ThisRequestExecutor(AsyncInferRequest* _this_) : _this{_this_} {}
-        void run(InferenceEngine::Task task) override {
-            auto& workerInferRequest = _this->m_sync_infer_request->m_batched_request_wrapper;
-            std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
+        void run(ov::threading::Task task) override {
+            auto workerInferRequest = _this->m_sync_request->m_batched_request_wrapper;
+            std::pair<AsyncInferRequest*, ov::threading::Task> t;
            t.first = _this;
            t.second = std::move(task);
-            workerInferRequest._tasks.push(t);
+            workerInferRequest->_tasks.push(t);
            // it is ok to call size() here as the queue only grows (and the bulk removal happens under the mutex)
-            const int sz = static_cast<int>(workerInferRequest._tasks.size());
-            if (sz == workerInferRequest._batchSize) {
-                workerInferRequest._cond.notify_one();
+            const int sz = static_cast<int>(workerInferRequest->_tasks.size());
+            if (sz == workerInferRequest->_batch_size) {
+                workerInferRequest->_cond.notify_one();
            }
        };
        AsyncInferRequest* _this = nullptr;
    };
-    _pipeline = {
-        {/*TaskExecutor*/ std::make_shared<ThisRequestExecutor>(this), /*task*/ [this] {
-             if (this->m_sync_infer_request->m_exceptionPtr)  // if the exception happened in the batch1 fallback
-                 std::rethrow_exception(this->m_sync_infer_request->m_exceptionPtr);
-             auto& batchReq = this->m_sync_infer_request->m_batched_request_wrapper;
-             if (batchReq.m_exceptionPtr)  // when the batchN execution failed
-                 std::rethrow_exception(batchReq.m_exceptionPtr);
-             // in the case of non-batched execution the blobs were set explicitly
-             if (SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED ==
-                 this->m_sync_infer_request->m_batched_request_status)
-                 this->m_sync_infer_request->CopyOutputsIfNeeded();
-         }}};
+    m_pipeline = {{/*TaskExecutor*/ std::make_shared<ThisRequestExecutor>(this), /*task*/ [this] {
+                       if (this->m_sync_request->m_exception_ptr)  // if the exception happened in the batch1 fallback
+                           std::rethrow_exception(this->m_sync_request->m_exception_ptr);
+                       auto batchReq = this->m_sync_request->m_batched_request_wrapper;
+                       if (batchReq->_exception_ptr)  // when the batchN execution failed
+                           std::rethrow_exception(batchReq->_exception_ptr);
+                       // in the case of non-batched execution the tensors were set explicitly
+                       if (SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED ==
+                           this->m_sync_request->m_batched_request_status) {
+                           this->m_sync_request->copy_outputs_if_needed();
+                       }
+                   }}};
 }

-std::map<std::string, InferenceEngine::InferenceEngineProfileInfo> AsyncInferRequest::GetPerformanceCounts() const {
-    CheckState();
-    if (SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED == m_sync_infer_request->m_batched_request_status)
-        return m_sync_infer_request->m_batched_request_wrapper._inferRequestBatched->GetPerformanceCounts();
+std::vector<ov::ProfilingInfo> AsyncInferRequest::get_profiling_info() const {
+    check_state();
+    if (SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED == m_sync_request->m_batched_request_status)
+        return m_sync_request->get_profiling_info();
    else
-        return m_infer_request_without_batch->GetPerformanceCounts();
+        return m_request_without_batch->get_profiling_info();
 }

-void AsyncInferRequest::Infer_ThreadUnsafe() {
-    InferUsingAsync();
+void AsyncInferRequest::infer_thread_unsafe() {
+    start_async_thread_unsafe();
 }

 AsyncInferRequest::~AsyncInferRequest() {
-    StopAndWait();
+    stop_and_wait();
 }
 }  // namespace autobatch_plugin
 }  // namespace ov
--- a/src/plugins/auto_batch/src/async_infer_request.hpp
+++ b/src/plugins/auto_batch/src/async_infer_request.hpp
@ -4,28 +4,27 @@

 ///////////////////////////////////////////////////////////////////////////////////////////////////
 #pragma once
-#include "cpp_interfaces/impl/ie_infer_async_request_thread_safe_default.hpp"
+
+#include "openvino/runtime/iasync_infer_request.hpp"
 #include "sync_infer_request.hpp"

 namespace ov {
 namespace autobatch_plugin {
-class AsyncInferRequest : public InferenceEngine::AsyncInferRequestThreadSafeDefault {
+class AsyncInferRequest : public ov::IAsyncInferRequest {
 public:
-    using Ptr = std::shared_ptr<AsyncInferRequest>;
+    AsyncInferRequest(const std::shared_ptr<SyncInferRequest>& request,
+                      const ov::SoPtr<ov::IAsyncInferRequest>& request_without_batch,
+                      const std::shared_ptr<ov::threading::ITaskExecutor>& callback_executor);

-    explicit AsyncInferRequest(const SyncInferRequest::Ptr& inferRequest,
-                               InferenceEngine::SoIInferRequestInternal& inferRequestWithoutBatch,
-                               const InferenceEngine::ITaskExecutor::Ptr& callbackExecutor);
-
-    void Infer_ThreadUnsafe() override;
+    void infer_thread_unsafe() override;

    virtual ~AsyncInferRequest();

-    std::map<std::string, InferenceEngine::InferenceEngineProfileInfo> GetPerformanceCounts() const override;
+    std::vector<ov::ProfilingInfo> get_profiling_info() const override;

-    InferenceEngine::SoIInferRequestInternal m_infer_request_without_batch;
+    std::shared_ptr<ov::autobatch_plugin::SyncInferRequest> m_sync_request;

-    SyncInferRequest::Ptr m_sync_infer_request;
+    ov::SoPtr<ov::IAsyncInferRequest> m_request_without_batch;
 };
 }  // namespace autobatch_plugin
 }  // namespace ov
--- a/src/plugins/auto_batch/src/compiled_model.cpp
+++ b/src/plugins/auto_batch/src/compiled_model.cpp
@ -6,29 +6,29 @@
 #include "compiled_model.hpp"

 #include "async_infer_request.hpp"
-#include "ie_performance_hints.hpp"
-#include "sync_infer_request.hpp"

 namespace ov {
 namespace autobatch_plugin {
-CompiledModel::CompiledModel(const InferenceEngine::SoExecutableNetworkInternal& networkWithBatch,
-                             const InferenceEngine::SoExecutableNetworkInternal& networkWithoutBatch,
-                             const DeviceInformation& networkDevice,
-                             const std::unordered_map<std::string, InferenceEngine::Parameter>& config,
-                             const std::set<std::string>& batchedInputs,
-                             const std::set<std::string>& batchedOutputs)
-    : InferenceEngine::ExecutableNetworkThreadSafeDefault(nullptr,
-                                                          std::make_shared<InferenceEngine::ImmediateExecutor>()),
-      m_model_with_batch{networkWithBatch},
-      m_model_without_batch{networkWithoutBatch},
-      m_config{config},
-      m_batched_inputs(batchedInputs),
-      m_batched_outputs(batchedOutputs) {
+CompiledModel::CompiledModel(const std::shared_ptr<ov::Model>& model,
+                             const std::shared_ptr<const ov::IPlugin>& plugin,
+                             const ov::AnyMap& config,
+                             const DeviceInformation& device_info,
+                             const std::set<std::string>& batched_inputs,
+                             const std::set<std::string>& batched_outputs,
+                             const ov::SoPtr<ov::ICompiledModel>& compiled_model_with_batch,
+                             const ov::SoPtr<ov::ICompiledModel>& compiled_model_without_batch,
+                             const ov::SoPtr<ov::IRemoteContext>& context)
+    : ov::ICompiledModel(model, plugin, context),
+      m_config(config),
+      m_batched_inputs(batched_inputs),
+      m_batched_outputs(batched_outputs),
+      m_compiled_model_with_batch(compiled_model_with_batch),
+      m_compiled_model_without_batch(compiled_model_without_batch) {
    // WA for gcc 4.8 ( fails compilation with member init-list)
-    m_device_info = networkDevice;
-    auto time_out = config.find(CONFIG_KEY(AUTO_BATCH_TIMEOUT));
-    IE_ASSERT(time_out != config.end());
-    m_timeout = ParseTimeoutValue(time_out->second.as<std::string>());
+    m_device_info = device_info;
+    auto time_out = config.find(ov::auto_batch_timeout.name());
+    OPENVINO_ASSERT(time_out != config.end(), "No timeout property be set in config, default will be used!");
+    m_time_out = time_out->second.as<std::uint32_t>();
 }

 CompiledModel::~CompiledModel() {
@ -39,63 +39,38 @@ CompiledModel::~CompiledModel() {
    m_worker_requests.clear();
 }

-unsigned int CompiledModel::ParseTimeoutValue(const std::string& s) {
-    auto val = std::stoi(s);
-    if (val < 0)
-        IE_THROW(ParameterMismatch) << "Value for the " << CONFIG_KEY(AUTO_BATCH_TIMEOUT) << " should be unsigned int";
-    return val;
-}
-
-std::shared_ptr<InferenceEngine::RemoteContext> CompiledModel::GetContext() const {
-    return m_model_without_batch->GetContext();
-}
-
-InferenceEngine::IInferRequestInternal::Ptr CompiledModel::CreateInferRequestImpl(
-    InferenceEngine::InputsDataMap networkInputs,
-    InferenceEngine::OutputsDataMap networkOutputs) {
+std::shared_ptr<ov::ISyncInferRequest> CompiledModel::create_sync_infer_request() const {
    auto workerRequestPtrAndId = GetWorkerInferRequest();
-    return std::make_shared<SyncInferRequest>(networkInputs,
-                                              networkOutputs,
-                                              workerRequestPtrAndId.first,
-                                              workerRequestPtrAndId.second,
-                                              m_device_info.batch_for_device,
-                                              m_batched_inputs,
-                                              m_batched_outputs);
+    auto async_infer_request = std::make_shared<ov::autobatch_plugin::SyncInferRequest>(
+        std::dynamic_pointer_cast<const ov::autobatch_plugin::CompiledModel>(shared_from_this()),
+        workerRequestPtrAndId.first,
+        workerRequestPtrAndId.second,
+        m_device_info.device_batch_size,
+        m_batched_inputs,
+        m_batched_outputs);
+    return async_infer_request;
 }

-InferenceEngine::IInferRequestInternal::Ptr CompiledModel::CreateInferRequestImpl(
-    const std::vector<std::shared_ptr<const ov::Node>>& inputs,
-    const std::vector<std::shared_ptr<const ov::Node>>& outputs) {
-    if (!this->_plugin || !_plugin->IsNewAPI())
-        return nullptr;
-    auto workerRequestPtrAndId = GetWorkerInferRequest();
-    return std::make_shared<SyncInferRequest>(inputs,
-                                              outputs,
-                                              workerRequestPtrAndId.first,
-                                              workerRequestPtrAndId.second,
-                                              m_device_info.batch_for_device,
-                                              m_batched_inputs,
-                                              m_batched_outputs);
-}
-
-std::pair<CompiledModel::WorkerInferRequest&, int> CompiledModel::GetWorkerInferRequest() {
+std::pair<std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest>, int>
+CompiledModel::GetWorkerInferRequest() const {
    auto num = m_num_requests_created++;
    std::lock_guard<std::mutex> lock(m_worker_requests_mutex);
-    auto batch_id = num % m_device_info.batch_for_device;
+    auto batch_id = num % m_device_info.device_batch_size;
    if (!batch_id) {  // need new request
        m_worker_requests.push_back(std::make_shared<WorkerInferRequest>());
        auto workerRequestPtr = m_worker_requests.back().get();
-        workerRequestPtr->_inferRequestBatched = {m_model_with_batch->CreateInferRequest(), m_model_with_batch._so};
-        workerRequestPtr->_batchSize = m_device_info.batch_for_device;
-        workerRequestPtr->_completionTasks.resize(workerRequestPtr->_batchSize);
-        workerRequestPtr->_inferRequestBatched->SetCallback(
+        workerRequestPtr->_infer_request_batched = {m_compiled_model_with_batch->create_infer_request(),
+                                                    m_compiled_model_with_batch._so};
+        workerRequestPtr->_batch_size = m_device_info.device_batch_size;
+        workerRequestPtr->_completion_tasks.resize(workerRequestPtr->_batch_size);
+        workerRequestPtr->_infer_request_batched->set_callback(
            [workerRequestPtr](std::exception_ptr exceptionPtr) mutable {
                if (exceptionPtr)
-                    workerRequestPtr->m_exceptionPtr = exceptionPtr;
-                IE_ASSERT(workerRequestPtr->_completionTasks.size() == (size_t)workerRequestPtr->_batchSize);
+                    workerRequestPtr->_exception_ptr = exceptionPtr;
+                OPENVINO_ASSERT(workerRequestPtr->_completion_tasks.size() == (size_t)workerRequestPtr->_batch_size);
                // notify the individual requests on the completion
-                for (int c = 0; c < workerRequestPtr->_batchSize; c++) {
-                    workerRequestPtr->_completionTasks[c]();
+                for (int c = 0; c < workerRequestPtr->_batch_size; c++) {
+                    workerRequestPtr->_completion_tasks[c]();
                }
                // reset the timeout
                workerRequestPtr->_cond.notify_one();
@ -106,7 +81,7 @@ std::pair<CompiledModel::WorkerInferRequest&, int> CompiledModel::GetWorkerInfer
                std::cv_status status;
                {
                    std::unique_lock<std::mutex> lock(workerRequestPtr->_mutex);
-                    status = workerRequestPtr->_cond.wait_for(lock, std::chrono::milliseconds(m_timeout));
+                    status = workerRequestPtr->_cond.wait_for(lock, std::chrono::milliseconds(m_time_out));
                }
                if (m_terminate) {
                    break;
@ -114,38 +89,38 @@ std::pair<CompiledModel::WorkerInferRequest&, int> CompiledModel::GetWorkerInfer
                    // as we pop the tasks from the queue only here
                    // it is ok to call size() (as the _tasks can only grow in parallel)
                    const int sz = static_cast<int>(workerRequestPtr->_tasks.size());
-                    if (sz == workerRequestPtr->_batchSize) {
-                        std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
+                    if (sz == workerRequestPtr->_batch_size) {
+                        std::pair<ov::autobatch_plugin::AsyncInferRequest*, ov::threading::Task> t;
                        for (int n = 0; n < sz; n++) {
-                            IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
-                            workerRequestPtr->_completionTasks[n] = std::move(t.second);
-                            t.first->m_sync_infer_request->CopyInputsIfNeeded();
-                            t.first->m_sync_infer_request->m_batched_request_status =
-                                SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED;
+                            OPENVINO_ASSERT(workerRequestPtr->_tasks.try_pop(t));
+                            workerRequestPtr->_completion_tasks[n] = std::move(t.second);
+                            t.first->m_sync_request->copy_inputs_if_needed();
+                            t.first->m_sync_request->m_batched_request_status =
+                                ov::autobatch_plugin::SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED;
                        }
-                        workerRequestPtr->_inferRequestBatched->StartAsync();
+                        workerRequestPtr->_infer_request_batched->start_async();
                    } else if ((status == std::cv_status::timeout) && sz) {
                        // timeout to collect the batch is over, have to execute the requests in the batch1 mode
-                        std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
+                        std::pair<ov::autobatch_plugin::AsyncInferRequest*, ov::threading::Task> t;
                        // popping all tasks collected by the moment of the time-out and execute each with batch1
                        std::atomic<int> arrived = {0};
                        std::promise<void> all_completed;
                        auto all_completed_future = all_completed.get_future();
                        for (int n = 0; n < sz; n++) {
-                            IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
-                            t.first->m_infer_request_without_batch->SetCallback(
+                            OPENVINO_ASSERT(workerRequestPtr->_tasks.try_pop(t));
+                            t.first->m_request_without_batch->set_callback(
                                [t, sz, &arrived, &all_completed](std::exception_ptr p) {
                                    if (p)
-                                        t.first->m_sync_infer_request->m_exceptionPtr = p;
+                                        t.first->m_sync_request->m_exception_ptr = p;
                                    t.second();
-                                    if (sz == ++arrived)
+                                    if (sz == ++arrived) {
                                        all_completed.set_value();
+                                    }
                                });
-                            t.first->m_sync_infer_request->m_batched_request_status =
-                                SyncInferRequest::eExecutionFlavor::TIMEOUT_EXECUTED;
-                            t.first->m_sync_infer_request->SetBlobsToAnotherRequest(
-                                t.first->m_infer_request_without_batch);
-                            t.first->m_infer_request_without_batch->StartAsync();
+                            t.first->m_sync_request->m_batched_request_status =
+                                ov::autobatch_plugin::SyncInferRequest::eExecutionFlavor::TIMEOUT_EXECUTED;
+                            t.first->m_sync_request->set_tensors_to_another_request(t.first->m_request_without_batch);
+                            t.first->m_request_without_batch->start_async();
                        }
                        all_completed_future.get();
                        // now when all the tasks for this batch are completed, start waiting for the timeout again
@ -154,93 +129,103 @@ std::pair<CompiledModel::WorkerInferRequest&, int> CompiledModel::GetWorkerInfer
            }
        });
    }
-    return {*m_worker_requests.back(), static_cast<int>(batch_id)};
+    return {m_worker_requests.back(), static_cast<int>(batch_id)};
 }

-InferenceEngine::IInferRequestInternal::Ptr CompiledModel::CreateInferRequest() {
-    if (!m_model_with_batch) {
-        auto res = m_model_without_batch->CreateInferRequest();
-        res->setPointerToExecutableNetworkInternal(shared_from_this());
-        res->setPointerToSo(m_model_without_batch._so);
-        _so = m_model_without_batch._so;
+std::shared_ptr<ov::IAsyncInferRequest> CompiledModel::create_infer_request() const {
+    if (!m_compiled_model_with_batch) {
+        auto res = m_compiled_model_without_batch->create_infer_request();
+        for (auto& iter : res->get_inputs()) {
+            auto&& tensor = res->get_tensor(iter);
+            if (!tensor._so)
+                tensor._so = m_compiled_model_without_batch._so;
+        }
+        for (auto& iter : res->get_outputs()) {
+            auto&& tensor = res->get_tensor(iter);
+            if (!tensor._so)
+                tensor._so = m_compiled_model_without_batch._so;
+        }
        return res;
    }
-    // trying to create the new API request first
-    InferenceEngine::IInferRequestInternal::Ptr syncRequestImpl = CreateInferRequestImpl(_parameters, _results);
-    if (!syncRequestImpl)
-        syncRequestImpl = CreateInferRequestImpl(_networkInputs, _networkOutputs);
-    syncRequestImpl->setPointerToExecutableNetworkInternal(shared_from_this());
-    InferenceEngine::SoIInferRequestInternal inferRequestWithoutBatch = {m_model_without_batch->CreateInferRequest(),
-                                                                         m_model_without_batch._so};
-    return std::make_shared<AsyncInferRequest>(std::static_pointer_cast<SyncInferRequest>(syncRequestImpl),
-                                               inferRequestWithoutBatch,
-                                               _callbackExecutor);
+
+    auto sync_res = create_sync_infer_request();
+
+    ov::SoPtr<ov::IAsyncInferRequest> infer_request_without_batch = {
+        m_compiled_model_without_batch->create_infer_request(),
+        m_compiled_model_without_batch._so};
+    return std::make_shared<ov::autobatch_plugin::AsyncInferRequest>(
+        std::dynamic_pointer_cast<ov::autobatch_plugin::SyncInferRequest>(sync_res),
+        infer_request_without_batch,
+        get_callback_executor());
 }

-std::shared_ptr<ngraph::Function> CompiledModel::GetExecGraphInfo() {
-    return m_model_with_batch && m_model_with_batch->GetExecGraphInfo() ? m_model_with_batch->GetExecGraphInfo()
-                                                                        : m_model_without_batch->GetExecGraphInfo();
+std::shared_ptr<const ov::Model> CompiledModel::get_runtime_model() const {
+    return m_compiled_model_with_batch ? m_compiled_model_with_batch->get_runtime_model()
+                                       : m_compiled_model_without_batch->get_runtime_model();
 }

-void CompiledModel::SetConfig(const std::map<std::string, InferenceEngine::Parameter>& user_config) {
-    auto timeout = user_config.find(CONFIG_KEY(AUTO_BATCH_TIMEOUT));
-    if (timeout == user_config.end() || user_config.size() > 1) {
-        IE_THROW() << "The only config that can be changed on the fly for the AutoBatching the is the "
-                   << CONFIG_KEY(AUTO_BATCH_TIMEOUT);
+void CompiledModel::set_property(const ov::AnyMap& properties) {
+    auto time_out = properties.find(ov::auto_batch_timeout.name());
+    if (time_out == properties.end() || properties.size() > 1) {
+        OPENVINO_THROW("The only config that can be changed on the fly for the AutoBatching is the ",
+                       ov::auto_batch_timeout.name());
    } else {
-        m_timeout = ParseTimeoutValue(timeout->second.as<std::string>());
+        m_time_out = time_out->second.as<std::uint32_t>();
    }
 }

-InferenceEngine::Parameter CompiledModel::GetConfig(const std::string& name) const {
+ov::Any CompiledModel::get_property(const std::string& name) const {
    auto it = m_config.find(name);
    if (it != m_config.end()) {
        return it->second;
    } else {
        // find config key among networks config keys
-        auto param = m_model_without_batch->GetMetric(METRIC_KEY(SUPPORTED_CONFIG_KEYS));
-        for (auto&& configKey : param.as<std::vector<std::string>>()) {
-            if (configKey == name) {
-                return m_model_without_batch->GetConfig(configKey);
+        auto modelSupportedProperties = m_compiled_model_without_batch->get_property(ov::supported_properties.name());
+        for (auto&& property : modelSupportedProperties.as<std::vector<ov::PropertyName>>()) {
+            if (property == name) {
+                return m_compiled_model_without_batch->get_property(property);
            }
        }
-        IE_THROW(NotFound) << name << " not found in the ExecutableNetwork config";
+        if (name == ov::optimal_number_of_infer_requests.name()) {
+            uint32_t num_request = 0;
+            try {
+                num_request =
+                    m_compiled_model_without_batch->get_property(ov::hint::num_requests.name()).as<std::uint32_t>();
+                if (num_request == 0)  // no limitations from user, let's deduce the full blown #requests
+                    // (multiplied by the devices capabilities to run multiple <batched> requests for further perf)
+                    num_request =
+                        m_device_info.device_batch_size *
+                        m_compiled_model_without_batch->get_property(ov::optimal_number_of_infer_requests.name())
+                            .as<uint32_t>();
+            } catch (const ov::Exception&) {
+            }
+            num_request =
+                std::max(num_request, m_device_info.device_batch_size);  // round up to the possible  user's value
+            return num_request;
+        } else if (name == ov::model_name.name()) {
+            return m_compiled_model_without_batch->get_property(name);
+            OPENVINO_SUPPRESS_DEPRECATED_START
+        } else if (name == METRIC_KEY(SUPPORTED_METRICS)) {
+            return std::vector<std::string>{ov::optimal_number_of_infer_requests.name(),
+                                            METRIC_KEY(SUPPORTED_METRICS),
+                                            ov::model_name.name(),
+                                            METRIC_KEY(SUPPORTED_CONFIG_KEYS),
+                                            ov::execution_devices.name()};
+        } else if (name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
+            return std::vector<std::string>{ov::auto_batch_timeout.name()};
+        } else if (name == ov::execution_devices) {
+            return m_compiled_model_without_batch->get_property(name);
+        } else if (name == ov::loaded_from_cache) {
+            return m_compiled_model_without_batch->get_property(ov::loaded_from_cache.name());
+        } else {
+            OPENVINO_THROW("Unsupported Compiled Model Property: ", name);
+        }
    }
+    OPENVINO_SUPPRESS_DEPRECATED_END
 }

-InferenceEngine::Parameter CompiledModel::GetMetric(const std::string& name) const {
-    if (name == METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS)) {
-        auto reqs = 0;
-        try {
-            auto hint = m_model_without_batch->GetConfig(CONFIG_KEY(PERFORMANCE_HINT_NUM_REQUESTS)).as<std::string>();
-            reqs = InferenceEngine::PerfHintsConfig::CheckPerformanceHintRequestValue(hint);
-            if (!reqs)  // no limitations from user, let's deduce the full blown #requests
-                // (multiplied by the devices capabilities to run multiple <batched> requests for further perf)
-                reqs =
-                    m_device_info.batch_for_device *
-                    m_model_without_batch->GetMetric(METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS)).as<unsigned int>();
-        } catch (const InferenceEngine::Exception&) {
-        }
-        reqs = std::max(reqs, m_device_info.batch_for_device);  // round up to the possible  user's value
-        IE_SET_METRIC_RETURN(OPTIMAL_NUMBER_OF_INFER_REQUESTS, reqs);
-    } else if (name == METRIC_KEY(NETWORK_NAME)) {
-        IE_SET_METRIC_RETURN(NETWORK_NAME,
-                             m_model_without_batch->GetMetric(METRIC_KEY(NETWORK_NAME)).as<std::string>());
-    } else if (name == METRIC_KEY(SUPPORTED_METRICS)) {
-        IE_SET_METRIC_RETURN(SUPPORTED_METRICS,
-                             {METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS),
-                              METRIC_KEY(SUPPORTED_METRICS),
-                              METRIC_KEY(NETWORK_NAME),
-                              METRIC_KEY(SUPPORTED_CONFIG_KEYS),
-                              ov::execution_devices.name()});
-    } else if (name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
-        IE_SET_METRIC_RETURN(SUPPORTED_CONFIG_KEYS,
-                             {CONFIG_KEY(AUTO_BATCH_TIMEOUT)});  // only timeout can be changed on the fly
-    } else if (name == ov::execution_devices) {
-        return m_model_without_batch->GetMetric(name);
-    } else {
-        IE_THROW() << "Unsupported Network metric: " << name;
-    }
+void CompiledModel::export_model(std::ostream& model) const {
+    OPENVINO_NOT_IMPLEMENTED;
 }

 }  // namespace autobatch_plugin
--- a/src/plugins/auto_batch/src/compiled_model.hpp
+++ b/src/plugins/auto_batch/src/compiled_model.hpp
@ -5,79 +5,75 @@
 ///////////////////////////////////////////////////////////////////////////////////////////////////
 #pragma once

-#include <map>
+#include <condition_variable>
+#include <thread>

-#include "cpp_interfaces/impl/ie_executable_network_thread_safe_default.hpp"
-#include "ie_metric_helpers.hpp"
+#include "openvino/runtime/iasync_infer_request.hpp"
+#include "openvino/runtime/icompiled_model.hpp"
+#include "openvino/runtime/threading/thread_safe_containers.hpp"
 #include "plugin.hpp"
-#include "threading/ie_thread_safe_containers.hpp"

 namespace ov {
 namespace autobatch_plugin {

 class AsyncInferRequest;

-class CompiledModel : public InferenceEngine::ExecutableNetworkThreadSafeDefault {
+class CompiledModel : public ov::ICompiledModel {
 public:
-    using Ptr = std::shared_ptr<CompiledModel>;
    struct WorkerInferRequest {
-        using Ptr = std::shared_ptr<WorkerInferRequest>;
-        InferenceEngine::SoIInferRequestInternal _inferRequestBatched;
-        int _batchSize;
-        InferenceEngine::ThreadSafeQueueWithSize<std::pair<AsyncInferRequest*, InferenceEngine::Task>> _tasks;
-        std::vector<InferenceEngine::Task> _completionTasks;
+        ov::SoPtr<ov::IAsyncInferRequest> _infer_request_batched;
+        int _batch_size;
+        ov::threading::ThreadSafeQueueWithSize<std::pair<ov::autobatch_plugin::AsyncInferRequest*, ov::threading::Task>>
+            _tasks;
+        std::vector<ov::threading::Task> _completion_tasks;
        std::thread _thread;
        std::condition_variable _cond;
        std::mutex _mutex;
-        std::exception_ptr m_exceptionPtr;
+        std::exception_ptr _exception_ptr;
    };

-    CompiledModel(const InferenceEngine::SoExecutableNetworkInternal& networkForDevice,
-                  const InferenceEngine::SoExecutableNetworkInternal& networkForDeviceWithoutBatch,
-                  const DeviceInformation& networkDevices,
-                  const std::unordered_map<std::string, InferenceEngine::Parameter>& config,
-                  const std::set<std::string>& batchedIntputs,
-                  const std::set<std::string>& batchedOutputs);
+    CompiledModel(const std::shared_ptr<ov::Model>& model,
+                  const std::shared_ptr<const ov::IPlugin>& plugin,
+                  const ov::AnyMap& config,
+                  const DeviceInformation& device_info,
+                  const std::set<std::string>& batched_inputs,
+                  const std::set<std::string>& batched_outputs,
+                  const ov::SoPtr<ov::ICompiledModel>& compiled_model_with_batch,
+                  const ov::SoPtr<ov::ICompiledModel>& compiled_model_without_batch,
+                  const ov::SoPtr<ov::IRemoteContext>& context);

-    void SetConfig(const std::map<std::string, InferenceEngine::Parameter>& config) override;
+    void set_property(const ov::AnyMap& properties) override;

-    InferenceEngine::Parameter GetConfig(const std::string& name) const override;
+    ov::Any get_property(const std::string& name) const override;

-    InferenceEngine::Parameter GetMetric(const std::string& name) const override;
+    std::shared_ptr<ov::IAsyncInferRequest> create_infer_request() const override;

-    InferenceEngine::IInferRequestInternal::Ptr CreateInferRequest() override;
+    std::shared_ptr<const ov::Model> get_runtime_model() const override;

-    InferenceEngine::IInferRequestInternal::Ptr CreateInferRequestImpl(
-        InferenceEngine::InputsDataMap networkInputs,
-        InferenceEngine::OutputsDataMap networkOutputs) override;
-
-    InferenceEngine::IInferRequestInternal::Ptr CreateInferRequestImpl(
-        const std::vector<std::shared_ptr<const ov::Node>>& inputs,
-        const std::vector<std::shared_ptr<const ov::Node>>& outputs) override;
-
-    std::shared_ptr<InferenceEngine::RemoteContext> GetContext() const override;
-
-    std::shared_ptr<ngraph::Function> GetExecGraphInfo() override;
+    void export_model(std::ostream& model) const override;

    virtual ~CompiledModel();

 protected:
+    std::shared_ptr<ov::ISyncInferRequest> create_sync_infer_request() const override;
    static unsigned int ParseTimeoutValue(const std::string&);
    std::atomic_bool m_terminate = {false};
+    ov::AnyMap m_config;
    DeviceInformation m_device_info;
-    InferenceEngine::SoExecutableNetworkInternal m_model_with_batch;
-    InferenceEngine::SoExecutableNetworkInternal m_model_without_batch;

-    std::pair<WorkerInferRequest&, int> GetWorkerInferRequest();
-    std::vector<WorkerInferRequest::Ptr> m_worker_requests;
-    std::mutex m_worker_requests_mutex;
+    std::pair<std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest>, int> GetWorkerInferRequest()
+        const;
+    mutable std::vector<std::shared_ptr<WorkerInferRequest>> m_worker_requests;
+    mutable std::mutex m_worker_requests_mutex;

-    std::unordered_map<std::string, InferenceEngine::Parameter> m_config;
-    std::atomic_size_t m_num_requests_created = {0};
-    std::atomic_int m_timeout = {0};  // in ms
+    mutable std::atomic_size_t m_num_requests_created = {0};
+    std::atomic<std::uint32_t> m_time_out = {0};  // in ms

    const std::set<std::string> m_batched_inputs;
    const std::set<std::string> m_batched_outputs;
+
+    ov::SoPtr<ov::ICompiledModel> m_compiled_model_with_batch;
+    ov::SoPtr<ov::ICompiledModel> m_compiled_model_without_batch;
 };
 }  // namespace autobatch_plugin
 }  // namespace ov
--- a/src/plugins/auto_batch/src/plugin.cpp
+++ b/src/plugins/auto_batch/src/plugin.cpp
@ -7,10 +7,6 @@
 #include "plugin.hpp"

 #include "compiled_model.hpp"
-#include "ie_icore.hpp"
-#include "ie_metric_helpers.hpp"
-#include "ie_ngraph_utils.hpp"
-#include "ie_performance_hints.hpp"
 #include "openvino/core/dimension_tracker.hpp"
 #include "openvino/pass/manager.hpp"
 #include "openvino/runtime/intel_gpu/properties.hpp"
@ -19,227 +15,220 @@
 #include "transformations/common_optimizations/dimension_tracking.hpp"
 #include "transformations/init_node_info.hpp"
 #include "transformations/utils/utils.hpp"
+OPENVINO_SUPPRESS_DEPRECATED_START
+#include "ie_layouts.h"
+OPENVINO_SUPPRESS_DEPRECATED_END

 namespace ov {
 namespace autobatch_plugin {

+OPENVINO_SUPPRESS_DEPRECATED_START
 std::vector<std::string> supported_configKeys = {CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG),
                                                 ov::device::priorities.name(),
-                                                 CONFIG_KEY(AUTO_BATCH_TIMEOUT),
-                                                 CONFIG_KEY(CACHE_DIR)};
-namespace {
+                                                 ov::auto_batch_timeout.name(),
+                                                 ov::cache_dir.name()};
+OPENVINO_SUPPRESS_DEPRECATED_END

-std::map<std::string, std::string> mergeConfigs(std::map<std::string, std::string> config,
-                                                const std::map<std::string, std::string>& user_config) {
+inline ov::AnyMap merge_properties(ov::AnyMap config, const ov::AnyMap& user_config) {
    for (auto&& kvp : user_config) {
        config[kvp.first] = kvp.second;
    }
    return config;
 }

-}  // namespace
-
-DeviceInformation Plugin::ParseBatchDevice(const std::string& deviceWithBatch) {
-    auto&& d = deviceWithBatch;
-    auto openingBracket = d.find_first_of('(');
-    auto closingBracket = d.find_first_of(')', openingBracket);
-    auto deviceName = d.substr(0, openingBracket);
+DeviceInformation Plugin::parse_batch_device(const std::string& device_with_batch) {
+    auto openingBracket = device_with_batch.find_first_of('(');
+    auto closingBracket = device_with_batch.find_first_of(')', openingBracket);
+    auto deviceName = device_with_batch.substr(0, openingBracket);

    int batch = 0;
    if (closingBracket != std::string::npos && openingBracket < closingBracket) {
-        batch = std::stol(d.substr(openingBracket + 1, closingBracket - 1));
+        batch = std::stol(device_with_batch.substr(openingBracket + 1, closingBracket - 1));

        if (batch <= 0) {
-            IE_THROW() << "Batch value for '" << deviceName << "' must be > 0, while " << batch << "is passed";
+            OPENVINO_THROW("Batch value for '", deviceName, "' must be > 0, while ", batch, "is passed");
        }
    }
-    return {deviceName, {{}}, batch};
+    return {deviceName, {{}}, static_cast<uint32_t>(batch)};
 }

-DeviceInformation Plugin::ParseMetaDevice(const std::string& devicesBatchCfg,
-                                          const std::map<std::string, std::string>& user_config) const {
-    auto metaDevice = ParseBatchDevice(devicesBatchCfg);
-    metaDevice.config = GetCore()->GetSupportedConfig(metaDevice.device_name, user_config);
-
+DeviceInformation Plugin::parse_meta_device(const std::string& devices_batch_config,
+                                            const ov::AnyMap& user_config) const {
+    auto meta_device = parse_batch_device(devices_batch_config);
+    meta_device.device_config = get_core()->get_supported_property(meta_device.device_name, user_config);
    // check that no irrelevant config-keys left
    for (const auto& k : user_config) {
        const auto& name = k.first;
-        if (metaDevice.config.find(name) == metaDevice.config.end() &&
+        if (meta_device.device_config.find(name) == meta_device.device_config.end() &&
            !ov::util::contains(supported_configKeys, name)) {
-            IE_THROW() << "Unsupported config key: " << name;
+            OPENVINO_THROW("Unsupported config key: ", name);
        }
    }
-    return metaDevice;
+    return meta_device;
 }

-InferenceEngine::RemoteContext::Ptr Plugin::CreateContext(const InferenceEngine::ParamMap& remote_properties) {
-    auto cfg = remote_properties;
-    auto it = cfg.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
-    if (it == cfg.end())
-        it = cfg.find(ov::device::priorities.name());
-    if (it == cfg.end())
-        IE_THROW() << "Value for KEY_AUTO_BATCH_DEVICE_CONFIG is not set";
+ov::SoPtr<ov::IRemoteContext> Plugin::create_context(const ov::AnyMap& remote_properties) const {
+    auto full_properties = remote_properties;
+    OPENVINO_SUPPRESS_DEPRECATED_START
+    auto it = full_properties.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
+    OPENVINO_SUPPRESS_DEPRECATED_END
+    if (it == full_properties.end())
+        it = full_properties.find(ov::device::priorities.name());
+    if (it == full_properties.end())
+        OPENVINO_THROW("Value for ov::device::priorities is not set");

    auto val = it->second.as<std::string>();
-    auto core = GetCore();
-    if (!core)
-        return nullptr;
-    auto metaDevice = ParseMetaDevice(val, std::map<std::string, std::string>());
-    cfg.erase(it);
-    return core->CreateContext(metaDevice.device_name, cfg);
+    auto metaDevice = parse_meta_device(val, ov::AnyMap());
+    full_properties.erase(it);
+    return get_core()->create_context(metaDevice.device_name, full_properties);
 }

-InferenceEngine::Parameter Plugin::GetConfig(
-    const std::string& name,
-    const std::map<std::string, InferenceEngine::Parameter>& user_options) const {
+ov::Any Plugin::get_property(const std::string& name, const ov::AnyMap& arguments) const {
+    OPENVINO_SUPPRESS_DEPRECATED_START
    if (supported_configKeys.end() != std::find(supported_configKeys.begin(), supported_configKeys.end(), name)) {
-        auto it = _config.find(name);
-        if (it == _config.end()) {
-            IE_THROW() << "Value for " << name << " is not set";
+        auto it = m_plugin_config.find(name);
+        if (it == m_plugin_config.end()) {
+            OPENVINO_THROW("The Value is not set for ", name);
        } else {
            return {it->second};
        }
-    } else {
-        IE_THROW() << "Unsupported config key: " << name;
-    }
-}
-
-void Plugin::CheckConfig(const std::map<std::string, std::string>& user_config) {
-    for (auto&& kvp : user_config) {
-        const auto name = kvp.first;
-        const auto val = kvp.second;
-        if (supported_configKeys.end() == std::find(supported_configKeys.begin(), supported_configKeys.end(), name))
-            IE_THROW() << "Unsupported config key: " << name;
-        if (name == CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG) || name == ov::device::priorities.name()) {
-            ParseBatchDevice(val);
-        } else if (name == CONFIG_KEY(AUTO_BATCH_TIMEOUT)) {
-            try {
-                auto t = std::stoi(val);
-                if (t < 0)
-                    IE_THROW(ParameterMismatch);
-            } catch (const std::exception&) {
-                IE_THROW(ParameterMismatch)
-                    << " Expecting unsigned int value for " << CONFIG_KEY(AUTO_BATCH_TIMEOUT) << " got " << val;
-            }
-        }
-    }
-}
-
-void Plugin::SetConfig(const std::map<std::string, std::string>& user_config) {
-    CheckConfig(user_config);
-    for (auto&& kvp : user_config) {
-        _config[kvp.first] = kvp.second;
-    }
-}
-
-static const InferenceEngine::Version version = {{2, 1}, CI_BUILD_NUMBER, "AutoBatchPlugin"};
-IE_DEFINE_PLUGIN_CREATE_FUNCTION(Plugin, version)
-
-Plugin::Plugin() {
-    _pluginName = "BATCH";
-    _config[CONFIG_KEY(AUTO_BATCH_TIMEOUT)] = "1000";  // default value, in ms
-}
-
-InferenceEngine::Parameter Plugin::GetMetric(
-    const std::string& name,
-    const std::map<std::string, InferenceEngine::Parameter>& user_options) const {
-    if (name == METRIC_KEY(SUPPORTED_METRICS)) {
-        std::vector<std::string> metrics;
-        metrics.push_back(METRIC_KEY(SUPPORTED_METRICS));
-        metrics.push_back(METRIC_KEY(FULL_DEVICE_NAME));
-        metrics.push_back(METRIC_KEY(SUPPORTED_CONFIG_KEYS));
-        IE_SET_METRIC_RETURN(SUPPORTED_METRICS, metrics);
+    } else if (name == METRIC_KEY(SUPPORTED_METRICS)) {
+        return std::vector<std::string>{METRIC_KEY(SUPPORTED_METRICS),
+                                        ov::device::full_name.name(),
+                                        METRIC_KEY(SUPPORTED_CONFIG_KEYS)};
+    } else if (name == ov::supported_properties.name()) {
+        return std::vector<ov::PropertyName>{
+            ov::PropertyName{ov::supported_properties.name(), ov::PropertyMutability::RO},
+            ov::PropertyName{ov::device::full_name.name(), ov::PropertyMutability::RO}};
    } else if (name == ov::internal::supported_properties.name()) {
        return decltype(ov::internal::supported_properties)::value_type{};
-    } else if (name == METRIC_KEY(FULL_DEVICE_NAME)) {
-        IE_SET_METRIC_RETURN(FULL_DEVICE_NAME, _pluginName);
+    } else if (name == ov::device::full_name.name()) {
+        return get_device_name();
    } else if (name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
-        IE_SET_METRIC_RETURN(SUPPORTED_CONFIG_KEYS, supported_configKeys);
+        return supported_configKeys;
    } else {
-        IE_THROW(NotFound) << "Unsupported metric key " << name;
+        OPENVINO_THROW("Unsupported property: ", name);
+    }
+    OPENVINO_SUPPRESS_DEPRECATED_END
+}
+
+void Plugin::set_property(const ov::AnyMap& properties) {
+    for (auto&& c : properties) {
+        const auto& name = c.first;
+        const auto& val = c.second;
+        if (supported_configKeys.end() == std::find(supported_configKeys.begin(), supported_configKeys.end(), name))
+            OPENVINO_THROW("Unsupported config key: ", name);
+        OPENVINO_SUPPRESS_DEPRECATED_START
+        if (name == CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG) || name == ov::device::priorities.name()) {
+            parse_batch_device(val.as<std::string>());
+        } else if (name == ov::auto_batch_timeout.name()) {
+            try {
+                auto t = val.as<uint32_t>();
+                if (t < 0)
+                    OPENVINO_THROW("The value for ", ov::auto_batch_timeout.name(), " should > 0, which is  ", t);
+            } catch (const std::exception&) {
+                OPENVINO_THROW(" Expecting unsigned int value for ",
+                               ov::auto_batch_timeout.name(),
+                               " got ",
+                               val.as<uint32_t>());
+            }
+        }
+        OPENVINO_SUPPRESS_DEPRECATED_END
+        m_plugin_config[name] = val;
    }
 }

-InferenceEngine::IExecutableNetworkInternal::Ptr Plugin::LoadExeNetworkImpl(
-    const InferenceEngine::CNNNetwork& network,
-    const std::map<std::string, std::string>& user_config) {
-    return LoadNetworkImpl(network, nullptr, user_config);
+static const ov::Version version = {CI_BUILD_NUMBER, "openvino_auto_batch_plugin"};
+OV_DEFINE_PLUGIN_CREATE_FUNCTION(Plugin, version)
+
+Plugin::Plugin() {
+    set_device_name("BATCH");
+    m_plugin_config.insert(ov::auto_batch_timeout(1000));  // default value (ms)
 }

-InferenceEngine::IExecutableNetworkInternal::Ptr Plugin::LoadNetworkImpl(
-    const InferenceEngine::CNNNetwork& network,
-    const std::shared_ptr<InferenceEngine::RemoteContext> ctx,
-    const std::map<std::string, std::string>& user_config) {
-    auto core = GetCore();
+std::shared_ptr<ov::ICompiledModel> Plugin::compile_model(const std::shared_ptr<const ov::Model>& model,
+                                                          const ov::AnyMap& properties) const {
+    return compile_model(model, properties, {});
+}
+
+std::shared_ptr<ov::ICompiledModel> Plugin::compile_model(const std::shared_ptr<const ov::Model>& model,
+                                                          const ov::AnyMap& properties,
+                                                          const ov::SoPtr<ov::IRemoteContext>& context) const {
+    auto core = get_core();
    if (core == nullptr) {
-        IE_THROW() << "Please, work with Auto-Batching device via InferencEngine::Core object";
+        OPENVINO_THROW("Please, work with Auto-Batching device via InferencEngine::Core object");
    }
-    auto fullConfig = mergeConfigs(_config, user_config);
-    auto device_batch = fullConfig.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
-    if (device_batch == fullConfig.end())
-        device_batch = fullConfig.find(ov::device::priorities.name());
-    if (device_batch == fullConfig.end()) {
-        IE_THROW() << "KEY_AUTO_BATCH key is not set for BATCH device";
+
+    // merge configs from func properties and m_plugin_config
+    auto full_properties = merge_properties(m_plugin_config, properties);
+    OPENVINO_SUPPRESS_DEPRECATED_START
+    auto device_batch = full_properties.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
+    if (device_batch == full_properties.end())
+        device_batch = full_properties.find(ov::device::priorities.name());
+    if (device_batch == full_properties.end()) {
+        OPENVINO_THROW("ov::device::priorities key for AUTO NATCH is not set for BATCH device");
    }
-    auto metaDevice = ParseMetaDevice(device_batch->second, user_config);
-    const auto& deviceName = metaDevice.device_name;
-    const auto& deviceConfig = metaDevice.config;
-    auto deviceConfigNoAutoBatch = deviceConfig;
+    OPENVINO_SUPPRESS_DEPRECATED_END
+    auto meta_device = parse_meta_device(device_batch->second.as<std::string>(), properties);
+
+    const auto& device_name = meta_device.device_name;
+    const auto& device_config = meta_device.device_config;
+    auto device_config_no_auto_batch = device_config;
    // avoid recursive auto-batching
-    deviceConfigNoAutoBatch[CONFIG_KEY(ALLOW_AUTO_BATCHING)] = CONFIG_VALUE(NO);
+    device_config_no_auto_batch[ov::hint::allow_auto_batching.name()] = false;

    std::set<std::string> batched_inputs;
    std::set<std::string> batched_outputs;
    // check that the auto-batching is applicable in general
    try {
        // if applicable, the Auto-Batching is implicitly enabled via the performance hints
-        const auto tput = CONFIG_VALUE(THROUGHPUT);
-        const bool bTputInPlg = core->GetConfig(deviceName, CONFIG_KEY(PERFORMANCE_HINT)).as<std::string>() == tput;
-        const auto& mode = deviceConfig.find(CONFIG_KEY(PERFORMANCE_HINT));
-        const bool bTputInLoadCfg = (mode != deviceConfig.end() && mode->second == tput);
+        const bool enable_tput_plugin =
+            core->get_property(device_name, ov::hint::performance_mode) == ov::hint::PerformanceMode::THROUGHPUT;
+        const auto& performance_mode = device_config.find(ov::hint::performance_mode.name());
+        const bool enable_tput_cfg = (performance_mode != device_config.end() &&
+                                      performance_mode->second == ov::hint::PerformanceMode::THROUGHPUT);
        // if the auto-batching is enabled implicitly, check the dims carefully, to avoid outstanding failures
-        const bool check_dims = (bTputInPlg || bTputInLoadCfg);
-        InferenceEngine::CNNNetwork clonedNetwork(InferenceEngine::details::cloneNetwork(network));
-        auto function = clonedNetwork.getFunction();
+        const bool check_dims = (enable_tput_plugin || enable_tput_cfg);
        // find the batch dim
-        ov::pass::Manager m;
-        m.register_pass<ov::pass::InitNodeInfo>();
-        m.register_pass<ov::pass::FindBatch>(false, check_dims);
-        m.run_passes(function);
+        auto cloned_model = model->clone();
+        ov::pass::Manager pass_manager;
+        pass_manager.register_pass<ov::pass::InitNodeInfo>();
+        pass_manager.register_pass<ov::pass::FindBatch>(false, check_dims);
+        pass_manager.run_passes(cloned_model);
        // do not reshape/re-batch originally batched networks and when there are no inputs with the N* layouts
        // input(s) should have the batch dim as the first dim (current limitation of the auto-batching impl)
-        const auto& params = function->get_parameters();
+        const auto& params = cloned_model->get_parameters();
        for (size_t input_id = 0; input_id < params.size(); input_id++) {
            const auto& input = params[input_id];
            const auto& shape = input->get_partial_shape();
            // currently no plugin support batched execution for dynamic networks
            if (shape.is_dynamic())
-                IE_THROW(NotImplemented) << "Auto-batching does not support dynamic networks!";
+                OPENVINO_THROW("Auto-batching does not support dynamic networks!");
            // check the batch dim: either 0th (and the original batch size of 1) or none
            if (shape.size() && ov::DimensionTracker::get_label(shape[0])) {
                const auto& static_shape = input->get_shape();
                if (static_shape[0] != 1)
-                    IE_THROW(NotImplemented) << "Auto-batching does not reshape/re-batch originally batched networks!";
+                    OPENVINO_THROW("Auto-batching does not reshape/re-batch originally batched networks!");
                batched_inputs.insert(
                    ov::op::util::get_ie_output_name(params[input_id]->output(0)));  // batched dim for the input
            } else {
                // if the 0-th dim is not for the batch, then we support only the case when NONE dimension is batch
                for (size_t s = 1; s < shape.size(); s++)
                    if (ov::DimensionTracker::get_label(shape[s]))
-                        IE_THROW(NotImplemented)
-                            << "Auto-batching operates only networks with inputs/outputs batched by 0th dimension";
+                        OPENVINO_THROW(
+                            "Auto-batching operates only networks with inputs/outputs batched by 0th dimension");
            }
        }
-        const auto& results = function->get_results();
+        const auto& results = cloned_model->get_results();
        for (size_t output_id = 0; output_id < results.size(); output_id++) {
            const auto& output = results[output_id];
            const auto& shape = output->get_output_partial_shape(0);
            if (shape.is_dynamic())
-                IE_THROW(NotImplemented) << "Auto-batching does not support dynamic networks!";
+                OPENVINO_THROW("Auto-batching does not support dynamic networks!");
            // check the batch dim: either 0th (and the original batch size of 1) or none
            if (shape.size() && ov::DimensionTracker::get_label(shape[0])) {
                if (shape[0] != 1)
-                    IE_THROW(NotImplemented) << "Auto-batching does not reshape/re-batch originally batched networks!";
+                    OPENVINO_THROW("Auto-batching does not reshape/re-batch originally batched networks!");
                const auto& node = output->input_value(0);
                batched_outputs.insert(
                    ov::op::util::get_ie_output_name(ov::Output<const ov::Node>(node.get_node(), node.get_index())));
@ -247,116 +236,177 @@ InferenceEngine::IExecutableNetworkInternal::Ptr Plugin::LoadNetworkImpl(
                // if the 0-th dim is not for the batch, then we support only the case when NONE dimension is batch
                for (size_t s = 1; s < shape.size(); s++)
                    if (ov::DimensionTracker::get_label(shape[s]))
-                        IE_THROW(NotImplemented)
-                            << "Auto-batching operates only networks with outputs batched by 0th dimension";
+                        OPENVINO_THROW("Auto-batching operates only networks with outputs batched by 0th dimension");
            }
        }
        if (!batched_inputs.size() || !batched_outputs.size())
-            IE_THROW(NotImplemented)
-                << "Auto-batching supports only networks with inputs/outputs featuring batched dim!";
-    } catch (const InferenceEngine::Exception&) {
-        metaDevice.batch_for_device = 1;
+            OPENVINO_THROW("Auto-batching supports only networks with inputs/outputs featuring batched dim!");
+    } catch (const ov::Exception&) {
+        meta_device.device_batch_size = 1;
    }

-    if (!metaDevice.batch_for_device) {
-        unsigned int requests = 0;
+    if (!meta_device.device_batch_size) {
        // batch size is not set explicitly via device name e.g. BATCH:GPU(4)
        // let's query the optimal batch size
-        std::map<std::string, InferenceEngine::Parameter> options;
-        options["MODEL_PTR"] = std::const_pointer_cast<ngraph::Function>(network.getFunction());
-        auto optBatchSize = core->GetMetric(deviceName, METRIC_KEY(OPTIMAL_BATCH_SIZE), options).as<unsigned int>();
-        auto res = core->GetConfig(deviceName, CONFIG_KEY(PERFORMANCE_HINT_NUM_REQUESTS)).as<std::string>();
-        requests = InferenceEngine::PerfHintsConfig::CheckPerformanceHintRequestValue(res);
-        const auto& reqs = user_config.find(CONFIG_KEY(PERFORMANCE_HINT_NUM_REQUESTS));
-        if (reqs != user_config.end())
-            requests = static_cast<unsigned int>(
-                InferenceEngine::PerfHintsConfig::CheckPerformanceHintRequestValue(reqs->second));
+        // auto cloned_model = model->clone();
+        ov::AnyMap options = {ov::hint::model(std::const_pointer_cast<ov::Model>(model))};
+        unsigned int opt_batch_size = core->get_property(device_name, ov::optimal_batch_size, options);
+        auto requests = core->get_property(device_name, ov::hint::num_requests);
+        const auto& reqs = properties.find(ov::hint::num_requests.name());
+        if (reqs != properties.end())
+            requests = reqs->second.as<unsigned int>();
        if (requests)
-            optBatchSize = std::max(1u, std::min(requests, optBatchSize));
-        if (optBatchSize > 2)  // batching is usually in-efficient for batch<4 (as batch1 kernels are heavily optimized)
-            metaDevice.batch_for_device = optBatchSize;
+            opt_batch_size = std::max(1u, std::min(requests, opt_batch_size));
+        if (opt_batch_size >
+            2)  // batching is usually in-efficient for batch<4 (as batch1 kernels are heavily optimized)
+            meta_device.device_batch_size = opt_batch_size;
        else
-            metaDevice.batch_for_device = 1;
+            meta_device.device_batch_size = 1;
    }

-    auto report_footprint = [](std::shared_ptr<InferenceEngine::ICore> pCore, std::string device) -> size_t {
+    auto report_footprint = [](std::shared_ptr<ICore> pCore, std::string device) -> size_t {
        size_t footprint = 0;
-        // TODO: use the per-network metric (22.2) rather than plugin-level
-        auto stats =
-            pCore->GetMetric(device, ov::intel_gpu::memory_statistics.name()).as<std::map<std::string, uint64_t>>();
+        // TODO: use the per-model metric (22.2) rather than plugin-level
+        auto stats = pCore->get_property(device, ov::intel_gpu::memory_statistics);
        for (const auto& s : stats)
            footprint += s.second;
        return footprint;
    };

    size_t batch1_footprint = 0;
-    if (deviceName.find("GPU") != std::string::npos)
-        batch1_footprint = report_footprint(core, deviceName);
-    auto executableNetworkWithoutBatch = ctx ? core->LoadNetwork(network, ctx, deviceConfigNoAutoBatch)
-                                             : core->LoadNetwork(network, deviceName, deviceConfigNoAutoBatch);
-    if (deviceName.find("GPU") != std::string::npos) {
-        batch1_footprint = report_footprint(core, deviceName) - batch1_footprint;
+    if (device_name.find("GPU") != std::string::npos)
+        batch1_footprint = report_footprint(core, device_name);
+    auto compiled_model_without_batch = context ? core->compile_model(model, context, device_config_no_auto_batch)
+                                                : core->compile_model(model, device_name, device_config_no_auto_batch);
+    if (device_name.find("GPU") != std::string::npos) {
+        batch1_footprint = report_footprint(core, device_name) - batch1_footprint;
        if (batch1_footprint) {
-            const auto total_mem =
-                GetCore()->GetMetric(deviceName, GPU_METRIC_KEY(DEVICE_TOTAL_MEM_SIZE)).as<uint64_t>();
+            const auto total_mem = core->get_property(device_name, ov::intel_gpu::device_total_mem_size);
            const int estimated_batch = static_cast<int>((total_mem - batch1_footprint) / batch1_footprint);
            int closest = static_cast<int>(pow(2, floor(std::log(estimated_batch) / std::log(2))));
            closest = std::max(1, closest);
-            metaDevice.batch_for_device = std::min(metaDevice.batch_for_device, closest);
+            meta_device.device_batch_size = std::min(static_cast<int>(meta_device.device_batch_size), closest);
        }
    }
+
    // auto-batch settings
-    std::unordered_map<std::string, InferenceEngine::Parameter> networkConfig;
-    for (const auto& c : fullConfig) {
+    ov::AnyMap compiled_model_config;
+    for (const auto& c : full_properties) {
        if (supported_configKeys.end() != std::find(supported_configKeys.begin(), supported_configKeys.end(), c.first))
-            networkConfig.insert(c);
+            compiled_model_config.insert(c);
    }
-
-    InferenceEngine::SoExecutableNetworkInternal executableNetworkWithBatch;
-    if (metaDevice.batch_for_device > 1 && batched_inputs.size()) {
+    ov::SoPtr<ov::ICompiledModel> compiled_model_with_batch;
+    auto reshaped = model->clone();
+    if (meta_device.device_batch_size > 1 && batched_inputs.size()) {
        try {
-            InferenceEngine::CNNNetwork reshaped(InferenceEngine::details::cloneNetwork(network));
-            InferenceEngine::ICNNNetwork::InputShapes shapes = reshaped.getInputShapes();
-            for (const auto& input : batched_inputs)
-                shapes[input][0] = metaDevice.batch_for_device;
-            reshaped.reshape(shapes);
-            executableNetworkWithBatch = ctx ? core->LoadNetwork(reshaped, ctx, deviceConfigNoAutoBatch)
-                                             : core->LoadNetwork(reshaped, deviceName, deviceConfigNoAutoBatch);
-        } catch (const InferenceEngine::Exception&) {
-            metaDevice.batch_for_device = 1;
+            auto inputs = reshaped->inputs();
+            std::map<ov::Output<ov::Node>, ov::PartialShape> partial_shapes;
+            for (auto& input : inputs) {
+                auto input_shape = input.get_shape();
+                if (batched_inputs.find(ov::op::util::get_ie_output_name(input)) != batched_inputs.end()) {
+                    input_shape[0] = meta_device.device_batch_size;
+                }
+                partial_shapes.insert({input, ov::PartialShape(input_shape)});
+            }
+
+            reshaped->reshape(partial_shapes);
+
+            OPENVINO_SUPPRESS_DEPRECATED_START
+            for (auto&& input : reshaped->inputs()) {
+                auto& rt_info = input.get_rt_info();
+                auto it = rt_info.find("ie_legacy_td");
+                if (it != rt_info.end()) {
+                    auto td = it->second.as<InferenceEngine::TensorDesc>();
+                    rt_info["ie_legacy_td"] =
+                        InferenceEngine::TensorDesc(td.getPrecision(), input.get_shape(), td.getLayout());
+                }
+            }
+            for (auto&& result : reshaped->get_results()) {
+                auto output = result->input_value(0);
+                auto& rt_info = output.get_rt_info();
+                auto it = rt_info.find("ie_legacy_td");
+                if (it != rt_info.end()) {
+                    auto td = it->second.as<InferenceEngine::TensorDesc>();
+                    rt_info["ie_legacy_td"] =
+                        InferenceEngine::TensorDesc(td.getPrecision(), output.get_shape(), td.getLayout());
+                }
+            }
+            OPENVINO_SUPPRESS_DEPRECATED_END
+
+            compiled_model_with_batch = context
+                                            ? core->compile_model(reshaped, context, device_config_no_auto_batch)
+                                            : core->compile_model(reshaped, device_name, device_config_no_auto_batch);
+        } catch (const ov::Exception&) {
+            meta_device.device_batch_size = 1;
        }
    }

-    return std::make_shared<CompiledModel>(executableNetworkWithBatch,
-                                           executableNetworkWithoutBatch,
-                                           metaDevice,
-                                           networkConfig,
+    ov::SoPtr<ov::IRemoteContext> device_context;
+    if (!context) {
+        OPENVINO_SUPPRESS_DEPRECATED_START
+        try {
+            device_context = compiled_model_without_batch->get_context();
+            if (!device_context._so)
+                device_context._so = compiled_model_without_batch._so;
+        } catch (const ov::NotImplemented&) {
+        } catch (const InferenceEngine::NotImplemented&) {
+        }
+        OPENVINO_SUPPRESS_DEPRECATED_END
+    } else {
+        device_context = context;
+    }
+
+    return std::make_shared<CompiledModel>(model->clone(),
+                                           shared_from_this(),
+                                           compiled_model_config,
+                                           meta_device,
                                           batched_inputs,
-                                           batched_outputs);
+                                           batched_outputs,
+                                           compiled_model_with_batch,
+                                           compiled_model_without_batch,
+                                           device_context);
 }

-InferenceEngine::IExecutableNetworkInternal::Ptr Plugin::LoadExeNetworkImpl(
-    const InferenceEngine::CNNNetwork& network,
-    const std::shared_ptr<InferenceEngine::RemoteContext>& context,
-    const std::map<std::string, std::string>& user_config) {
-    return LoadNetworkImpl(network, context, user_config);
-}
-
-InferenceEngine::QueryNetworkResult Plugin::QueryNetwork(const InferenceEngine::CNNNetwork& network,
-                                                         const std::map<std::string, std::string>& user_config) const {
-    auto core = GetCore();
-    if (!core)
-        return InferenceEngine::QueryNetworkResult();
-    auto cfg = user_config;
+ov::SupportedOpsMap Plugin::query_model(const std::shared_ptr<const ov::Model>& model,
+                                        const ov::AnyMap& properties) const {
+    OPENVINO_ASSERT(model, "OpenVINO Model is empty!");
+    OPENVINO_ASSERT(get_core(), "Core is missing!");
+    auto cfg = properties;
    for (const auto& c : cfg) {
+        OPENVINO_SUPPRESS_DEPRECATED_START
        if (c.first == CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG) || c.first == ov::device::priorities.name()) {
            auto val = c.second;
            cfg.erase(c.first);
-            auto metaDevice = ParseMetaDevice(val, cfg);
-            return core->QueryNetwork(network, metaDevice.device_name, cfg);
+            auto metaDevice = parse_meta_device(val.as<std::string>(), cfg);
+            return get_core()->query_model(model, metaDevice.device_name, cfg);
        }
+        OPENVINO_SUPPRESS_DEPRECATED_END
    }
-    IE_THROW() << "Value for KEY_AUTO_BATCH_DEVICE_CONFIG is not set";
+    OPENVINO_THROW("Value for ov::device::priorities for AUTO BATCH PLUGIN is not set");
+}
+
+ov::SoPtr<ov::IRemoteContext> Plugin::get_default_context(const ov::AnyMap& remote_properties) const {
+    OPENVINO_SUPPRESS_DEPRECATED_START
+    auto it = remote_properties.find(CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG));
+    OPENVINO_SUPPRESS_DEPRECATED_END
+    if (it == remote_properties.end())
+        it = remote_properties.find(ov::device::priorities.name());
+    if (it == remote_properties.end())
+        OPENVINO_THROW("Value for ov::device::priorities is not set");
+
+    auto val = it->second.as<std::string>();
+    auto metaDevice = parse_meta_device(val, ov::AnyMap());
+    return get_core()->get_default_context(metaDevice.device_name);
+}
+
+std::shared_ptr<ov::ICompiledModel> Plugin::import_model(std::istream& model, const ov::AnyMap& properties) const {
+    OPENVINO_NOT_IMPLEMENTED;
+}
+
+std::shared_ptr<ov::ICompiledModel> Plugin::import_model(std::istream& model,
+                                                         const ov::SoPtr<ov::IRemoteContext>& context,
+                                                         const ov::AnyMap& properties) const {
+    OPENVINO_NOT_IMPLEMENTED;
 }
 }  // namespace autobatch_plugin
 }  // namespace ov
--- a/src/plugins/auto_batch/src/plugin.hpp
+++ b/src/plugins/auto_batch/src/plugin.hpp
@ -7,8 +7,9 @@

 #include <map>

-#include "cpp_interfaces/impl/ie_executable_network_thread_safe_default.hpp"
-#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
+#include "ie/ie_plugin_config.hpp"
+#include "openvino/runtime/iplugin.hpp"
+#include "openvino/runtime/properties.hpp"

 #ifdef AUTOBATCH_UNITTEST
 #    define autobatch_plugin mock_autobatch_plugin
@ -19,40 +20,39 @@ namespace autobatch_plugin {

 struct DeviceInformation {
    std::string device_name;
-    std::map<std::string, std::string> config;
-    int batch_for_device;
+    ov::AnyMap device_config;
+    uint32_t device_batch_size;
 };

-class Plugin : public InferenceEngine::IInferencePlugin {
+class Plugin : public ov::IPlugin {
 public:
    Plugin();

    virtual ~Plugin() = default;

-    InferenceEngine::IExecutableNetworkInternal::Ptr LoadExeNetworkImpl(
-        const InferenceEngine::CNNNetwork& network,
-        const std::map<std::string, std::string>& config) override;
+    std::shared_ptr<ov::ICompiledModel> compile_model(const std::shared_ptr<const ov::Model>& model,
+                                                      const ov::AnyMap& properties) const override;

-    InferenceEngine::IExecutableNetworkInternal::Ptr LoadExeNetworkImpl(
-        const InferenceEngine::CNNNetwork& network,
-        const std::shared_ptr<InferenceEngine::RemoteContext>& context,
-        const std::map<std::string, std::string>& config) override;
+    std::shared_ptr<ov::ICompiledModel> compile_model(const std::shared_ptr<const ov::Model>& model,
+                                                      const ov::AnyMap& properties,
+                                                      const ov::SoPtr<ov::IRemoteContext>& context) const override;

-    void SetConfig(const std::map<std::string, std::string>& config) override;
+    void set_property(const ov::AnyMap& properties) override;

-    void CheckConfig(const std::map<std::string, std::string>& config);
+    ov::Any get_property(const std::string& name, const ov::AnyMap& arguments) const override;

-    InferenceEngine::Parameter GetConfig(
-        const std::string& name,
-        const std::map<std::string, InferenceEngine::Parameter>& options) const override;
+    ov::SupportedOpsMap query_model(const std::shared_ptr<const ov::Model>& model,
+                                    const ov::AnyMap& properties) const override;

-    InferenceEngine::QueryNetworkResult QueryNetwork(const InferenceEngine::CNNNetwork& network,
-                                                     const std::map<std::string, std::string>& config) const override;
-    InferenceEngine::Parameter GetMetric(
-        const std::string& name,
-        const std::map<std::string, InferenceEngine::Parameter>& options) const override;
+    ov::SoPtr<ov::IRemoteContext> create_context(const ov::AnyMap& remote_properties) const override;

-    InferenceEngine::RemoteContext::Ptr CreateContext(const InferenceEngine::ParamMap&) override;
+    ov::SoPtr<ov::IRemoteContext> get_default_context(const ov::AnyMap& remote_properties) const override;
+
+    std::shared_ptr<ov::ICompiledModel> import_model(std::istream& model, const ov::AnyMap& properties) const override;
+
+    std::shared_ptr<ov::ICompiledModel> import_model(std::istream& model,
+                                                     const ov::SoPtr<ov::IRemoteContext>& context,
+                                                     const ov::AnyMap& properties) const override;

 #ifdef AUTOBATCH_UNITTEST

@ -61,15 +61,12 @@ public:

 protected:
 #endif
-    DeviceInformation ParseMetaDevice(const std::string& devicesBatchCfg,
-                                      const std::map<std::string, std::string>& config) const;
+    DeviceInformation parse_meta_device(const std::string& devices_batch_config, const ov::AnyMap& user_config) const;

-    static DeviceInformation ParseBatchDevice(const std::string& deviceWithBatch);
+    static DeviceInformation parse_batch_device(const std::string& device_with_batch);

-    InferenceEngine::IExecutableNetworkInternal::Ptr LoadNetworkImpl(
-        const InferenceEngine::CNNNetwork& network,
-        const std::shared_ptr<InferenceEngine::RemoteContext> context,
-        const std::map<std::string, std::string>& config);
+private:
+    mutable ov::AnyMap m_plugin_config;
 };
 }  // namespace autobatch_plugin
 }  // namespace ov
--- a/src/plugins/auto_batch/src/sync_infer_request.cpp
+++ b/src/plugins/auto_batch/src/sync_infer_request.cpp
@ -5,324 +5,104 @@
 ///////////////////////////////////////////////////////////////////////////////////////////////////
 #include "sync_infer_request.hpp"

+#include "openvino/core/type/element_type_traits.hpp"
+#include "openvino/runtime/make_tensor.hpp"
+#include "transformations/utils/utils.hpp"
+
 namespace ov {
 namespace autobatch_plugin {

-template <InferenceEngine::Precision::ePrecision precision>
-InferenceEngine::Blob::Ptr create_shared_blob_on_top_of_batched_blob(InferenceEngine::Blob::Ptr batched_blob,
+inline ov::SoPtr<ov::ITensor> create_shared_tensor_on_batched_tensor(ov::SoPtr<ov::ITensor> batched_tensor,
                                                                     std::string name,
                                                                     const std::set<std::string>& batched_names,
                                                                     size_t batch_id,
                                                                     size_t batch_num) {
-    typedef typename InferenceEngine::PrecisionTrait<precision>::value_type TYPE;
-    typedef typename std::add_pointer<TYPE>::type TYPEPTR;
-    auto ptr = batched_blob->buffer().as<TYPEPTR>();
-    auto sizePerBatch = batched_blob->size() / batch_num;
-    InferenceEngine::SizeVector dims = batched_blob->getTensorDesc().getDims();
+    auto ptr = static_cast<uint8_t*>(batched_tensor->data());
+    auto size_per_batch = batched_tensor->get_byte_size() / batch_num;
+    auto batched_shape = batched_tensor->get_shape();
    // for performance reason (copy avoidance) current impl of the auto-batching supports only batching by 0th dim
    if (batched_names.count(name)) {
-        dims[0] = 1;
-        return InferenceEngine::make_shared_blob<TYPE>({precision, dims, batched_blob->getTensorDesc().getLayout()},
-                                                       ptr + sizePerBatch * batch_id,
-                                                       sizePerBatch);
+        batched_shape[0] = 1;
+        return {ov::make_tensor(batched_tensor->get_element_type(), batched_shape, ptr + size_per_batch * batch_id),
+                batched_tensor._so};
    } else {
-        // same blob for all requests (e.g. constants)
-        return InferenceEngine::make_shared_blob<TYPE>({precision, dims, batched_blob->getTensorDesc().getLayout()},
-                                                       ptr);
+        return {ov::make_tensor(batched_tensor->get_element_type(), batched_shape, ptr), batched_tensor._so};
    }
 }

-SyncInferRequest::SyncInferRequest(const std::vector<std::shared_ptr<const ov::Node>>& inputs,
-                                   const std::vector<std::shared_ptr<const ov::Node>>& outputs,
-                                   CompiledModel::WorkerInferRequest& workerRequest,
-                                   int batch_id,
-                                   int num_batch,
-                                   const std::set<std::string>& batchedInputs,
-                                   const std::set<std::string>& batchedOutputs)
-    : IInferRequestInternal(inputs, outputs),
-      m_batched_request_wrapper(workerRequest),
+SyncInferRequest::SyncInferRequest(
+    const std::shared_ptr<const ov::autobatch_plugin::CompiledModel>& compiled_model,
+    const std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest>& worker_request,
+    int batch_id,
+    int num_batch,
+    const std::set<std::string>& batched_inputs,
+    const std::set<std::string>& batched_outputs)
+    : ov::ISyncInferRequest(compiled_model),
+      m_batched_request_wrapper(worker_request),
      m_batch_id(batch_id),
      m_batch_size(num_batch) {
-    ShareBlobsWithBatchRequest(batchedInputs, batchedOutputs);
+    share_tensors_with_batched_req(batched_inputs, batched_outputs);
 }

-SyncInferRequest::SyncInferRequest(const InferenceEngine::InputsDataMap& networkInputs,
-                                   const InferenceEngine::OutputsDataMap& networkOutputs,
-                                   CompiledModel::WorkerInferRequest& workerRequest,
-                                   int batch_id,
-                                   int num_batch,
-                                   const std::set<std::string>& batchedInputs,
-                                   const std::set<std::string>& batchedOutputs)
-    : IInferRequestInternal(networkInputs, networkOutputs),
-      m_batched_request_wrapper(workerRequest),
-      m_batch_id(batch_id),
-      m_batch_size(num_batch) {
-    ShareBlobsWithBatchRequest(batchedInputs, batchedOutputs);
+void SyncInferRequest::share_tensors_with_batched_req(const std::set<std::string>& batched_inputs,
+                                                      const std::set<std::string>& batched_outputs) {
+    for (const auto& it : get_inputs()) {
+        auto name = ov::op::util::get_ie_output_name(it);
+        ov::SoPtr<ov::ITensor> res;
+        auto batched_tensor = m_batched_request_wrapper->_infer_request_batched->get_tensor(it);
+        if (!batched_tensor._so)
+            batched_tensor._so = m_batched_request_wrapper->_infer_request_batched._so;
+        res = create_shared_tensor_on_batched_tensor(batched_tensor, name, batched_inputs, m_batch_id, m_batch_size);
+        set_tensor(it, res);
+    }
+
+    for (const auto& it : get_outputs()) {
+        auto name = ov::op::util::get_ie_output_name(it.get_node_shared_ptr()->input_value(0));
+        ov::SoPtr<ov::ITensor> res;
+        auto batched_tensor = m_batched_request_wrapper->_infer_request_batched->get_tensor(it);
+        if (!batched_tensor._so)
+            batched_tensor._so = m_batched_request_wrapper->_infer_request_batched._so;
+        res = create_shared_tensor_on_batched_tensor(batched_tensor, name, batched_outputs, m_batch_id, m_batch_size);
+        set_tensor(it, res);
+    }
 }

-void SyncInferRequest::ShareBlobsWithBatchRequest(const std::set<std::string>& batchedInputs,
-                                                  const std::set<std::string>& batchedOutputs) {
-    // Allocate all input blobs
-    for (const auto& it : _networkInputs) {
-        auto blob = m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first);
-        InferenceEngine::Blob::Ptr res;
-        switch (it.second->getTensorDesc().getPrecision()) {
-        case InferenceEngine::Precision::FP32:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP32>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::I32:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I32>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::I8:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I8>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::I16:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I16>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::U16:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U16>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::U32:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U32>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::FP64:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP64>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::FP16:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP16>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::BF16:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::BF16>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::U64:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U64>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::I64:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I64>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::U8:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U8>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::BOOL:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::BOOL>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedInputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        default:
-            IE_THROW() << "Unsupported input precision " << it.second->getTensorDesc().getPrecision();
+void SyncInferRequest::set_tensors_to_another_request(ov::SoPtr<ov::IAsyncInferRequest>& req) {
+    for (const auto& it : get_inputs()) {
+        // this request is already in BUSY state, so using the internal functions safely
+        auto tensor = get_tensor(it);
+        OPENVINO_ASSERT(tensor != nullptr, "The tensor is empty!");
+        auto type = tensor->get_element_type();
+        if (req->get_tensor(it)->data(type) != tensor->data(type)) {
+            req->set_tensor(it, tensor);
        }
-        _inputs[it.first] = res;
    }
-    // Allocate all output blobs
-    for (const auto& it : _networkOutputs) {
-        auto blob = m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first);
-        InferenceEngine::Blob::Ptr res;
-        switch (it.second->getTensorDesc().getPrecision()) {
-        case InferenceEngine::Precision::FP32:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP32>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::I32:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I32>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::I8:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I8>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::I16:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I16>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::U16:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U16>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::U32:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U32>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::FP64:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP64>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::FP16:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::FP16>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::BF16:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::BF16>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::U64:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U64>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::I64:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::I64>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::U8:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::U8>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        case InferenceEngine::Precision::BOOL:
-            res = create_shared_blob_on_top_of_batched_blob<InferenceEngine::Precision::BOOL>(
-                m_batched_request_wrapper._inferRequestBatched->GetBlob(it.first),
-                it.first,
-                batchedOutputs,
-                m_batch_id,
-                m_batch_size);
-            break;
-        default:
-            IE_THROW(NotImplemented) << "Unsupported input precision " << it.second->getTensorDesc().getPrecision();
+    for (const auto& it : get_outputs()) {
+        // this request is already in BUSY state, so using the internal functions safely
+        auto tensor = get_tensor(it);
+        OPENVINO_ASSERT(tensor != nullptr, "The tensor is empty!");
+        auto type = tensor->get_element_type();
+        if (req->get_tensor(it)->data(type) != tensor->data(type)) {
+            req->set_tensor(it, tensor);
        }
-        _outputs[it.first] = res;
-    }
-}
-void SyncInferRequest::SetBlobsToAnotherRequest(InferenceEngine::SoIInferRequestInternal& req) {
-    for (const auto& it : _networkInputs) {
-        auto& name = it.first;
-        // this request is already in BUSY state, so using the internal functions safely
-        auto blob = GetBlob(name);
-        if (req->GetBlob(name) != blob)
-            req->SetBlob(name, blob);
-    }
-    for (const auto& it : _networkOutputs) {
-        auto& name = it.first;
-        // this request is already in BUSY state, so using the internal functions safely
-        auto blob = GetBlob(name);
-        if (req->GetBlob(name) != blob)
-            req->SetBlob(name, blob);
    }
 }

-void SyncInferRequest::CopyInputsIfNeeded() {
-    for (const auto& it : _networkInputs) {
-        auto& name = it.first;
+void SyncInferRequest::copy_inputs_if_needed() {
+    for (const auto& it : get_inputs()) {
        // this request is already in BUSY state, so using the internal functions safely
-        CopyBlobIfNeeded(GetBlob(name), m_batched_request_wrapper._inferRequestBatched->GetBlob(name), true);
+        auto dst_tensor = m_batched_request_wrapper->_infer_request_batched->get_tensor(it);
+        copy_tensor_if_needed(get_tensor(it), dst_tensor, true);
    }
 }

-void SyncInferRequest::CopyBlobIfNeeded(InferenceEngine::Blob::CPtr src, InferenceEngine::Blob::Ptr dst, bool bInput) {
-    auto bufferDst = dst->buffer();
-    auto ptrDst = bufferDst.as<char*>();
-    auto bufferSrc = src->cbuffer();
-    auto ptrSrc = bufferSrc.as<const char*>();
-    ptrdiff_t szDst = dst->byteSize();
-    ptrdiff_t szSrc = src->byteSize();
+void SyncInferRequest::copy_tensor_if_needed(const ov::SoPtr<ov::ITensor>& src,
+                                             ov::SoPtr<ov::ITensor>& dst,
+                                             const bool bInput) {
+    auto ptrDst = static_cast<char*>(dst->data());
+    auto ptrSrc = static_cast<char*>(src->data());
+    ptrdiff_t szDst = dst->get_byte_size();
+    ptrdiff_t szSrc = src->get_byte_size();
    if (bInput) {
        ptrdiff_t offset = szSrc != szDst ? m_batch_id * szDst / m_batch_size : 0;
        if ((ptrDst + offset) == ptrSrc)
@ -338,12 +118,29 @@ void SyncInferRequest::CopyBlobIfNeeded(InferenceEngine::Blob::CPtr src, Inferen
    }
 }

-void SyncInferRequest::CopyOutputsIfNeeded() {
-    for (const auto& it : _networkOutputs) {
-        auto& name = it.first;
+void SyncInferRequest::copy_outputs_if_needed() {
+    for (const auto& it : get_outputs()) {
        // this request is already in BUSY state, so using the internal functions safely
-        CopyBlobIfNeeded(m_batched_request_wrapper._inferRequestBatched->GetBlob(name), GetBlob(name), false);
+        auto dst_tensor = get_tensor(it);
+        copy_tensor_if_needed(m_batched_request_wrapper->_infer_request_batched->get_tensor(it), dst_tensor, false);
    }
 }
+
+void SyncInferRequest::infer() {
+    OPENVINO_NOT_IMPLEMENTED;
+}
+
+std::vector<ov::SoPtr<ov::IVariableState>> SyncInferRequest::query_state() const {
+    auto states = m_batched_request_wrapper->_infer_request_batched->query_state();
+    for (auto&& state : states) {
+        if (!state._so)
+            state._so = m_batched_request_wrapper->_infer_request_batched._so;
+    }
+    return states;
+}
+
+std::vector<ov::ProfilingInfo> SyncInferRequest::get_profiling_info() const {
+    return m_batched_request_wrapper->_infer_request_batched->get_profiling_info();
+}
 }  // namespace autobatch_plugin
 }  // namespace ov
--- a/src/plugins/auto_batch/src/sync_infer_request.hpp
+++ b/src/plugins/auto_batch/src/sync_infer_request.hpp
@ -6,40 +6,36 @@
 #pragma once

 #include "compiled_model.hpp"
-#include "cpp_interfaces/interface/ie_iinfer_request_internal.hpp"
+#include "openvino/runtime/isync_infer_request.hpp"

 namespace ov {
 namespace autobatch_plugin {

-class SyncInferRequest : public InferenceEngine::IInferRequestInternal {
+class SyncInferRequest : public ov::ISyncInferRequest {
 public:
-    using Ptr = std::shared_ptr<SyncInferRequest>;
-    explicit SyncInferRequest(const InferenceEngine::InputsDataMap& networkInputs,
-                              const InferenceEngine::OutputsDataMap& networkOutputs,
-                              CompiledModel::WorkerInferRequest& workerRequestPtr,
-                              int batch_id,
-                              int num_batch,
-                              const std::set<std::string>& batchedIntputs,
-                              const std::set<std::string>& batchedOutputs);
-
-    explicit SyncInferRequest(const std::vector<std::shared_ptr<const ov::Node>>& inputs,
-                              const std::vector<std::shared_ptr<const ov::Node>>& outputs,
-                              CompiledModel::WorkerInferRequest& workerRequestPtr,
-                              int batch_id,
-                              int num_batch,
-                              const std::set<std::string>& batchedIntputs,
-                              const std::set<std::string>& batchedOutputs);
+    SyncInferRequest(const std::shared_ptr<const ov::autobatch_plugin::CompiledModel>& compiled_model,
+                     const std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest>& worker_request,
+                     int batch_id,
+                     int num_batch,
+                     const std::set<std::string>& batched_inputs,
+                     const std::set<std::string>& batched_outputs);

    // Batch-Device impl specific: sets the data (blobs from the device request to the batched device request)
-    void SetBlobsToAnotherRequest(InferenceEngine::SoIInferRequestInternal& req);
+    void set_tensors_to_another_request(ov::SoPtr<ov::IAsyncInferRequest>& req);

-    void CopyInputsIfNeeded();
+    void copy_inputs_if_needed();

-    void CopyOutputsIfNeeded();
+    void copy_outputs_if_needed();

-    CompiledModel::WorkerInferRequest& m_batched_request_wrapper;
+    void infer() override;

-    std::exception_ptr m_exceptionPtr;
+    std::vector<ov::SoPtr<ov::IVariableState>> query_state() const override;
+
+    std::vector<ov::ProfilingInfo> get_profiling_info() const override;
+
+    std::shared_ptr<ov::autobatch_plugin::CompiledModel::WorkerInferRequest> m_batched_request_wrapper;
+
+    std::exception_ptr m_exception_ptr;

    enum eExecutionFlavor : uint8_t {
        NOT_EXECUTED,
@ -48,10 +44,11 @@ public:
    } m_batched_request_status = eExecutionFlavor::NOT_EXECUTED;

 protected:
-    void CopyBlobIfNeeded(InferenceEngine::Blob::CPtr src, InferenceEngine::Blob::Ptr dst, bool bInput);
+    void copy_tensor_if_needed(const ov::SoPtr<ov::ITensor>& src, ov::SoPtr<ov::ITensor>& dst, const bool bInput);
+
+    void share_tensors_with_batched_req(const std::set<std::string>& batched_inputs,
+                                        const std::set<std::string>& batched_outputs);

-    void ShareBlobsWithBatchRequest(const std::set<std::string>& batchedIntputs,
-                                    const std::set<std::string>& batchedOutputs);
    size_t m_batch_id;

    size_t m_batch_size;
--- a/src/plugins/auto_batch/tests/functional/behavior/ov_executable_network/properties.cpp
+++ b/src/plugins/auto_batch/tests/functional/behavior/ov_executable_network/properties.cpp
@ -13,8 +13,6 @@ namespace {

 const std::vector<ov::AnyMap> auto_batch_inproperties = {
    {ov::num_streams(-100)},
-    {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), std::string(CommonTestUtils::DEVICE_TEMPLATE) + "(4)"},
-     {ov::auto_batch_timeout(-1)}},
    {ov::device::id("UNSUPPORTED_DEVICE_ID_STRING")},
 };

--- a/src/plugins/auto_batch/tests/functional/behavior/plugin/auto_batching_tests.cpp
+++ b/src/plugins/auto_batch/tests/functional/behavior/plugin/auto_batching_tests.cpp
@ -53,7 +53,7 @@ INSTANTIATE_TEST_SUITE_P(smoke_AutoBatching_test_uint,
                         ::testing::Combine(::testing::Values(std::string(CommonTestUtils::DEVICE_BATCH) + ":" +
                                                              CommonTestUtils::DEVICE_TEMPLATE),
                                            ::testing::Values(DefaultParameter{ov::auto_batch_timeout.name(),
-                                                                               InferenceEngine::Parameter{1000}})),
+                                                                               InferenceEngine::Parameter{uint32_t(1000)}})),
                         DefaultConfigurationTest::getTestCaseName);

 }  // namespace
--- a/src/plugins/auto_batch/tests/functional/behavior/plugin/configuration_tests.cpp
+++ b/src/plugins/auto_batch/tests/functional/behavior/plugin/configuration_tests.cpp
@ -8,8 +8,6 @@ using namespace BehaviorTestsDefinitions;
 namespace {
 auto auto_batch_inconfigs = []() {
    return std::vector<std::map<std::string, std::string>>{
-        {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_TEMPLATE},
-         {ov::auto_batch_timeout.name(), "-1"}},
        {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_TEMPLATE},
         {ov::hint::performance_mode.name(), "DOESN'T EXIST"}},
        {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_TEMPLATE},
--- a/src/plugins/auto_batch/tests/unit/async_infer_request_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/async_infer_request_test.cpp
@ -0,0 +1,324 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "ngraph_functions/subgraph_builders.hpp"
+#include "openvino/core/dimension_tracker.hpp"
+#include "openvino/core/type/element_type.hpp"
+#include "openvino/runtime/threading/immediate_executor.hpp"
+#include "transformations/utils/utils.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::MatcherCast;
+using ::testing::Matches;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using AutoBatchRequestTestParams = std::tuple<uint32_t,             // batch_size
+                                              ov::element::Type_t,  // data type
+                                              uint32_t>;            // inference interval
+
+class AutoBatchAsyncInferRequestTest : public ::testing::TestWithParam<AutoBatchRequestTestParams> {
+public:
+    std::shared_ptr<ov::Model> m_model;
+    std::shared_ptr<ov::Model> m_batched_model;
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_auto_batch_plugin;
+
+    std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
+
+    std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_without_batch;
+    ov::SoPtr<ov::ICompiledModel> m_compile_model_without_batch;
+
+    std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_with_batch;
+    ov::SoPtr<ov::ICompiledModel> m_compile_model_with_batch;
+
+    ov::AnyMap m_config;
+    DeviceInformation m_device_info;
+    std::set<std::string> m_batched_inputs;
+    std::set<std::string> m_batched_outputs;
+    ov::SoPtr<ov::IRemoteContext> m_remote_context;
+
+    std::shared_ptr<CompiledModel> m_auto_batch_compile_model;
+
+    std::shared_ptr<NiceMock<MockISyncInferRequest>> m_sync_infer_request_with_batch;
+
+    std::shared_ptr<NiceMock<MockIAsyncInferRequest>> m_async_infer_request_with_batch;
+
+    std::shared_ptr<NiceMock<MockISyncInferRequest>> m_sync_infer_request_without_batch;
+
+    std::shared_ptr<NiceMock<MockIAsyncInferRequest>> m_async_infer_request_without_batch;
+
+    std::shared_ptr<ov::threading::ImmediateExecutor> m_executor;
+
+    std::shared_ptr<CompiledModel::WorkerInferRequest> workerRequestPtr;
+
+    uint32_t m_batch_size;
+    ov::element::Type_t m_element_type;
+    uint32_t m_infer_interval;
+
+    std::vector<std::shared_ptr<AsyncInferRequest>> m_auto_batch_async_infer_requests;
+
+    std::vector<ov::ProfilingInfo> m_profiling_info;
+
+    bool m_terminate;
+
+    static std::string getTestCaseName(testing::TestParamInfo<AutoBatchRequestTestParams> obj) {
+        uint32_t batch_size, infer_interval;
+        ov::element::Type_t element_type;
+        std::tie(batch_size, element_type, infer_interval) = obj.param;
+
+        std::string res;
+        res = "batch_size_" + std::to_string(batch_size);
+        res += "_element_type_" + std::to_string(static_cast<int>(element_type));
+        if (infer_interval > 0)
+            res += "_infer_interval_" + std::to_string(infer_interval);
+        return res;
+    }
+
+    void TearDown() override {
+        m_terminate = true;
+        m_profiling_info.clear();
+        m_auto_batch_async_infer_requests.clear();
+        m_auto_batch_plugin.reset();
+        m_model.reset();
+        m_batched_model.reset();
+        m_core.reset();
+        m_i_compile_model_without_batch.reset();
+        m_compile_model_without_batch = {};
+        m_i_compile_model_with_batch.reset();
+        m_compile_model_with_batch = {};
+        m_auto_batch_compile_model.reset();
+        m_sync_infer_request_without_batch.reset();
+        m_async_infer_request_without_batch.reset();
+        m_executor.reset();
+        clear_worker();
+        workerRequestPtr.reset();
+        m_sync_infer_request_with_batch.reset();
+        m_async_infer_request_with_batch.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_batch_size, m_element_type, m_infer_interval) = this->GetParam();
+        m_terminate = false;
+        std::vector<size_t> inputShape = {1, 3, 24, 24};
+        m_model = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, m_element_type);
+
+        prepare_input(m_model, m_batch_size);
+
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+
+        m_auto_batch_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+
+        m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
+
+        m_auto_batch_plugin->set_core(m_core);
+        m_i_compile_model_without_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
+        m_compile_model_without_batch = {m_i_compile_model_without_batch, {}};
+
+        m_config = {{"AUTO_BATCH_TIMEOUT", "200"}};
+
+        m_device_info = {"CPU", {}, m_batch_size};
+
+        auto reshaped = m_model->clone();
+        auto inputs = reshaped->inputs();
+        std::map<ov::Output<ov::Node>, ov::PartialShape> partial_shapes;
+        for (auto& input : inputs) {
+            auto input_shape = input.get_shape();
+            if (m_batched_inputs.find(ov::op::util::get_ie_output_name(input)) != m_batched_inputs.end()) {
+                input_shape[0] = m_batch_size;
+            }
+            partial_shapes.insert({input, ov::PartialShape(input_shape)});
+        }
+
+        reshaped->reshape(partial_shapes);
+
+        m_i_compile_model_with_batch = std::make_shared<NiceMock<MockICompiledModel>>(reshaped, m_hardware_plugin);
+        m_compile_model_with_batch = {m_i_compile_model_with_batch, {}};
+
+        ASSERT_NO_THROW(m_auto_batch_compile_model = std::make_shared<CompiledModel>(m_model->clone(),
+                                                                                     m_auto_batch_plugin,
+                                                                                     m_config,
+                                                                                     m_device_info,
+                                                                                     m_batched_inputs,
+                                                                                     m_batched_outputs,
+                                                                                     m_compile_model_with_batch,
+                                                                                     m_compile_model_without_batch,
+                                                                                     m_remote_context));
+
+        m_sync_infer_request_with_batch =
+            std::make_shared<NiceMock<MockISyncInferRequest>>(m_i_compile_model_with_batch);
+
+        m_executor = std::make_shared<ov::threading::ImmediateExecutor>();
+
+        m_async_infer_request_with_batch =
+            std::make_shared<NiceMock<MockIAsyncInferRequest>>(m_sync_infer_request_with_batch, m_executor, nullptr);
+
+        m_sync_infer_request_without_batch =
+            std::make_shared<NiceMock<MockISyncInferRequest>>(m_i_compile_model_without_batch);
+
+        m_async_infer_request_without_batch =
+            std::make_shared<NiceMock<MockIAsyncInferRequest>>(m_sync_infer_request_without_batch, m_executor, nullptr);
+
+        m_profiling_info = {};
+    }
+
+    void create_worker(int batch_size) {
+        workerRequestPtr = std::make_shared<CompiledModel::WorkerInferRequest>();
+
+        workerRequestPtr->_infer_request_batched = {m_async_infer_request_with_batch, {}};
+        workerRequestPtr->_batch_size = batch_size;
+        workerRequestPtr->_completion_tasks.resize(workerRequestPtr->_batch_size);
+        workerRequestPtr->_infer_request_batched->set_callback([this](std::exception_ptr exceptionPtr) mutable {
+            if (exceptionPtr)
+                workerRequestPtr->_exception_ptr = exceptionPtr;
+        });
+
+        ON_CALL(*m_async_infer_request_with_batch, start_async()).WillByDefault([this]() {
+            OPENVINO_ASSERT(workerRequestPtr->_completion_tasks.size() == (size_t)workerRequestPtr->_batch_size);
+            for (int c = 0; c < workerRequestPtr->_batch_size; c++) {
+                workerRequestPtr->_completion_tasks[c]();
+            }
+            workerRequestPtr->_cond.notify_one();
+        });
+
+        workerRequestPtr->_thread = std::thread([this] {
+            while (1) {
+                std::cv_status status;
+                {
+                    std::unique_lock<std::mutex> lock(workerRequestPtr->_mutex);
+                    status = workerRequestPtr->_cond.wait_for(lock, std::chrono::milliseconds(10));
+                }
+                if (m_terminate) {
+                    break;
+                } else {
+                    // as we pop the tasks from the queue only here
+                    // it is ok to call size() (as the _tasks can only grow in parallel)
+                    const int sz = static_cast<int>(workerRequestPtr->_tasks.size());
+                    if (sz == workerRequestPtr->_batch_size) {
+                        std::pair<ov::autobatch_plugin::AsyncInferRequest*, ov::threading::Task> t;
+                        for (int n = 0; n < sz; n++) {
+                            OPENVINO_ASSERT(workerRequestPtr->_tasks.try_pop(t));
+                            workerRequestPtr->_completion_tasks[n] = std::move(t.second);
+                            t.first->m_sync_request->copy_inputs_if_needed();
+                            t.first->m_sync_request->m_batched_request_status =
+                                ov::autobatch_plugin::SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED;
+                        }
+                        workerRequestPtr->_infer_request_batched->start_async();
+                    } else if ((status == std::cv_status::timeout) && sz) {
+                        std::pair<AsyncInferRequest*, ov::threading::Task> t;
+                        for (int n = 0; n < sz; n++) {
+                            IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
+                            t.first->m_sync_request->m_batched_request_status =
+                                SyncInferRequest::eExecutionFlavor::TIMEOUT_EXECUTED;
+                            t.first->m_request_without_batch->start_async();
+                            t.second();
+                        }
+                    }
+                }
+            }
+        });
+        return;
+    }
+
+    void clear_worker() {
+        workerRequestPtr->_infer_request_batched = {};
+        workerRequestPtr->_completion_tasks.clear();
+        workerRequestPtr->_thread.join();
+    }
+
+    void prepare_input(std::shared_ptr<ov::Model>& model, int batch_size) {
+        const auto& params = model->get_parameters();
+        for (size_t i = 0; i < params.size(); i++) {
+            m_batched_inputs.insert(ov::op::util::get_ie_output_name(params[i]->output(0)));
+        }
+        const auto& results = model->get_results();
+        for (size_t i = 0; i < results.size(); i++) {
+            const auto& output = results[i];
+            const auto& node = output->input_value(0);
+            m_batched_outputs.insert(
+                ov::op::util::get_ie_output_name(ov::Output<const ov::Node>(node.get_node(), node.get_index())));
+        }
+    }
+};
+
+TEST_P(AutoBatchAsyncInferRequestTest, AutoBatchRequestCreateTestCase) {
+    prepare_input(m_model, m_batch_size);
+    create_worker(m_batch_size);
+
+    for (uint32_t batch_id = 0; batch_id < m_batch_size; batch_id++) {
+        auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
+                                                      workerRequestPtr,
+                                                      batch_id,
+                                                      m_batch_size,
+                                                      m_batched_inputs,
+                                                      m_batched_outputs);
+        EXPECT_NE(req, nullptr);
+
+        auto asyncInferRequest = std::make_shared<AsyncInferRequest>(req, m_async_infer_request_without_batch, nullptr);
+        EXPECT_NE(asyncInferRequest, nullptr);
+        m_auto_batch_async_infer_requests.emplace_back(asyncInferRequest);
+    }
+}
+
+TEST_P(AutoBatchAsyncInferRequestTest, AutoBatchAsyncInferRequestStartAsyncTest) {
+    prepare_input(m_model, m_batch_size);
+    create_worker(m_batch_size);
+
+    for (uint32_t batch_id = 0; batch_id < m_batch_size; batch_id++) {
+        auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
+                                                      workerRequestPtr,
+                                                      batch_id,
+                                                      m_batch_size,
+                                                      m_batched_inputs,
+                                                      m_batched_outputs);
+        EXPECT_NE(req, nullptr);
+
+        auto asyncInferRequest = std::make_shared<AsyncInferRequest>(req, m_async_infer_request_without_batch, nullptr);
+        EXPECT_NE(asyncInferRequest, nullptr);
+        m_auto_batch_async_infer_requests.emplace_back(asyncInferRequest);
+    }
+
+    for (auto& req : m_auto_batch_async_infer_requests) {
+        if (m_infer_interval > 0)
+            std::this_thread::sleep_for(std::chrono::milliseconds(m_infer_interval));
+        EXPECT_NO_THROW(req->start_async());
+    }
+
+    for (auto& req : m_auto_batch_async_infer_requests) {
+        EXPECT_NO_THROW(req->wait());
+    }
+}
+
+std::vector<ov::element::Type_t> element_type_param{ov::element::Type_t::f16,
+                                                    ov::element::Type_t::f32,
+                                                    ov::element::Type_t::f64,
+                                                    ov::element::Type_t::i8,
+                                                    ov::element::Type_t::i16,
+                                                    ov::element::Type_t::i32,
+                                                    ov::element::Type_t::i64,
+                                                    ov::element::Type_t::u8,
+                                                    ov::element::Type_t::u16,
+                                                    ov::element::Type_t::u32,
+                                                    ov::element::Type_t::u64};
+const std::vector<uint32_t> batch_size_param{1, 8, 16, 32, 64, 128};
+const std::vector<uint32_t> infer_interval_timeout_param{0, 10};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         AutoBatchAsyncInferRequestTest,
+                         ::testing::Combine(::testing::ValuesIn(batch_size_param),
+                                            ::testing::ValuesIn(element_type_param),
+                                            ::testing::ValuesIn(infer_interval_timeout_param)),
+                         AutoBatchAsyncInferRequestTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/auto_batch_infer_request_tests.cpp
+++ b/src/plugins/auto_batch/tests/unit/auto_batch_infer_request_tests.cpp
@ -1,397 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#include <gmock/gmock.h>
-#include <gtest/gtest.h>
-
-#include <thread>
-
-#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
-#include "ie_ngraph_utils.hpp"
-#include "mock_auto_batch_plugin.hpp"
-#include "ngraph_functions/subgraph_builders.hpp"
-#include "transformations/utils/utils.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iexecutable_network_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_ivariable_state_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/mock_task_executor.hpp"
-
-using ::testing::_;
-using ::testing::AnyNumber;
-using ::testing::AtLeast;
-using ::testing::Eq;
-using ::testing::MatcherCast;
-using ::testing::Matches;
-using ::testing::NiceMock;
-using ::testing::Return;
-using ::testing::ReturnRef;
-using ::testing::StrEq;
-using ::testing::StrNe;
-using ::testing::Throw;
-using namespace ov::mock_autobatch_plugin;
-using namespace InferenceEngine;
-
-using AutoBatchRequestTestParams = std::tuple<int,                      // batch_size
-                                              ngraph::element::Type_t,  // data type
-                                              int>;                     // inference interval
-class AutoBatchRequestTest : public ::testing::TestWithParam<AutoBatchRequestTestParams> {
-public:
-    // Mock inferRequest
-    std::shared_ptr<NiceMock<MockIInferRequestInternal>> mockInferRequestBatched;
-
-    std::vector<std::shared_ptr<SyncInferRequest>> autoBatchInferRequests;
-    std::map<std::string, InferenceEngine::Blob::Ptr> blobMap;
-
-    std::vector<std::shared_ptr<const ov::Node>> inputs, outputs;
-    std::set<std::string> batchedInputs, batchedOutputs;
-    std::shared_ptr<CompiledModel::WorkerInferRequest> workerRequestPtr;
-
-public:
-    static std::string getTestCaseName(testing::TestParamInfo<AutoBatchRequestTestParams> obj) {
-        int batch_size, infer_interval;
-        ngraph::element::Type_t element_type;
-        std::tie(batch_size, element_type, infer_interval) = obj.param;
-
-        std::string res;
-        res = "batch_size_" + std::to_string(batch_size);
-        res += "_element_type_" + std::to_string(static_cast<int>(element_type));
-        if (infer_interval > 0)
-            res += "_infer_interval_" + std::to_string(infer_interval);
-        return res;
-    }
-
-    void TearDown() override {
-        mockInferRequestBatched = {};
-        autoBatchInferRequests.clear();
-        blobMap.clear();
-
-        inputs.clear();
-        outputs.clear();
-        batchedInputs.clear();
-        batchedOutputs.clear();
-        clear_worker();
-    }
-
-    void SetUp() override {
-        mockInferRequestBatched = std::make_shared<NiceMock<MockIInferRequestInternal>>();
-    }
-
-    void create_worker(int batch_size) {
-        workerRequestPtr = std::make_shared<CompiledModel::WorkerInferRequest>();
-
-        workerRequestPtr->_inferRequestBatched = {mockInferRequestBatched, {}};
-        workerRequestPtr->_batchSize = batch_size;
-        workerRequestPtr->_completionTasks.resize(workerRequestPtr->_batchSize);
-        workerRequestPtr->_inferRequestBatched->SetCallback([this](std::exception_ptr exceptionPtr) mutable {
-            if (exceptionPtr)
-                workerRequestPtr->m_exceptionPtr = exceptionPtr;
-        });
-        workerRequestPtr->_thread = std::thread([] {
-            std::this_thread::sleep_for(std::chrono::milliseconds(10));
-        });
-        return;
-    }
-
-    void clear_worker() {
-        workerRequestPtr->_inferRequestBatched = {};
-        workerRequestPtr->_completionTasks.clear();
-        workerRequestPtr->_thread.join();
-    }
-
-    void prepare_input(std::shared_ptr<ov::Model>& function, int batch_size) {
-        for (auto& input : function->inputs()) {
-            std::shared_ptr<const ov::Node> n = input.get_node_shared_ptr();
-            inputs.emplace_back(n);
-        }
-
-        for (auto& output : function->outputs()) {
-            std::shared_ptr<const ov::Node> n = output.get_node_shared_ptr();
-            outputs.emplace_back(n);
-        }
-
-        const auto& params = function->get_parameters();
-        for (size_t i = 0; i < params.size(); i++) {
-            batchedInputs.insert(ov::op::util::get_ie_output_name(params[i]->output(0)));
-        }
-        const auto& results = function->get_results();
-        for (size_t i = 0; i < results.size(); i++) {
-            const auto& output = results[i];
-            const auto& node = output->input_value(0);
-            batchedOutputs.insert(
-                ov::op::util::get_ie_output_name(ov::Output<const ov::Node>(node.get_node(), node.get_index())));
-        }
-
-        ON_CALL(*mockInferRequestBatched, GetBlob(StrEq(*batchedInputs.begin())))
-            .WillByDefault([this, batch_size](const std::string& name) {
-                auto item = blobMap.find(name);
-                if (item != blobMap.end()) {
-                    return item->second;
-                }
-                auto shape = inputs[0]->get_shape();
-                shape[0] = batch_size;
-                auto element_type = inputs[0]->get_element_type();
-                InferenceEngine::TensorDesc tensorDesc = {InferenceEngine::details::convertPrecision(element_type),
-                                                          shape,
-                                                          InferenceEngine::TensorDesc::getLayoutByRank(shape.size())};
-                auto blob = make_blob_with_precision(tensorDesc);
-                blob->allocate();
-                blobMap[name] = blob;
-                return blob;
-            });
-
-        ON_CALL(*mockInferRequestBatched, GetBlob(StrEq(*batchedOutputs.begin())))
-            .WillByDefault([this, batch_size](const std::string& name) {
-                auto item = blobMap.find(name);
-                if (item != blobMap.end()) {
-                    return item->second;
-                }
-                auto shape = outputs[0]->get_shape();
-                shape[0] = batch_size;
-                auto element_type = outputs[0]->get_element_type();
-                InferenceEngine::TensorDesc tensorDesc = {InferenceEngine::details::convertPrecision(element_type),
-                                                          shape,
-                                                          InferenceEngine::TensorDesc::getLayoutByRank(shape.size())};
-                auto blob = make_blob_with_precision(tensorDesc);
-                blob->allocate();
-                blobMap[name] = blob;
-                return blob;
-            });
-    }
-};
-
-TEST_P(AutoBatchRequestTest, AutoBatchRequestCreateTestCase) {
-    int batch_size, infer_interval;
-    ngraph::element::Type_t element_type;
-    std::tie(batch_size, element_type, infer_interval) = this->GetParam();
-
-    std::vector<size_t> inputShape = {1, 3, 24, 24};
-    auto function = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, element_type);
-    prepare_input(function, batch_size);
-    create_worker(batch_size);
-
-    for (int batch_id = 0; batch_id < batch_size; batch_id++) {
-        auto req = std::make_shared<SyncInferRequest>(inputs,
-                                                      outputs,
-                                                      *workerRequestPtr,
-                                                      batch_id,
-                                                      batch_size,
-                                                      batchedInputs,
-                                                      batchedOutputs);
-        EXPECT_NE(req, nullptr);
-        autoBatchInferRequests.emplace_back(req);
-
-        std::vector<std::string> names = {*batchedInputs.begin(), *batchedOutputs.begin()};
-        for (auto& name : names) {
-            auto blob = req->GetBlob(name);
-            auto ptr = blob->buffer().as<char*>();
-            auto size = blob->byteSize();
-            auto batch_blob = mockInferRequestBatched->GetBlob(name);
-            auto batch_ptr = batch_blob->buffer().as<char*>();
-            EXPECT_EQ(ptr, batch_ptr + size * batch_id);
-        }
-    }
-}
-
-TEST_P(AutoBatchRequestTest, AutoBatchRequestCopyBlobTestCase) {
-    int batch_size, infer_interval;
-    ngraph::element::Type_t element_type;
-    std::tie(batch_size, element_type, infer_interval) = this->GetParam();
-
-    std::vector<size_t> inputShape = {1, 3, 24, 24};
-    auto function = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, element_type);
-    prepare_input(function, batch_size);
-    create_worker(batch_size);
-
-    for (int batch_id = 0; batch_id < batch_size; batch_id++) {
-        auto req = std::make_shared<SyncInferRequest>(inputs,
-                                                      outputs,
-                                                      *workerRequestPtr,
-                                                      batch_id,
-                                                      batch_size,
-                                                      batchedInputs,
-                                                      batchedOutputs);
-        EXPECT_NE(req, nullptr);
-        autoBatchInferRequests.emplace_back(req);
-
-        EXPECT_NO_THROW(req->CopyInputsIfNeeded());
-        EXPECT_NO_THROW(req->CopyOutputsIfNeeded());
-    }
-}
-
-class AutoBatchAsyncInferRequestTest : public AutoBatchRequestTest {
-public:
-    std::shared_ptr<NiceMock<MockIInferRequestInternal>> mockInferRequestWithoutBatched;
-    MockTaskExecutor::Ptr mockTaskExecutor;
-    std::vector<AsyncInferRequest::Ptr> autoBatchAsyncInferRequestVec;
-    bool terminate;
-
-public:
-    void TearDown() override {
-        terminate = true;
-        autoBatchAsyncInferRequestVec.clear();
-        AutoBatchRequestTest::TearDown();
-        mockInferRequestWithoutBatched = {};
-    }
-
-    void SetUp() override {
-        AutoBatchRequestTest::SetUp();
-        mockInferRequestWithoutBatched = std::make_shared<NiceMock<MockIInferRequestInternal>>();
-        terminate = false;
-
-        mockTaskExecutor = std::make_shared<MockTaskExecutor>();
-    }
-
-    void create_worker(int batch_size) {
-        workerRequestPtr = std::make_shared<CompiledModel::WorkerInferRequest>();
-
-        workerRequestPtr->_inferRequestBatched = {mockInferRequestBatched, {}};
-        workerRequestPtr->_batchSize = batch_size;
-        workerRequestPtr->_completionTasks.resize(workerRequestPtr->_batchSize);
-        workerRequestPtr->_inferRequestBatched->SetCallback([this](std::exception_ptr exceptionPtr) mutable {
-            if (exceptionPtr)
-                workerRequestPtr->m_exceptionPtr = exceptionPtr;
-        });
-
-        ON_CALL(*mockInferRequestBatched, StartAsync()).WillByDefault([this]() {
-            IE_ASSERT(workerRequestPtr->_completionTasks.size() == (size_t)workerRequestPtr->_batchSize);
-            for (int c = 0; c < workerRequestPtr->_batchSize; c++) {
-                workerRequestPtr->_completionTasks[c]();
-            }
-            workerRequestPtr->_cond.notify_one();
-        });
-
-        workerRequestPtr->_thread = std::thread([this] {
-            while (1) {
-                std::cv_status status;
-                {
-                    std::unique_lock<std::mutex> lock(workerRequestPtr->_mutex);
-                    status = workerRequestPtr->_cond.wait_for(lock, std::chrono::milliseconds(10));
-                }
-                if (terminate) {
-                    break;
-                } else {
-                    const int sz = static_cast<int>(workerRequestPtr->_tasks.size());
-                    if (sz == workerRequestPtr->_batchSize) {
-                        std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
-                        for (int n = 0; n < sz; n++) {
-                            IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
-                            workerRequestPtr->_completionTasks[n] = std::move(t.second);
-                            t.first->m_sync_infer_request->m_batched_request_status =
-                                SyncInferRequest::eExecutionFlavor::BATCH_EXECUTED;
-                        }
-                        workerRequestPtr->_inferRequestBatched->StartAsync();
-                    } else if ((status == std::cv_status::timeout) && sz) {
-                        std::pair<AsyncInferRequest*, InferenceEngine::Task> t;
-                        for (int n = 0; n < sz; n++) {
-                            IE_ASSERT(workerRequestPtr->_tasks.try_pop(t));
-                            t.first->m_sync_infer_request->m_batched_request_status =
-                                SyncInferRequest::eExecutionFlavor::TIMEOUT_EXECUTED;
-                            t.first->m_infer_request_without_batch->StartAsync();
-                            t.second();
-                        }
-                    }
-                }
-            }
-        });
-        return;
-    }
-};
-
-TEST_P(AutoBatchAsyncInferRequestTest, AutoBatchAsyncInferRequestCreateTest) {
-    int batch_size, infer_interval;
-    ngraph::element::Type_t element_type;
-    std::tie(batch_size, element_type, infer_interval) = this->GetParam();
-
-    std::vector<size_t> inputShape = {1, 3, 24, 24};
-    auto function = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, element_type);
-    prepare_input(function, batch_size);
-    create_worker(batch_size);
-
-    for (int batch_id = 0; batch_id < batch_size; batch_id++) {
-        auto autoRequestImpl = std::make_shared<SyncInferRequest>(inputs,
-                                                                  outputs,
-                                                                  *workerRequestPtr,
-                                                                  batch_id,
-                                                                  batch_size,
-                                                                  batchedInputs,
-                                                                  batchedOutputs);
-        EXPECT_NE(autoRequestImpl, nullptr);
-        autoBatchInferRequests.emplace_back(autoRequestImpl);
-
-        InferenceEngine::SoIInferRequestInternal inferRequestWithoutBatched = {mockInferRequestWithoutBatched, {}};
-        auto asyncInferRequest =
-            std::make_shared<AsyncInferRequest>(autoRequestImpl, inferRequestWithoutBatched, nullptr);
-        EXPECT_NE(asyncInferRequest, nullptr);
-        autoBatchAsyncInferRequestVec.emplace_back(asyncInferRequest);
-    }
-}
-
-TEST_P(AutoBatchAsyncInferRequestTest, AutoBatchAsyncInferRequestStartAsyncTest) {
-    int batch_size, infer_interval;
-    ngraph::element::Type_t element_type;
-    std::tie(batch_size, element_type, infer_interval) = this->GetParam();
-
-    std::vector<size_t> inputShape = {1, 3, 24, 24};
-    auto function = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, element_type);
-    prepare_input(function, batch_size);
-    create_worker(batch_size);
-
-    for (int batch_id = 0; batch_id < batch_size; batch_id++) {
-        auto autoRequestImpl = std::make_shared<SyncInferRequest>(inputs,
-                                                                  outputs,
-                                                                  *workerRequestPtr,
-                                                                  batch_id,
-                                                                  batch_size,
-                                                                  batchedInputs,
-                                                                  batchedOutputs);
-        EXPECT_NE(autoRequestImpl, nullptr);
-        autoBatchInferRequests.emplace_back(autoRequestImpl);
-
-        InferenceEngine::SoIInferRequestInternal inferRequestWithoutBatched = {mockInferRequestWithoutBatched, {}};
-        auto asyncInferRequest =
-            std::make_shared<AsyncInferRequest>(autoRequestImpl, inferRequestWithoutBatched, nullptr);
-        EXPECT_NE(asyncInferRequest, nullptr);
-        autoBatchAsyncInferRequestVec.emplace_back(asyncInferRequest);
-    }
-
-    for (auto& req : autoBatchAsyncInferRequestVec) {
-        if (infer_interval > 0)
-            std::this_thread::sleep_for(std::chrono::milliseconds(infer_interval));
-        EXPECT_NO_THROW(req->StartAsync());
-    }
-
-    for (auto& req : autoBatchAsyncInferRequestVec)
-        EXPECT_NO_THROW(req->Wait(InferRequest::WaitMode::RESULT_READY));
-}
-
-const std::vector<ngraph::element::Type_t> element_type{ngraph::element::Type_t::f16,
-                                                        ngraph::element::Type_t::f32,
-                                                        ngraph::element::Type_t::f64,
-                                                        ngraph::element::Type_t::i8,
-                                                        ngraph::element::Type_t::i16,
-                                                        ngraph::element::Type_t::i32,
-                                                        ngraph::element::Type_t::i64,
-                                                        ngraph::element::Type_t::u8,
-                                                        ngraph::element::Type_t::u16,
-                                                        ngraph::element::Type_t::u32,
-                                                        ngraph::element::Type_t::u64};
-const std::vector<int> batch_size{1, 8, 16, 32, 64, 128};
-const std::vector<int> infer_interval{0};
-const std::vector<int> infer_interval_timeout{0, 10};
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         AutoBatchRequestTest,
-                         ::testing::Combine(::testing::ValuesIn(batch_size),
-                                            ::testing::ValuesIn(element_type),
-                                            ::testing::ValuesIn(infer_interval)),
-                         AutoBatchRequestTest::getTestCaseName);
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         AutoBatchAsyncInferRequestTest,
-                         ::testing::Combine(::testing::ValuesIn(batch_size),
-                                            ::testing::ValuesIn(element_type),
-                                            ::testing::ValuesIn(infer_interval_timeout)),
-                         AutoBatchAsyncInferRequestTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/compile_model_create_infer_request_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/compile_model_create_infer_request_test.cpp
@ -0,0 +1,148 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "ngraph_functions/subgraph_builders.hpp"
+#include "openvino/core/dimension_tracker.hpp"
+#include "openvino/runtime/threading/immediate_executor.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::MatcherCast;
+using ::testing::Matches;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using CreateInferRequestTestParams = std::tuple<int,   // batch_size
+                                                int>;  // inferReq number
+
+class CompileModelCreateInferRequestTest : public ::testing::TestWithParam<CreateInferRequestTestParams> {
+public:
+    std::shared_ptr<ov::Model> m_model;
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_auto_batch_plugin;
+
+    std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_without_batch;
+    ov::SoPtr<ov::ICompiledModel> m_compile_model_without_batch;
+
+    std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_with_batch;
+    ov::SoPtr<ov::ICompiledModel> m_compile_model_with_batch;
+
+    ov::AnyMap m_config;
+    DeviceInformation m_device_info;
+    std::set<std::string> m_batched_inputs;
+    std::set<std::string> m_batched_outputs;
+    ov::SoPtr<ov::IRemoteContext> m_remote_context;
+
+    std::shared_ptr<MockAutoBatchCompileModel> m_auto_batch_compile_model;
+
+    std::shared_ptr<NiceMock<MockISyncInferRequest>> m_sync_infer_request;
+
+    std::shared_ptr<ov::threading::ImmediateExecutor> m_executor;
+
+    uint32_t m_batch_size;
+    int m_infer_request_num;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<CreateInferRequestTestParams> obj) {
+        int batch_size;
+        int infer_num;
+        std::tie(batch_size, infer_num) = obj.param;
+
+        std::string res;
+        res = "batch_size_" + std::to_string(batch_size);
+        res += "_infer_num_" + std::to_string(infer_num);
+        return res;
+    }
+
+    void TearDown() override {
+        m_auto_batch_plugin.reset();
+        m_model.reset();
+        m_core.reset();
+        m_i_compile_model_without_batch.reset();
+        m_compile_model_without_batch = {};
+        m_i_compile_model_with_batch.reset();
+        m_compile_model_with_batch = {};
+        m_auto_batch_compile_model.reset();
+        m_sync_infer_request.reset();
+        m_executor.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_batch_size, m_infer_request_num) = this->GetParam();
+        m_model = ngraph::builder::subgraph::makeMultiSingleConv();
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+
+        m_auto_batch_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+
+        m_auto_batch_plugin->set_core(m_core);
+        m_i_compile_model_without_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_auto_batch_plugin);
+        m_compile_model_without_batch = {m_i_compile_model_without_batch, {}};
+
+        m_config = {{"AUTO_BATCH_TIMEOUT", "200"}};
+
+        m_device_info = {"CPU", {}, m_batch_size};
+        m_batched_inputs = {"Parameter_0"};
+        m_batched_outputs = {"Convolution_20"};
+
+        if (m_batch_size > 1) {
+            m_i_compile_model_with_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_auto_batch_plugin);
+            m_compile_model_with_batch = {m_i_compile_model_with_batch, {}};
+        }
+
+        ASSERT_NO_THROW(m_auto_batch_compile_model =
+                            std::make_shared<MockAutoBatchCompileModel>(m_model->clone(),
+                                                                        m_auto_batch_plugin,
+                                                                        m_config,
+                                                                        m_device_info,
+                                                                        m_batched_inputs,
+                                                                        m_batched_outputs,
+                                                                        m_compile_model_with_batch,
+                                                                        m_compile_model_without_batch,
+                                                                        m_remote_context));
+
+        m_sync_infer_request = std::make_shared<NiceMock<MockISyncInferRequest>>(m_i_compile_model_without_batch);
+
+        m_executor = std::make_shared<ov::threading::ImmediateExecutor>();
+
+        ON_CALL(*m_i_compile_model_without_batch, create_infer_request()).WillByDefault([this]() {
+            return std::make_shared<NiceMock<MockIAsyncInferRequest>>(m_sync_infer_request, m_executor, nullptr);
+        });
+
+        EXPECT_CALL(*m_auto_batch_compile_model, create_sync_infer_request())
+            .WillRepeatedly(Return(m_sync_infer_request));
+    }
+};
+
+TEST_P(CompileModelCreateInferRequestTest, CreateInferRequestTestCases) {
+    std::vector<std::shared_ptr<ov::IAsyncInferRequest>> inferReqs;
+    std::shared_ptr<ov::IAsyncInferRequest> inferReq;
+    for (int i = 0; i < m_infer_request_num; i++) {
+        EXPECT_NO_THROW(inferReq = m_auto_batch_compile_model->create_infer_request());
+        EXPECT_NE(inferReq, nullptr);
+        inferReqs.push_back(inferReq);
+    }
+    inferReqs.clear();
+}
+
+const std::vector<int> requests_num{1, 8, 16, 64};
+const std::vector<int> batch_size{1, 8, 16, 32, 128, 256};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         CompileModelCreateInferRequestTest,
+                         ::testing::Combine(::testing::ValuesIn(batch_size), ::testing::ValuesIn(requests_num)),
+                         CompileModelCreateInferRequestTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/compile_model_get_property_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/compile_model_get_property_test.cpp
@ -0,0 +1,171 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "ngraph_functions/subgraph_builders.hpp"
+#include "openvino/core/dimension_tracker.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::MatcherCast;
+using ::testing::Matches;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using get_property_param = std::tuple<std::string,  // Property need to be set
+                                      bool>;        // Throw exception
+
+class CompileModelGetPropertyTest : public ::testing::TestWithParam<get_property_param> {
+public:
+    std::string m_properity_name;
+    bool m_throw_exception;
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+    std::shared_ptr<ov::Model> m_model;
+
+    ov::SoPtr<MockICompiledModel> m_mock_compile_model;
+    std::shared_ptr<MockICompiledModel> m_mock_i_compile_model;
+    std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
+
+    std::shared_ptr<ov::ICompiledModel> auto_batch_compile_model;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<get_property_param> obj) {
+        std::string properity_name;
+        bool throw_exception;
+        std::tie(properity_name, throw_exception) = obj.param;
+
+        std::string res;
+        res += "_" + properity_name;
+        if (throw_exception)
+            res += "throw";
+
+        return res;
+    }
+
+    void TearDown() override {
+        m_core.reset();
+        m_plugin.reset();
+        m_model.reset();
+        m_mock_i_compile_model.reset();
+        m_mock_compile_model = {};
+        auto_batch_compile_model.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_properity_name, m_throw_exception) = this->GetParam();
+        m_model = ngraph::builder::subgraph::makeMultiSingleConv();
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+        m_plugin->set_core(m_core);
+        m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
+        m_mock_i_compile_model = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
+        m_mock_compile_model = {m_mock_i_compile_model, {}};
+
+        ON_CALL(*m_core,
+                compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
+                              MatcherCast<const std::string&>(_),
+                              _))
+            .WillByDefault(Return(m_mock_compile_model));
+
+        ON_CALL(*m_core,
+                compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
+                              MatcherCast<const ov::SoPtr<ov::IRemoteContext>&>(_),
+                              _))
+            .WillByDefault(Return(m_mock_compile_model));
+
+        ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT")))
+            .WillByDefault(Return(ov::hint::PerformanceMode::THROUGHPUT));
+
+        ON_CALL(*m_core, get_property(_, StrEq("OPTIMAL_BATCH_SIZE"), _))
+            .WillByDefault(Return(static_cast<unsigned int>(16)));
+
+        ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
+            .WillByDefault(Return(static_cast<uint32_t>(12)));
+
+        ON_CALL(*m_core, get_property(_, StrEq("GPU_MEMORY_STATISTICS"), _))
+            .WillByDefault([](const std::string& device, const std::string& key, const ov::AnyMap& options) {
+                std::map<std::string, uint64_t> ret = {{"xyz", 1024}};
+                return ret;
+            });
+
+        ON_CALL(*m_core, get_property(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _)).WillByDefault(Return("10240"));
+
+        const ov::AnyMap configs = {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(16)"}};
+        ASSERT_NO_THROW(auto_batch_compile_model = m_plugin->compile_model(m_model, configs));
+
+        std::string network_name = m_model.get()->get_name();
+        std::vector<ov::PropertyName> supported_props = {ov::optimal_batch_size, ov::cache_dir};
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq(ov::supported_properties.name())))
+            .WillByDefault(Return(ov::Any(supported_props)));
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
+            .WillByDefault(Return("0"));
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("OPTIMAL_NUMBER_OF_INFER_REQUESTS")))
+            .WillByDefault(Return("12"));
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("NETWORK_NAME")))
+            .WillByDefault(Return(network_name.c_str()));
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("EXECUTION_DEVICES"))).WillByDefault(Return("CPU"));
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("SUPPORTED_CONFIG_KEYS")))
+            .WillByDefault(Return("CPU"));
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("SUPPORTED_CONFIG_KEYS")))
+            .WillByDefault([](const std::string& name) {
+                std::vector<std::string> res_config;
+                res_config.emplace_back("CACHE_DIR");
+                res_config.emplace_back("OPTIMAL_BATCH_SIZE");
+                return res_config;
+            });
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("CACHE_DIR"))).WillByDefault(Return("./abc"));
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_property(StrEq("OPTIMAL_BATCH_SIZE"))).WillByDefault(Return("16"));
+    }
+};
+
+TEST_P(CompileModelGetPropertyTest, CompileModelGetPropertyTestCase) {
+    if (m_throw_exception)
+        ASSERT_ANY_THROW(auto_batch_compile_model->get_property(m_properity_name));
+    else
+        ASSERT_NO_THROW(auto_batch_compile_model->get_property(m_properity_name));
+}
+
+const std::vector<get_property_param> compile_model_get_property_param_test = {
+    get_property_param{METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS), false},
+    get_property_param{METRIC_KEY(NETWORK_NAME), false},
+    get_property_param{METRIC_KEY(SUPPORTED_METRICS), false},
+    get_property_param{METRIC_KEY(SUPPORTED_CONFIG_KEYS), false},
+    get_property_param{ov::execution_devices.name(), false},
+    get_property_param{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), false},
+    get_property_param{CONFIG_KEY(AUTO_BATCH_TIMEOUT), false},
+    get_property_param{CONFIG_KEY(CACHE_DIR), false},
+    // Config in dependent m_plugin
+    get_property_param{"OPTIMAL_BATCH_SIZE", false},
+    // Incorrect Property
+    get_property_param{"INCORRECT_METRIC", true},
+    get_property_param{"INCORRECT_CONFIG", true},
+};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         CompileModelGetPropertyTest,
+                         ::testing::ValuesIn(compile_model_get_property_param_test),
+                         CompileModelGetPropertyTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/compile_model_get_runtime_model_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/compile_model_get_runtime_model_test.cpp
@ -0,0 +1,99 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "ngraph_functions/subgraph_builders.hpp"
+#include "openvino/core/dimension_tracker.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::MatcherCast;
+using ::testing::Matches;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+class CompileModelGetRuntimeModelTest : public ::testing::Test {
+public:
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+    std::shared_ptr<ov::Model> m_model;
+
+    ov::SoPtr<MockICompiledModel> m_mock_compile_model;
+    std::shared_ptr<MockICompiledModel> m_mock_i_compile_model;
+    std::shared_ptr<ov::ICompiledModel> m_auto_batch_compile_model;
+
+    std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
+
+public:
+    void TearDown() override {
+        m_core.reset();
+        m_plugin.reset();
+        m_model.reset();
+        m_mock_i_compile_model.reset();
+        m_mock_compile_model = {};
+        m_auto_batch_compile_model.reset();
+    }
+
+    void SetUp() override {
+        m_model = ngraph::builder::subgraph::makeMultiSingleConv();
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+        m_plugin->set_core(m_core);
+        m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
+        m_mock_i_compile_model = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
+        m_mock_compile_model = {m_mock_i_compile_model, {}};
+
+        ON_CALL(*m_core,
+                compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
+                              MatcherCast<const std::string&>(_),
+                              _))
+            .WillByDefault(Return(m_mock_compile_model));
+
+        ON_CALL(*m_core,
+                compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
+                              MatcherCast<const ov::SoPtr<ov::IRemoteContext>&>(_),
+                              _))
+            .WillByDefault(Return(m_mock_compile_model));
+
+        ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT")))
+            .WillByDefault(Return(ov::hint::PerformanceMode::THROUGHPUT));
+
+        ON_CALL(*m_core, get_property(_, StrEq("OPTIMAL_BATCH_SIZE"), _))
+            .WillByDefault(Return(static_cast<unsigned int>(16)));
+
+        ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
+            .WillByDefault(Return(static_cast<uint32_t>(12)));
+
+        ON_CALL(*m_core, get_property(_, StrEq("GPU_MEMORY_STATISTICS"), _))
+            .WillByDefault([](const std::string& device, const std::string& key, const ov::AnyMap& options) {
+                std::map<std::string, uint64_t> ret = {{"xyz", 1024}};
+                return ret;
+            });
+
+        ON_CALL(*m_core, get_property(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _)).WillByDefault(Return("10240"));
+
+        ON_CALL(*m_mock_i_compile_model.get(), get_runtime_model()).WillByDefault(Return(m_model));
+
+        const ov::AnyMap configs = {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(16)"}};
+
+        ASSERT_NO_THROW(m_auto_batch_compile_model = m_plugin->compile_model(m_model, configs));
+    }
+};
+
+TEST_F(CompileModelGetRuntimeModelTest, CompileModelGetRuntimeModelTestCase) {
+    ASSERT_NO_THROW(m_auto_batch_compile_model->get_runtime_model());
+}
--- a/src/plugins/auto_batch/tests/unit/compile_model_set_property_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/compile_model_set_property_test.cpp
@ -0,0 +1,132 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "ngraph_functions/subgraph_builders.hpp"
+#include "openvino/core/dimension_tracker.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::MatcherCast;
+using ::testing::Matches;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using set_property_param = std::tuple<ov::AnyMap,  // Property need to be set
+                                      bool>;       // Throw exception
+
+class CompileModelSetPropertyTest : public ::testing::TestWithParam<set_property_param> {
+public:
+    ov::AnyMap m_properities;
+    bool m_throw_exception;
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+    std::shared_ptr<ov::Model> m_model;
+
+    // Mock execNetwork
+    ov::SoPtr<MockICompiledModel> m_mock_compile_model;
+    std::shared_ptr<MockICompiledModel> m_mock_i_compile_model;
+    std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
+
+    std::shared_ptr<ov::ICompiledModel> m_auto_batch_compile_model;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<set_property_param> obj) {
+        ov::AnyMap properities;
+        bool throw_exception;
+        std::tie(properities, throw_exception) = obj.param;
+
+        std::string res;
+        for (auto& c : properities) {
+            res += "_" + c.first + "_" + c.second.as<std::string>();
+        }
+        if (throw_exception)
+            res += "throw";
+
+        return res;
+    }
+
+    void TearDown() override {
+        m_core.reset();
+        m_plugin.reset();
+        m_model.reset();
+        m_mock_i_compile_model.reset();
+        m_mock_compile_model = {};
+        m_auto_batch_compile_model.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_properities, m_throw_exception) = this->GetParam();
+        m_model = ngraph::builder::subgraph::makeMultiSingleConv();
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+        m_plugin->set_core(m_core);
+        m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
+        m_mock_i_compile_model = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
+        m_mock_compile_model = {m_mock_i_compile_model, {}};
+
+        ON_CALL(*m_core,
+                compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
+                              MatcherCast<const std::string&>(_),
+                              _))
+            .WillByDefault(Return(m_mock_compile_model));
+
+        ON_CALL(*m_core,
+                compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
+                              MatcherCast<const ov::SoPtr<ov::IRemoteContext>&>(_),
+                              _))
+            .WillByDefault(Return(m_mock_compile_model));
+
+        ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT")))
+            .WillByDefault(Return(ov::hint::PerformanceMode::THROUGHPUT));
+
+        ON_CALL(*m_core, get_property(_, StrEq("OPTIMAL_BATCH_SIZE"), _))
+            .WillByDefault(Return(static_cast<unsigned int>(16)));
+
+        ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
+            .WillByDefault(Return(static_cast<uint32_t>(12)));
+
+        ON_CALL(*m_core, get_property(_, StrEq("GPU_MEMORY_STATISTICS"), _))
+            .WillByDefault([](const std::string& device, const std::string& key, const ov::AnyMap& options) {
+                std::map<std::string, uint64_t> ret = {{"xyz", 1024}};
+                return ret;
+            });
+
+        ON_CALL(*m_core, get_property(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _)).WillByDefault(Return("10240"));
+
+        const ov::AnyMap configs = {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(16)"}};
+
+        ASSERT_NO_THROW(m_auto_batch_compile_model = m_plugin->compile_model(m_model, configs));
+    }
+};
+
+TEST_P(CompileModelSetPropertyTest, CompileModelSetPropertyTestCase) {
+    if (m_throw_exception)
+        ASSERT_ANY_THROW(m_auto_batch_compile_model->set_property(m_properities));
+    else
+        ASSERT_NO_THROW(m_auto_batch_compile_model->set_property(m_properities));
+}
+
+const std::vector<set_property_param> compile_model_set_property_param_test = {
+    set_property_param{{{CONFIG_KEY(AUTO_BATCH_TIMEOUT), std::uint32_t(100)}}, false},
+    set_property_param{{{"INCORRECT_CONFIG", 2}}, true},
+};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         CompileModelSetPropertyTest,
+                         ::testing::ValuesIn(compile_model_set_property_param_test),
+                         CompileModelSetPropertyTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/create_infer_request_tests.cpp
+++ b/src/plugins/auto_batch/tests/unit/create_infer_request_tests.cpp
@ -1,133 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#include <gmock/gmock.h>
-#include <gtest/gtest.h>
-
-#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
-#include "mock_auto_batch_plugin.hpp"
-#include "ngraph_functions/subgraph_builders.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iexecutable_network_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_ivariable_state_internal.hpp"
-
-using ::testing::_;
-using ::testing::AnyNumber;
-using ::testing::AtLeast;
-using ::testing::Eq;
-using ::testing::MatcherCast;
-using ::testing::Matches;
-using ::testing::NiceMock;
-using ::testing::Return;
-using ::testing::ReturnRef;
-using ::testing::StrEq;
-using ::testing::StrNe;
-using ::testing::Throw;
-using namespace ov::mock_autobatch_plugin;
-using namespace InferenceEngine;
-
-using CreateInferRequestTestParams = std::tuple<int,   // batch_size
-                                                int>;  // inferReq number
-class CreateInferRequestTest : public ::testing::TestWithParam<CreateInferRequestTestParams> {
-public:
-    std::shared_ptr<NiceMock<MockICore>> core;
-    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
-
-    // Mock execNetwork
-    std::shared_ptr<NiceMock<MockIExecutableNetworkInternal>> mockIExecNet;
-    ov::SoPtr<IExecutableNetworkInternal> mockExecNetwork;
-    std::shared_ptr<NiceMock<MockIInferencePlugin>> mockIPlugin;
-    std::shared_ptr<InferenceEngine::IInferencePlugin> mockPlugin;
-    ov::SoPtr<IExecutableNetworkInternal> batchedExecNetwork;
-
-    std::shared_ptr<CompiledModel> actualExecNet;
-    std::vector<std::shared_ptr<NiceMock<MockIInferRequestInternal>>> inferRequestVec;
-
-public:
-    static std::string getTestCaseName(testing::TestParamInfo<CreateInferRequestTestParams> obj) {
-        int batch_size;
-        int infer_num;
-        std::tie(batch_size, infer_num) = obj.param;
-
-        std::string res;
-        res = "batch_size_" + std::to_string(batch_size);
-        res += "_infer_num_" + std::to_string(infer_num);
-        return res;
-    }
-
-    void TearDown() override {
-        core.reset();
-        plugin.reset();
-        mockIExecNet.reset();
-        mockExecNetwork = {};
-        batchedExecNetwork = {};
-        mockPlugin = {};
-        actualExecNet.reset();
-        inferRequestVec.clear();
-    }
-
-    void SetUp() override {
-        mockIExecNet = std::make_shared<NiceMock<MockIExecutableNetworkInternal>>();
-        mockIPlugin = std::make_shared<NiceMock<MockIInferencePlugin>>();
-        ON_CALL(*mockIPlugin, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _)).WillByDefault(Return(mockIExecNet));
-        mockPlugin = mockIPlugin;
-        mockExecNetwork =
-            ov::SoPtr<InferenceEngine::IExecutableNetworkInternal>(mockPlugin->LoadNetwork(CNNNetwork{}, {}), {});
-        batchedExecNetwork = {};
-
-        core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
-        plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
-        plugin->SetCore(core);
-
-        // Create inferRequest
-        ON_CALL(*mockIExecNet.get(), CreateInferRequest()).WillByDefault([this]() {
-            auto inferReq = std::make_shared<NiceMock<MockIInferRequestInternal>>();
-            inferRequestVec.push_back(inferReq);
-            return inferReq;
-        });
-    }
-
-    CompiledModel::Ptr createAutoBatchExecutableNetwork(int batch_size) {
-        DeviceInformation metaDevice = {"CPU", {}, batch_size};
-        std::unordered_map<std::string, InferenceEngine::Parameter> config = {{CONFIG_KEY(AUTO_BATCH_TIMEOUT), "200"}};
-        std::set<std::string> batched_inputs = {"Parameter_0"};
-        std::set<std::string> batched_outputs = {"Convolution_20"};
-
-        if (batch_size > 1)
-            batchedExecNetwork =
-                ov::SoPtr<InferenceEngine::IExecutableNetworkInternal>(mockPlugin->LoadNetwork(CNNNetwork{}, {}), {});
-        return std::make_shared<CompiledModel>(batchedExecNetwork,
-                                               mockExecNetwork,
-                                               metaDevice,
-                                               config,
-                                               batched_inputs,
-                                               batched_outputs);
-    }
-};
-
-TEST_P(CreateInferRequestTest, CreateInferRequestTestCases) {
-    int batch_size;
-    int infer_num;
-    std::tie(batch_size, infer_num) = this->GetParam();
-
-    actualExecNet = createAutoBatchExecutableNetwork(batch_size);
-    std::vector<InferenceEngine::IInferRequestInternal::Ptr> inferReqs;
-    InferenceEngine::IInferRequestInternal::Ptr inferReq;
-    for (int i = 0; i < infer_num; i++) {
-        EXPECT_NO_THROW(inferReq = actualExecNet->CreateInferRequest());
-        EXPECT_NE(inferReq, nullptr);
-        inferReqs.push_back(inferReq);
-    }
-    inferReqs.clear();
-}
-
-const std::vector<int> requests_num{1, 8, 16, 64};
-const std::vector<int> batch_size{1, 8, 16, 32, 128, 256};
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         CreateInferRequestTest,
-                         ::testing::Combine(::testing::ValuesIn(batch_size), ::testing::ValuesIn(requests_num)),
-                         CreateInferRequestTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/exec_network_tests.cpp
+++ b/src/plugins/auto_batch/tests/unit/exec_network_tests.cpp
@ -1,201 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#include <gmock/gmock.h>
-#include <gtest/gtest.h>
-
-#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
-#include "mock_auto_batch_plugin.hpp"
-#include "ngraph_functions/subgraph_builders.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iexecutable_network_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_ivariable_state_internal.hpp"
-
-using ::testing::_;
-using ::testing::AnyNumber;
-using ::testing::AtLeast;
-using ::testing::Eq;
-using ::testing::MatcherCast;
-using ::testing::Matches;
-using ::testing::NiceMock;
-using ::testing::Return;
-using ::testing::ReturnRef;
-using ::testing::StrEq;
-using ::testing::StrNe;
-using ::testing::Throw;
-using namespace ov::mock_autobatch_plugin;
-using namespace InferenceEngine;
-
-using ExecNetworkParams = std::tuple<std::string,  // Key name
-                                     int,          // GetMetric(0) or GetConfig(1) or SetConfig(3)
-                                     bool>;        // Throw exception
-class ExecNetworkTest : public ::testing::TestWithParam<ExecNetworkParams> {
-public:
-    std::shared_ptr<NiceMock<MockICore>> core;
-    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
-
-    // Mock execNetwork
-    std::shared_ptr<NiceMock<MockIExecutableNetworkInternal>> mockIExecNet;
-    ov::SoPtr<IExecutableNetworkInternal> mockExecNetwork;
-    std::shared_ptr<NiceMock<MockIInferencePlugin>> mockIPlugin;
-    std::shared_ptr<InferenceEngine::IInferencePlugin> mockPlugin;
-
-    InferenceEngine::IExecutableNetworkInternal::Ptr actualExecNet;
-
-public:
-    static std::string getTestCaseName(testing::TestParamInfo<ExecNetworkParams> obj) {
-        std::string name;
-        bool throw_exception;
-        int action;
-        std::tie(name, action, throw_exception) = obj.param;
-
-        std::string res;
-        switch (action) {
-        case 0:
-            res += "GetMetric_" + name;
-            break;
-        case 1:
-            res += "GetConfig_" + name;
-            break;
-        case 3:
-            res += "SetConfig_" + name;
-            break;
-        default:
-            res += "error_" + name;
-        }
-
-        if (throw_exception)
-            res += "throw";
-
-        return res;
-    }
-
-    void TearDown() override {
-        core.reset();
-        plugin.reset();
-        mockIExecNet.reset();
-        mockExecNetwork = {};
-        mockPlugin = {};
-        actualExecNet.reset();
-    }
-
-    void SetUp() override {
-        mockIExecNet = std::make_shared<NiceMock<MockIExecutableNetworkInternal>>();
-        auto mockIPluginPtr = std::make_shared<NiceMock<MockIInferencePlugin>>();
-        ON_CALL(*mockIPluginPtr, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _)).WillByDefault(Return(mockIExecNet));
-        mockPlugin = mockIPluginPtr;
-        EXPECT_CALL(*mockIPluginPtr, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _)).Times(1);
-        mockExecNetwork =
-            ov::SoPtr<InferenceEngine::IExecutableNetworkInternal>(mockPlugin->LoadNetwork(CNNNetwork{}, {}), {});
-
-        core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
-        plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
-        plugin->SetCore(core);
-
-        ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
-            return plugin->Plugin::ParseBatchDevice(batchDevice);
-        });
-        ON_CALL(*core, LoadNetwork(MatcherCast<const CNNNetwork&>(_), MatcherCast<const std::string&>(_), _))
-            .WillByDefault(Return(mockExecNetwork));
-        ON_CALL(*core,
-                LoadNetwork(MatcherCast<const CNNNetwork&>(_),
-                            MatcherCast<const std::shared_ptr<InferenceEngine::RemoteContext>&>(_),
-                            _))
-            .WillByDefault(Return(mockExecNetwork));
-        ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT"))).WillByDefault(Return("THROUGHPUT"));
-        ON_CALL(*core, GetMetric(_, StrEq("OPTIMAL_BATCH_SIZE"), _)).WillByDefault(Return("16"));
-        ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS"))).WillByDefault(Return("12"));
-        ON_CALL(*core, GetMetric(_, StrEq("GPU_MEMORY_STATISTICS"), _))
-            .WillByDefault([](const std::string& device, const std::string& key, const ov::AnyMap& options) {
-                std::map<std::string, uint64_t> ret = {{"xyz", 1024}};
-                return ret;
-            });
-        ON_CALL(*core, GetMetric(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _)).WillByDefault(Return("10240"));
-        auto graph = ngraph::builder::subgraph::makeMultiSingleConv();
-        auto net = CNNNetwork(graph);
-
-        const std::map<std::string, std::string> configs = {{"AUTO_BATCH_TIMEOUT", "200"},
-                                                            {"AUTO_BATCH_DEVICE_CONFIG", "CPU(16)"}};
-        ASSERT_NO_THROW(actualExecNet = plugin->LoadNetworkImpl(net, {}, configs));
-
-        ON_CALL(*mockIExecNet, GetConfig(StrEq("PERFORMANCE_HINT_NUM_REQUESTS"))).WillByDefault(Return("0"));
-        ON_CALL(*mockIExecNet, GetMetric(StrEq("OPTIMAL_NUMBER_OF_INFER_REQUESTS"))).WillByDefault(Return("12"));
-        ON_CALL(*mockIExecNet, GetMetric(StrEq("NETWORK_NAME"))).WillByDefault(Return("network_name"));
-        ON_CALL(*mockIExecNet, GetMetric(StrEq("EXECUTION_DEVICES"))).WillByDefault(Return("CPU"));
-        ON_CALL(*mockIExecNet, GetMetric(StrEq("SUPPORTED_CONFIG_KEYS"))).WillByDefault(Return("CPU"));
-        ON_CALL(*mockIExecNet, GetMetric(StrEq("SUPPORTED_CONFIG_KEYS"))).WillByDefault([](const std::string& name) {
-            std::vector<std::string> res_config;
-            res_config.emplace_back("CACHE_DIR");
-            res_config.emplace_back("OPTIMAL_BATCH_SIZE");
-            return res_config;
-        });
-        ON_CALL(*mockIExecNet, GetConfig(StrEq("CACHE_DIR"))).WillByDefault(Return("./abc"));
-        ON_CALL(*mockIExecNet, GetConfig(StrEq("OPTIMAL_BATCH_SIZE"))).WillByDefault(Return("16"));
-    }
-};
-
-TEST_P(ExecNetworkTest, ExecNetworkGetConfigMetricTestCase) {
-    std::string name;
-    bool throw_exception;
-    int action;
-    std::tie(name, action, throw_exception) = this->GetParam();
-
-    std::map<std::string, InferenceEngine::Parameter> config;
-
-    switch (action) {
-    case 0: {
-        if (throw_exception)
-            ASSERT_ANY_THROW(actualExecNet->GetMetric(name));
-        else
-            ASSERT_NO_THROW(actualExecNet->GetMetric(name));
-        break;
-    }
-    case 1: {
-        if (throw_exception)
-            ASSERT_ANY_THROW(actualExecNet->GetConfig(name));
-        else
-            ASSERT_NO_THROW(actualExecNet->GetConfig(name));
-        break;
-    }
-    case 3: {
-        config[name] = InferenceEngine::Parameter(100);
-        if (throw_exception)
-            ASSERT_ANY_THROW(actualExecNet->SetConfig(config));
-        else
-            ASSERT_NO_THROW(actualExecNet->SetConfig(config));
-        break;
-    }
-    default:
-        break;
-    }
-}
-
-const std::vector<ExecNetworkParams> testConfigs = {
-    // Metric
-    ExecNetworkParams{METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS), 0, false},
-    ExecNetworkParams{METRIC_KEY(NETWORK_NAME), 0, false},
-    ExecNetworkParams{METRIC_KEY(SUPPORTED_METRICS), 0, false},
-    ExecNetworkParams{METRIC_KEY(SUPPORTED_CONFIG_KEYS), 0, false},
-    ExecNetworkParams{ov::execution_devices.name(), 0, false},
-    // Config in autobatch
-    ExecNetworkParams{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), 1, false},
-    ExecNetworkParams{CONFIG_KEY(AUTO_BATCH_TIMEOUT), 1, false},
-    ExecNetworkParams{CONFIG_KEY(CACHE_DIR), 1, false},
-    // Config in dependent plugin
-    ExecNetworkParams{"OPTIMAL_BATCH_SIZE", 1, false},
-    // Incorrect Metric
-    ExecNetworkParams{"INCORRECT_METRIC", 0, true},
-    // Incorrect config
-    ExecNetworkParams{"INCORRECT_CONFIG", 1, true},
-    // Set Config
-    ExecNetworkParams{CONFIG_KEY(AUTO_BATCH_TIMEOUT), 2, false},
-    ExecNetworkParams{"INCORRECT_CONFIG", 2, true},
-};
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         ExecNetworkTest,
-                         ::testing::ValuesIn(testConfigs),
-                         ExecNetworkTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/load_network_tests.cpp
+++ b/src/plugins/auto_batch/tests/unit/load_network_tests.cpp
@ -1,313 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#include <gmock/gmock.h>
-#include <gtest/gtest.h>
-
-#include "cpp_interfaces/interface/ie_iplugin_internal.hpp"
-#include "mock_auto_batch_plugin.hpp"
-#include "ngraph_functions/subgraph_builders.hpp"
-#include "openvino/core/dimension_tracker.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iexecutable_network_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_ivariable_state_internal.hpp"
-
-using ::testing::_;
-using ::testing::AnyNumber;
-using ::testing::AtLeast;
-using ::testing::Eq;
-using ::testing::MatcherCast;
-using ::testing::Matches;
-using ::testing::NiceMock;
-using ::testing::Return;
-using ::testing::ReturnRef;
-using ::testing::StrEq;
-using ::testing::StrNe;
-using ::testing::Throw;
-using namespace ov::mock_autobatch_plugin;
-using namespace InferenceEngine;
-
-using PluginLoadNetworkParams = std::tuple<std::map<std::string, std::string>,  // Paramters
-                                           std::map<std::string, std::string>,  // Config
-                                           int>;                                // Batch Size
-class PluginLoadNetworkTest : public ::testing::TestWithParam<PluginLoadNetworkParams> {
-public:
-    std::shared_ptr<NiceMock<MockICore>> core;
-    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
-
-    // Mock CPU execNetwork
-    std::shared_ptr<NiceMock<MockIExecutableNetworkInternal>> cpuMockIExecNet;
-    ov::SoPtr<IExecutableNetworkInternal> cpuMockExecNetwork;
-    std::shared_ptr<NiceMock<MockIInferencePlugin>> cpuMockIPlugin;
-    std::shared_ptr<InferenceEngine::IInferencePlugin> cpuMockPlugin;
-
-public:
-    static std::string getTestCaseName(testing::TestParamInfo<PluginLoadNetworkParams> obj) {
-        std::map<std::string, std::string> params;
-        std::map<std::string, std::string> configs;
-        int batch_size;
-        std::tie(params, configs, batch_size) = obj.param;
-
-        std::string res;
-        for (auto& c : params) {
-            res += "_" + c.first + "_" + c.second;
-        }
-        for (auto& c : configs) {
-            res += "_" + c.first + "_" + c.second;
-        }
-        res += "_" + std::to_string(batch_size);
-        return res;
-    }
-
-    void TearDown() override {
-        core.reset();
-        plugin.reset();
-        cpuMockIExecNet.reset();
-        cpuMockExecNetwork = {};
-        cpuMockPlugin = {};
-    }
-
-    void SetUp() override {
-        cpuMockIExecNet = std::make_shared<NiceMock<MockIExecutableNetworkInternal>>();
-        auto cpuMockIPluginPtr = std::make_shared<NiceMock<MockIInferencePlugin>>();
-        ON_CALL(*cpuMockIPluginPtr, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _))
-            .WillByDefault(Return(cpuMockIExecNet));
-        cpuMockPlugin = cpuMockIPluginPtr;
-        EXPECT_CALL(*cpuMockIPluginPtr, LoadNetwork(MatcherCast<const CNNNetwork&>(_), _)).Times(1);
-        cpuMockExecNetwork =
-            ov::SoPtr<InferenceEngine::IExecutableNetworkInternal>(cpuMockPlugin->LoadNetwork(CNNNetwork{}, {}), {});
-
-        core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
-        plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
-        plugin->SetCore(core);
-
-        ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
-            return plugin->Plugin::ParseBatchDevice(batchDevice);
-        });
-        ON_CALL(*core, LoadNetwork(MatcherCast<const CNNNetwork&>(_), MatcherCast<const std::string&>(_), _))
-            .WillByDefault(Return(cpuMockExecNetwork));
-        ON_CALL(*core,
-                LoadNetwork(MatcherCast<const CNNNetwork&>(_),
-                            MatcherCast<const std::shared_ptr<InferenceEngine::RemoteContext>&>(_),
-                            _))
-            .WillByDefault(Return(cpuMockExecNetwork));
-    }
-};
-
-TEST_P(PluginLoadNetworkTest, PluginLoadNetworkTestCase) {
-    std::map<std::string, std::string> params;
-    std::map<std::string, std::string> configs;
-    int batch_size;
-    std::tie(params, configs, batch_size) = this->GetParam();
-
-    ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT"))).WillByDefault(Return(params["PERFORMANCE_HINT"]));
-    ON_CALL(*core, GetMetric(_, StrEq("OPTIMAL_BATCH_SIZE"), _)).WillByDefault(Return(params["OPTIMAL_BATCH_SIZE"]));
-    ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
-        .WillByDefault(Return(params["PERFORMANCE_HINT_NUM_REQUESTS"]));
-
-    ON_CALL(*core, GetMetric(_, StrEq("GPU_MEMORY_STATISTICS"), _))
-        .WillByDefault([&params](const std::string& device, const std::string& key, const ov::AnyMap& options) {
-            static int flag = 0;
-            ov::Any value = params[key];
-            uint64_t data = flag * value.as<uint64_t>();
-            std::map<std::string, uint64_t> ret = {{"xyz", data}};
-            flag = flag ? 0 : 1;
-            return ret;
-        });
-
-    ON_CALL(*core, GetMetric(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _))
-        .WillByDefault(Return(params["GPU_DEVICE_TOTAL_MEM_SIZE"]));
-
-    auto graph = ngraph::builder::subgraph::makeMultiSingleConv();
-    auto net = CNNNetwork(graph);
-    ASSERT_NO_THROW(plugin->LoadNetworkImpl(net, {}, configs));
-}
-
-TEST_P(PluginLoadNetworkTest, PluginLoadBatchedNetworkTestCase) {
-    std::map<std::string, std::string> params;
-    std::map<std::string, std::string> configs;
-    int batch_size;
-    std::tie(params, configs, batch_size) = this->GetParam();
-
-    ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT"))).WillByDefault(Return(params["PERFORMANCE_HINT"]));
-    ON_CALL(*core, GetMetric(_, StrEq("OPTIMAL_BATCH_SIZE"), _)).WillByDefault(Return(params["OPTIMAL_BATCH_SIZE"]));
-    ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
-        .WillByDefault(Return(params["PERFORMANCE_HINT_NUM_REQUESTS"]));
-
-    ON_CALL(*core, GetMetric(_, StrEq("GPU_MEMORY_STATISTICS"), _))
-        .WillByDefault([&params](const std::string& device, const std::string& key, const ov::AnyMap& options) {
-            static int flag = 0;
-            ov::Any value = params[key];
-            uint64_t data = flag * value.as<uint64_t>();
-            std::map<std::string, uint64_t> ret = {{"xyz", data}};
-            flag = flag ? 0 : 1;
-            return ret;
-        });
-
-    ON_CALL(*core, GetMetric(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _))
-        .WillByDefault(Return(params["GPU_DEVICE_TOTAL_MEM_SIZE"]));
-
-    auto graph = ngraph::builder::subgraph::makeConvPoolReluNonZero({1, 1, 32, 32});
-    auto batch = ov::Dimension(5);
-    ov::DimensionTracker::set_label(batch, 11);
-    auto p_shape = ov::PartialShape{batch, 1, 32, 32};
-    graph->reshape(p_shape);
-    auto net = CNNNetwork(graph);
-    InferenceEngine::IExecutableNetworkInternal::Ptr execNet;
-    ASSERT_NO_THROW(execNet = plugin->LoadNetworkImpl(net, {}, configs));
-
-    ON_CALL(*cpuMockIExecNet, GetConfig(StrEq("PERFORMANCE_HINT_NUM_REQUESTS"))).WillByDefault(Return("0"));
-    ON_CALL(*cpuMockIExecNet, GetMetric(StrEq("OPTIMAL_NUMBER_OF_INFER_REQUESTS"))).WillByDefault(Return("1"));
-
-    InferenceEngine::Parameter res;
-    ASSERT_NO_THROW(res = execNet->GetMetric("OPTIMAL_NUMBER_OF_INFER_REQUESTS"));
-    EXPECT_EQ(1, std::atoi(res.as<std::string>().c_str()));
-}
-
-TEST_P(PluginLoadNetworkTest, PluginLoadNetworkGetMetricTestCase) {
-    std::map<std::string, std::string> params;
-    std::map<std::string, std::string> configs;
-    int batch_size;
-    std::tie(params, configs, batch_size) = this->GetParam();
-
-    ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT"))).WillByDefault(Return(params["PERFORMANCE_HINT"]));
-    ON_CALL(*core, GetMetric(_, StrEq("OPTIMAL_BATCH_SIZE"), _)).WillByDefault(Return(params["OPTIMAL_BATCH_SIZE"]));
-    ON_CALL(*core, GetConfig(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
-        .WillByDefault(Return(params["PERFORMANCE_HINT_NUM_REQUESTS"]));
-
-    ON_CALL(*core, GetMetric(_, StrEq("GPU_MEMORY_STATISTICS"), _))
-        .WillByDefault([&params](const std::string& device, const std::string& key, const ov::AnyMap& options) {
-            static int flag = 0;
-            ov::Any value = params[key];
-            uint64_t data = flag * value.as<uint64_t>();
-            std::map<std::string, uint64_t> ret = {{"xyz", data}};
-            flag = flag ? 0 : 1;
-            return ret;
-        });
-
-    ON_CALL(*core, GetMetric(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _))
-        .WillByDefault(Return(params["GPU_DEVICE_TOTAL_MEM_SIZE"]));
-
-    auto graph = ngraph::builder::subgraph::makeMultiSingleConv();
-    auto net = CNNNetwork(graph);
-    InferenceEngine::IExecutableNetworkInternal::Ptr execNet;
-    ASSERT_NO_THROW(execNet = plugin->LoadNetworkImpl(net, {}, configs));
-
-    std::string network_name = graph.get()->get_name();
-    ON_CALL(*cpuMockIExecNet, GetConfig(StrEq("PERFORMANCE_HINT_NUM_REQUESTS"))).WillByDefault(Return("0"));
-    ON_CALL(*cpuMockIExecNet, GetMetric(StrEq("OPTIMAL_NUMBER_OF_INFER_REQUESTS"))).WillByDefault(Return("1"));
-    ON_CALL(*cpuMockIExecNet, GetMetric(StrEq("NETWORK_NAME"))).WillByDefault(Return(network_name.c_str()));
-    ON_CALL(*cpuMockIExecNet, GetMetric(StrEq("EXECUTION_DEVICES"))).WillByDefault(Return("CPU"));
-
-    InferenceEngine::Parameter res;
-    ASSERT_NO_THROW(res = execNet->GetMetric("OPTIMAL_NUMBER_OF_INFER_REQUESTS"));
-    EXPECT_EQ(batch_size, std::atoi(res.as<std::string>().c_str()));
-
-    ASSERT_NO_THROW(res = execNet->GetMetric("NETWORK_NAME"));
-    EXPECT_EQ(network_name, res.as<std::string>());
-
-    ASSERT_NO_THROW(res = execNet->GetMetric("SUPPORTED_METRICS"));
-
-    ASSERT_NO_THROW(res = execNet->GetMetric("EXECUTION_DEVICES"));
-    EXPECT_STREQ("CPU", res.as<std::string>().c_str());
-
-    ASSERT_ANY_THROW(execNet->GetMetric("XYZ"));
-}
-
-const std::vector<PluginLoadNetworkParams> testConfigs = {
-    // Case 1: explict apply batch size by config of AUTO_BATCH_DEVICE_CONFIG
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "16"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
-                             {"GPU_MEMORY_STATISTICS", "1024000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(32)"}},
-                            32},
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "16"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
-                             {"GPU_MEMORY_STATISTICS", "1024000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU(32)"}},
-                            32},
-    // Case 2: CPU batch size is figured out by min of opt_batch_size and infReq_num
-    //         If config contains "PERFORMANCE_HINT_NUM_REQUESTS" else get it from core->GetConfig
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "16"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
-                             {"GPU_MEMORY_STATISTICS", "1024000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
-                            12},
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "8"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "16"},
-                             {"GPU_MEMORY_STATISTICS", "1024000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
-                            8},
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "8"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "2"},
-                             {"GPU_MEMORY_STATISTICS", "1024000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
-                            1},
-    // PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-    //                         {"OPTIMAL_BATCH_SIZE", "32"},
-    //                         {"PERFORMANCE_HINT_NUM_REQUESTS", "16"},
-    //                         {"GPU_MEMORY_STATISTICS", "1024000"},
-    //                         {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
-    //                        {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"},
-    //                        {"PERFORMANCE_HINT_NUM_REQUESTS", "12"}},
-    //                        12},
-    //
-    // Case 3: GPU batch size is figured out by
-    //      1) min of opt_batch_size and infReq_num
-    //      2) available_mem/one_graph_mem_footprint with power 2
-    //  Final batch_size is the min of 1) and 2)
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "16"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
-                             {"GPU_MEMORY_STATISTICS", "1000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "5000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
-                            4},
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "16"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
-                             {"GPU_MEMORY_STATISTICS", "1024000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "40960000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
-                            12},
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "32"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "24"},
-                             {"GPU_MEMORY_STATISTICS", "1000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "18000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
-                            16},
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "THROUGHPUT"},
-                             {"OPTIMAL_BATCH_SIZE", "32"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "48"},
-                             {"GPU_MEMORY_STATISTICS", "1000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "180000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
-                            32},
-    // Case 4:
-    PluginLoadNetworkParams{{{"PERFORMANCE_HINT", "LATENCY"},
-                             {"OPTIMAL_BATCH_SIZE", "16"},
-                             {"PERFORMANCE_HINT_NUM_REQUESTS", "12"},
-                             {"GPU_MEMORY_STATISTICS", "1024000"},
-                             {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
-                            {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(32)"}},
-                            32},
-};
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         PluginLoadNetworkTest,
-                         ::testing::ValuesIn(testConfigs),
-                         PluginLoadNetworkTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/mock_auto_batch_plugin.hpp
+++ b/src/plugins/auto_batch/tests/unit/mock_auto_batch_plugin.hpp
@ -1,36 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#pragma once
-#include <gmock/gmock.h>
-
-#include <iostream>
-
-#include "async_infer_request.hpp"
-#include "compiled_model.hpp"
-#include "ie_icore.hpp"
-#include "plugin.hpp"
-#include "sync_infer_request.hpp"
-
-using namespace ov::mock_autobatch_plugin;
-
-class MockAutoBatchInferencePlugin : public Plugin {
-public:
-    MOCK_METHOD((DeviceInformation),
-                ParseMetaDevices,
-                (const std::string&, (const std::map<std::string, std::string>&)),
-                (const));
-    MOCK_METHOD((DeviceInformation), ParseBatchDevice, (const std::string&), ());
-
-    MOCK_METHOD((InferenceEngine::Parameter),
-                GetMetric,
-                (const std::string&, (const std::map<std::string, InferenceEngine::Parameter>&)),
-                (const, override));
-};
-
-class MockAutoBatchExecutableNetwork : public CompiledModel {
-public:
-    MOCK_METHOD((InferenceEngine::Parameter), GetConfig, (const std::string&), (const, override));
-    MOCK_METHOD((InferenceEngine::Parameter), GetMetric, (const std::string&), (const, override));
-};
--- a/src/plugins/auto_batch/tests/unit/mock_common.hpp
+++ b/src/plugins/auto_batch/tests/unit/mock_common.hpp
@ -0,0 +1,139 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#pragma once
+#include <gmock/gmock.h>
+
+#include <iostream>
+
+#include "async_infer_request.hpp"
+#include "compiled_model.hpp"
+#include "ie_icore.hpp"
+#include "openvino/runtime/make_tensor.hpp"
+#include "plugin.hpp"
+#include "sync_infer_request.hpp"
+
+using namespace ov::mock_autobatch_plugin;
+
+class MockIPlugin : public ov::IPlugin {
+public:
+    MockIPlugin() {
+        set_device_name("HWPLUGIN");
+    }
+    MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>,
+                compile_model,
+                (const std::shared_ptr<const ov::Model>&, const ov::AnyMap&),
+                (const, override));
+    MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>,
+                compile_model,
+                (const std::shared_ptr<const ov::Model>&, const ov::AnyMap&, const ov::SoPtr<ov::IRemoteContext>&),
+                (const, override));
+    MOCK_METHOD(ov::Any, get_property, (const std::string&, const ov::AnyMap&), (const, override));
+    MOCK_METHOD(void, set_property, (const ov::AnyMap&), (override));
+    MOCK_METHOD(ov::SoPtr<ov::IRemoteContext>, create_context, (const ov::AnyMap&), (const, override));
+    MOCK_METHOD(ov::SoPtr<ov::IRemoteContext>, get_default_context, (const ov::AnyMap&), (const, override));
+    MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>,
+                import_model,
+                (std::istream&, const ov::AnyMap&),
+                (const, override));
+    MOCK_METHOD(std::shared_ptr<ov::ICompiledModel>,
+                import_model,
+                (std::istream&, const ov::SoPtr<ov::IRemoteContext>&, const ov::AnyMap&),
+                (const, override));
+    MOCK_METHOD(ov::SupportedOpsMap,
+                query_model,
+                (const std::shared_ptr<const ov::Model>&, const ov::AnyMap&),
+                (const, override));
+};
+
+class MockAutoBatchInferencePlugin : public Plugin {
+public:
+    MOCK_METHOD((ov::Any), get_property, (const std::string&, const ov::AnyMap&), (const, override));
+};
+
+class MockICompiledModel : public ov::ICompiledModel {
+public:
+    MockICompiledModel(const std::shared_ptr<const ov::Model>& model, const std::shared_ptr<const ov::IPlugin>& plugin)
+        : ov::ICompiledModel(model, plugin) {}
+    MOCK_METHOD(std::shared_ptr<ov::ISyncInferRequest>, create_sync_infer_request, (), (const, override));
+    MOCK_METHOD(ov::Any, get_property, (const std::string&), (const, override));
+    MOCK_METHOD(void, set_property, (const ov::AnyMap&), (override));
+    MOCK_METHOD(void, export_model, (std::ostream&), (const, override));
+    MOCK_METHOD(std::shared_ptr<const ov::Model>, get_runtime_model, (), (const, override));
+    MOCK_METHOD(std::shared_ptr<ov::IAsyncInferRequest>, create_infer_request, (), (const, override));
+};
+
+class MockAutoBatchCompileModel : public CompiledModel {
+public:
+    MockAutoBatchCompileModel(const std::shared_ptr<ov::Model>& model,
+                              const std::shared_ptr<const ov::IPlugin>& plugin,
+                              const ov::AnyMap& config,
+                              const DeviceInformation& device_info,
+                              const std::set<std::string>& batched_inputs,
+                              const std::set<std::string>& batched_outputs,
+                              const ov::SoPtr<ov::ICompiledModel>& compiled_model_with_batch,
+                              const ov::SoPtr<ov::ICompiledModel>& compiled_model_without_batch,
+                              const ov::SoPtr<ov::IRemoteContext>& context)
+        : CompiledModel(model,
+                        plugin,
+                        config,
+                        device_info,
+                        batched_inputs,
+                        batched_outputs,
+                        compiled_model_with_batch,
+                        compiled_model_without_batch,
+                        context) {}
+    MOCK_METHOD(std::shared_ptr<ov::ISyncInferRequest>, create_sync_infer_request, (), (const, override));
+};
+
+class MockISyncInferRequest : public ov::ISyncInferRequest {
+public:
+    MockISyncInferRequest(const std::shared_ptr<const MockICompiledModel>& compiled_model)
+        : ov::ISyncInferRequest(compiled_model) {
+        OPENVINO_ASSERT(compiled_model);
+        // Allocate input/output tensors
+        for (const auto& input : get_inputs()) {
+            allocate_tensor(input, [this, input](ov::SoPtr<ov::ITensor>& tensor) {
+                // Can add a check to avoid double work in case of shared tensors
+                allocate_tensor_impl(tensor,
+                                     input.get_element_type(),
+                                     input.get_partial_shape().is_dynamic() ? ov::Shape{0} : input.get_shape());
+            });
+        }
+        for (const auto& output : get_outputs()) {
+            allocate_tensor(output, [this, output](ov::SoPtr<ov::ITensor>& tensor) {
+                // Can add a check to avoid double work in case of shared tensors
+                allocate_tensor_impl(tensor,
+                                     output.get_element_type(),
+                                     output.get_partial_shape().is_dynamic() ? ov::Shape{0} : output.get_shape());
+            });
+        }
+    }
+    MOCK_METHOD(std::vector<ov::ProfilingInfo>, get_profiling_info, (), (const, override));
+    MOCK_METHOD(void, infer, (), (override));
+    MOCK_METHOD(std::vector<ov::SoPtr<ov::IVariableState>>, query_state, (), (const, override));
+    ~MockISyncInferRequest() = default;
+
+private:
+    void allocate_tensor_impl(ov::SoPtr<ov::ITensor>& tensor,
+                              const ov::element::Type& element_type,
+                              const ov::Shape& shape) {
+        if (!tensor || tensor->get_element_type() != element_type) {
+            tensor = ov::make_tensor(element_type, shape);
+        } else {
+            tensor->set_shape(shape);
+        }
+    }
+};
+
+class MockIAsyncInferRequest : public ov::IAsyncInferRequest {
+public:
+    MockIAsyncInferRequest(const std::shared_ptr<IInferRequest>& request,
+                           const std::shared_ptr<ov::threading::ITaskExecutor>& task_executor,
+                           const std::shared_ptr<ov::threading::ITaskExecutor>& callback_executor)
+        : IAsyncInferRequest(request, task_executor, callback_executor) {
+        m_pipeline = {};
+    }
+    MOCK_METHOD(void, start_async, (), (override));
+};
--- a/src/plugins/auto_batch/tests/unit/parse_batch_device_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/parse_batch_device_test.cpp
@ -0,0 +1,81 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using batch_device_config_params = std::tuple<std::string,  // Batch devices
+                                              std::string,  // Expected device name
+                                              int,          // Expected batch size
+                                              bool          // Throw exception
+                                              >;
+
+class ParseBatchDeviceTest : public ::testing::TestWithParam<batch_device_config_params> {
+public:
+    std::string m_batch_device_config;
+    std::string m_device_name;
+    int m_batch_size;
+    bool m_throw_exception;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<batch_device_config_params> obj) {
+        std::string batch_device_config;
+        std::string device_name;
+        int batch_size;
+        bool throw_exception;
+        std::tie(batch_device_config, device_name, batch_size, throw_exception) = obj.param;
+        std::string res = batch_device_config;
+        if (throw_exception)
+            res += "_throw";
+        return res;
+    }
+
+    void TearDown() override {
+        m_plugin.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_batch_device_config, m_device_name, m_batch_size, m_throw_exception) = this->GetParam();
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+    }
+};
+
+TEST_P(ParseBatchDeviceTest, ParseBatchDeviceTestCase) {
+    if (m_throw_exception) {
+        ASSERT_ANY_THROW(m_plugin->parse_batch_device(m_batch_device_config));
+    } else {
+        auto result = m_plugin->parse_batch_device(m_batch_device_config);
+        EXPECT_EQ(result.device_name, m_device_name);
+        EXPECT_EQ(result.device_batch_size, m_batch_size);
+    }
+}
+
+const std::vector<batch_device_config_params> batch_device_test_configs = {
+    batch_device_config_params{"CPU(4)", "CPU", 4, false},
+    batch_device_config_params{"GPU(8)", "GPU", 8, false},
+    batch_device_config_params{"CPU(0)", "CPU", 0, true},
+    batch_device_config_params{"GPU(-1)", "GPU", 0, true},
+};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         ParseBatchDeviceTest,
+                         ::testing::ValuesIn(batch_device_test_configs),
+                         ParseBatchDeviceTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/parse_meta_device_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/parse_meta_device_test.cpp
@ -0,0 +1,143 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using meta_device_params = std::tuple<std::string,        // Device batch cfg
+                                      ov::AnyMap,         // property map
+                                      DeviceInformation,  // Expected result
+                                      bool>;              // Throw exception
+
+const std::vector<std::string> cpu_supported_properties = {
+    "CACHE_DIR",
+};
+
+const std::vector<std::string> gpu_supported_properties = {
+    "CACHE_DIR",
+    "OPTIMAL_BATCH_SIZE",
+};
+
+class ParseMetaDeviceTest : public ::testing::TestWithParam<meta_device_params> {
+public:
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+
+    std::string m_batch_cfg;
+    ov::AnyMap m_config;
+    DeviceInformation m_expected_device_info;
+    bool m_throw_exception;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<meta_device_params> obj) {
+        std::string batch_cfg;
+        ov::AnyMap config;
+        DeviceInformation info;
+        bool throw_exception;
+
+        std::tie(batch_cfg, config, info, throw_exception) = obj.param;
+        std::string res = batch_cfg;
+        for (auto& c : config) {
+            res += "_" + c.first + "_" + c.second.as<std::string>();
+        }
+        if (throw_exception)
+            res += "_throw";
+        return res;
+    }
+
+    void TearDown() override {
+        m_core.reset();
+        m_plugin.reset();
+    }
+
+    void SetUp() override {
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+        m_plugin->set_core(m_core);
+
+        std::tie(m_batch_cfg, m_config, m_expected_device_info, m_throw_exception) = this->GetParam();
+
+        ON_CALL(*m_core, get_supported_property)
+            .WillByDefault([](const std::string& device, const ov::AnyMap& configs) {
+                ov::AnyMap res_config;
+                if (device == "CPU") {
+                    for (auto& c : configs) {
+                        if (std::find(begin(cpu_supported_properties), end(cpu_supported_properties), c.first) !=
+                            cpu_supported_properties.end())
+                            res_config[c.first] = c.second;
+                    }
+                } else if (device == "GPU") {
+                    for (auto& c : configs) {
+                        if (std::find(begin(gpu_supported_properties), end(gpu_supported_properties), c.first) !=
+                            gpu_supported_properties.end())
+                            res_config[c.first] = c.second;
+                    }
+                }
+                return res_config;
+            });
+    }
+
+    bool compare(ov::AnyMap a, ov::AnyMap b) {
+        if (a.size() != b.size())
+            return false;
+
+        for (auto& it : a) {
+            auto item = b.find(it.first);
+            if (item == b.end())
+                return false;
+            if (it.second != item->second)
+                return false;
+        }
+        return true;
+    }
+};
+
+TEST_P(ParseMetaDeviceTest, ParseMetaDeviceTestCase) {
+    if (m_throw_exception) {
+        ASSERT_ANY_THROW(m_plugin->parse_meta_device(m_batch_cfg, m_config));
+    } else {
+        auto result = m_plugin->parse_meta_device(m_batch_cfg, m_config);
+        EXPECT_EQ(result.device_name, m_expected_device_info.device_name);
+        EXPECT_EQ(result.device_batch_size, m_expected_device_info.device_batch_size);
+        EXPECT_TRUE(compare(result.device_config, m_expected_device_info.device_config));
+    }
+}
+
+const std::vector<meta_device_params> meta_device_test_configs = {
+    meta_device_params{"CPU(4)", {}, DeviceInformation{"CPU", {}, 4}, false},
+    meta_device_params{"CPU(4)", {{}}, DeviceInformation{"CPU", {{}}, 4}, true},
+    meta_device_params{"CPU(4)", {{"CACHE_DIR", "./"}}, DeviceInformation{"CPU", {{"CACHE_DIR", "./"}}, 4}, false},
+    meta_device_params{"GPU(4)", {{"CACHE_DIR", "./"}}, DeviceInformation{"GPU", {{"CACHE_DIR", "./"}}, 4}, false},
+    meta_device_params{"GPU(8)",
+                       {{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}},
+                       DeviceInformation{"GPU", {{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}}, 8},
+                       false},
+    meta_device_params{"CPU(4)", {{"OPTIMAL_BATCH_SIZE", "16"}}, DeviceInformation{"CPU", {{}}, 4}, true},
+    meta_device_params{"CPU(4)",
+                       {{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}},
+                       DeviceInformation{"CPU", {{"CACHE_DIR", "./"}}, 4},
+                       true},
+};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         ParseMetaDeviceTest,
+                         ::testing::ValuesIn(meta_device_test_configs),
+                         ParseMetaDeviceTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/plugin_compile_model_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/plugin_compile_model_test.cpp
@ -0,0 +1,232 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "ngraph_functions/subgraph_builders.hpp"
+#include "openvino/core/dimension_tracker.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::MatcherCast;
+using ::testing::Matches;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using plugin_compile_model_param = std::tuple<ov::AnyMap,  // Core Properties
+                                              ov::AnyMap,  // Plugin Properties
+                                              uint32_t>;   // batch size
+
+class PluginCompileModelTest : public ::testing::TestWithParam<plugin_compile_model_param> {
+public:
+    ov::AnyMap m_core_properities;
+    ov::AnyMap m_plugin_properities;
+    int m_batch_size;
+
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+    std::shared_ptr<ov::Model> m_model;
+    ov::SoPtr<ov::IRemoteContext> m_remote_context;
+
+    ov::SoPtr<MockICompiledModel> m_mock_compile_model;
+    std::shared_ptr<MockICompiledModel> m_mock_i_compile_model;
+    std::shared_ptr<NiceMock<MockIPlugin>> m_hardware_plugin;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<plugin_compile_model_param> obj) {
+        ov::AnyMap core_properities;
+        ov::AnyMap plugin_properities;
+        uint32_t expect_batch_size;
+        std::tie(core_properities, plugin_properities, expect_batch_size) = obj.param;
+
+        std::string res;
+        for (auto& c : core_properities) {
+            res += "_" + c.first + "_" + c.second.as<std::string>();
+        }
+        for (auto& c : plugin_properities) {
+            res += "_" + c.first + "_" + c.second.as<std::string>();
+        }
+        res += "_" + std::to_string(expect_batch_size);
+        return res;
+    }
+
+    void TearDown() override {
+        m_core.reset();
+        m_plugin.reset();
+        m_model.reset();
+        m_remote_context = {};
+        m_mock_i_compile_model.reset();
+        m_mock_compile_model = {};
+    }
+
+    void SetUp() override {
+        std::tie(m_core_properities, m_plugin_properities, m_batch_size) = this->GetParam();
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+        m_plugin->set_core(m_core);
+        m_hardware_plugin = std::shared_ptr<NiceMock<MockIPlugin>>(new NiceMock<MockIPlugin>());
+        m_mock_i_compile_model = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_hardware_plugin);
+        m_mock_compile_model = {m_mock_i_compile_model, {}};
+
+        ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT")))
+            .WillByDefault(Return(m_core_properities["PERFORMANCE_HINT"]));
+
+        ON_CALL(*m_core, get_property(_, StrEq("OPTIMAL_BATCH_SIZE"), _))
+            .WillByDefault(Return(m_core_properities["OPTIMAL_BATCH_SIZE"]));
+
+        ON_CALL(*m_core, get_property(_, StrEq("PERFORMANCE_HINT_NUM_REQUESTS")))
+            .WillByDefault(Return(m_core_properities["PERFORMANCE_HINT_NUM_REQUESTS"]));
+
+        ON_CALL(*m_core, get_property(_, StrEq("GPU_MEMORY_STATISTICS"), _))
+            .WillByDefault([&](const std::string& device, const std::string& key, const ov::AnyMap& options) {
+                static int flag = 0;
+                ov::Any value = m_core_properities[key];
+                uint64_t data = flag * value.as<uint64_t>();
+                std::map<std::string, uint64_t> ret = {{"xyz", data}};
+                flag = flag ? 0 : 1;
+                return ret;
+            });
+
+        ON_CALL(*m_core, get_property(_, StrEq("GPU_DEVICE_TOTAL_MEM_SIZE"), _))
+            .WillByDefault(Return(m_core_properities["GPU_DEVICE_TOTAL_MEM_SIZE"]));
+
+        ON_CALL(*m_core,
+                compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
+                              MatcherCast<const std::string&>(_),
+                              _))
+            .WillByDefault(Return(m_mock_compile_model));
+
+        ON_CALL(*m_core,
+                compile_model(MatcherCast<const std::shared_ptr<const ov::Model>&>(_),
+                              MatcherCast<const ov::SoPtr<ov::IRemoteContext>&>(_),
+                              _))
+            .WillByDefault(Return(m_mock_compile_model));
+    }
+};
+
+TEST_P(PluginCompileModelTest, PluginCompileModelTestCase) {
+    m_model = ngraph::builder::subgraph::makeMultiSingleConv();
+    ASSERT_NO_THROW(m_plugin->compile_model(m_model, m_plugin_properities));
+}
+
+TEST_P(PluginCompileModelTest, PluginCompileModelWithRemoteContextTestCase) {
+    m_model = ngraph::builder::subgraph::makeMultiSingleConv();
+    ASSERT_NO_THROW(m_plugin->compile_model(m_model, m_plugin_properities, m_remote_context));
+}
+
+TEST_P(PluginCompileModelTest, PluginCompileModelBatchedModelTestCase) {
+    m_model = ngraph::builder::subgraph::makeConvPoolReluNonZero({1, 1, 32, 32});
+    auto batch = ov::Dimension(5);
+    ov::DimensionTracker::set_label(batch, 11);
+    auto p_shape = ov::PartialShape{batch, 1, 32, 32};
+    m_model->reshape(p_shape);
+    ASSERT_NO_THROW(m_plugin->compile_model(m_model, m_plugin_properities));
+}
+
+TEST_P(PluginCompileModelTest, PluginCompileModelBatchedModelWithRemoteContextTestCase) {
+    m_model = ngraph::builder::subgraph::makeConvPoolReluNonZero({1, 1, 32, 32});
+    auto batch = ov::Dimension(5);
+    ov::DimensionTracker::set_label(batch, 11);
+    auto p_shape = ov::PartialShape{batch, 1, 32, 32};
+    m_model->reshape(p_shape);
+    ASSERT_NO_THROW(m_plugin->compile_model(m_model, m_plugin_properities, m_remote_context));
+}
+
+const std::vector<plugin_compile_model_param> plugin_compile_model_param_test = {
+    // Case 1: explict apply batch size by config of AUTO_BATCH_DEVICE_CONFIG
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
+                                {"GPU_MEMORY_STATISTICS", "1024000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(32)"}},
+                               32},
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
+                                {"GPU_MEMORY_STATISTICS", "1024000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU(32)"}},
+                               32},
+    // Case 2: CPU batch size is figured out by min of opt_batch_size and infReq_num
+    //         If config contains "PERFORMANCE_HINT_NUM_REQUESTS"
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
+                                {"GPU_MEMORY_STATISTICS", "1024000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
+                               12},
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(8)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(16)},
+                                {"GPU_MEMORY_STATISTICS", "1024000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
+                               8},
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(8)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(2)},
+                                {"GPU_MEMORY_STATISTICS", "1024000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU"}},
+                               1},
+    // Case 3: GPU batch size is figured out by
+    //      1) min of opt_batch_size and infReq_num
+    //      2) available_mem/one_graph_mem_footprint with power 2
+    //  Final m_batch_size is the min of 1) and 2)
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
+                                {"GPU_MEMORY_STATISTICS", "1000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "5000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
+                               4},
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
+                                {"GPU_MEMORY_STATISTICS", "1024000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "40960000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
+                               12},
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(32)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(24)},
+                                {"GPU_MEMORY_STATISTICS", "1000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "18000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
+                               16},
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::THROUGHPUT},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(32)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(48)},
+                                {"GPU_MEMORY_STATISTICS", "1000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "180000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "GPU"}},
+                               32},
+    // Case 4:
+    plugin_compile_model_param{{{"PERFORMANCE_HINT", ov::hint::PerformanceMode::LATENCY},
+                                {"OPTIMAL_BATCH_SIZE", static_cast<unsigned int>(16)},
+                                {"PERFORMANCE_HINT_NUM_REQUESTS", static_cast<uint32_t>(12)},
+                                {"GPU_MEMORY_STATISTICS", "1024000"},
+                                {"GPU_DEVICE_TOTAL_MEM_SIZE", "4096000000"}},
+                               {{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(32)"}},
+                               32},
+};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         PluginCompileModelTest,
+                         ::testing::ValuesIn(plugin_compile_model_param_test),
+                         PluginCompileModelTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/plugin_get_property_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/plugin_get_property_test.cpp
@ -0,0 +1,102 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using get_property_params = std::tuple<std::string,  // Get Property Name
+                                       bool>;        // Throw exception
+
+const char supported_metric[] = "SUPPORTED_METRICS FULL_DEVICE_NAME SUPPORTED_CONFIG_KEYS";
+const char supported_config_keys[] = "AUTO_BATCH_DEVICE_CONFIG MULTI_DEVICE_PRIORITIES AUTO_BATCH_TIMEOUT CACHE_DIR";
+
+class GetPropertyTest : public ::testing::TestWithParam<get_property_params> {
+public:
+    std::string m_property_name;
+    bool m_throw_exception;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<get_property_params> obj) {
+        std::string property_name;
+        bool throw_exception;
+
+        std::tie(property_name, throw_exception) = obj.param;
+        std::string res = "";
+
+        if (!property_name.empty()) {
+            res += "GetProperty_" + property_name;
+        }
+        if (throw_exception)
+            res += "_throw";
+        return res;
+    }
+
+    void TearDown() override {
+        m_plugin.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_property_name, m_throw_exception) = this->GetParam();
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+
+        ON_CALL(*m_plugin, get_property).WillByDefault([this](const std::string& name, const ov::AnyMap& arguments) {
+            return m_plugin->Plugin::get_property(name, arguments);
+        });
+    }
+};
+
+TEST_P(GetPropertyTest, GetPropertyTestCase) {
+    ov::AnyMap options = {};
+    if (m_throw_exception) {
+        ASSERT_ANY_THROW(m_plugin->get_property(m_property_name, options));
+    } else {
+        ov::Any value;
+        ASSERT_NO_THROW(value = m_plugin->get_property(m_property_name, options));
+        if (m_property_name == METRIC_KEY(SUPPORTED_METRICS)) {
+            EXPECT_EQ(value.as<std::string>(), supported_metric);
+            return;
+        }
+        if (m_property_name == ov::device::full_name.name()) {
+            EXPECT_EQ(value.as<std::string>(), "BATCH");
+            return;
+        }
+        if (m_property_name == METRIC_KEY(SUPPORTED_CONFIG_KEYS)) {
+            EXPECT_EQ(value.as<std::string>(), supported_config_keys);
+            return;
+        }
+    }
+}
+
+const std::vector<get_property_params> get_property_params_test = {
+    get_property_params{"AUTO_BATCH_TIMEOUT", false},
+    get_property_params{"AUTO_BATCH_DEVICE_CONFIG", true},
+    get_property_params{"CACHE_DIR", true},
+    get_property_params{METRIC_KEY(SUPPORTED_METRICS), false},
+    get_property_params{METRIC_KEY(SUPPORTED_CONFIG_KEYS), false},
+    get_property_params{"CPU_THREADS_NUM", true},
+    get_property_params{"PERFORMANCE_HINT", true},
+};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         GetPropertyTest,
+                         ::testing::ValuesIn(get_property_params_test),
+                         GetPropertyTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/plugin_query_model_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/plugin_query_model_test.cpp
@ -0,0 +1,91 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "ngraph_functions/subgraph_builders.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using query_model_params = std::tuple<ov::AnyMap,  // Set Property
+                                      bool>;
+
+class QueryModelTest : public ::testing::TestWithParam<query_model_params> {
+public:
+    ov::AnyMap m_properties;
+    bool m_throw_exception;
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+    std::shared_ptr<ov::Model> m_model;
+    ov::SupportedOpsMap m_supported_ops_map;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<query_model_params> obj) {
+        ov::AnyMap properties;
+        bool throw_exception;
+
+        std::tie(properties, throw_exception) = obj.param;
+        std::string res = "";
+        if (properties.size() > 0) {
+            res += "QueryModel_";
+            for (auto& it : properties) {
+                res += it.first + "_" + it.second.as<std::string>() + "_";
+            }
+        }
+        if (throw_exception)
+            res += "_throw";
+        return res;
+    }
+
+    void TearDown() override {
+        m_core.reset();
+        m_plugin.reset();
+        m_model.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_properties, m_throw_exception) = this->GetParam();
+        m_model = ngraph::builder::subgraph::makeMultiSingleConv();
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+        m_plugin->set_core(m_core);
+
+        ON_CALL(*m_core, query_model).WillByDefault(Return(m_supported_ops_map));
+    }
+};
+
+TEST_P(QueryModelTest, QueryModelTestCase) {
+    if (m_throw_exception) {
+        ASSERT_ANY_THROW(m_plugin->query_model(m_model, m_properties));
+    } else {
+        ASSERT_NO_THROW(m_plugin->query_model(m_model, m_properties));
+    }
+}
+
+const std::vector<query_model_params> query_model_params_test = {
+    query_model_params{{{}}, true},
+    query_model_params{{{"AUTO_BATCH_TIMEOUT", "200"}}, true},
+    query_model_params{{{"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, false},
+    query_model_params{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, false},
+};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         QueryModelTest,
+                         ::testing::ValuesIn(query_model_params_test),
+                         QueryModelTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/plugin_set_property_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/plugin_set_property_test.cpp
@ -0,0 +1,88 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using namespace ov::mock_autobatch_plugin;
+
+using set_property_params = std::tuple<ov::AnyMap,  // Set Property
+                                       bool>;
+
+class SetPropertyTest : public ::testing::TestWithParam<set_property_params> {
+public:
+    ov::AnyMap m_properties;
+    bool m_throw_exception;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_plugin;
+
+public:
+    static std::string getTestCaseName(testing::TestParamInfo<set_property_params> obj) {
+        ov::AnyMap properties;
+        bool throw_exception;
+
+        std::tie(properties, throw_exception) = obj.param;
+        std::string res = "";
+        if (properties.size() > 0) {
+            res += "SetProperty_";
+            for (auto& it : properties) {
+                res += it.first + "_" + it.second.as<std::string>() + "_";
+            }
+        }
+        if (throw_exception)
+            res += "_throw";
+        return res;
+    }
+
+    void TearDown() override {
+        m_plugin.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_properties, m_throw_exception) = this->GetParam();
+        m_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+    }
+};
+
+TEST_P(SetPropertyTest, SetPropertyTestCase) {
+    if (m_properties.size() == 0) {
+        ASSERT_NO_THROW(m_plugin->set_property(m_properties));
+        return;
+    }
+
+    if (m_throw_exception) {
+        ASSERT_ANY_THROW(m_plugin->set_property(m_properties));
+    } else {
+        ASSERT_NO_THROW(m_plugin->set_property(m_properties));
+    }
+}
+
+const std::vector<set_property_params> plugin_set_property_params_test = {
+    set_property_params{{{"AUTO_BATCH_TIMEOUT", "200"}}, false},
+    set_property_params{{{"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, false},
+    set_property_params{{{"CACHE_DIR", "./xyz"}}, false},
+    set_property_params{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, false},
+    set_property_params{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}, {"CACHE_DIR", "./xyz"}},
+                        false},
+    set_property_params{{{"XYZ", "200"}}, true},
+    set_property_params{{{"XYZ", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}, {"CACHE_DIR", "./xyz"}}, true},
+};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         SetPropertyTest,
+                         ::testing::ValuesIn(plugin_set_property_params_test),
+                         SetPropertyTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/plugins_tests.cpp
+++ b/src/plugins/auto_batch/tests/unit/plugins_tests.cpp
@ -1,397 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#include <gmock/gmock.h>
-#include <gtest/gtest.h>
-
-#include "mock_auto_batch_plugin.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/impl/mock_inference_plugin_internal.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
-#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_iinference_plugin.hpp"
-
-using ::testing::_;
-using ::testing::AnyNumber;
-using ::testing::AtLeast;
-using ::testing::Eq;
-using ::testing::NiceMock;
-using ::testing::Return;
-using ::testing::ReturnRef;
-using ::testing::StrEq;
-using ::testing::StrNe;
-using ::testing::Throw;
-using namespace ov::mock_autobatch_plugin;
-using BatchDeviceConfigParams = std::tuple<std::string,  // Batch devices
-                                           std::string,  // Expected device name
-                                           int,          // Expected batch size
-                                           bool          // Throw exception
-                                           >;
-using MetricConfigParams = std::tuple<std::string, std::string, bool>;
-using MetaDeviceParams = std::tuple<std::string,                           // Device batch cfg
-                                    std::map<std::string, std::string>,    // Config
-                                    DeviceInformation,                     // Expected result
-                                    bool>;                                 // Throw exception
-using SetGetConfigParams = std::tuple<std::map<std::string, std::string>,  // Set Config
-                                      std::string,                         // Get Config
-                                      bool>;                               // Throw exception
-
-const std::vector<std::string> cpu_supported_properties = {
-    "CACHE_DIR",
-};
-const std::vector<std::string> gpu_supported_properties = {
-    "CACHE_DIR",
-    "OPTIMAL_BATCH_SIZE",
-};
-
-class SetGetConfigTest : public ::testing::TestWithParam<SetGetConfigParams> {
-public:
-    std::shared_ptr<NiceMock<MockICore>> core;
-    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
-
-public:
-    static std::string getTestCaseName(testing::TestParamInfo<SetGetConfigParams> obj) {
-        std::map<std::string, std::string> set_config;
-        std::string get_config;
-        bool throw_exception;
-
-        std::tie(set_config, get_config, throw_exception) = obj.param;
-        std::string res = "";
-        if (set_config.size() > 0) {
-            res += "GetConfig_";
-            for (auto& it : set_config) {
-                res += it.first + "_" + it.second + "_";
-            }
-        }
-        if (!get_config.empty()) {
-            res += "GetConfig_" + get_config;
-        }
-        if (throw_exception)
-            res += "_throw";
-        return res;
-    }
-
-    void TearDown() override {
-        core.reset();
-        plugin.reset();
-    }
-
-    void SetUp() override {
-        core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
-        plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
-        plugin->SetCore(core);
-
-        ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
-            return plugin->Plugin::ParseBatchDevice(batchDevice);
-        });
-    }
-};
-
-TEST_P(SetGetConfigTest, SetConfigTestCase) {
-    std::map<std::string, std::string> set_config;
-    std::string temp;
-    bool throw_exception;
-    std::tie(set_config, temp, throw_exception) = this->GetParam();
-
-    if (set_config.size() == 0) {
-        ASSERT_NO_THROW(plugin->SetConfig(set_config));
-        return;
-    }
-
-    if (throw_exception) {
-        ASSERT_ANY_THROW(plugin->SetConfig(set_config));
-    } else {
-        ASSERT_NO_THROW(plugin->SetConfig(set_config));
-    }
-}
-
-TEST_P(SetGetConfigTest, GetConfigTestCase) {
-    std::map<std::string, std::string> temp;
-    std::string get_config;
-    bool throw_exception;
-    std::tie(temp, get_config, throw_exception) = this->GetParam();
-
-    if (get_config.empty() || temp.size() > 0) {
-        return;
-    }
-
-    std::map<std::string, InferenceEngine::Parameter> options = {};
-    if (throw_exception) {
-        ASSERT_ANY_THROW(plugin->GetConfig(get_config, options));
-    } else {
-        ASSERT_NO_THROW(plugin->GetConfig(get_config, options));
-    }
-}
-
-TEST_P(SetGetConfigTest, SetGetConfigTestCase) {
-    std::map<std::string, std::string> set_config;
-    std::string get_config;
-    bool throw_exception;
-    std::tie(set_config, get_config, throw_exception) = this->GetParam();
-
-    if (get_config.empty() || set_config.size() == 0) {
-        return;
-    }
-
-    std::map<std::string, InferenceEngine::Parameter> options = {};
-    ASSERT_NO_THROW(plugin->SetConfig(set_config));
-    InferenceEngine::Parameter result;
-    ASSERT_NO_THROW(result = plugin->GetConfig(get_config, options));
-    EXPECT_EQ(result.as<std::string>(), set_config[get_config]);
-}
-
-class ParseMetaDeviceTest : public ::testing::TestWithParam<MetaDeviceParams> {
-public:
-    std::shared_ptr<NiceMock<MockICore>> core;
-    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
-
-public:
-    static std::string getTestCaseName(testing::TestParamInfo<MetaDeviceParams> obj) {
-        std::string batch_cfg;
-        std::map<std::string, std::string> config;
-        DeviceInformation info;
-        bool throw_exception;
-
-        std::tie(batch_cfg, config, info, throw_exception) = obj.param;
-        std::string res = batch_cfg;
-        for (auto& c : config) {
-            res += "_" + c.first + "_" + c.second;
-        }
-        if (throw_exception)
-            res += "_throw";
-        return res;
-    }
-
-    void TearDown() override {
-        core.reset();
-        plugin.reset();
-    }
-
-    void SetUp() override {
-        core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
-        plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
-        plugin->SetCore(core);
-
-        ON_CALL(*core, GetSupportedConfig)
-            .WillByDefault([](const std::string& device, const std::map<std::string, std::string>& configs) {
-                std::map<std::string, std::string> res_config;
-                if (device == "CPU") {
-                    for (auto& c : configs) {
-                        if (std::find(begin(cpu_supported_properties), end(cpu_supported_properties), c.first) !=
-                            cpu_supported_properties.end())
-                            res_config[c.first] = c.second;
-                    }
-                } else if (device == "GPU") {
-                    for (auto& c : configs) {
-                        if (std::find(begin(gpu_supported_properties), end(gpu_supported_properties), c.first) !=
-                            gpu_supported_properties.end())
-                            res_config[c.first] = c.second;
-                    }
-                }
-                return res_config;
-            });
-
-        ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
-            return plugin->Plugin::ParseBatchDevice(batchDevice);
-        });
-    }
-
-    bool compare(std::map<std::string, std::string> a, std::map<std::string, std::string> b) {
-        if (a.size() != b.size())
-            return false;
-
-        for (auto& it : a) {
-            auto item = b.find(it.first);
-            if (item == b.end())
-                return false;
-            if (it.second != item->second)
-                return false;
-        }
-        return true;
-    }
-};
-
-TEST_P(ParseMetaDeviceTest, ParseMetaDeviceTestCase) {
-    std::string batch_cfg;
-    std::map<std::string, std::string> config;
-    DeviceInformation expected;
-    bool throw_exception;
-
-    std::tie(batch_cfg, config, expected, throw_exception) = this->GetParam();
-
-    if (throw_exception) {
-        ASSERT_ANY_THROW(plugin->ParseMetaDevice(batch_cfg, config));
-    } else {
-        auto result = plugin->ParseMetaDevice(batch_cfg, config);
-        EXPECT_EQ(result.device_name, expected.device_name);
-        EXPECT_EQ(result.batch_for_device, expected.batch_for_device);
-        EXPECT_TRUE(compare(result.config, expected.config));
-    }
-}
-
-class ParseBatchDeviceTest : public ::testing::TestWithParam<BatchDeviceConfigParams> {
-public:
-    std::shared_ptr<NiceMock<MockICore>> core;
-    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
-
-public:
-    static std::string getTestCaseName(testing::TestParamInfo<BatchDeviceConfigParams> obj) {
-        std::string batchDevice;
-        std::string deviceName;
-        int batchSize;
-        bool throw_exception;
-        std::tie(batchDevice, deviceName, batchSize, throw_exception) = obj.param;
-        return batchDevice;
-    }
-
-    void TearDown() override {
-        core.reset();
-        plugin.reset();
-    }
-
-    void SetUp() override {
-        core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
-        plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
-        plugin->SetCore(core);
-
-        ON_CALL(*plugin, ParseBatchDevice).WillByDefault([this](const std::string& batchDevice) {
-            return plugin->Plugin::ParseBatchDevice(batchDevice);
-        });
-    }
-};
-
-TEST_P(ParseBatchDeviceTest, ParseBatchDeviceTestCase) {
-    std::string batchDevice;
-    std::string deviceName;
-    int batchSize;
-    bool throw_exception;
-    std::tie(batchDevice, deviceName, batchSize, throw_exception) = this->GetParam();
-
-    if (throw_exception) {
-        ASSERT_ANY_THROW(plugin->ParseBatchDevice(batchDevice));
-    } else {
-        auto result = plugin->ParseBatchDevice(batchDevice);
-        EXPECT_EQ(result.device_name, deviceName);
-        EXPECT_EQ(result.batch_for_device, batchSize);
-    }
-}
-
-class PluginMetricTest : public ::testing::TestWithParam<MetricConfigParams> {
-public:
-    std::shared_ptr<NiceMock<MockICore>> core;
-    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> plugin;
-
-public:
-    static std::string getTestCaseName(testing::TestParamInfo<MetricConfigParams> obj) {
-        std::string metricName;
-        std::string value;
-        bool throw_exception;
-        std::tie(metricName, value, throw_exception) = obj.param;
-        return "Metric_" + metricName;
-    }
-
-    void TearDown() override {
-        core.reset();
-        plugin.reset();
-    }
-
-    void SetUp() override {
-        core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
-        plugin = std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
-        plugin->SetCore(core);
-
-        ON_CALL(*plugin, GetMetric)
-            .WillByDefault(
-                [this](const std::string& name, const std::map<std::string, InferenceEngine::Parameter>& options) {
-                    return plugin->Plugin::GetMetric(name, options);
-                });
-    }
-};
-
-TEST_P(PluginMetricTest, GetPluginMetricTest) {
-    std::string metricName;
-    std::string expected;
-    bool throw_exception;
-    std::tie(metricName, expected, throw_exception) = this->GetParam();
-
-    if (throw_exception) {
-        ASSERT_ANY_THROW(plugin->GetMetric(metricName, {}));
-    } else {
-        auto value = plugin->GetMetric(metricName, {});
-        EXPECT_EQ(value.as<std::string>(), expected);
-    }
-}
-
-const char supported_metric[] = "SUPPORTED_METRICS FULL_DEVICE_NAME SUPPORTED_CONFIG_KEYS";
-const char supported_config_keys[] = "AUTO_BATCH_DEVICE_CONFIG MULTI_DEVICE_PRIORITIES AUTO_BATCH_TIMEOUT CACHE_DIR";
-
-const std::vector<BatchDeviceConfigParams> batchDeviceTestConfigs = {
-    BatchDeviceConfigParams{"CPU(4)", "CPU", 4, false},
-    BatchDeviceConfigParams{"GPU(8)", "GPU", 8, false},
-    BatchDeviceConfigParams{"CPU(0)", "CPU", 0, true},
-    BatchDeviceConfigParams{"GPU(-1)", "GPU", 0, true},
-};
-
-const std::vector<MetricConfigParams> metricTestConfigs = {
-    MetricConfigParams{METRIC_KEY(SUPPORTED_METRICS), supported_metric, false},
-    MetricConfigParams{METRIC_KEY(FULL_DEVICE_NAME), "BATCH", false},
-    MetricConfigParams{METRIC_KEY(SUPPORTED_CONFIG_KEYS), supported_config_keys, false},
-    MetricConfigParams{"CPU_THREADS_NUM", "16", true},
-    MetricConfigParams{"PERFORMANCE_HINT", "LATENCY", true},
-};
-
-const std::vector<MetaDeviceParams> testMetaDeviceConfigs = {
-    MetaDeviceParams{"CPU(4)", {}, DeviceInformation{"CPU", {}, 4}, false},
-    MetaDeviceParams{"CPU(4)", {{}}, DeviceInformation{"CPU", {{}}, 4}, true},
-    MetaDeviceParams{"CPU(4)", {{"CACHE_DIR", "./"}}, DeviceInformation{"CPU", {{"CACHE_DIR", "./"}}, 4}, false},
-    MetaDeviceParams{"GPU(4)", {{"CACHE_DIR", "./"}}, DeviceInformation{"GPU", {{"CACHE_DIR", "./"}}, 4}, false},
-    MetaDeviceParams{"GPU(8)",
-                     {{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}},
-                     DeviceInformation{"GPU", {{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}}, 8},
-                     false},
-    MetaDeviceParams{"CPU(4)", {{"OPTIMAL_BATCH_SIZE", "16"}}, DeviceInformation{"CPU", {{}}, 4}, true},
-    MetaDeviceParams{"CPU(4)",
-                     {{"CACHE_DIR", "./"}, {"OPTIMAL_BATCH_SIZE", "16"}},
-                     DeviceInformation{"CPU", {{"CACHE_DIR", "./"}}, 4},
-                     true},
-};
-
-const std::vector<SetGetConfigParams> testSetGetConfigParams = {
-    // Set Config
-    SetGetConfigParams{{{"AUTO_BATCH_TIMEOUT", "200"}}, {}, false},
-    SetGetConfigParams{{{"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, {}, false},
-    SetGetConfigParams{{{"CACHE_DIR", "./xyz"}}, {}, false},
-    SetGetConfigParams{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, {}, false},
-    SetGetConfigParams{{{"AUTO_BATCH_TIMEOUT", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}, {"CACHE_DIR", "./xyz"}},
-                       {},
-                       false},
-    SetGetConfigParams{{{"XYZ", "200"}}, {}, true},
-    SetGetConfigParams{{{"XYZ", "200"}, {"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}, {"CACHE_DIR", "./xyz"}}, {}, true},
-    // Get Config
-    SetGetConfigParams{{}, "AUTO_BATCH_TIMEOUT", false},
-    SetGetConfigParams{{}, "AUTO_BATCH_DEVICE_CONFIG", true},
-    SetGetConfigParams{{}, "CACHE_DIR", true},
-    // Set and get Config
-    SetGetConfigParams{{{"AUTO_BATCH_TIMEOUT", "200"}}, "AUTO_BATCH_TIMEOUT", false},
-    SetGetConfigParams{{{"AUTO_BATCH_DEVICE_CONFIG", "CPU(4)"}}, "AUTO_BATCH_DEVICE_CONFIG", false},
-    SetGetConfigParams{{{"CACHE_DIR", "./abc"}}, "CACHE_DIR", false},
-};
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         SetGetConfigTest,
-                         ::testing::ValuesIn(testSetGetConfigParams),
-                         SetGetConfigTest::getTestCaseName);
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         ParseBatchDeviceTest,
-                         ::testing::ValuesIn(batchDeviceTestConfigs),
-                         ParseBatchDeviceTest::getTestCaseName);
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         PluginMetricTest,
-                         ::testing::ValuesIn(metricTestConfigs),
-                         PluginMetricTest::getTestCaseName);
-
-INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
-                         ParseMetaDeviceTest,
-                         ::testing::ValuesIn(testMetaDeviceConfigs),
-                         ParseMetaDeviceTest::getTestCaseName);
--- a/src/plugins/auto_batch/tests/unit/sync_infer_request_test.cpp
+++ b/src/plugins/auto_batch/tests/unit/sync_infer_request_test.cpp
@ -0,0 +1,257 @@
+// Copyright (C) 2018-2023 Intel Corporation
+// SPDX-License-Identifier: Apache-2.0
+//
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+#include "mock_common.hpp"
+#include "ngraph_functions/subgraph_builders.hpp"
+#include "openvino/core/dimension_tracker.hpp"
+#include "openvino/core/type/element_type.hpp"
+#include "openvino/runtime/threading/immediate_executor.hpp"
+#include "transformations/utils/utils.hpp"
+#include "unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp"
+using ::testing::_;
+using ::testing::AnyNumber;
+using ::testing::AtLeast;
+using ::testing::Eq;
+using ::testing::MatcherCast;
+using ::testing::Matches;
+using ::testing::NiceMock;
+using ::testing::Return;
+using ::testing::ReturnRef;
+using ::testing::StrEq;
+using ::testing::StrNe;
+using ::testing::Throw;
+
+using AutoBatchRequestTestParams = std::tuple<uint32_t,              // batch_size
+                                              ov::element::Type_t>;  // data type
+
+class AutoBatchRequestTest : public ::testing::TestWithParam<AutoBatchRequestTestParams> {
+public:
+    std::shared_ptr<ov::Model> m_model;
+    std::shared_ptr<NiceMock<MockICore>> m_core;
+    std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>> m_auto_batch_plugin;
+
+    std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_without_batch;
+    ov::SoPtr<ov::ICompiledModel> m_compile_model_without_batch;
+
+    std::shared_ptr<NiceMock<MockICompiledModel>> m_i_compile_model_with_batch;
+    ov::SoPtr<ov::ICompiledModel> m_compile_model_with_batch;
+
+    ov::AnyMap m_config;
+    DeviceInformation m_device_info;
+    std::set<std::string> m_batched_inputs;
+    std::set<std::string> m_batched_outputs;
+    ov::SoPtr<ov::IRemoteContext> m_remote_context;
+
+    std::shared_ptr<MockAutoBatchCompileModel> m_auto_batch_compile_model;
+
+    std::shared_ptr<NiceMock<MockISyncInferRequest>> m_sync_infer_request_with_batch;
+
+    std::shared_ptr<NiceMock<MockIAsyncInferRequest>> m_async_infer_request_with_batch;
+
+    std::shared_ptr<ov::threading::ImmediateExecutor> m_executor;
+
+    std::shared_ptr<CompiledModel::WorkerInferRequest> workerRequestPtr;
+
+    uint32_t m_batch_size;
+    ov::element::Type_t m_element_type;
+
+    std::vector<std::shared_ptr<SyncInferRequest>> m_auto_batch_infer_requests;
+
+    std::vector<ov::ProfilingInfo> m_profiling_info;
+
+    static std::string getTestCaseName(testing::TestParamInfo<AutoBatchRequestTestParams> obj) {
+        uint32_t batch_size;
+        ov::element::Type_t element_type;
+        std::tie(batch_size, element_type) = obj.param;
+
+        std::string res;
+        res = "batch_size_" + std::to_string(batch_size);
+        res += "_element_type_" + std::to_string(static_cast<int>(element_type));
+        return res;
+    }
+
+    void TearDown() override {
+        m_profiling_info.clear();
+        m_auto_batch_infer_requests.clear();
+        m_auto_batch_plugin.reset();
+        m_model.reset();
+        m_core.reset();
+        m_i_compile_model_without_batch.reset();
+        m_compile_model_without_batch = {};
+        m_i_compile_model_with_batch.reset();
+        m_compile_model_with_batch = {};
+        m_auto_batch_compile_model.reset();
+        m_sync_infer_request_with_batch.reset();
+        m_async_infer_request_with_batch.reset();
+        m_executor.reset();
+        clear_worker();
+        workerRequestPtr.reset();
+    }
+
+    void SetUp() override {
+        std::tie(m_batch_size, m_element_type) = this->GetParam();
+        std::vector<size_t> inputShape = {1, 3, 24, 24};
+        m_model = ngraph::builder::subgraph::makeMultiSingleConv(inputShape, m_element_type);
+        m_core = std::shared_ptr<NiceMock<MockICore>>(new NiceMock<MockICore>());
+
+        m_auto_batch_plugin =
+            std::shared_ptr<NiceMock<MockAutoBatchInferencePlugin>>(new NiceMock<MockAutoBatchInferencePlugin>());
+
+        m_auto_batch_plugin->set_core(m_core);
+        m_i_compile_model_without_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_auto_batch_plugin);
+        m_compile_model_without_batch = {m_i_compile_model_without_batch, {}};
+
+        m_config = {{"AUTO_BATCH_TIMEOUT", "200"}};
+
+        m_device_info = {"CPU", {}, m_batch_size};
+        m_batched_inputs = {"Parameter_0"};
+        m_batched_outputs = {"Convolution_20"};
+
+        m_i_compile_model_with_batch = std::make_shared<NiceMock<MockICompiledModel>>(m_model, m_auto_batch_plugin);
+        m_compile_model_with_batch = {m_i_compile_model_with_batch, {}};
+
+        ASSERT_NO_THROW(m_auto_batch_compile_model =
+                            std::make_shared<MockAutoBatchCompileModel>(m_model->clone(),
+                                                                        m_auto_batch_plugin,
+                                                                        m_config,
+                                                                        m_device_info,
+                                                                        m_batched_inputs,
+                                                                        m_batched_outputs,
+                                                                        m_compile_model_with_batch,
+                                                                        m_compile_model_without_batch,
+                                                                        m_remote_context));
+
+        m_sync_infer_request_with_batch =
+            std::make_shared<NiceMock<MockISyncInferRequest>>(m_i_compile_model_with_batch);
+
+        m_executor = std::make_shared<ov::threading::ImmediateExecutor>();
+
+        m_async_infer_request_with_batch =
+            std::make_shared<NiceMock<MockIAsyncInferRequest>>(m_sync_infer_request_with_batch, m_executor, nullptr);
+
+        m_profiling_info = {};
+    }
+
+    void create_worker(int batch_size) {
+        workerRequestPtr = std::make_shared<CompiledModel::WorkerInferRequest>();
+
+        workerRequestPtr->_infer_request_batched = {m_async_infer_request_with_batch, {}};
+        workerRequestPtr->_batch_size = batch_size;
+        workerRequestPtr->_completion_tasks.resize(workerRequestPtr->_batch_size);
+        workerRequestPtr->_infer_request_batched->set_callback([this](std::exception_ptr exceptionPtr) mutable {
+            if (exceptionPtr)
+                workerRequestPtr->_exception_ptr = exceptionPtr;
+        });
+        workerRequestPtr->_thread = std::thread([] {
+            std::this_thread::sleep_for(std::chrono::milliseconds(10));
+        });
+        return;
+    }
+
+    void clear_worker() {
+        workerRequestPtr->_infer_request_batched = {};
+        workerRequestPtr->_completion_tasks.clear();
+        workerRequestPtr->_thread.join();
+    }
+
+    void prepare_input(std::shared_ptr<ov::Model>& model, int batch_size) {
+        const auto& params = model->get_parameters();
+        for (size_t i = 0; i < params.size(); i++) {
+            m_batched_inputs.insert(ov::op::util::get_ie_output_name(params[i]->output(0)));
+        }
+        const auto& results = model->get_results();
+        for (size_t i = 0; i < results.size(); i++) {
+            const auto& output = results[i];
+            const auto& node = output->input_value(0);
+            m_batched_outputs.insert(
+                ov::op::util::get_ie_output_name(ov::Output<const ov::Node>(node.get_node(), node.get_index())));
+        }
+    }
+};
+
+TEST_P(AutoBatchRequestTest, AutoBatchRequestCreateTestCase) {
+    prepare_input(m_model, m_batch_size);
+    create_worker(m_batch_size);
+
+    for (uint32_t batch_id = 0; batch_id < m_batch_size; batch_id++) {
+        auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
+                                                      workerRequestPtr,
+                                                      batch_id,
+                                                      m_batch_size,
+                                                      m_batched_inputs,
+                                                      m_batched_outputs);
+        EXPECT_NE(req, nullptr);
+        m_auto_batch_infer_requests.emplace_back(req);
+    }
+}
+
+TEST_P(AutoBatchRequestTest, AutoBatchRequestCopyInputTensorTestCase) {
+    prepare_input(m_model, m_batch_size);
+    create_worker(m_batch_size);
+
+    auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
+                                                  workerRequestPtr,
+                                                  0,
+                                                  m_batch_size,
+                                                  m_batched_inputs,
+                                                  m_batched_outputs);
+    EXPECT_NE(req, nullptr);
+    m_auto_batch_infer_requests.emplace_back(req);
+
+    EXPECT_NO_THROW(req->copy_inputs_if_needed());
+}
+
+TEST_P(AutoBatchRequestTest, AutoBatchRequestCopyOutputTensorTestCase) {
+    prepare_input(m_model, m_batch_size);
+    create_worker(m_batch_size);
+
+    auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
+                                                  workerRequestPtr,
+                                                  0,
+                                                  m_batch_size,
+                                                  m_batched_inputs,
+                                                  m_batched_outputs);
+    EXPECT_NE(req, nullptr);
+    m_auto_batch_infer_requests.emplace_back(req);
+
+    EXPECT_NO_THROW(req->copy_outputs_if_needed());
+}
+
+TEST_P(AutoBatchRequestTest, AutoBatchRequestGetProfilingInfoTestCase) {
+    prepare_input(m_model, m_batch_size);
+    create_worker(m_batch_size);
+
+    auto req = std::make_shared<SyncInferRequest>(m_auto_batch_compile_model,
+                                                  workerRequestPtr,
+                                                  0,
+                                                  m_batch_size,
+                                                  m_batched_inputs,
+                                                  m_batched_outputs);
+    EXPECT_NE(req, nullptr);
+
+    ON_CALL(*m_sync_infer_request_with_batch, get_profiling_info()).WillByDefault(Return(m_profiling_info));
+
+    EXPECT_NO_THROW(req->get_profiling_info());
+}
+
+std::vector<ov::element::Type_t> element_type{ov::element::Type_t::f16,
+                                              ov::element::Type_t::f32,
+                                              ov::element::Type_t::f64,
+                                              ov::element::Type_t::i8,
+                                              ov::element::Type_t::i16,
+                                              ov::element::Type_t::i32,
+                                              ov::element::Type_t::i64,
+                                              ov::element::Type_t::u8,
+                                              ov::element::Type_t::u16,
+                                              ov::element::Type_t::u32,
+                                              ov::element::Type_t::u64};
+const std::vector<uint32_t> batch_size{1, 8, 16, 32, 64, 128};
+
+INSTANTIATE_TEST_SUITE_P(smoke_AutoBatch_BehaviorTests,
+                         AutoBatchRequestTest,
+                         ::testing::Combine(::testing::ValuesIn(batch_size), ::testing::ValuesIn(element_type)),
+                         AutoBatchRequestTest::getTestCaseName);
--- a/src/plugins/intel_cpu/tests/functional/shared_tests_instances/behavior/ov_executable_network/properties.cpp
+++ b/src/plugins/intel_cpu/tests/functional/shared_tests_instances/behavior/ov_executable_network/properties.cpp
@ -16,8 +16,6 @@ const std::vector<ov::AnyMap> inproperties = {

 const std::vector<ov::AnyMap> auto_batch_inproperties = {
    {ov::num_streams(-100)},
-    {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), std::string(CommonTestUtils::DEVICE_CPU) + "(4)"},
-     {ov::auto_batch_timeout(-1)}},
 };

 INSTANTIATE_TEST_SUITE_P(smoke_BehaviorTests,
--- a/src/plugins/intel_gpu/tests/functional/shared_tests_instances/behavior/plugin/auto_batching_tests.cpp
+++ b/src/plugins/intel_gpu/tests/functional/shared_tests_instances/behavior/plugin/auto_batching_tests.cpp
@ -57,6 +57,6 @@ INSTANTIATE_TEST_SUITE_P(
        ::testing::Combine(
                ::testing::Values(std::string(CommonTestUtils::DEVICE_BATCH) + ":" + CommonTestUtils::DEVICE_GPU),
                ::testing::Values(DefaultParameter{ov::auto_batch_timeout.name(),
-                                                   InferenceEngine::Parameter{1000}})),
+                                                   InferenceEngine::Parameter{uint32_t(1000)}})),
        DefaultConfigurationTest::getTestCaseName);
 }  // namespace AutoBatchingTests
--- a/src/plugins/intel_gpu/tests/functional/shared_tests_instances/behavior/plugin/configuration_tests.cpp
+++ b/src/plugins/intel_gpu/tests/functional/shared_tests_instances/behavior/plugin/configuration_tests.cpp
@ -89,8 +89,6 @@ namespace {

    auto auto_batch_inconfigs = []() {
        return std::vector<std::map<std::string, std::string>>{
-            {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_GPU},
-             {CONFIG_KEY(AUTO_BATCH_TIMEOUT), "-1"}},
            {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_GPU},
             {InferenceEngine::PluginConfigParams::KEY_PERFORMANCE_HINT, "DOESN'T EXIST"}},
            {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), CommonTestUtils::DEVICE_GPU},
--- a/src/plugins/template/tests/functional/shared_tests_instances/behavior/ov_executable_network/properties.cpp
+++ b/src/plugins/template/tests/functional/shared_tests_instances/behavior/ov_executable_network/properties.cpp
@ -16,8 +16,6 @@ const std::vector<ov::AnyMap> inproperties = {

 const std::vector<ov::AnyMap> auto_batch_inproperties = {
    {ov::device::id("UNSUPPORTED_DEVICE_ID_STRING")},
-    {{CONFIG_KEY(AUTO_BATCH_DEVICE_CONFIG), std::string(CommonTestUtils::DEVICE_TEMPLATE) + "(4)"},
-     {ov::auto_batch_timeout(-1)}},
 };

 INSTANTIATE_TEST_SUITE_P(smoke_BehaviorTests,
--- a/src/tests/functional/plugin/conformance/test_runner/api_conformance_runner/src/behavior/ov_executable_network/properties.cpp
+++ b/src/tests/functional/plugin/conformance/test_runner/api_conformance_runner/src/behavior/ov_executable_network/properties.cpp
@ -15,9 +15,7 @@ const std::vector<ov::AnyMap> inproperties = {
        {ov::device::id("UNSUPPORTED_DEVICE_ID_STRING")},
 };

-const std::vector<ov::AnyMap> auto_batch_inproperties = {
-        {{ov::auto_batch_timeout(-1)}},
-};
+const std::vector<ov::AnyMap> auto_batch_inproperties = {};

 INSTANTIATE_TEST_SUITE_P(ov_compiled_model_mandatory, OVClassCompiledModelPropertiesIncorrectTests,
                        ::testing::Combine(
--- a/src/tests/functional/plugin/conformance/test_runner/api_conformance_runner/src/behavior/ov_plugin/properties.cpp
+++ b/src/tests/functional/plugin/conformance/test_runner/api_conformance_runner/src/behavior/ov_plugin/properties.cpp
@ -16,9 +16,7 @@ const std::vector<ov::AnyMap> inproperties = {
        {ov::device::id("UNSUPPORTED_DEVICE_ID_STRING")},
 };

-const std::vector<ov::AnyMap> auto_batch_inproperties = {
-        {{ov::auto_batch_timeout(-1)}},
-};
+const std::vector<ov::AnyMap> auto_batch_inproperties = {};

 INSTANTIATE_TEST_SUITE_P(ov_plugin_mandatory, OVPropertiesIncorrectTests,
                        ::testing::Combine(
--- a/src/tests/test_utils/unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp
+++ b/src/tests/test_utils/unit_test_utils/mocks/cpp_interfaces/interface/mock_icore.hpp
@ -58,6 +58,7 @@ public:
    MOCK_CONST_METHOD3(GetMetric, ov::Any(const std::string&, const std::string&, const ov::AnyMap&));
    MOCK_CONST_METHOD2(GetConfig, ov::Any(const std::string&, const std::string&));
    MOCK_CONST_METHOD3(get_property, ov::Any(const std::string&, const std::string&, const ov::AnyMap&));
+    MOCK_CONST_METHOD2(get_property, ov::Any(const std::string&, const std::string&));
    MOCK_CONST_METHOD0(GetAvailableDevices, std::vector<std::string>());
    MOCK_CONST_METHOD1(DeviceSupportsModelCaching, bool(const std::string&));  // NOLINT not a cast to bool
    MOCK_METHOD2(GetSupportedConfig,