Delete the deprecated LowLatency (version1) transformation (#17965)

* Delete the deprecated LowLatency (version1) transformation * detele LowLatency refs from the docs
2023-06-10 12:24:43 +04:00 · 2023-06-10 12:24:43 +04:00 · 74100670ac
commit 74100670ac
parent cff083f83d
16 changed files with 4 additions and 913 deletions
--- a/docs/OV_Runtime_UG/lowlatency_deprecated.md
+++ b/docs/OV_Runtime_UG/lowlatency_deprecated.md
@ -1,140 +0,0 @@
 # [DEPRECATED] The LowLatency Transformation {#openvino_docs_OV_UG_lowlatency_deprecated}
@sphinxdirective
 The deprecated LowLatency transformation changes the structure of the network containing :doc:`TensorIterator <openvino_docs_ops_infrastructure_TensorIterator_1>` and :doc:`Loop <openvino_docs_ops_infrastructure_Loop_5>` operations by adding the ability to work with the state, inserting the :doc:`Assign <openvino_docs_ops_infrastructure_Assign_3>` / :doc:`ReadValue <openvino_docs_ops_infrastructure_ReadValue_3>` layers, as shown in the picture below.
 .. image:: _static/images/applying_low_latency.svg
 After applying the transformation, ``ReadValue`` operations can receive other operations as an input, as shown in the picture above. These inputs should set the initial value for initialization of ``ReadValue`` operations. However, such initialization is not supported in the current State API implementation. Input values are ignored and the initial values for the ``ReadValue`` operations are set to 0 unless otherwise specified by the user via :ref:`State API <openvino-state-api>`.
 Steps to Apply LowLatency
 #########################
 1. Get CNNNetwork. Either way is acceptable:
   * :doc:`from IR or ONNX model <openvino_docs_OV_UG_Integrate_OV_with_your_application>`
   * :doc:`from ov::Model <openvino_docs_OV_UG_Model_Representation>`
 2. :doc:`Reshape <openvino_docs_OV_UG_ShapeInference>` the CNNNetwork network if necessary.
   An example of such a **necessary case** is when the ``sequence_lengths`` dimension of input > 1, and it means that ``TensorIterator`` layer will have ``number_iterations`` > 1. The inputs of the network should be reshaped to set ``sequence_dimension`` to exactly 1.
   Usually, the following exception, which occurs after applying a transform when trying to infer the network in a plugin, indicates the need to apply the reshape feature: 
   ``C++ exception with description "Function is incorrect. The Assign and ReadValue operations must be used in pairs in the network."``
   This means that there are several pairs of ``Assign``/``ReadValue`` operations with the same ``variable_id`` in the network and operations were inserted into each iteration of the ``TensorIterator``.
   .. code-block:: cpp
      // Network before reshape: Parameter (name: X, shape: [2 (sequence_lengths), 1, 16]) -> TensorIterator (num_iteration = 2, axis = 0) -> ...
      cnnNetwork.reshape({"X" : {1, 1, 16});
      // Network after reshape: Parameter (name: X, shape: [1 (sequence_lengths), 1, 16]) -> TensorIterator (num_iteration = 1, axis = 0) -> ...
 3. Apply the LowLatency transformation.
   .. code-block:: cpp
      #include "ie_transformations.hpp"
      ...
      InferenceEngine::LowLatency(cnnNetwork);
   **State naming rule**:  A name of a state is a concatenation of names: original ``TensorIterator`` operation, parameter of the body, and additional suffix ``variable_`` + ``id`` (0-base indexing, new indexing for each ``TensorIterator``). Use these rules to predict the name of the inserted state after the transformation is applied. For example:
   .. code-block:: cpp
      // Precondition in ngraph::function.
      // Created TensorIterator and Parameter in body of TensorIterator with names
      std::string tensor_iterator_name = "TI_name"
      std::string body_parameter_name = "param_name"
      std::string idx = "0"; // it's a first variable in the network
      // The State will be named "TI_name/param_name/variable_0"
      auto state_name = tensor_iterator_name + "//" + body_parameter_name + "//" + "variable_" + idx;
      InferenceEngine::CNNNetwork cnnNetwork = InferenceEngine::CNNNetwork{function};
      InferenceEngine::LowLatency(cnnNetwork);
      InferenceEngine::ExecutableNetwork executableNetwork = core->LoadNetwork(/*cnnNetwork, targetDevice, configuration*/);
      // Try to find the Variable by name
      auto states = executableNetwork.QueryState();
      for (auto& state : states) {
         auto name = state.GetName();
         if (name == state_name) {
            // some actions
         }
      }
 4. Use state API. See the :ref:`OpenVINO state API <openvino-state-api>` and the :ref:`Example of stateful network inference <example-of-stateful-model-inference>` sections.
 Known Limitations for the LowLatency
 ####################################
 1. Parameters connected directly to ``ReadValues`` (states) after the transformation is applied are not allowed.
   Unnecessary parameters may remain on the graph after applying the transformation. The automatic handling of this case inside the transformation is currently not possible. Such parameters should be removed manually from `ngraph::Function <classngraph.html#doxid-classngraph-1a14d7fe7c605267b52c145579e12d2a5f>`__ or replaced with a constant.
   .. image:: _static/images/low_latency_limitation_1.svg
      :scale: 70 %
   **Current solutions:**
   * Replace a parameter with a constant (freeze) with the ``[0, 0, 0 … 0]`` value via :doc:`ModelOptimizer CLI <openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model>`: the ``--input`` or ``--freeze_placeholder_with_value`` parameters.
   * Use nGraph API to replace a parameter with a constant, as shown in the example below:
     .. code-block:: cpp
        // nGraph example. How to replace Parameter with Constant.
        auto func = cnnNetwork.getFunction();
        // Creating the new Constant with zero values.
        auto new_const = std::make_shared<ngraph::opset6::Constant>( /*type, shape, std::vector with zeros*/ );
        for (const auto& param : func->get_parameters()) {
           // Trying to find the problematic Constant by name.
           if (param->get_friendly_name() == "param_name") {
              // Replacing the problematic Param with a Constant.
              ngraph::replace_node(param, new_const);
              // Removing problematic Parameter from ngraph::function
              func->remove_parameter(param);
           }
        }
 2. Unable to execute reshape precondition to apply the transformation correctly.
   Networks can be non-reshapable. The most common reason is that the value of shapes is hardcoded in the constant somewhere in the network.
   .. image:: _static/images/low_latency_limitation_2.svg
      :scale: 70 %
   **Current solutions:**
   * Trim non-reshapable layers via :doc:`ModelOptimizer CLI <openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model>` : the ``--input`` and ``--output`` parameters. For example, the    parameter and the problematic constant (as shown in the picture above) can be trimmed using the ``--input Reshape_layer_name`` command-line option.
   * Use nGraph API to replace the problematic constant, as shown in the example below:
     .. code-block:: cpp
        // nGraph example. How to replace a Constant with hardcoded values of shapes in the network with another one with the new values.
        // Assume we know which Constant (const_with_hardcoded_shape) prevents the reshape from being applied.
        // Then we can find this Constant by name on the network and replace it with a new one with the correct shape.
        auto func = cnnNetwork.getFunction();
        // Creating the new Constant with a correct shape.
        // For the example shown in the picture above, the new values of the Constant should be 1, 1, 10 instead of 1, 49, 10
        auto new_const = std::make_shared<ngraph::opset6::Constant>( /*type, shape, value_with_correct_shape*/ );
        for (const auto& node : func->get_ops()) {
           // Trying to find the problematic Constant by name.
           if (node->get_friendly_name() == "name_of_non_reshapable_const") {
              auto const_with_hardcoded_shape = std::dynamic_pointer_cast<ngraph::opset6::Constant>(node);
              // Replacing the problematic Constant with a new one. Do this for all the problematic Constants in the network, then
              // you can apply the reshape feature.
              ngraph::replace_node(const_with_hardcoded_shape, new_const);
           }
        }
@endsphinxdirective
--- a/docs/OV_Runtime_UG/model_state_intro.md
+++ b/docs/OV_Runtime_UG/model_state_intro.md
@ -7,7 +7,6 @@
   :hidden:
   openvino_docs_OV_UG_lowlatency2
   openvino_docs_OV_UG_lowlatency_deprecated
 Several use cases require processing of data sequences. When length of a sequence is known and small enough, 
@ -197,11 +196,7 @@ refer to the speech sample and a demo in the :doc:`Samples Overview <openvino_do
 LowLatency Transformations
 ##########################
-If the original framework does not have a special API for working with states, OpenVINO representation will not contain ``Assign``/``ReadValue`` layers after importing the model. For example, if the original ONNX model contains RNN operations, OpenVINO IR will contain :doc:`TensorIterator <openvino_docs_ops_infrastructure_TensorIterator_1>` operations and the values will be obtained only after execution of the whole ``TensorIterator`` primitive. Intermediate values from each iteration will not be available. Working with these intermediate values of each iteration is enabled by special :doc:`LowLatency <openvino_docs_OV_UG_lowlatency_deprecated>` and :doc:`LowLatency2 <openvino_docs_OV_UG_lowlatency2>` transformations, which also help receive these values with a low latency after each infer request.
+If the original framework does not have a special API for working with states, OpenVINO representation will not contain ``Assign``/``ReadValue`` layers after importing the model. For example, if the original ONNX model contains RNN operations, OpenVINO IR will contain :doc:`TensorIterator <openvino_docs_ops_infrastructure_TensorIterator_1>` operations and the values will be obtained only after execution of the whole ``TensorIterator`` primitive. Intermediate values from each iteration will not be available. Working with these intermediate values of each iteration is enabled by special and :doc:`LowLatency2 <openvino_docs_OV_UG_lowlatency2>` transformation, which also help receive these values with a low latency after each infer request.
 .. note::
   It is recommended to use LowLatency2, as LowLatency transformation has already been deprecated.
 TensorIterator/Loop operations
 ++++++++++++++++++++++++++++++
--- a/src/common/transformations/tests/common_optimizations/low_latency_test.cpp
+++ b/src/common/transformations/tests/common_optimizations/low_latency_test.cpp
@ -1,485 +0,0 @@
 // Copyright (C) 2018-2023 Intel Corporation
 // SPDX-License-Identifier: Apache-2.0
 //
 #include <gtest/gtest.h>
 #include <memory>
 #include <ngraph/function.hpp>
 #include <ngraph/opsets/opset6.hpp>
 #include <ngraph/pass/low_latency.hpp>
 #include <ngraph/pass/manager.hpp>
 #include <queue>
 #include <string>
 #include <transformations/control_flow/unroll_tensor_iterator.hpp>
 #include <transformations/init_node_info.hpp>
 #include "common_test_utils/ngraph_test_utils.hpp"
 using namespace testing;
 using namespace ngraph;
 TEST(TransformationTests, LowLatencyLSTM) {
    std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
    {
        auto X = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        X->set_friendly_name("X");
        auto H_init = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        H_init->set_friendly_name("H_init");
        auto C_init = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        C_init->set_friendly_name("C_init");
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        Xi->set_friendly_name("Xi");
        auto H_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        H_t->set_friendly_name("H_t");
        auto C_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        C_t->set_friendly_name("C_t");
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(512 * 16, 0);
        auto r_val = std::vector<float>(512 * 128, 0);
        auto b_val = std::vector<float>(512, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512}, b_val);
        auto lstm_cell = std::make_shared<opset6::LSTMCell>(squeeze, H_t, C_t, W, R, B, 128);
        auto res_1 = std::make_shared<opset6::Result>(lstm_cell->output(0));
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(lstm_cell->output(0), axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        auto res_3 = std::make_shared<opset6::Result>(lstm_cell->output(1));
        auto body =
            std::make_shared<ngraph::Function>(OutputVector{res_1, res_2, res_3}, ParameterVector{Xi, H_t, C_t});
        auto tensor_iterator = std::make_shared<opset6::TensorIterator>();
        tensor_iterator->set_body(body);
        tensor_iterator->set_friendly_name("LSTMTensorIterator");
        tensor_iterator->set_merged_input(C_t, C_init, res_3);
        tensor_iterator->set_sliced_input(Xi, X, 0, 1, 1, -1, 0);
        tensor_iterator->set_merged_input(H_t, H_init, res_1);
        auto out0 = tensor_iterator->get_iter_value(res_1, -1);
        auto out1 = tensor_iterator->get_concatenated_slices(res_2, 0, 1, 1, -1, 0);
        auto res_ti_1 = std::make_shared<opset6::Result>(tensor_iterator->output(1));
        auto res_ti_2 = std::make_shared<opset6::Result>(tensor_iterator->output(0));
        f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res_ti_1, res_ti_2},
                                               ngraph::ParameterVector{X, H_init, C_init});
        ngraph::pass::Manager manager;
        manager.register_pass<ov::pass::InitNodeInfo>();
        NGRAPH_SUPPRESS_DEPRECATED_START
        manager.register_pass<ngraph::pass::LowLatency>();
        NGRAPH_SUPPRESS_DEPRECATED_END
        manager.register_pass<ov::pass::UnrollTensorIterator>();
        manager.run_passes(f);
    }
    {
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto H_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto C_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        const std::string variable_name_H("LSTMTensorIterator/H_t/variable_2");
        const std::string variable_name_C("LSTMTensorIterator/C_t/variable_0");
        auto variable_H =
            std::make_shared<Variable>(VariableInfo{PartialShape::dynamic(), element::dynamic, variable_name_H});
        auto variable_C =
            std::make_shared<Variable>(VariableInfo{PartialShape::dynamic(), element::dynamic, variable_name_C});
        auto read_value_H = std::make_shared<opset6::ReadValue>(H_t, variable_H);
        auto read_value_C = std::make_shared<opset6::ReadValue>(C_t, variable_C);
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(512 * 16, 0);
        auto r_val = std::vector<float>(512 * 128, 0);
        auto b_val = std::vector<float>(512, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512}, b_val);
        auto lstm_cell = std::make_shared<opset6::LSTMCell>(squeeze, read_value_H, read_value_C, W, R, B, 128);
        auto assign_H = std::make_shared<opset6::Assign>(lstm_cell->output(0), variable_H);
        auto assign_C = std::make_shared<opset6::Assign>(lstm_cell->output(1), variable_C);
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(lstm_cell->output(0), axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        auto res_1 = std::make_shared<opset6::Result>(lstm_cell->output(0));
        f_ref = std::make_shared<ngraph::Function>(OutputVector{res_1, res_2}, ParameterVector{Xi, H_t, C_t});
        f_ref->add_sinks({assign_C, assign_H});
        assign_H->add_control_dependency(read_value_H);
        assign_C->add_control_dependency(read_value_C);
    }
    auto res = compare_functions(f, f_ref, true, false, false, true, true);
    ASSERT_TRUE(res.first) << res.second;
 }
 TEST(TransformationTests, LowLatencyGRU) {
    std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
    {
        auto X = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto Y = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto Yi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        Yi->set_friendly_name("Yi");
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(384 * 16, 0);
        auto r_val = std::vector<float>(384 * 128, 0);
        auto b_val = std::vector<float>(384, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{384, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{384, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{384}, b_val);
        auto gru_cell = std::make_shared<opset6::GRUCell>(squeeze, Yi, W, R, B, 128);
        auto res_1 = std::make_shared<opset6::Result>(gru_cell);
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(gru_cell, axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        auto body = std::make_shared<ngraph::Function>(OutputVector{res_1, res_2}, ParameterVector{Xi, Yi});
        auto tensor_iterator = std::make_shared<opset6::TensorIterator>();
        tensor_iterator->set_body(body);
        tensor_iterator->set_friendly_name("GRUTensorIterator");
        tensor_iterator->set_sliced_input(Xi, X, 0, 1, 1, -1, 0);
        tensor_iterator->set_merged_input(Yi, Y, res_1);
        auto out0 = tensor_iterator->get_iter_value(res_1, -1);
        auto out1 = tensor_iterator->get_concatenated_slices(res_2, 0, 1, 1, -1, 0);
        auto res_ti_1 = std::make_shared<opset6::Result>(tensor_iterator->output(1));
        f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res_ti_1}, ngraph::ParameterVector{X, Y});
        ngraph::pass::Manager manager;
        manager.register_pass<ov::pass::InitNodeInfo>();
        NGRAPH_SUPPRESS_DEPRECATED_START
        manager.register_pass<ngraph::pass::LowLatency>();
        NGRAPH_SUPPRESS_DEPRECATED_END
        manager.register_pass<ov::pass::UnrollTensorIterator>();
        manager.run_passes(f);
        ASSERT_NO_THROW(check_rt_info(f));
    }
    {
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto H_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        const std::string variable_name_H("GRUTensorIterator/Yi/variable");
        auto variable_H =
            std::make_shared<Variable>(VariableInfo{PartialShape::dynamic(), element::dynamic, variable_name_H});
        auto read_value_H = std::make_shared<opset6::ReadValue>(H_t, variable_H);
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(384 * 16, 0);
        auto r_val = std::vector<float>(384 * 128, 0);
        auto b_val = std::vector<float>(384, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{384, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{384, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{384}, b_val);
        auto rnn_cell = std::make_shared<opset6::GRUCell>(squeeze, read_value_H, W, R, B, 128);
        auto assign_H = std::make_shared<opset6::Assign>(rnn_cell->output(0), variable_H);
        auto res_1 = std::make_shared<opset6::Result>(assign_H);
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(rnn_cell->output(0), axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        f_ref = std::make_shared<ngraph::Function>(OutputVector{unsqueeze}, ParameterVector{Xi, H_t});
        f_ref->add_sinks({assign_H});
        assign_H->add_control_dependency(read_value_H);
    }
    auto res = compare_functions(f, f_ref);
    ASSERT_TRUE(res.first) << res.second;
 }
 TEST(TransformationTests, LowLatencyRNN) {
    std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
    {
        auto X = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto Y = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto Yi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        Yi->set_friendly_name("Yi");
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(128 * 16, 0);
        auto r_val = std::vector<float>(128 * 128, 0);
        auto b_val = std::vector<float>(128, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{128, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{128, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{128}, b_val);
        auto rnn_cell = std::make_shared<opset6::RNNCell>(squeeze, Yi, W, R, B, 128);
        auto res_1 = std::make_shared<opset6::Result>(rnn_cell);
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(rnn_cell, axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        auto body = std::make_shared<ngraph::Function>(OutputVector{res_1, res_2}, ParameterVector{Xi, Yi});
        auto tensor_iterator = std::make_shared<opset6::TensorIterator>();
        tensor_iterator->set_body(body);
        tensor_iterator->set_friendly_name("RNNTensorIterator");
        tensor_iterator->set_sliced_input(Xi, X, 0, 1, 1, -1, 0);
        tensor_iterator->set_merged_input(Yi, Y, res_1);
        auto out0 = tensor_iterator->get_iter_value(res_1, -1);
        auto out1 = tensor_iterator->get_concatenated_slices(res_2, 0, 1, 1, -1, 0);
        auto res_ti_1 = std::make_shared<opset6::Result>(tensor_iterator->output(1));
        f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res_ti_1}, ngraph::ParameterVector{X, Y});
        ngraph::pass::Manager manager;
        manager.register_pass<ov::pass::InitNodeInfo>();
        NGRAPH_SUPPRESS_DEPRECATED_START
        manager.register_pass<ngraph::pass::LowLatency>();
        NGRAPH_SUPPRESS_DEPRECATED_END
        manager.register_pass<ov::pass::UnrollTensorIterator>();
        manager.run_passes(f);
        ASSERT_NO_THROW(check_rt_info(f));
    }
    {
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto H_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        const std::string variable_name_H("RNNTensorIterator/Yi/variable");
        auto variable_H =
            std::make_shared<Variable>(VariableInfo{PartialShape::dynamic(), element::dynamic, variable_name_H});
        auto read_value_H = std::make_shared<opset6::ReadValue>(H_t, variable_H);
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(128 * 16, 0);
        auto r_val = std::vector<float>(128 * 128, 0);
        auto b_val = std::vector<float>(128, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{128, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{128, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{128}, b_val);
        auto rnn_cell = std::make_shared<opset6::RNNCell>(squeeze, read_value_H, W, R, B, 128);
        auto assign_H = std::make_shared<opset6::Assign>(rnn_cell->output(0), variable_H);
        auto res_1 = std::make_shared<opset6::Result>(assign_H);
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(rnn_cell->output(0), axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        f_ref = std::make_shared<ngraph::Function>(OutputVector{unsqueeze}, ParameterVector{Xi, H_t});
        f_ref->add_sinks({assign_H});
        assign_H->add_control_dependency(read_value_H);
    }
    auto res = compare_functions(f, f_ref);
    ASSERT_TRUE(res.first) << res.second;
 }
 TEST(TransformationTests, LowLatencyLSTMReshape) {
    std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
    {
        auto X = std::make_shared<opset6::Parameter>(element::f32, Shape{2, 1, 16});
        auto H = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto C = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto H_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        H_t->set_friendly_name("H_t");
        auto C_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        C_t->set_friendly_name("C_t");
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(512 * 16, 0);
        auto r_val = std::vector<float>(512 * 128, 0);
        auto b_val = std::vector<float>(512, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512}, b_val);
        auto lstm_cell = std::make_shared<opset6::LSTMCell>(squeeze, H_t, C_t, W, R, B, 128);
        auto res_1 = std::make_shared<opset6::Result>(lstm_cell->output(0));
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(lstm_cell, axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        auto res_3 = std::make_shared<opset6::Result>(lstm_cell->output(1));
        auto body =
            std::make_shared<ngraph::Function>(OutputVector{res_1, res_2, res_3}, ParameterVector{Xi, H_t, C_t});
        auto tensor_iterator = std::make_shared<opset6::TensorIterator>();
        tensor_iterator->set_body(body);
        tensor_iterator->set_friendly_name("LSTMTensorIterator");
        tensor_iterator->set_merged_input(C_t, C, res_3);
        tensor_iterator->set_sliced_input(Xi, X, 0, 1, 1, -1, 0);
        tensor_iterator->set_merged_input(H_t, H, res_1);
        auto out0 = tensor_iterator->get_iter_value(res_1, -1);
        auto out1 = tensor_iterator->get_concatenated_slices(res_2, 0, 1, 1, -1, 0);
        auto res_ti_1 = std::make_shared<opset6::Result>(tensor_iterator->output(1));
        auto res_ti_2 = std::make_shared<opset6::Result>(tensor_iterator->output(0));
        f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res_ti_1, res_ti_2},
                                               ngraph::ParameterVector{X, H, C});
        // Reshape
        // change the number of iteration of TI. 2 -> 1
        auto new_X = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        f->replace_parameter(0, new_X);
        f->validate_nodes_and_infer_types();
        ngraph::pass::Manager manager;
        manager.register_pass<ov::pass::InitNodeInfo>();
        NGRAPH_SUPPRESS_DEPRECATED_START
        manager.register_pass<ngraph::pass::LowLatency>();
        NGRAPH_SUPPRESS_DEPRECATED_END
        manager.register_pass<ov::pass::UnrollTensorIterator>();
        manager.run_passes(f);
    }
    {
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto H_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto C_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        const std::string variable_name_H("LSTMTensorIterator/H_t/variable_2");
        const std::string variable_name_C("LSTMTensorIterator/C_t/variable_0");
        auto variable_H =
            std::make_shared<Variable>(VariableInfo{PartialShape::dynamic(), element::dynamic, variable_name_H});
        auto variable_C =
            std::make_shared<Variable>(VariableInfo{PartialShape::dynamic(), element::dynamic, variable_name_C});
        auto read_value_H = std::make_shared<opset6::ReadValue>(H_t, variable_H);
        auto read_value_C = std::make_shared<opset6::ReadValue>(C_t, variable_C);
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(512 * 16, 0);
        auto r_val = std::vector<float>(512 * 128, 0);
        auto b_val = std::vector<float>(512, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512}, b_val);
        auto lstm_cell = std::make_shared<opset6::LSTMCell>(squeeze, read_value_H, read_value_C, W, R, B, 128);
        auto assign_H = std::make_shared<opset6::Assign>(lstm_cell->output(0), variable_H);
        auto assign_C = std::make_shared<opset6::Assign>(lstm_cell->output(1), variable_C);
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(lstm_cell->output(0), axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        auto res_1 = std::make_shared<opset6::Result>(lstm_cell->output(0));
        f_ref = std::make_shared<ngraph::Function>(OutputVector{res_1, res_2}, ParameterVector{Xi, H_t, C_t});
        f_ref->add_sinks({assign_C, assign_H});
        assign_H->add_control_dependency(read_value_H);
        assign_C->add_control_dependency(read_value_C);
    }
    auto res = compare_functions(f, f_ref);
    ASSERT_TRUE(res.first) << res.second;
 }
 TEST(TransformationTests, LowLatencyLSTM_Loop) {
    std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
    {
        auto X = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto H_init = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto C_init = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto H_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        H_t->set_friendly_name("H_t");
        auto C_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        C_t->set_friendly_name("C_t");
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(512 * 16, 0);
        auto r_val = std::vector<float>(512 * 128, 0);
        auto b_val = std::vector<float>(512, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512}, b_val);
        auto lstm_cell = std::make_shared<opset6::LSTMCell>(squeeze, H_t, C_t, W, R, B, 128);
        auto res_1 = std::make_shared<opset6::Result>(lstm_cell->output(0));
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(lstm_cell->output(0), axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        auto res_3 = std::make_shared<opset6::Result>(lstm_cell->output(1));
        auto body_condition =
            std::make_shared<ngraph::opset6::Constant>(ngraph::element::boolean, ngraph::Shape{1}, false);
        auto body = std::make_shared<ngraph::Function>(OutputVector{res_1, res_2, res_3, body_condition},
                                                       ParameterVector{Xi, H_t, C_t});
        auto trip_count = std::make_shared<ngraph::opset6::Constant>(ngraph::element::i64, ngraph::Shape{}, 10);
        auto exec_condition =
            std::make_shared<ngraph::opset6::Constant>(ngraph::element::boolean, ngraph::Shape{}, true);
        auto loop = std::make_shared<opset6::Loop>(trip_count, exec_condition);
        loop->set_special_body_ports({-1, 3});
        loop->set_function(body);
        loop->set_friendly_name("LSTMLoop");
        loop->set_merged_input(C_t, C_init, res_3);
        loop->set_sliced_input(Xi, X, 0, 1, 1, -1, 0);
        loop->set_merged_input(H_t, H_init, res_1);
        auto out0 = loop->get_iter_value(res_1, -1);
        auto out1 = loop->get_concatenated_slices(res_2, 0, 1, 1, -1, 0);
        auto res_ti_1 = std::make_shared<opset6::Result>(loop->output(1));
        auto res_ti_2 = std::make_shared<opset6::Result>(loop->output(0));
        f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res_ti_1, res_ti_2},
                                               ngraph::ParameterVector{X, H_init, C_init});
        ngraph::pass::Manager manager;
        manager.register_pass<ov::pass::InitNodeInfo>();
        NGRAPH_SUPPRESS_DEPRECATED_START
        manager.register_pass<ngraph::pass::LowLatency>();
        NGRAPH_SUPPRESS_DEPRECATED_END
        manager.register_pass<ov::pass::UnrollTensorIterator>();
        manager.run_passes(f);
    }
    {
        auto Xi = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 1, 16});
        auto H_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        auto C_t = std::make_shared<opset6::Parameter>(element::f32, Shape{1, 128});
        const std::string variable_name_H("LSTMLoop/H_t/variable_2");
        const std::string variable_name_C("LSTMLoop/C_t/variable_0");
        auto variable_H =
            std::make_shared<Variable>(VariableInfo{PartialShape::dynamic(), element::dynamic, variable_name_H});
        auto variable_C =
            std::make_shared<Variable>(VariableInfo{PartialShape::dynamic(), element::dynamic, variable_name_C});
        auto read_value_H = std::make_shared<opset6::ReadValue>(H_t, variable_H);
        auto read_value_C = std::make_shared<opset6::ReadValue>(C_t, variable_C);
        // Body
        auto axis = ngraph::opset6::Constant::create(ngraph::element::i64, ngraph::Shape{}, {0});
        auto squeeze = std::make_shared<opset6::Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(512 * 16, 0);
        auto r_val = std::vector<float>(512 * 128, 0);
        auto b_val = std::vector<float>(512, 0);
        auto W = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 16}, w_val);
        auto R = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512, 128}, r_val);
        auto B = ngraph::opset6::Constant::create(ngraph::element::f32, ngraph::Shape{512}, b_val);
        auto lstm_cell = std::make_shared<opset6::LSTMCell>(squeeze, read_value_H, read_value_C, W, R, B, 128);
        auto assign_H = std::make_shared<opset6::Assign>(lstm_cell->output(0), variable_H);
        auto assign_C = std::make_shared<opset6::Assign>(lstm_cell->output(1), variable_C);
        auto unsqueeze = std::make_shared<opset6::Unsqueeze>(lstm_cell->output(0), axis);
        auto res_2 = std::make_shared<opset6::Result>(unsqueeze);
        auto res_1 = std::make_shared<opset6::Result>(lstm_cell->output(0));
        f_ref = std::make_shared<ngraph::Function>(OutputVector{res_1, res_2}, ParameterVector{Xi, H_t, C_t});
        f_ref->add_sinks({assign_C, assign_H});
        assign_H->add_control_dependency(read_value_H);
        assign_C->add_control_dependency(read_value_C);
    }
    auto res = compare_functions(f, f_ref);
    ASSERT_TRUE(res.first) << res.second;
 }
--- a/src/common/transformations/tests/common_optimizations/low_latency_v2_test.cpp
+++ b/src/common/transformations/tests/common_optimizations/low_latency_v2_test.cpp
@ -775,71 +775,6 @@ TEST(TransformationTests, LowLatency2_LSTM_Loop_several_iterations) {
    ASSERT_TRUE(res.first) << res.second;
 }
 TEST(TransformationTests, LowLatencyLSTM_LLTv1_LLTv2) {
    std::shared_ptr<Function> f(nullptr), f_ref(nullptr);
    {
        auto X = std::make_shared<Parameter>(element::f32, Shape{1, 1, 16});
        auto H_init = std::make_shared<Parameter>(element::f32, Shape{1, 128});
        auto C_init = std::make_shared<Parameter>(element::f32, Shape{1, 128});
        auto Xi = std::make_shared<Parameter>(element::f32, Shape{1, 1, 16});
        auto H_t = std::make_shared<Parameter>(element::f32, Shape{1, 128});
        auto C_t = std::make_shared<Parameter>(element::f32, Shape{1, 128});
        // Body
        auto axis = Constant::create(element::i64, Shape{}, {0});
        auto squeeze = std::make_shared<Squeeze>(Xi, axis);
        auto w_val = std::vector<float>(512 * 16, 0);
        auto r_val = std::vector<float>(512 * 128, 0);
        auto b_val = std::vector<float>(512, 0);
        auto W = Constant::create(element::f32, Shape{512, 16}, w_val);
        auto R = Constant::create(element::f32, Shape{512, 128}, r_val);
        auto B = Constant::create(element::f32, Shape{512}, b_val);
        auto lstm_cell = std::make_shared<LSTMCell>(squeeze, H_t, C_t, W, R, B, 128);
        auto res_1 = std::make_shared<Result>(lstm_cell->output(0));
        auto unsqueeze = std::make_shared<Unsqueeze>(lstm_cell->output(0), axis);
        auto res_2 = std::make_shared<Result>(unsqueeze);
        auto res_3 = std::make_shared<Result>(lstm_cell->output(1));
        auto body = std::make_shared<Function>(OutputVector{res_1, res_2, res_3}, ParameterVector{Xi, H_t, C_t});
        auto tensor_iterator = std::make_shared<TensorIterator>();
        tensor_iterator->set_body(body);
        tensor_iterator->set_friendly_name("LSTMTensorIterator");
        tensor_iterator->set_merged_input(C_t, C_init, res_3);
        tensor_iterator->set_sliced_input(Xi, X, 0, 1, 1, -1, 0);
        tensor_iterator->set_merged_input(H_t, H_init, res_1);
        auto out0 = tensor_iterator->get_iter_value(res_1, -1);
        auto out1 = tensor_iterator->get_concatenated_slices(res_2, 0, 1, 1, -1, 0);
        auto res_ti_1 = std::make_shared<Result>(tensor_iterator->output(1));
        auto res_ti_2 = std::make_shared<Result>(tensor_iterator->output(0));
        f = std::make_shared<Function>(NodeVector{res_ti_1, res_ti_2}, ParameterVector{X, H_init, C_init});
        auto f_2 = f->clone();
        pass::Manager manager_2;
        manager_2.register_pass<ov::pass::InitNodeInfo>();
        NGRAPH_SUPPRESS_DEPRECATED_START
        manager_2.register_pass<ngraph::pass::LowLatency>();
        NGRAPH_SUPPRESS_DEPRECATED_END
        EXPECT_NO_THROW(manager_2.run_passes(f_2));
        pass::Manager manager;
        manager.register_pass<ov::pass::InitNodeInfo>();
        NGRAPH_SUPPRESS_DEPRECATED_START
        manager.register_pass<ngraph::pass::LowLatency>();
        NGRAPH_SUPPRESS_DEPRECATED_END
        // LLT v2 doesn't insert Assign/ReadValue ops, they are already inserted
        // but unrolls TI/Loop
        manager.register_pass<ov::pass::LowLatency2>();
        EXPECT_NO_THROW(manager.run_passes(f));
    }
 }
 namespace {
 using OutPtr = Output<Node>;
 enum class RNNType : size_t {
--- a/src/core/include/ngraph/pass/low_latency.hpp
+++ b/src/core/include/ngraph/pass/low_latency.hpp
@ -23,41 +23,8 @@
 namespace ngraph {
 namespace pass {
 /**
 * @brief The transformation finds all TensorIterator/Loop layers in the network,
 * processes all back edges that describe a connection between Result and Parameter
 * of the TensorIterator body,and inserts ReadValue layer between Parameter
 * and the next layers after this Parameter, and Assign layer after the layers
 * before the Result layer. Supported platforms: CPU, GNA.
 *
 * The example below describes the changes to the inner part (body, back edges) of the
 * Tensor Iterator layer.
 *  [] - TensorIterator body
 *  () - new layer
 *
 *  before applying the transformation:
 *  back_edge_1 -> [Parameter -> some layers ... -> Result ] -> back_edge_1
 *
 *  after applying the transformation:
 *  back_edge_1 -> [Parameter -> (ReadValue layer) -> some layers ... -> (Assign layer) ]
 *                                                              \
 *                                                               -> Result ] -> back_edge_1
 *
 * It is recommended to use this transformation in conjunction with the Reshape feature to
 * set  sequence dimension to 1 and with the UnrollTensorIterator transformation.
 * For convenience, we have already enabled the unconditional execution of the
 * UnrollTensorIterator transformation when using the LowLatency transformation for
 * CPU, GNA plugins, no action is required here.
 * After applying both of these transformations, the resulting network can be inferred step
 * by step, the states will store between inferences.
 */
 class NGRAPH_API_DEPRECATED NGRAPH_API LowLatency : public ngraph::pass::MatcherPass {
 public:
    NGRAPH_RTTI_DECLARATION;
    LowLatency();
 };
 using ov::pass::LowLatency2;
 }  // namespace pass
 }  // namespace ngraph
--- a/src/core/src/pass/low_latency.cpp
+++ b/src/core/src/pass/low_latency.cpp
@ -17,9 +17,6 @@
 #include <openvino/opsets/opset9.hpp>
 #include <openvino/util/log.hpp>
 NGRAPH_SUPPRESS_DEPRECATED_START
 NGRAPH_RTTI_DEFINITION(ngraph::pass::LowLatency, "LowLatency");
 using namespace std;
 namespace {
@ -28,72 +25,6 @@ string generate_variable_name(const string& op_name, const string& param_name, i
 }
 }  // namespace
 ngraph::pass::LowLatency::LowLatency() {
    MATCHER_SCOPE(LowLatency);
    auto tensor_iterator = ov::pass::pattern::wrap_type<opset6::TensorIterator, opset6::Loop>();
    ov::matcher_pass_callback callback = [](ov::pass::pattern::Matcher& m) {
        const auto& sub_graph_op = std::dynamic_pointer_cast<ngraph::op::util::SubGraphOp>(m.get_match_root());
        if (!sub_graph_op) {
            return false;
        }
        if (const auto& loop = std::dynamic_pointer_cast<opset6::Loop>(sub_graph_op)) {
            const auto& trip_count = std::dynamic_pointer_cast<opset6::Constant>(loop->get_input_node_shared_ptr(0));
            const auto& num_iter = loop->get_num_iterations();
            if (trip_count && num_iter > 0 && trip_count->get_output_target_inputs(0).size() == 1) {
                auto single_iter = std::make_shared<opset6::Constant>(ov::element::i64, Shape{}, 1);
                replace_node(trip_count, single_iter);
            } else {
                // count of iterations is dynamic;
                return false;
            }
        }
        // Mark the TI layer to be unrolled. Enable unconditional ti unrolling for all plugins.
        auto& rt_info = sub_graph_op->get_rt_info();
        rt_info["UNROLL_TI"] = int64_t(1);
        int64_t variable_id = 0;
        std::vector<std::shared_ptr<ngraph::op::Sink>> assigns;
        const auto& func = sub_graph_op->get_function();
        for (const auto& in : sub_graph_op->get_input_descriptions()) {
            // Process all back edges
            if (const auto& merged_in =
                    std::dynamic_pointer_cast<ngraph::op::util::SubGraphOp::MergedInputDescription>(in)) {
                // Insert ReadValue nodes: Parameter -> (new ReadValue) -> consumers
                const auto& inputs_to =
                    func->get_parameters().at(merged_in->m_body_parameter_index)->get_output_target_inputs(0);
                const std::string variable_name(generate_variable_name(
                    sub_graph_op->get_friendly_name(),
                    func->get_parameters().at(merged_in->m_body_parameter_index)->get_friendly_name(),
                    variable_id));
                auto variable = std::make_shared<Variable>(
                    VariableInfo{ov::PartialShape::dynamic(), element::dynamic, variable_name});
                auto read_value =
                    std::make_shared<opset6::ReadValue>(func->get_parameters().at(merged_in->m_body_parameter_index),
                                                        variable);
                read_value->set_friendly_name(variable_name);
                for (const auto& input_to : inputs_to) {
                    input_to.replace_source_output(read_value->output(0));
                }
                // insert Assign nodes: provider -> (new Assign) -> Result
                const auto res = func->get_results().at(merged_in->m_body_value_index);
                auto assign = std::make_shared<opset6::Assign>(res->input_value(0), variable);
                // control dependency so that ReadValue is processed before Assign
                assign->add_control_dependency(read_value);
                assigns.emplace_back(assign);
            }
            variable_id++;
        }
        // save Assign in the func so that it gets into graph traversals and isn't deleted.
        func->add_sinks(assigns);
        return false;
    };
    auto m = std::make_shared<ov::pass::pattern::Matcher>(tensor_iterator, matcher_name);
    register_matcher(m, callback);
 }
 NGRAPH_SUPPRESS_DEPRECATED_END
 namespace {
--- a/src/inference/include/ie/ie_transformations.hpp
+++ b/src/inference/include/ie/ie_transformations.hpp
@ -15,47 +15,6 @@
 namespace InferenceEngine {
 /**
 * @deprecated Use InferenceEngine::lowLatency2 instead. This transformation will be removed in 2023.1.
 * @brief The transformation finds all TensorIterator layers in the network, processes all back
 * edges that describe a connection between Result and Parameter of the TensorIterator body,
 * and inserts ReadValue layer between Parameter and the next layers after this Parameter,
 * and Assign layer after the layers before the Result layer.
 * Supported platforms: CPU, GNA.
 *
 *  The example below describes the changes to the inner part (body, back edges) of the TensorIterator layer.
 *  [] - TensorIterator body
 *  () - new layer
 *
 *  before applying the transformation:
 *  back_edge_1 -> [Parameter -> some layers ... -> Result ] -> back_edge_1
 *
 *  after applying the transformation:
 *  back_edge_1 -> [Parameter -> (ReadValue layer) -> some layers ... -> (Assign layer) ]
 *                                                              \
 *                                                               -> Result ] -> back_edge_1
 *
 *  It is recommended to use this transformation in conjunction with the Reshape feature to set sequence
 *  dimension to 1 and with the UnrollTensorIterator transformation.
 *  For convenience, we have already enabled the unconditional execution of the UnrollTensorIterator
 *  transformation when using the LowLatency transformation for CPU, GNA plugins, no action is required here.
 *  After applying both of these transformations, the resulting network can be inferred step by
 *  step, the states will store between inferences.
 *
 *    An illustrative example, not real API:
 *
 *    network->reshape(...) // Set sequence dimension to 1, recalculating shapes. Optional, depends on the network.
 *    LowLatency(network)   // Applying LowLatency and UnrollTensorIterator transformations.
 *    network->infer (...)  // Calculating new values for states.
 *    // All states are stored between inferences via Assign, ReadValue layers.
 *    network->infer (...)  // Using stored states, calculating new values for states.
 *
 * @param network A network to apply LowLatency transformation
 */
 INFERENCE_ENGINE_DEPRECATED("This transformation will be removed in 2023.1. "
                            "Use InferenceEngine::lowLatency2 instead.")
 INFERENCE_ENGINE_API_CPP(void) LowLatency(InferenceEngine::CNNNetwork& network);
 /**
 * @brief The transformation finds all TensorIterator/Loop layers in the network,
 * processes all back edges that describe a connection between Result and Parameter
--- a/src/inference/src/ie_transformations.cpp
+++ b/src/inference/src/ie_transformations.cpp
@ -9,15 +9,6 @@
 using namespace InferenceEngine;
 void InferenceEngine::LowLatency(InferenceEngine::CNNNetwork& network) {
    auto function = network.getFunction();
    ngraph::pass::Manager manager;
    NGRAPH_SUPPRESS_DEPRECATED_START
    manager.register_pass<ngraph::pass::LowLatency>();
    NGRAPH_SUPPRESS_DEPRECATED_END
    manager.run_passes(function);
 }
 void InferenceEngine::lowLatency2(InferenceEngine::CNNNetwork& network, bool use_const_initializer) {
    auto function = network.getFunction();
    ngraph::pass::Manager manager;
--- a/src/plugins/intel_cpu/tests/functional/shared_tests_instances/subgraph_tests/memory_LSTMCell.cpp
+++ b/src/plugins/intel_cpu/tests/functional/shared_tests_instances/subgraph_tests/memory_LSTMCell.cpp
@ -8,8 +8,6 @@
 namespace SubgraphTestsDefinitions {
    std::vector<ngraph::helpers::MemoryTransformation> transformation {
            ngraph::helpers::MemoryTransformation::NONE,
            ngraph::helpers::MemoryTransformation::LOW_LATENCY,
            ngraph::helpers::MemoryTransformation::LOW_LATENCY_REGULAR_API,
            ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2,
            ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2_REGULAR_API
    };
--- a/src/plugins/intel_cpu/tests/functional/shared_tests_instances/subgraph_tests/multiple_LSTMCell.cpp
+++ b/src/plugins/intel_cpu/tests/functional/shared_tests_instances/subgraph_tests/multiple_LSTMCell.cpp
@ -10,8 +10,6 @@ namespace {
 std::vector<ngraph::helpers::MemoryTransformation> transformation {
    ngraph::helpers::MemoryTransformation::NONE,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_REGULAR_API,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2_REGULAR_API
 };
--- a/src/plugins/intel_gna/tests/functional/shared_tests_instances/subgraph_tests/memory_LSTMCell.cpp
+++ b/src/plugins/intel_gna/tests/functional/shared_tests_instances/subgraph_tests/memory_LSTMCell.cpp
@ -9,8 +9,6 @@
 namespace SubgraphTestsDefinitions {
 std::vector<ngraph::helpers::MemoryTransformation> transformation{
    ngraph::helpers::MemoryTransformation::NONE,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_REGULAR_API,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2_REGULAR_API};
--- a/src/plugins/intel_gna/tests/functional/shared_tests_instances/subgraph_tests/multiple_LSTMCell.cpp
+++ b/src/plugins/intel_gna/tests/functional/shared_tests_instances/subgraph_tests/multiple_LSTMCell.cpp
@ -11,8 +11,6 @@ namespace {
 std::vector<ngraph::helpers::MemoryTransformation> transformation{
    ngraph::helpers::MemoryTransformation::NONE,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_REGULAR_API,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2,
    ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2_REGULAR_API};
--- a/src/tests/functional/shared_test_classes/src/subgraph/memory_LSTMCell.cpp
+++ b/src/tests/functional/shared_test_classes/src/subgraph/memory_LSTMCell.cpp
@ -308,18 +308,7 @@ namespace SubgraphTestsDefinitions {
    void MemoryLSTMCellTest::ApplyLowLatency() {
        // Calculate values after LowLatency transformation
        CreatePureTensorIteratorModel();
-        if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY) {
+        if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2) {
            function->validate_nodes_and_infer_types();
            // Apply LowLatency (insert Assigns/ReadValues) and UnrollTensorIterator
            pass::Manager manager;
            NGRAPH_SUPPRESS_DEPRECATED_START
            manager.register_pass<ngraph::pass::LowLatency>();
            NGRAPH_SUPPRESS_DEPRECATED_END // LowLatency enables UnrollTI
            manager.run_passes(function);
            bool ti_found = helpers::is_tensor_iterator_exist(function);
            EXPECT_EQ(ti_found, true);
            LoadNetwork();
        } else if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2) {
            function->validate_nodes_and_infer_types();
            // Apply LowLatency (insert Assigns/ReadValues) and UnrollTensorIterator
@ -329,18 +318,6 @@ namespace SubgraphTestsDefinitions {
            bool ti_found = helpers::is_tensor_iterator_exist(function);
            EXPECT_EQ(ti_found, false);
            LoadNetwork();
        } else if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY_REGULAR_API) {
            cnnNetwork = InferenceEngine::CNNNetwork{function};
            IE_SUPPRESS_DEPRECATED_START
            InferenceEngine::LowLatency(cnnNetwork);
            IE_SUPPRESS_DEPRECATED_END
            bool ti_found = helpers::is_tensor_iterator_exist(cnnNetwork.getFunction());
            EXPECT_EQ(ti_found, true);
            ConfigureNetwork();
            executableNetwork = core->LoadNetwork(cnnNetwork, targetDevice, configuration);
            inferRequest = executableNetwork.CreateInferRequest();
        } else if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2_REGULAR_API) {
            cnnNetwork = InferenceEngine::CNNNetwork{function};
            InferenceEngine::lowLatency2(cnnNetwork);
--- a/src/tests/functional/shared_test_classes/src/subgraph/multiple_LSTMCell.cpp
+++ b/src/tests/functional/shared_test_classes/src/subgraph/multiple_LSTMCell.cpp
@ -425,18 +425,7 @@ void MultipleLSTMCellTest::InitMemory() {
 void MultipleLSTMCellTest::ApplyLowLatency() {
    // Calculate values after LowLatency transformation
    CreatePureTensorIteratorModel();
-    if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY) {
+    if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2) {
        function->validate_nodes_and_infer_types();
        // Apply LowLatency (insert Assigns/ReadValues) and UnrollTensorIterator
        pass::Manager manager;
        NGRAPH_SUPPRESS_DEPRECATED_START
        manager.register_pass<ngraph::pass::LowLatency>();
        NGRAPH_SUPPRESS_DEPRECATED_END // LowLatency enables UnrollTI
        manager.run_passes(function);
        bool ti_found = helpers::is_tensor_iterator_exist(function);
        EXPECT_EQ(ti_found, true);
        LoadNetwork();
    } else if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2) {
        function->validate_nodes_and_infer_types();
        // Apply LowLatency (insert Assigns/ReadValues) and UnrollTensorIterator
@ -446,18 +435,6 @@ void MultipleLSTMCellTest::ApplyLowLatency() {
        bool ti_found = helpers::is_tensor_iterator_exist(function);
        EXPECT_EQ(ti_found, false);
        LoadNetwork();
    } else if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY_REGULAR_API) {
        cnnNetwork = InferenceEngine::CNNNetwork{function};
        IE_SUPPRESS_DEPRECATED_START
        InferenceEngine::LowLatency(cnnNetwork);
        IE_SUPPRESS_DEPRECATED_END
        bool ti_found = helpers::is_tensor_iterator_exist(cnnNetwork.getFunction());
        EXPECT_EQ(ti_found, true);
        ConfigureNetwork();
        executableNetwork = core->LoadNetwork(cnnNetwork, targetDevice, configuration);
        inferRequest = executableNetwork.CreateInferRequest();
    } else if (transformation == ngraph::helpers::MemoryTransformation::LOW_LATENCY_V2_REGULAR_API) {
        cnnNetwork = InferenceEngine::CNNNetwork{function};
        InferenceEngine::lowLatency2(cnnNetwork);
--- a/src/tests/ngraph_helpers/ngraph_functions/include/ngraph_functions/utils/ngraph_helpers.hpp
+++ b/src/tests/ngraph_helpers/ngraph_functions/include/ngraph_functions/utils/ngraph_helpers.hpp
@ -226,8 +226,6 @@ enum class SequenceTestsMode {
 enum class MemoryTransformation {
    NONE,
    LOW_LATENCY,
    LOW_LATENCY_REGULAR_API,
    LOW_LATENCY_V2,
    LOW_LATENCY_V2_REGULAR_API,
    LOW_LATENCY_V2_ORIGINAL_INIT
--- a/src/tests/ngraph_helpers/ngraph_functions/src/utils/ngraph_helpers.cpp
+++ b/src/tests/ngraph_helpers/ngraph_functions/src/utils/ngraph_helpers.cpp
@ -895,15 +895,9 @@ std::ostream& operator<<(std::ostream & os, MemoryTransformation type) {
        case MemoryTransformation::LOW_LATENCY_V2:
            os << "LOW_LATENCY_V2";
            break;
        case MemoryTransformation::LOW_LATENCY:
            os << "LOW_LATENCY";
            break;
        case MemoryTransformation::LOW_LATENCY_V2_REGULAR_API:
            os << "LOW_LATENCY_V2_REGULAR_API";
            break;
        case MemoryTransformation::LOW_LATENCY_REGULAR_API:
            os << "LOW_LATENCY_REGULAR_API";
            break;
        case MemoryTransformation::LOW_LATENCY_V2_ORIGINAL_INIT:
            os << "LOW_LATENCY_V2_ORIGINAL_INIT";
            break;