[GPU] Add set_arguments() call in case of dynamic dependencies of static inst (#21664 )

[PyOV] Delete compatibility tests (#21820 )
Co-authored-by: Michal Lukaszewski <michal.lukaszewski@intel.com> Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
2023-12-22 17:56:59 +01:00 · 2023-12-22 17:13:21 +01:00 · 2023-12-22 17:07:57 +01:00 · 2023-12-22 14:57:05 +01:00 · 2023-12-22 14:51:15 +01:00 · 2023-12-22 17:15:04 +04:00
1760 changed files with 4107 additions and 182305 deletions
--- a/.ci/pot/Jenkinsfile
+++ b/.ci/pot/Jenkinsfile
@@ -1,14 +0,0 @@
-#!groovy
-
-
-properties([
-    parameters([
-        string(defaultValue: '',
-               description: 'Pipeline shared library version (branch/tag/commit). Determined automatically if empty',
-               name: 'library_version')
-    ])
-])
-
-loadOpenVinoLibrary {
-    potEntrypoint(this)
-}
--- a/.gitattributes
+++ b/.gitattributes
@@ -65,15 +65,3 @@
 *.vsdx filter=lfs diff=lfs merge=lfs -text
 *.bmp filter=lfs diff=lfs merge=lfs -text
 *.svg filter=lfs diff=lfs merge=lfs -text
-
-#POT attributes
-tools/pot/tests/data/test_cases_refs/* filter=lfs diff=lfs merge=lfs -text
-tools/pot/tests/data/models/*/* filter=lfs diff=lfs merge=lfs -text
-tools/pot/tests/data/reference_models/* filter=lfs diff=lfs merge=lfs -text
-tools/pot/tests/data/video/* filter=lfs diff=lfs merge=lfs -text
-tools/pot/tests/data/reference_fake_quantize_conf/* filter=lfs diff=lfs merge=lfs -text
-/tools/pot/tests/** -pot_package
-/tools/pot/tools/auxilary/** -pot_package
-/tools/pot/tools/run_series_experiments.py -pot_package
-/tools/pot/.pylintrc -pot_package
-/tools/pot/README_dev.md -pot_package
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -28,6 +28,7 @@

 /src/bindings/python/ @openvinotoolkit/openvino-ie-python-api-maintainers
 /src/bindings/c/ @openvinotoolkit/openvino-c-api-maintainers
+/src/bindings/js/ @openvinotoolkit/openvino-js-api-maintainers
 /src/common/*transformations/  @openvinotoolkit/openvino-ie-transformations-maintainers
 /src/core/  @openvinotoolkit/openvino-ngraph-maintainers

@@ -35,6 +36,7 @@
 /samples/c/ @openvinotoolkit/openvino-samples-maintainers @openvinotoolkit/openvino-c-api-maintainers
 /samples/cpp/  @openvinotoolkit/openvino-samples-maintainers @openvinotoolkit/openvino-maintainers
 /samples/python/ @openvinotoolkit/openvino-samples-maintainers @openvinotoolkit/openvino-ie-python-api-maintainers
+/samples/js/ @openvinotoolkit/openvino-samples-maintainers @openvinotoolkit/openvino-js-api-maintainers
 /thirdparty/zlib/ @openvinotoolkit/openvino-samples-maintainers
 /thirdparty/json/ @openvinotoolkit/openvino-samples-maintainers
 /thirdparty/gflags/ @openvinotoolkit/openvino-samples-maintainers
@@ -58,9 +60,6 @@
 /src/tests/**/gpu/ @openvinotoolkit/openvino-ie-gpu-maintainers
 /thirdparty/ocl/  @openvinotoolkit/openvino-ie-gpu-maintainers @openvinotoolkit/openvino-ie-gpu-developers

-# OpenVINO GNA:
-/src/plugins/intel_gna/  @openvinotoolkit/openvino-ie-gna-maintainers
-
 # OpenVINO Auto (MULTI) plugin:
 /src/plugins/auto/  @openvinotoolkit/openvino-ie-auto-multi-maintainers

@@ -103,8 +102,7 @@
 /tools/openvino_dev/ @openvinotoolkit/openvino-tools-maintainers @openvinotoolkit/openvino-ie-python-api-maintainers
 /tools/mo/  @openvinotoolkit/openvino-mo-maintainers
 /tools/ovc/  @openvinotoolkit/openvino-mo-maintainers
-/tools/pot/  @openvinotoolkit/openvino-pot-maintainers
-/thirdparty/open_model_zoo/ @openvinotoolkit/omz-maintainers @openvinotoolkit/openvino-pot-maintainers
+/thirdparty/open_model_zoo/ @openvinotoolkit/omz-maintainers

 # Documentation
 /docs/  @openvinotoolkit/openvino-docs-maintainers
@@ -118,7 +116,6 @@
 /docs/snippets/ @openvinotoolkit/openvino-docs-maintainers @openvinotoolkit/openvino-ie-maintainers
 /docs/OV_Runtime_UG/supported_plugins/ARM_CPU.md @openvinotoolkit/openvino-docs-maintainers @openvinotoolkit/openvino_contrib-arm_plugin-maintainers
 /docs/OV_Runtime_UG/supported_plugins/CPU.md @openvinotoolkit/openvino-docs-maintainers @openvinotoolkit/openvino-ie-cpu-maintainers
-/docs/OV_Runtime_UG/supported_plugins/GNA.md @openvinotoolkit/openvino-docs-maintainers @openvinotoolkit/openvino-ie-gna-maintainers
 /docs/OV_Runtime_UG/supported_plugins/GPU*.md @openvinotoolkit/openvino-docs-maintainers @openvinotoolkit/openvino-ie-gpu-maintainers

 # Configuration management
--- a/.github/ISSUE_TEMPLATE/bug.yml
+++ b/.github/ISSUE_TEMPLATE/bug.yml
@@ -41,9 +41,7 @@ body:
      options:
        - CPU
        - GPU
-        - GNA
-        - NCS2 (Intel Movidius)
-        - HDDL
+        - NPU
        - AUTO
        - HETERO
        - BATCH
--- a/.github/components.yml
+++ b/.github/components.yml
@@ -13,12 +13,10 @@ LP_transformations:
    - TFL_FE
    - ONNX_FE
    - PDPD_FE
-    - POT

 preprocessing:
  revalidate:
    - inference
-    - GNA
    - C_API
    - Python_API

@@ -29,6 +27,7 @@ CPU:
  revalidate:
    - C_API
    - Python_API
+    - JS_API
    - samples
    - ONNX_RT
    - PyTorch_FE
@@ -48,18 +47,10 @@ GPU:
    - IR_FE
    - PROXY

-GNA:
-  build:
-    - HETERO
-    - AUTO_BATCH
-    - TEMPLATE
-    - IR_FE
-
 HETERO:
  revalidate:
    - CPU
    - GPU
-    - GNA
    - HETERO
    - AUTO_BATCH
    - TEMPLATE
@@ -72,7 +63,6 @@ AUTO_BATCH:
  revalidate:
    - CPU
    - GPU
-    - GNA
    - HETERO
    - AUTO_BATCH
    - TEMPLATE
@@ -85,7 +75,6 @@ TEMPLATE:
  revalidate:
    - CPU
    - GPU
-    - GNA
    - HETERO
    - AUTO_BATCH
    - TEMPLATE
@@ -115,6 +104,7 @@ IR_FE:
  revalidate:
    - C_API
    - Python_API
+    - JS_API
    - samples
  build:
    - CPU
@@ -166,7 +156,6 @@ Python_API:
  revalidate:
    - samples
    - MO
-    - POT
    - tools
  build:
    - CPU
@@ -181,6 +170,13 @@ Python_API:
    - TFL_FE
    - PyTorch_FE

+JS_API:
+  revalidate:
+    - samples
+  build:
+    - CPU
+    - IR_FE
+
 samples:
  build:
    - CPU
@@ -194,7 +190,6 @@ IE_Tests:
  revalidate:
    - CPU
    - GPU
-    - GNA
    - HETERO
    - AUTO_BATCH
    - TEMPLATE
@@ -204,16 +199,9 @@ IE_Tests:
    - IR_FE

 MO:
-  revalidate:
-    - POT
  build:
    - Python_API

-POT:
-  build:
-    - CPU
-    - Python_API
-
 tools:
  build:
    - CPU
--- a/.github/dependabot.yml
+++ b/.github/dependabot.yml
@@ -61,23 +61,6 @@ updates:
        dependency-type: "production"
    versioning-strategy: increase-if-necessary

-  # POT requirements
-  - package-ecosystem: pip
-    directory: "/tools/pot"
-    schedule:
-      interval: "daily"
-      time: "09:00"
-      timezone: "Asia/Dubai"
-    open-pull-requests-limit: 3
-    assignees:
-      - "AlexKoff88"
-      - "KodiaqQ"
-      - "jiwaszki"
-      - "p-wysocki"
-      - "akuporos"
-      - "rkazants"
-    versioning-strategy: increase-if-necessary
-
  #
  # Python Samples
  #
--- a/.github/github_org_control/config.json
+++ b/.github/github_org_control/config.json
@@ -6,7 +6,6 @@
        "openvino-ci",
        "openvino-pushbot",
        "workbench-ci-bot",
-        "openvino-pot-ci",
        "sysicvvpux",
        "ote-ci-bot"
    ],
@@ -24,7 +23,6 @@
        "openvino-docs-maintainers": "category: docs",
        "openvino-ie-maintainers": "category: inference",
        "openvino-ie-cpu-maintainers": "category: CPU",
-        "openvino-ie-gna-maintainers": "category: GNA",
        "openvino-ie-gpu-maintainers": "category: GPU",
        "openvino-ie-lpt-maintainers": "category: LP transformations",
        "openvino-ie-transformations-maintainers": "category: transformations",
@@ -43,7 +41,6 @@
        "openvino-scripts-maintainers": "category: build",
        "openvino-tests-maintainers": "category: IE Tests",
        "openvino-tools-maintainers": "category: tools",
-        "openvino-pot-maintainers": "category: POT",
        "openvino-configuration-mgmt": "category: dependency_changes",
        "openvino-samples-maintainers": "category: samples",
        "openvino-c-api-maintainers": "category: C API"
--- a/.github/labeler.yml
+++ b/.github/labeler.yml
@@ -67,9 +67,6 @@
 - 'src/frontends/common/include/openvino/frontend/extension.hpp'
 - 'src/frontends/common/include/openvino/frontend/extension/**/*'

-'category: GNA':
- 'src/plugins/intel_gna/**/*'
-
 'category: GPU':
 - 'src/plugins/intel_gpu/**/*'
 - 'thirdparty/ocl/**/*'
@@ -109,21 +106,24 @@
 'category: packaging':
 - 'cmake/**/packaging/**/*'
 - 'src/bindings/python/wheel/**/*'
+- any: ['src/bindings/js/node/CMakeLists.txt',
+        'src/bindings/js/node/package.json',
+        'src/bindings/js/node/package-lock.json']
 - 'tools/openvino_dev/**/*'

 'category: PDPD FE':
 - 'src/frontends/paddle/**/*'
 - 'tests/layer_tests/py_frontend_tests/test_paddle_frontend.py'

-'category: POT':
- 'tools/pot/**/*'
-
 'category: preprocessing':
 - 'src/common/preprocessing/**/*'

 'category: Python API':
 - 'src/bindings/python/**/*'

+'category: JS API':
+- 'src/bindings/js/**/*'
+
 'category: samples':
 - 'samples/**/*'
 - 'thirdparty/zlib/**/*'
@@ -160,7 +160,6 @@

 'category: tools':
 - any: ['tools/**',
-        '!tools/pot/**/*',
        '!tools/mo/**/*',
        '!tools/ovc/**/*']

--- a/.github/workflows/coverity.yml
+++ b/.github/workflows/coverity.yml
@@ -62,7 +62,7 @@ jobs:
          repository: 'openvinotoolkit/openvino_contrib'
          path: ${{ env.OPENVINO_CONTRIB_REPO }}
          submodules: 'true'
-          ref: 'releases/2023/3'
+          ref: 'master'

      #
      # Dependencies
--- a/.github/workflows/job_cxx_unit_tests.yml
+++ b/.github/workflows/job_cxx_unit_tests.yml
@@ -157,22 +157,6 @@ jobs:
          ${INSTALL_TEST_DIR}/ov_transformations_tests --gtest_print_time=1 \
                --gtest_output=xml:${INSTALL_TEST_DIR}/TEST-Transformations.xml

-      - name: Legacy Transformations func tests
-        if: fromJSON(inputs.affected-components).GNA.test && 
-            (runner.os != 'macOS' && runner.arch != 'ARM64')
-        run: |
-          source ${INSTALL_DIR}/setupvars.sh
-          ${INSTALL_TEST_DIR}/ov_legacy_transformations_tests --gtest_print_time=1 \
-                --gtest_output=xml:${INSTALL_TEST_DIR}/TEST-LegacyTransformations.xml
-
-      - name: Inference Engine 1.0 unit tests
-        if: fromJSON(inputs.affected-components).GNA.test && 
-            (runner.os != 'macOS' && runner.arch != 'ARM64')
-        run: |
-          source ${INSTALL_DIR}/setupvars.sh
-          ${INSTALL_TEST_DIR}/InferenceEngineUnitTests --gtest_print_time=1 \
-                --gtest_output=xml:${INSTALL_TEST_DIR}/TEST-InferenceEngineUnitTests.xml
-
      - name: Common test utils tests
        run: |
          source ${INSTALL_DIR}/setupvars.sh
--- a/.github/workflows/job_python_unit_tests.yml
+++ b/.github/workflows/job_python_unit_tests.yml
@@ -121,14 +121,6 @@ jobs:
      # Tests
      #

-      - name: Python API 1.0 Tests
-        # if: fromJSON(inputs.affected-components).Python_API.test # Ticket: 127101
-        run: |
-          python3 -m pytest -s ${INSTALL_TEST_DIR}/pyngraph \
-            --junitxml=${INSTALL_TEST_DIR}/TEST-Pyngraph.xml \
-            --ignore=${INSTALL_TEST_DIR}/pyngraph/tests_compatibility/test_onnx/test_zoo_models.py \
-            --ignore=${INSTALL_TEST_DIR}/pyngraph/tests_compatibility/test_onnx/test_backend.py
-
      - name: Python API 2.0 Tests
        # if: ${{ fromJSON(inputs.affected-components).Python_API.test && runner.arch != 'ARM64' }} # Ticket: 126380, 127101
        run: |
--- a/.github/workflows/linux.yml
+++ b/.github/workflows/linux.yml
@@ -69,7 +69,7 @@ jobs:
      DEVELOPER_PACKAGE_DIR: /__w/openvino/openvino/developer_package_install
      BUILD_DIR: /__w/openvino/openvino/openvino_build
      SCCACHE_AZURE_KEY_PREFIX: ubuntu20_x86_64_Release
-      ONNX_RUNTIME_UTILS: /__w/openvino/openvino/openvino/.ci/azure/ci_utils/onnxruntime
+      ONNX_RUNTIME_UTILS: /__w/openvino/openvino/openvino/src/frontends/onnx/tests/ci_utils/onnxruntime
    if: "!needs.smart_ci.outputs.skip_workflow"

    steps:
@@ -90,7 +90,7 @@ jobs:
          repository: 'openvinotoolkit/openvino_contrib'
          path: ${{ env.OPENVINO_CONTRIB_REPO }}
          submodules: 'true'
-          ref: 'releases/2023/3'
+          ref: 'master'

      #
      # Print system info
@@ -521,7 +521,7 @@ jobs:
        with:
          repository: 'openvinotoolkit/openvino_contrib'
          path: ${{ env.OPENVINO_CONTRIB_REPO }}
-          ref: 'releases/2023/3'
+          ref: 'master'

      #
      # Dependencies
--- a/.github/workflows/linux_arm64.yml
+++ b/.github/workflows/linux_arm64.yml
@@ -21,6 +21,7 @@ jobs:
    runs-on: ubuntu-latest
    outputs:
      affected_components: "${{ steps.smart_ci.outputs.affected_components }}"
+      skip_workflow: "${{ steps.smart_ci.outputs.skip_workflow }}"
    steps:
      - name: checkout action
        uses: actions/checkout@v4
@@ -36,6 +37,8 @@ jobs:
          commit_sha: ${{ github.sha }}
          component_pattern: "category: (.*)"
          repo_token: ${{ secrets.GITHUB_TOKEN }}
+          skip_when_only_listed_labels_set: 'docs'
+          skip_when_only_listed_files_changed: '*.md,*.rst,*.png,*.jpg,*.svg'

      - name: Show affected components
        run: |
@@ -68,8 +71,8 @@ jobs:
      DEVELOPER_PACKAGE_DIR: /__w/openvino/openvino/developer_package_install
      BUILD_DIR: /__w/openvino/openvino/openvino_build
      SCCACHE_AZURE_KEY_PREFIX: 'ubuntu20_aarch64_Release'
-      ONNX_RUNTIME_UTILS: /__w/openvino/openvino/openvino/.ci/azure/ci_utils/onnxruntime
-    if: "!fromJSON(needs.smart_ci.outputs.affected_components).docs_only"
+      ONNX_RUNTIME_UTILS: /__w/openvino/openvino/openvino/src/frontends/onnx/tests/ci_utils/onnxruntime
+    if: "!needs.smart_ci.outputs.skip_workflow"

    steps:
      - name: Install git
@@ -87,7 +90,7 @@ jobs:
          repository: 'openvinotoolkit/openvino_contrib'
          path: ${{ env.OPENVINO_CONTRIB_REPO }}
          submodules: 'true'
-          ref: 'releases/2023/3'
+          ref: 'master'

      #
      # Print system info
--- a/.github/workflows/linux_conditional_compilation.yml
+++ b/.github/workflows/linux_conditional_compilation.yml
@@ -86,7 +86,7 @@ jobs:
          repository: 'openvinotoolkit/testdata'
          path: ${{ env.MODELS_PATH }}
          lfs: 'true'
-          ref: 'releases/2023/3'
+          ref: 'master'

      #
      # Print system info
@@ -147,7 +147,6 @@ jobs:
            -DENABLE_TESTS=ON \
            -DENABLE_CPPLINT=OFF \
            -DENABLE_NCC_STYLE=OFF \
-            -DENABLE_INTEL_GNA=OFF \
            -DCMAKE_COMPILE_WARNING_AS_ERROR=ON \
            -DENABLE_PROFILING_ITT=ON \
            -DSELECTIVE_BUILD=COLLECT \
@@ -269,7 +268,7 @@ jobs:
          repository: 'openvinotoolkit/testdata'
          path: ${{ env.MODELS_PATH }}
          lfs: 'true'
-          ref: 'releases/2023/3'
+          ref: 'master'

      - name: Download selective build statistics package
        uses: actions/download-artifact@v3
@@ -303,7 +302,6 @@ jobs:
            -DSELECTIVE_BUILD=ON \
            -DENABLE_TEMPLATE=OFF \
            -DENABLE_INTEL_GPU=OFF \
-            -DENABLE_INTEL_GNA=OFF \
            -DENABLE_OV_TF_FRONTEND=OFF \
            -DENABLE_OV_TF_LITE_FRONTEND=OFF \
            -DENABLE_OV_PADDLE_FRONTEND=OFF \
--- a/.github/workflows/windows.yml
+++ b/.github/workflows/windows.yml
@@ -72,7 +72,7 @@ jobs:
        with:
          repository: 'openvinotoolkit/openvino_contrib'
          path: 'openvino_contrib'
-          ref: 'releases/2023/3'
+          ref: 'master'

      #
      # Print system info
@@ -364,12 +364,6 @@ jobs:
          # TODO: replace with Python API tests requirements
          python3 -m pip install -r ${{ env.INSTALL_TEST_DIR }}/mo/requirements_dev.txt

-      - name: Python API 1.0 Tests
-        #if: fromJSON(needs.smart_ci.outputs.affected_components).Python_API.test # Ticket: 127101
-        shell: cmd
-        run: |
-          python3 -m pytest -s ${{ env.INSTALL_TEST_DIR }}/pyngraph ${{ env.PYTHON_STATIC_ARGS }} --junitxml=${{ env.INSTALL_TEST_DIR }}/TEST-Pyngraph.xml --ignore=${{ env.INSTALL_TEST_DIR }}/pyngraph/tests_compatibility/test_onnx/test_zoo_models.py
-
      - name: Python API 2.0 Tests
        #if: fromJSON(needs.smart_ci.outputs.affected_components).Python_API.test # Ticket: 127101
        shell: cmd
@@ -617,18 +611,6 @@ jobs:
        run: |
          call "${{ env.INSTALL_DIR }}\\setupvars.bat" && ${{ env.INSTALL_TEST_DIR }}/ov_transformations_tests --gtest_print_time=1 --gtest_output=xml:${{ env.INSTALL_TEST_DIR }}/TEST-Transformations.xml

-      - name: Legacy Transformations func tests
-        if: fromJSON(needs.smart_ci.outputs.affected_components).GNA.test
-        shell: cmd
-        run: |
-          call "${{ env.INSTALL_DIR }}\\setupvars.bat" && ${{ env.INSTALL_TEST_DIR }}/ov_legacy_transformations_tests --gtest_print_time=1 --gtest_output=xml:${{ env.INSTALL_TEST_DIR }}/TEST-LegacyTransformations.xml
-
-      - name: Inference Engine 1.0 unit tests
-        if: fromJSON(needs.smart_ci.outputs.affected_components).GNA.test
-        shell: cmd
-        run: |
-          call "${{ env.INSTALL_DIR }}\\setupvars.bat" && ${{ env.INSTALL_TEST_DIR }}/InferenceEngineUnitTests --gtest_print_time=1 --gtest_output=xml:${{ env.INSTALL_TEST_DIR }}/TEST-InferenceEngineUnitTests.xml
-
      - name: Common test utils tests
        shell: cmd
        run: |
@@ -656,12 +638,6 @@ jobs:
        run: |
          call "${{ env.INSTALL_DIR }}\\setupvars.bat" && ${{ env.INSTALL_TEST_DIR }}/ov_op_conformance_tests --gtest_print_time=1 --gtest_filter="*OpImpl*" --gtest_output=xml:${{ env.INSTALL_TEST_DIR }}/TEST-TemplateOpImplTests.xml

-      - name: GNA plugin unit tests
-        if: fromJSON(needs.smart_ci.outputs.affected_components).GNA.test
-        shell: cmd
-        run: |
-          call "${{ env.INSTALL_DIR }}\\setupvars.bat" && ${{ env.INSTALL_TEST_DIR }}/ov_gna_unit_tests --gtest_print_time=1 --gtest_output=xml:${{ env.INSTALL_TEST_DIR }}/TEST-GNAUnitTests.xml
-
      - name: AUTO unit tests
        if: fromJSON(needs.smart_ci.outputs.affected_components).AUTO.test
        shell: cmd
--- a/.github/workflows/windows_conditional_compilation.yml
+++ b/.github/workflows/windows_conditional_compilation.yml
@@ -77,7 +77,7 @@ jobs:
          repository: 'openvinotoolkit/testdata'
          path: 'testdata'
          lfs: 'true'
-          ref: 'releases/2023/3'
+          ref: 'master'

      #
      # Print system info
@@ -140,7 +140,6 @@ jobs:
            -DENABLE_TESTS=ON `
            -DENABLE_CPPLINT=OFF `
            -DENABLE_NCC_STYLE=OFF `
-            -DENABLE_INTEL_GNA=OFF `
            -DCMAKE_COMPILE_WARNING_AS_ERROR=ON `
            -DENABLE_PROFILING_ITT=ON `
            -DSELECTIVE_BUILD=COLLECT `
@@ -275,7 +274,7 @@ jobs:
          repository: 'openvinotoolkit/testdata'
          path: 'testdata'
          lfs: 'true'
-          ref: 'releases/2023/3'
+          ref: 'master'

      - name: Download selective build statistics package
        uses: actions/download-artifact@v3
@@ -306,7 +305,6 @@ jobs:
            -DSELECTIVE_BUILD=ON `
            -DENABLE_TEMPLATE=OFF `
            -DENABLE_INTEL_GPU=OFF `
-            -DENABLE_INTEL_GNA=OFF `
            -DENABLE_OV_TF_FRONTEND=OFF `
            -DENABLE_OV_TF_LITE_FRONTEND=OFF `
            -DENABLE_OV_PADDLE_FRONTEND=OFF `
--- a/README.md
+++ b/README.md
@@ -33,7 +33,7 @@ OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference.
 - Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud


-This open-source version includes several components: namely [OpenVINO Model Converter (OVC)], [OpenVINO™ Runtime], as well as CPU, GPU, GNA, multi device and heterogeneous plugins to accelerate deep learning inference on Intel® CPUs and Intel® Processor Graphics.
+This open-source version includes several components: namely [OpenVINO Model Converter (OVC)], [OpenVINO™ Runtime], as well as CPU, GPU, multi device and heterogeneous plugins to accelerate deep learning inference on Intel® CPUs and Intel® Processor Graphics.
 It supports pre-trained models from [Open Model Zoo], along with 100+ open
 source and public models in popular formats such as TensorFlow, ONNX, PaddlePaddle, MXNet, Caffe, Kaldi.

@@ -82,12 +82,6 @@ The OpenVINO™ Runtime can infer models on different hardware devices. This sec
            <td><b><i><a href="./src/plugins/intel_gpu">openvino_intel_gpu_plugin</a></i></b></td>
            <td>Intel Processor Graphics, including Intel HD Graphics and Intel Iris Graphics</td>
        </tr>
-        <tr>
-            <td>GNA</td>
-            <td><a href="https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_GNA.html#doxid-openvino-docs-o-v-u-g-supported-plugins-g-n-a">Intel GNA</a></td>
-            <td><b><i><a href="./src/plugins/intel_gna">openvino_intel_gna_plugin</a></i></b></td>
-            <td>Intel Speech Enabling Developer Kit, Amazon Alexa* Premium Far-Field Developer Kit, Intel Pentium Silver J5005 Processor, Intel Pentium Silver N5000 Processor, Intel Celeron J4005 Processor, Intel Celeron J4105 Processor, Intel Celeron Processor N4100, Intel Celeron Processor N4000, Intel Core i3-8121U Processor, Intel Core i7-1065G7 Processor, Intel Core i7-1060G7 Processor, Intel Core i5-1035G4 Processor, Intel Core i5-1035G7 Processor, Intel Core i5-1035G1 Processor, Intel Core i5-1030G7 Processor, Intel Core i5-1030G4 Processor, Intel Core i3-1005G1 Processor, Intel Core i3-1000G1 Processor, Intel Core i3-1000G4 Processor</td>
-        </tr>
    </tbody>
 </table>

--- a/cmake/coverage.cmake
+++ b/cmake/coverage.cmake
@@ -89,13 +89,6 @@ if(ENABLE_INTEL_CPU)
                        PREFIX "${OV_COVERAGE_BASE_DIRECTORY}")
 endif()

-if(ENABLE_INTEL_GNA)
-    ov_coverage_extract(INPUT "openvino" OUTPUT "intel_gna_plugin"
-                        PATTERNS "${OV_COVERAGE_BASE_DIRECTORY}/src/plugins/intel_gna/*")
-    ov_coverage_genhtml(INFO_FILE "intel_gna_plugin"
-                        PREFIX "${OV_COVERAGE_BASE_DIRECTORY}")
-endif()
-
 if (ENABLE_INTEL_GPU)
    ov_coverage_extract(INPUT "openvino" OUTPUT "intel_gpu_plugin"
                        PATTERNS "${OV_COVERAGE_BASE_DIRECTORY}/src/plugins/intel_gpu/*")
--- a/cmake/dependencies.cmake
+++ b/cmake/dependencies.cmake
@@ -221,47 +221,3 @@ Build oneTBB from sources and set TBBROOT environment var before OpenVINO cmake
    update_deps_cache(TBBBIND_2_5_ROOT "${TBBBIND_2_5}" "Path to TBBBIND_2_5 root folder")
    update_deps_cache(TBBBIND_2_5_DIR "${TBBBIND_2_5}/cmake" "Path to TBBBIND_2_5 cmake folder")
 endfunction()
-
-if(ENABLE_INTEL_GNA)
-    reset_deps_cache(
-            GNA_EXT_DIR
-            GNA_PLATFORM_DIR
-            GNA_KERNEL_LIB_NAME
-            GNA_LIBS_LIST
-            GNA_LIB_DIR
-            libGNA_INCLUDE_DIRS
-            libGNA_LIBRARIES_BASE_PATH)
-        set(GNA_VERSION "03.05.00.2116")
-        set(GNA_HASH "960350567702bda17276ac4c060d7524fb7ce7ced785004bd861c81ff2bfe2c5")
-
-        set(FILES_TO_EXTRACT_LIST gna_${GNA_VERSION}/include)
-        if(WIN32)
-            LIST(APPEND FILES_TO_EXTRACT_LIST gna_${GNA_VERSION}/win64)
-        else()
-            LIST(APPEND FILES_TO_EXTRACT_LIST gna_${GNA_VERSION}/linux)
-        endif()
-
-        RESOLVE_DEPENDENCY(GNA_EXT_DIR
-                ARCHIVE_UNIFIED "gna/gna_${GNA_VERSION}.zip"
-                TARGET_PATH "${TEMP}/gna_${GNA_VERSION}"
-                VERSION_REGEX ".*_([0-9]+.[0-9]+.[0-9]+.[0-9]+).*"
-                FILES_TO_EXTRACT FILES_TO_EXTRACT_LIST
-                SHA256 ${GNA_HASH}
-                USE_NEW_LOCATION TRUE)
-    update_deps_cache(GNA_EXT_DIR "${GNA_EXT_DIR}" "Path to GNA root folder")
-    debug_message(STATUS "gna=" ${GNA_EXT_DIR})
-
-    if (WIN32)
-        set(GNA_PLATFORM_DIR win64 CACHE STRING "" FORCE)
-    elseif (UNIX)
-        set(GNA_PLATFORM_DIR linux CACHE STRING "" FORCE)
-    else ()
-        message(FATAL_ERROR "GNA not supported on this platform, only linux, and windows")
-    endif ()
-    set(GNA_LIB_DIR x64 CACHE STRING "" FORCE)
-    set(GNA_PATH ${GNA_EXT_DIR}/${GNA_PLATFORM_DIR}/${GNA_LIB_DIR} CACHE STRING "" FORCE)
-
-    if(NOT BUILD_SHARED_LIBS)
-        list(APPEND PATH_VARS "GNA_PATH")
-    endif()
-endif()
--- a/cmake/developer_package/packaging/archive.cmake
+++ b/cmake/developer_package/packaging/archive.cmake
@@ -22,8 +22,6 @@ macro(ov_archive_cpack_set_dirs)
    # common "archive" package locations
    # TODO: move current variables to OpenVINO specific locations
    set(OV_CPACK_INCLUDEDIR runtime/include)
-    set(OV_CPACK_IE_CMAKEDIR runtime/cmake)
-    set(OV_CPACK_NGRAPH_CMAKEDIR runtime/cmake)
    set(OV_CPACK_OPENVINO_CMAKEDIR runtime/cmake)
    set(OV_CPACK_DOCDIR docs)
    set(OV_CPACK_LICENSESDIR licenses)
@@ -89,7 +87,6 @@ macro(ov_define_component_include_rules)
    set(OV_CPACK_COMP_NPM_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # tools
    set(OV_CPACK_COMP_OPENVINO_DEV_REQ_FILES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
-    unset(OV_CPACK_COMP_DEPLOYMENT_MANAGER_EXCLUDE_ALL)
    # scripts
    unset(OV_CPACK_COMP_INSTALL_DEPENDENCIES_EXCLUDE_ALL)
    unset(OV_CPACK_COMP_SETUPVARS_EXCLUDE_ALL)
--- a/cmake/developer_package/packaging/common-libraries.cmake
+++ b/cmake/developer_package/packaging/common-libraries.cmake
@@ -21,13 +21,9 @@ macro(ov_common_libraries_cpack_set_dirs)
    endif()
    set(OV_CPACK_ARCHIVEDIR ${CMAKE_INSTALL_LIBDIR})
    if(CPACK_GENERATOR MATCHES "^(CONAN|VCPKG)$")
-        set(OV_CPACK_IE_CMAKEDIR ${CMAKE_INSTALL_DATADIR}/openvino)
-        set(OV_CPACK_NGRAPH_CMAKEDIR ${CMAKE_INSTALL_DATADIR}/openvino)
        set(OV_CPACK_OPENVINO_CMAKEDIR ${CMAKE_INSTALL_DATADIR}/openvino)
        set(OV_CPACK_PLUGINSDIR ${OV_CPACK_RUNTIMEDIR})
    else()
-        set(OV_CPACK_IE_CMAKEDIR ${CMAKE_INSTALL_LIBDIR}/cmake/inferenceengine${OpenVINO_VERSION})
-        set(OV_CPACK_NGRAPH_CMAKEDIR ${CMAKE_INSTALL_LIBDIR}/cmake/ngraph${OpenVINO_VERSION})
        set(OV_CPACK_OPENVINO_CMAKEDIR ${CMAKE_INSTALL_LIBDIR}/cmake/openvino${OpenVINO_VERSION})
        set(OV_CPACK_PLUGINSDIR ${OV_CPACK_RUNTIMEDIR}/openvino-${OpenVINO_VERSION})
    endif()
@@ -105,7 +101,6 @@ macro(ov_define_component_include_rules)
    set(OV_CPACK_COMP_NPM_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # tools
    set(OV_CPACK_COMP_OPENVINO_DEV_REQ_FILES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
-    set(OV_CPACK_COMP_DEPLOYMENT_MANAGER_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # scripts
    set(OV_CPACK_COMP_INSTALL_DEPENDENCIES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    set(OV_CPACK_COMP_SETUPVARS_EXCLUDE_ALL EXCLUDE_FROM_ALL)
--- a/cmake/developer_package/packaging/debian/debian.cmake
+++ b/cmake/developer_package/packaging/debian/debian.cmake
@@ -30,8 +30,6 @@ macro(ov_debian_cpack_set_dirs)
    set(OV_CPACK_LIBRARYDIR ${OV_CPACK_RUNTIMEDIR})
    set(OV_CPACK_ARCHIVEDIR ${OV_CPACK_RUNTIMEDIR})
    set(OV_CPACK_PLUGINSDIR ${OV_CPACK_RUNTIMEDIR}/openvino-${OpenVINO_VERSION})
-    set(OV_CPACK_IE_CMAKEDIR ${OV_CPACK_RUNTIMEDIR}/cmake/inferenceengine${OpenVINO_VERSION})
-    set(OV_CPACK_NGRAPH_CMAKEDIR ${OV_CPACK_RUNTIMEDIR}/cmake/ngraph${OpenVINO_VERSION})
    set(OV_CPACK_OPENVINO_CMAKEDIR ${OV_CPACK_RUNTIMEDIR}/cmake/openvino${OpenVINO_VERSION})
    set(OV_CPACK_DOCDIR ${CMAKE_INSTALL_DATADIR}/doc/openvino-${OpenVINO_VERSION})
    set(OV_CPACK_LICENSESDIR ${OV_CPACK_DOCDIR}/licenses)
@@ -110,7 +108,6 @@ macro(ov_define_component_include_rules)
    set(OV_CPACK_COMP_NPM_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # tools
    set(OV_CPACK_COMP_OPENVINO_DEV_REQ_FILES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
-    set(OV_CPACK_COMP_DEPLOYMENT_MANAGER_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # scripts
    set(OV_CPACK_COMP_INSTALL_DEPENDENCIES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    set(OV_CPACK_COMP_SETUPVARS_EXCLUDE_ALL EXCLUDE_FROM_ALL)
--- a/cmake/developer_package/packaging/npm.cmake
+++ b/cmake/developer_package/packaging/npm.cmake
@@ -14,49 +14,18 @@ set(CMAKE_SKIP_INSTALL_RPATH OFF)
 #
 macro(ov_npm_cpack_set_dirs)
    set(OV_CPACK_INCLUDEDIR .)
-    set(OV_CPACK_IE_CMAKEDIR .)
-    set(OV_CPACK_NGRAPH_CMAKEDIR .)
    set(OV_CPACK_OPENVINO_CMAKEDIR .)
    set(OV_CPACK_DOCDIR .)
-    set(OV_CPACK_LICENSESDIR .)
+    set(OV_CPACK_LICENSESDIR licenses)
    set(OV_CPACK_SAMPLESDIR .)
    set(OV_CPACK_WHEELSDIR .)
    set(OV_CPACK_TOOLSDIR .)
    set(OV_CPACK_DEVREQDIR .)
    set(OV_CPACK_PYTHONDIR .)

-    if(WIN32)
-        set(OV_CPACK_LIBRARYDIR .)
-        set(OV_CPACK_RUNTIMEDIR .)
-        set(OV_CPACK_ARCHIVEDIR .)
-    elseif(APPLE)
-        set(OV_CPACK_LIBRARYDIR .)
-        set(OV_CPACK_RUNTIMEDIR .)
-        set(OV_CPACK_ARCHIVEDIR .)
-    else()
-        set(OV_CPACK_LIBRARYDIR .)
-        set(OV_CPACK_RUNTIMEDIR .)
-        set(OV_CPACK_ARCHIVEDIR .)
-    endif()
-
    set(OV_CPACK_LIBRARYDIR .)
    set(OV_CPACK_ARCHIVEDIR .)
    set(OV_CPACK_PLUGINSDIR .)
-    set(OV_CPACK_IE_CMAKEDIR .)
-    set(OV_CPACK_NGRAPH_CMAKEDIR .)
-    set(OV_CPACK_OPENVINO_CMAKEDIR .)
-    set(OV_CPACK_DOCDIR .)
-    set(OV_CPACK_LICENSESDIR licenses)
-    set(OV_CPACK_PYTHONDIR .)
-
-    # non-native stuff
-    set(OV_CPACK_SHAREDIR .)
-    set(OV_CPACK_SAMPLESDIR .)
-    set(OV_CPACK_DEVREQDIR .)
-    unset(OV_CPACK_SHAREDIR)
-
-    # skipped during debian packaging
-    set(OV_CPACK_WHEELSDIR .)
 endmacro()

 ov_npm_cpack_set_dirs()
@@ -93,7 +62,6 @@ macro(ov_define_component_include_rules)
    unset(OV_CPACK_COMP_NPM_EXCLUDE_ALL)
    # tools
    set(OV_CPACK_COMP_OPENVINO_DEV_REQ_FILES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
-    set(OV_CPACK_COMP_DEPLOYMENT_MANAGER_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # scripts
    set(OV_CPACK_COMP_INSTALL_DEPENDENCIES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    set(OV_CPACK_COMP_SETUPVARS_EXCLUDE_ALL EXCLUDE_FROM_ALL)
--- a/cmake/developer_package/packaging/nsis.cmake
+++ b/cmake/developer_package/packaging/nsis.cmake
@@ -54,8 +54,6 @@ macro(ov_archive_cpack_set_dirs)
    # common "archive" package locations
    # TODO: move current variables to OpenVINO specific locations
    set(OV_CPACK_INCLUDEDIR runtime/include)
-    set(OV_CPACK_IE_CMAKEDIR runtime/cmake)
-    set(OV_CPACK_NGRAPH_CMAKEDIR runtime/cmake)
    set(OV_CPACK_OPENVINO_CMAKEDIR runtime/cmake)
    set(OV_CPACK_DOCDIR docs)
    set(OV_CPACK_LICENSESDIR licenses)
@@ -121,7 +119,6 @@ macro(ov_define_component_include_rules)
    set(OV_CPACK_COMP_NPM_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # tools
    unset(OV_CPACK_COMP_OPENVINO_DEV_REQ_FILES_EXCLUDE_ALL)
-    unset(OV_CPACK_COMP_DEPLOYMENT_MANAGER_EXCLUDE_ALL)
    # scripts
    unset(OV_CPACK_COMP_INSTALL_DEPENDENCIES_EXCLUDE_ALL)
    unset(OV_CPACK_COMP_SETUPVARS_EXCLUDE_ALL)
--- a/cmake/developer_package/packaging/packaging.cmake
+++ b/cmake/developer_package/packaging/packaging.cmake
@@ -149,7 +149,6 @@ macro(ov_define_component_names)
    set(OV_CPACK_COMP_NPM "ov_node_addon")
    # tools
    set(OV_CPACK_COMP_OPENVINO_DEV_REQ_FILES "openvino_dev_req_files")
-    set(OV_CPACK_COMP_DEPLOYMENT_MANAGER "deployment_manager")
    # scripts
    set(OV_CPACK_COMP_INSTALL_DEPENDENCIES "install_dependencies")
    set(OV_CPACK_COMP_SETUPVARS "setupvars")
--- a/cmake/developer_package/packaging/rpm/rpm.cmake
+++ b/cmake/developer_package/packaging/rpm/rpm.cmake
@@ -17,8 +17,6 @@ macro(ov_rpm_cpack_set_dirs)
    set(OV_CPACK_RUNTIMEDIR ${CMAKE_INSTALL_LIBDIR})
    set(OV_CPACK_ARCHIVEDIR ${CMAKE_INSTALL_LIBDIR})
    set(OV_CPACK_PLUGINSDIR ${CMAKE_INSTALL_LIBDIR}/openvino-${OpenVINO_VERSION})
-    set(OV_CPACK_IE_CMAKEDIR ${CMAKE_INSTALL_LIBDIR}/cmake/inferenceengine${OpenVINO_VERSION})
-    set(OV_CPACK_NGRAPH_CMAKEDIR ${CMAKE_INSTALL_LIBDIR}/cmake/ngraph${OpenVINO_VERSION})
    set(OV_CPACK_OPENVINO_CMAKEDIR ${CMAKE_INSTALL_LIBDIR}/cmake/openvino${OpenVINO_VERSION})
    set(OV_CPACK_DOCDIR ${CMAKE_INSTALL_DATADIR}/doc/openvino-${OpenVINO_VERSION})
    set(OV_CPACK_LICENSESDIR ${OV_CPACK_DOCDIR}/licenses)
@@ -60,7 +58,7 @@ ov_override_component_names()
 #
 # Override include / exclude rules for components
 # This is required to exclude some files from installation
-# (e.g. rpm packages don't require setupvars scripts or deployment_manager)
+# (e.g. rpm packages don't require setupvars scripts or others)
 #

 macro(ov_define_component_include_rules)
@@ -101,7 +99,6 @@ macro(ov_define_component_include_rules)
    set(OV_CPACK_COMP_NPM_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # tools
    set(OV_CPACK_COMP_OPENVINO_DEV_REQ_FILES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
-    set(OV_CPACK_COMP_DEPLOYMENT_MANAGER_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    # scripts
    set(OV_CPACK_COMP_INSTALL_DEPENDENCIES_EXCLUDE_ALL EXCLUDE_FROM_ALL)
    set(OV_CPACK_COMP_SETUPVARS_EXCLUDE_ALL EXCLUDE_FROM_ALL)
--- a/cmake/developer_package/version.cmake
+++ b/cmake/developer_package/version.cmake
@@ -82,33 +82,19 @@ macro(ov_parse_ci_build_number repo_root)
            return()
        endif()

-        set(ie_version_hpp "${OpenVINO_SOURCE_DIR}/src/inference/include/ie/ie_version.hpp")
-        if(NOT EXISTS ${ie_version_hpp})
-            message(FATAL_ERROR "File ie_version.hpp with IE_VERSION definitions is not found")
-        endif()
-
        set(ov_version_hpp "${OpenVINO_SOURCE_DIR}/src/core/include/openvino/core/version.hpp")
        if(NOT EXISTS ${ov_version_hpp})
            message(FATAL_ERROR "File openvino/core/version.hpp with OPENVINO_VERSION definitions is not found")
        endif()

-        file(STRINGS "${ie_version_hpp}" IE_VERSION_PARTS REGEX "#define IE_VERSION_[A-Z]+[ ]+" )
        file(STRINGS "${ov_version_hpp}" OV_VERSION_PARTS REGEX "#define OPENVINO_VERSION_[A-Z]+[ ]+" )

        foreach(suffix MAJOR MINOR PATCH)
-            set(ie_version_name "IE_VERSION_${suffix}")
            set(ov_version_name "OpenVINO_VERSION_${suffix}")
            set(ov_version_name_hpp "OPENVINO_VERSION_${suffix}")

-            string(REGEX REPLACE ".+${ie_version_name}[ ]+([0-9]+).*" "\\1"
-                    ${ie_version_name}_HPP "${IE_VERSION_PARTS}")
            string(REGEX REPLACE ".+${ov_version_name_hpp}[ ]+([0-9]+).*" "\\1"
                    ${ov_version_name}_HPP "${OV_VERSION_PARTS}")
-
-            if(NOT ${ie_version_name}_HPP EQUAL ${ov_version_name}_HPP)
-                message(FATAL_ERROR "${ov_version_name} (${${ov_version_name}_HPP})"
-                                    " and ${ie_version_name} (${${ie_version_name}_HPP}) are not equal")
-            endif()
        endforeach()

        foreach(var OpenVINO_VERSION_MAJOR OpenVINO_VERSION_MINOR OpenVINO_VERSION_PATCH)
--- a/cmake/features.cmake
+++ b/cmake/features.cmake
@@ -103,13 +103,6 @@ endif()
 ov_dependent_option (ENABLE_TBBBIND_2_5 "Enable TBBBind_2_5 static usage in OpenVINO runtime" ${ENABLE_TBBBIND_2_5_DEFAULT} "THREADING MATCHES TBB; NOT APPLE" OFF)
 ov_dependent_option (ENABLE_TBB_RELEASE_ONLY "Only Release TBB libraries are linked to the OpenVINO Runtime binaries" ON "THREADING MATCHES TBB;LINUX" OFF)

-ov_dependent_option (ENABLE_INTEL_GNA "GNA support for OpenVINO Runtime" ON
-    "NOT APPLE;NOT ANDROID;X86_64;CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 5.4" OFF)
-
-ov_dependent_option (ENABLE_INTEL_GNA_DEBUG "GNA debug build" OFF "ENABLE_INTEL_GNA" OFF)
-ov_dependent_option (ENABLE_V7_SERIALIZE "enables serialization to IR v7" OFF "ENABLE_INTEL_GNA" OFF)
-ov_dependent_option (ENABLE_IR_V7_READER "Enables IR v7 reader" ${BUILD_SHARED_LIBS} "ENABLE_TESTS;ENABLE_INTEL_GNA" OFF)
-
 ov_dependent_option (ENABLE_GAPI_PREPROCESSING "Enables G-API preprocessing" ON "NOT MINGW64" OFF)

 ov_option (ENABLE_MULTI "Enables MULTI Device Plugin" ON)
--- a/cmake/packaging/debian.cmake
+++ b/cmake/packaging/debian.cmake
@@ -51,8 +51,6 @@ macro(ov_cpack_settings)
           NOT item MATCHES "^${OV_CPACK_COMP_PYTHON_OPENVINO}_python.*" AND
           # because in case of .deb package, pyopenvino_package_python${Python3_VERSION_MAJOR}${Python3_VERSION_MINOR} is installed
           (NOT item MATCHES "^${OV_CPACK_COMP_PYTHON_OPENVINO_PACKAGE}_python.*" OR ENABLE_PYTHON_PACKAGING) AND
-           # see ticket # 82605
-           NOT item STREQUAL "gna" AND
           # temporary block nvidia
           NOT item STREQUAL "nvidia" AND
           # don't install Intel OpenMP
@@ -90,6 +88,7 @@ macro(ov_cpack_settings)
        2023.0.0 2023.0.1 2023.0.2 2023.0.3
        2023.1.0
        2023.2.0
+        2023.3.0 2023.3.1 2023.3.2 2023.3.3 2023.3.4 2023.3.5
        )

    #
@@ -182,23 +181,6 @@ macro(ov_cpack_settings)
        set(gpu_copyright "generic")
    endif()

-    # intel-gna
-    if(ENABLE_INTEL_GNA AND "gna" IN_LIST CPACK_COMPONENTS_ALL)
-        set(CPACK_COMPONENT_GNA_DESCRIPTION "Intel® Gaussian Neural Accelerator inference plugin")
-        set(CPACK_COMPONENT_GNA_DEPENDS "${OV_CPACK_COMP_CORE}")
-        set(CPACK_DEBIAN_GNA_PACKAGE_NAME "libopenvino-intel-gna-plugin-${cpack_name_ver}")
-        # since we have libgna.so we need to call ldconfig and have `def_triggers` here
-        set(CPACK_DEBIAN_GNA_PACKAGE_CONTROL_EXTRA "${def_postinst};${def_postrm};${def_triggers}")
-
-        ov_debian_add_lintian_suppression(gna
-            # package name matches libopenvino_intel_gna_plugin.so
-            # but lintian looks at libgna.so.2 since it's a versioned library
-            "package-name-doesnt-match-sonames")
-        set(gna_copyright "generic")
-
-        _ov_add_plugin(gna OFF)
-    endif()
-
    # # add pseudo plugins are recommended to core component
    # if(pseudo_plugins_recommends)
    #     # see https://superuser.com/questions/70031/what-is-the-difference-between-recommended-and-suggested-packages-ubuntu.
--- a/cmake/packaging/rpm.cmake
+++ b/cmake/packaging/rpm.cmake
@@ -37,8 +37,6 @@ macro(ov_cpack_settings)
           NOT item MATCHES "^${OV_CPACK_COMP_PYTHON_OPENVINO}_python.*" AND
           # because in case of .rpm package, pyopenvino_package_python${Python3_VERSION_MAJOR}${Python3_VERSION_MINOR} is installed
           (NOT item MATCHES "^${OV_CPACK_COMP_PYTHON_OPENVINO_PACKAGE}_python.*" OR ENABLE_PYTHON_PACKAGING) AND
-           # see ticket # 82605
-           NOT item STREQUAL "gna" AND
           # temporary block nvidia
           NOT item STREQUAL "nvidia" AND
           # don't install Intel OpenMP
@@ -76,6 +74,7 @@ macro(ov_cpack_settings)
        2023.0.0 2023.0.1 2023.0.2 2023.0.3
        2023.1.0
        2023.2.0
+        2023.3.0 2023.3.1 2023.3.2 2023.3.3 2023.3.4 2023.3.5
        )

    find_host_program(rpmlint_PROGRAM NAMES rpmlint DOC "Path to rpmlint")
@@ -178,15 +177,6 @@ macro(ov_cpack_settings)
        set(gpu_copyright "generic")
    endif()

-    # intel-gna
-    if(ENABLE_INTEL_GNA AND "gna" IN_LIST CPACK_COMPONENTS_ALL)
-        set(CPACK_COMPONENT_GNA_DESCRIPTION "Intel® Gaussian Neural Accelerator inference plugin")
-        set(CPACK_RPM_GNA_PACKAGE_REQUIRES "${core_package}")
-        set(CPACK_RPM_GNA_PACKAGE_NAME "libopenvino-intel-gna-plugin-${cpack_name_ver}")
-        _ov_add_package(plugin_packages gna)
-        set(gna_copyright "generic")
-    endif()
-
    #
    # Frontends
    #
--- a/cmake/templates/InferenceEngineConfig.cmake.in
+++ b/cmake/templates/InferenceEngineConfig.cmake.in
@@ -79,8 +79,6 @@ endforeach()
 set(PACKAGE_PREFIX_DIR ${_ie_package_prefix_dir})
 unset(_ie_package_prefix_dir)

-set_and_check(InferenceEngine_INCLUDE_DIRS "@PACKAGE_OV_INCLUDE_DIR@")
-
 check_required_components(${CMAKE_FIND_PACKAGE_NAME})

 if(_ie_need_package_name_reset)
--- a/cmake/templates/OpenVINOConfig.cmake.in
+++ b/cmake/templates/OpenVINOConfig.cmake.in
@@ -417,20 +417,6 @@ macro(_ov_find_intel_gpu_dependencies)
    unset(_OV_ENABLE_ONEDNN_FOR_GPU)
 endmacro()

-macro(_ov_find_intel_gna_dependencies)
-    set(_OV_ENABLE_INTEL_GNA "@ENABLE_INTEL_GNA@")
-    if(_OV_ENABLE_INTEL_GNA)
-        set_and_check(GNA_PATH "@PACKAGE_GNA_PATH@")
-        _ov_find_dependency(libGNA
-                            COMPONENTS KERNEL
-                            CONFIG
-                            PATHS "${CMAKE_CURRENT_LIST_DIR}"
-                            NO_DEFAULT_PATH)
-        unset(GNA_PATH)
-    endif()
-    unset(_OV_ENABLE_INTEL_GNA)
-endmacro()
-
 macro(_ov_find_protobuf_frontend_dependency)
    set(_OV_ENABLE_SYSTEM_PROTOBUF "@ENABLE_SYSTEM_PROTOBUF@")
    set(_OV_PROTOBUF_PACKAGE_CONFIG "@protobuf_config@")
@@ -518,7 +504,6 @@ if(NOT _OV_ENABLE_OPENVINO_BUILD_SHARED)
    # plugin dependencies
    _ov_find_intel_cpu_dependencies()
    _ov_find_intel_gpu_dependencies()
-    _ov_find_intel_gna_dependencies()
 endif()

 _ov_find_dependency(Threads)
--- a/docs/articles_en/about_openvino/system_requirements.rst
+++ b/docs/articles_en/about_openvino/system_requirements.rst
@@ -81,7 +81,7 @@ GPU
        were used during OpenVINO internal validation: 22.43 for Ubuntu 22.04, 21.48
        for Ubuntu 20.04 and 21.49 for Red Hat Enterprise Linux 8. 

-NPU and GNA 
+NPU
 #############################

 .. tab-set::
@@ -91,13 +91,6 @@ NPU and GNA
      * Ubuntu 22.04 long-term support (LTS), 64-bit
      * Windows 11, 64-bit

-   .. tab-item:: Operating Systems for GNA
-
-      * Ubuntu 22.04 long-term support (LTS), 64-bit
-      * Ubuntu 20.04 long-term support (LTS), 64-bit
-      * Windows 10, 64-bit
-      * Windows 11, 64-bit
-
   .. tab-item:: Additional considerations

      * These Accelerators require drivers that are not included in the
@@ -130,7 +123,7 @@ Operating systems and developer environment
      Higher versions of kernel might be required for 10th Gen Intel® Core™ Processors,
      11th Gen Intel® Core™ Processors, 11th Gen Intel® Core™ Processors S-Series Processors,
      12th Gen Intel® Core™ Processors, 13th Gen Intel® Core™ Processors, Intel® Core™ Ultra
-      Processors, or 4th Gen Intel® Xeon® Scalable Processors to support CPU, GPU, GNA or
+      Processors, or 4th Gen Intel® Xeon® Scalable Processors to support CPU, GPU or
      hybrid-cores CPU capabilities.

   .. tab-item:: Windows
--- a/docs/articles_en/documentation/openvino_legacy_features.rst
+++ b/docs/articles_en/documentation/openvino_legacy_features.rst
@@ -82,7 +82,7 @@ offering.
    than the remaining ones, their support has been discontinued. Converting them to the
    ONNX format is a possible way of retaining them in the OpenVINO-based pipeline.
 |   :doc:`See the previous conversion instructions <mxnet_caffe_kaldi>`
-|   :doc:`See the currently supported frameworks <Supported_Model_Formats>`
+|   :doc:`See the currently supported frameworks <openvino_docs_model_processing_introduction>`


 | **Post-training Optimization Tool (POT)**
--- a/docs/articles_en/documentation/openvino_legacy_features/mo_ovc_transition/legacy_conversion_api/supported_model_formats.rst
+++ b/docs/articles_en/documentation/openvino_legacy_features/mo_ovc_transition/legacy_conversion_api/supported_model_formats.rst
@@ -7,7 +7,7 @@

   The code described here has been **deprecated!** Do not use it to avoid working with a legacy solution. It will be kept for some time to ensure backwards compatibility, but **you should not use** it in contemporary applications.

-   This guide describes a deprecated conversion method. The guide on the new and recommended method can be found in the :doc:`Supported Model Formats <Supported_Model_Formats>` article. 
+   This guide describes a deprecated conversion method. The guide on the new and recommended method can be found in the :doc:`Supported Model Formats <openvino_docs_model_processing_introduction>` article. 

 .. toctree::
   :maxdepth: 1
--- a/docs/articles_en/documentation/openvino_legacy_features/mo_ovc_transition/legacy_conversion_api/supported_model_formats/convert_model_tutorials/Convert_FaceNet_From_Tensorflow.rst
+++ b/docs/articles_en/documentation/openvino_legacy_features/mo_ovc_transition/legacy_conversion_api/supported_model_formats/convert_model_tutorials/Convert_FaceNet_From_Tensorflow.rst
@@ -12,7 +12,7 @@ Converting TensorFlow FaceNet Models

   The code described here has been **deprecated!** Do not use it to avoid working with a legacy solution. It will be kept for some time to ensure backwards compatibility, but **you should not use** it in contemporary applications.

-   This guide describes a deprecated conversion method. The guide on the new and recommended method can be found in the :doc:`Supported Model Formats <Supported_Model_Formats>` article. 
+   This guide describes a deprecated conversion method. The guide on the new and recommended method can be found in the :doc:`Supported Model Formats <openvino_docs_model_processing_introduction>` article. 
   
 `Public pre-trained FaceNet models <https://github.com/davidsandberg/facenet#pre-trained-models>`__ contain both training
 and inference part of graph. Switch between this two states is manageable with placeholder value.
--- a/docs/articles_en/glossary.rst
+++ b/docs/articles_en/glossary.rst
@@ -75,7 +75,7 @@ Glossary of terms used in OpenVINO™
 |   Number of images to analyze during one call of infer. Maximum batch size is a property of the model set before its compilation. In NHWC, NCHW, and NCDHW image data layout representations, the 'N' refers to the number of images in the batch.

 | *Device Affinity*
-|   A preferred hardware device to run inference (CPU, GPU, GNA, etc.).
+|   A preferred hardware device to run inference (CPU, GPU, NPU, etc.).

 | *Extensibility mechanism, Custom layers*
 |   The mechanism that provides you with capabilities to extend the OpenVINO™ Runtime and model conversion API so that they can work with models containing operations that are not yet supported.
@@ -87,7 +87,7 @@ Glossary of terms used in OpenVINO™
 |   The Conversion API is used to import and convert models trained in popular frameworks to a format usable by other OpenVINO components. Model conversion API is represented by a Python ``openvino.convert_model()`` method  and ``ovc`` command-line tool.

 | *OpenVINO™ Core*
-|   OpenVINO™ Core is a software component that manages inference on certain Intel(R) hardware devices: CPU, GPU, GNA, etc.
+|   OpenVINO™ Core is a software component that manages inference on certain Intel(R) hardware devices: CPU, GPU, NPU, etc.

 | *OpenVINO™ API*
 |   The basic default API for all supported devices, which allows you to load a model from Intermediate Representation or convert from ONNX, PaddlePaddle, TensorFlow, TensorFlow Lite file formats, set input and output formats and execute the model on various devices.
--- a/docs/articles_en/learn_openvino.rst
+++ b/docs/articles_en/learn_openvino.rst
@@ -17,6 +17,7 @@ Learn OpenVINO
   Interactive Tutorials (Python) <tutorials>
   Sample Applications (Python & C++) <openvino_docs_OV_UG_Samples_Overview>
   Generative AI Optimization and Deployment <gen_ai_guide>
+   Import TensorFlow and PyTorch Models <openvino_docs_model_processing_introduction_draft>


 This section will help you get a hands-on experience with OpenVINO even if you are just starting 
@@ -32,3 +33,6 @@ as well as an experienced user.

 | :doc:`Optimize and Deploy Generative AI Models <gen_ai_guide>`
 | Detailed information on how OpenVINO accelerates Generative AI use cases and what models it supports. This tutorial provides instructions for running Generative AI models using Hugging Face Optimum Intel and Native OpenVINO APIs.
+
+| :doc:`Import TensorFlow and PyTorch Models <openvino_docs_model_processing_introduction_draft>`
+| Learn about different import methods for TensorFlow and PyTorch models. 
--- a/docs/articles_en/openvino_workflow.rst
+++ b/docs/articles_en/openvino_workflow.rst
@@ -26,7 +26,7 @@ OpenVINO offers multiple workflows, depending on the use case and personal or pr
 This section will give you a detailed view of how you can go from preparing your model,
 through optimizing it, to executing inference, and deploying your solution.

-Once you obtain a model in one of the :doc:`supported model formats <Supported_Model_Formats>`,
+Once you obtain a model in one of the :doc:`supported model formats <openvino_docs_model_processing_introduction>`,
 you can decide how to proceed:

 .. tab-set::
@@ -48,6 +48,34 @@ you can decide how to proceed:
         :align: center
         :alt: OpenVINO workflow diagram for performance

+OpenVINO uses the following functions for reading, converting, and saving models:
+
+.. tab-set::
+
+   .. tab-item:: read_model
+
+      * Creates an ov.Model from a file.
+      * Supported file formats: OpenVINO IR, ONNX, PaddlePaddle, TensorFlow and TensorFlow Lite. PyTorch files are not directly supported.
+      * OpenVINO files are read directly while other formats are converted automatically.
+
+   .. tab-item:: compile_model
+
+      * Creates an ov.CompiledModel from a file or ov.Model object.
+      * Supported file formats: OpenVINO IR, ONNX, PaddlePaddle, TensorFlow and TensorFlow Lite. PyTorch files are not directly supported.
+      * OpenVINO files are read directly while other formats are converted automatically.
+
+   .. tab-item:: convert_model
+
+      * Creates an ov.Model from a file or Python memory object.
+      * Supported file formats: ONNX, PaddlePaddle, TensorFlow and TensorFlow Lite.
+      * Supported framework objects: PaddlePaddle, TensorFlow and PyTorch.
+      * This method is only available in the Python API.
+
+   .. tab-item:: save_model
+
+      * Saves an ov.Model to OpenVINO IR format.
+      * Compresses weights to FP16 by default. 
+      * This method is only available in the Python API.


 | :doc:`Model Preparation <openvino_docs_model_processing_introduction>`
--- a/docs/articles_en/openvino_workflow/deployment_intro/conditional_compilation_deployment.rst
+++ b/docs/articles_en/openvino_workflow/deployment_intro/conditional_compilation_deployment.rst
@@ -245,7 +245,7 @@ Build OpenVINO with conditional compilation enabled:
    cd %OPENVINO_HOME%
    md build_cc
    cd build_cc
-    cmake -G Ninja -Wno-dev -DCMAKE_BUILD_TYPE=Debug -DENABLE_CPPLINT=OFF -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_COMPILE_WARNING_AS_ERROR=OFF -DENABLE_FASTER_BUILD=ON -DENABLE_SANITIZER=OFF -DTHREADING=TBB -DBUILD_SHARED_LIBS=OFF -DENABLE_PROFILING_ITT=ON -DSELECTIVE_BUILD=COLLECT -DENABLE_INTEL_GPU=OFF  -DENABLE_INTEL_GNA=OFF -DENABLE_MULTI=OFF -DENABLE_AUTO=OFF -DENABLE_AUTO_BATCH=OFF -DENABLE_HETERO=OFF -DENABLE_TEMPLATE=OFF -DENABLE_OV_ONNX_FRONTEND=OFF -DENABLE_OV_PADDLE_FRONTEND=OFF -DENABLE_OV_PYTORCH_FRONTEND=OFF -DENABLE_OV_TF_FRONTEND=OFF -DCMAKE_INSTALL_PREFIX=install ..
+    cmake -G Ninja -Wno-dev -DCMAKE_BUILD_TYPE=Debug -DENABLE_CPPLINT=OFF -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_COMPILE_WARNING_AS_ERROR=OFF -DENABLE_FASTER_BUILD=ON -DENABLE_SANITIZER=OFF -DTHREADING=TBB -DBUILD_SHARED_LIBS=OFF -DENABLE_PROFILING_ITT=ON -DSELECTIVE_BUILD=COLLECT -DENABLE_INTEL_GPU=OFF -DENABLE_MULTI=OFF -DENABLE_AUTO=OFF -DENABLE_AUTO_BATCH=OFF -DENABLE_HETERO=OFF -DENABLE_TEMPLATE=OFF -DENABLE_OV_ONNX_FRONTEND=OFF -DENABLE_OV_PADDLE_FRONTEND=OFF -DENABLE_OV_PYTORCH_FRONTEND=OFF -DENABLE_OV_TF_FRONTEND=OFF -DCMAKE_INSTALL_PREFIX=install ..

    cmake --build . --config Debug

@@ -278,7 +278,7 @@ Generate final optimal binaries size of OpenVINO package
    md build
    cd build

-    cmake -G "Visual Studio 16 2019" -A x64 -DENABLE_CPPLINT=OFF -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_COMPILE_WARNING_AS_ERROR=OFF -DCMAKE_BUILD_TYPE=Release -DENABLE_FASTER_BUILD=ON -DENABLE_PROFILING_ITT=OFF -DSELECTIVE_BUILD=ON -DENABLE_INTEL_GPU=OFF -DENABLE_INTEL_GNA=OFF -DENABLE_MULTI=OFF -DENABLE_AUTO=OFF -DENABLE_AUTO_BATCH=OFF -DENABLE_HETERO=OFF -DENABLE_TEMPLATE=OFF -DENABLE_OV_ONNX_FRONTEND=OFF -DENABLE_OV_PADDLE_FRONTEND=OFF -DENABLE_OV_PYTORCH_FRONTEND=OFF -DENABLE_OV_TF_FRONTEND=OFF -DSELECTIVE_BUILD_STAT=%OPENVINO_HOME%\cc_data\*.csv -DBUILD_SHARED_LIBS=OFF -DENABLE_LTO=ON -DENABLE_ONEDNN_FOR_GPU=OFF -DENABLE_GAPI_PREPROCESSING=OFF -DENABLE_OV_TF_LITE_FRONTEND=OFF -DENABLE_PROFILING_FIRST_INFERENCE=OFF ..
+    cmake -G "Visual Studio 16 2019" -A x64 -DENABLE_CPPLINT=OFF -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_COMPILE_WARNING_AS_ERROR=OFF -DCMAKE_BUILD_TYPE=Release -DENABLE_FASTER_BUILD=ON -DENABLE_PROFILING_ITT=OFF -DSELECTIVE_BUILD=ON -DENABLE_INTEL_GPU=OFF -DENABLE_MULTI=OFF -DENABLE_AUTO=OFF -DENABLE_AUTO_BATCH=OFF -DENABLE_HETERO=OFF -DENABLE_TEMPLATE=OFF -DENABLE_OV_ONNX_FRONTEND=OFF -DENABLE_OV_PADDLE_FRONTEND=OFF -DENABLE_OV_PYTORCH_FRONTEND=OFF -DENABLE_OV_TF_FRONTEND=OFF -DSELECTIVE_BUILD_STAT=%OPENVINO_HOME%\cc_data\*.csv -DBUILD_SHARED_LIBS=OFF -DENABLE_LTO=ON -DENABLE_ONEDNN_FOR_GPU=OFF -DENABLE_GAPI_PREPROCESSING=OFF -DENABLE_OV_TF_LITE_FRONTEND=OFF -DENABLE_PROFILING_FIRST_INFERENCE=OFF ..

    cmake --build . --config Release

--- a/docs/articles_en/openvino_workflow/model_preparation.rst
+++ b/docs/articles_en/openvino_workflow/model_preparation.rst
@@ -12,36 +12,33 @@ Model Preparation
   :maxdepth: 1
   :hidden:

+   Convert to OpenVINO Model <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_IR>
   Conversion Parameters <openvino_docs_OV_Converter_UG_Conversion_Options>
   Setting Input Shapes <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Converting_Model>
-   Convert from PyTorch <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_PyTorch>
-   Convert from TensorFlow <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_TensorFlow>
-   Convert from ONNX <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_ONNX>
-   Convert from TensorFlow_Lite <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_TensorFlow_Lite>
-   Convert from PaddlePaddle <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_Paddle>
-   Supported_Model_Formats
+   PyVision preprocessing <pytorch_vision>


-You can obtain a model in one of :doc:`supported formats <Supported_Model_Formats>`
+You can obtain a model in one of supported formats, **PyTorch, TensorFlow, TensorFlow Lite, ONNX, and PaddlePaddle**,
 in many ways. The easiest one is to download it from an online database,
 such as `TensorFlow Hub <https://tfhub.dev/>`__, `Hugging Face <https://huggingface.co/>`__,
 and `Torchvision models <https://pytorch.org/hub/>`__. Now you have two options:

-* Skip model conversion and run inference directly from the source format. Conversion
-  will still be performed but it will happen automatically and "under the hood." 
+* Skip model conversion and `run inference <openvino_docs_OV_UG_Integrate_OV_with_your_application>`__ directly from the **TensorFlow, TensorFlow Lite, ONNX, and PaddlePaddle** source format. Conversion
+  will still be performed but it will happen automatically and "under the hood". 
  This option, while convenient, offers lower performance and stability, as well as
  fewer optimization options.

-* Explicitly convert the model to :doc:`OpenVINO IR <openvino_ir>`.
+* Explicitly :doc:`convert the model to OpenVINO IR <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_IR>`.
  This approach offers the best possible results and is the recommended one,
-  especially for for production-ready solutions. Explicit conversion can be done in two ways:
+  especially for production-ready solutions. Consider storing your model in this format to minimize first-inference latency, 
+  perform model optimizations, and save space on your drive, in some cases. Explicit conversion can be done in two ways:

-  * the Python API functions (``openvino.convert_model`` and ``openvino.save_model``) 
-  * the ``ovc`` command line tool. 
-
-  Once saved as :doc:`OpenVINO IR <openvino_ir>` (a set of ``.xml`` and ``.bin`` files), 
+  * the `Python API functions <#convert-a-model-with-python-convert-model>`__ (``openvino.convert_model`` and ``openvino.save_model``) 
+  * the `ovc <#convert-a-model-in-cli-ovc>`__  command line tool. 
+  
+  Once saved as :doc:`OpenVINO IR <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_IR>` (a set of ``.xml`` and ``.bin`` files), 
  the model may be deployed with maximum performance. Because it is already optimized
-  for OpenVINO inference, it can be read, compiled, and inferred with no additional delay. 
+  for `OpenVINO inference <openvino_docs_OV_UG_Integrate_OV_with_your_application>`__, it can be read, compiled, and inferred with no additional delay. 

 .. note::
   
@@ -51,13 +48,37 @@ and `Torchvision models <https://pytorch.org/hub/>`__. Now you have two options:
   with ``openvino.tools.mo.convert_model`` or the ``mo`` CLI tool.
   For more details, see the :doc:`Model Conversion API Transition Guide <openvino_docs_OV_Converter_UG_prepare_model_convert_model_MO_OVC_transition>`.

+   For PyTorch models, `Python API <#convert-a-model-with-python-convert-model>`__ is the only conversion option. 
+
+   TensorFlow may present additional considerations :doc:`TensorFlow Frontend Capabilities and Limitations <openvino_docs_MO_DG_TensorFlow_Frontend>`.
+
+Model States
+##############################################
+
+There are three states a model in OpenVINO can be: saved on disk, loaded but not compiled (``ov.Model``) or loaded and compiled (``ov.CompiledModel``).
+
+| **Saved on disk**
+|    A model in this state consists of one or more files that fully represent the neural network. A model can be stored in different ways. For example:
+|       OpenVINO IR: pair of .xml and .bin files
+|       ONNX: .onnx file
+|       TensorFlow: directory with a .pb file and two subfolders or just a .pb file
+|       TensorFlow Lite: .tflite file
+|       PaddlePaddle: .pdmodel file
+
+| **Loaded but not compiled**
+|    A model object (``ov.Model``) is created in memory either by parsing a file or converting an existing framework object. Inference cannot be done with this object yet as it is not attached to any specific device, but it allows customization such as reshaping its input, applying quantization or even adding preprocessing steps before compiling the model.
+
+| **Loaded and compiled**
+|   This state is achieved when one or more devices are specified for a model object to run on (``ov.CompiledModel``), allowing device optimizations to be made and enabling inference.
+
+For more information on each function, see the :doc:`OpenVINO workflow <openvino_workflow>` page.

 Convert a Model with Python: ``convert_model``
 ##############################################

 The Model conversion API in Python uses the ``openvino.convert_model`` function,
 turning a given model into the `openvino.Model <api/ie_python_api/_autosummary/openvino.runtime.Model.html>`__
-object and loading it to memory. Now it can be: saved to a drive with `openvino.save_model``
+object and loading it to memory. Now it can be: saved to a drive with ``openvino.save_model``
 or further :doc:`optimized with NNCF <openvino_docs_model_optimization_guide>`
 prior to saving.

@@ -233,21 +254,18 @@ Convert a Model in CLI: ``ovc``

 ``ovc`` is a command-line model converter, combining the ``openvino.convert_model``
 and ``openvino.save_model`` functionalities, providing the exact same results, if the same set of 
-parameters is used for saving into OpenVINO IR. It converts files from one of the 
-:doc:`supported model formats <Supported_Model_Formats>` to :doc:`OpenVINO IR <openvino_ir>`, which can then be read, compiled,
+parameters is used for saving into OpenVINO IR. It converts files from one of the to :doc:`OpenVINO IR <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_IR>`, which can then be read, compiled,
 and run by the final inference application.

 .. note::
   PyTorch models cannot be converted with ``ovc``, use ``openvino.convert_model`` instead.

-
-
 Additional Resources
 ####################

 The following articles describe in detail how to obtain and prepare your model depending on the source model type:

-* :doc:`Convert different model formats to the ov.Model format <Supported_Model_Formats>`.
+* :doc:`Convert different model formats to the ov.Model format <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_IR>`.
 * :doc:`Review all available conversion parameters <openvino_docs_OV_Converter_UG_Conversion_Options>`.

 To achieve the best model inference performance and more compact OpenVINO IR representation follow:
--- a/docs/articles_en/openvino_workflow/model_preparation/supported_model_formats.rst
+++ b/docs/articles_en/openvino_workflow/model_preparation/supported_model_formats.rst
@@ -1,54 +1,28 @@
-.. {#Supported_Model_Formats}
+.. {#openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_IR}

-Supported Model Formats
-=======================

+Convert to OpenVINO IR
+=============================================

 .. meta::
-   :description: Learn about supported model formats and the methods used to convert, read, and compile them in OpenVINO™.
-
-
-| **OpenVINO IR (Intermediate Representation)**
-| The proprietary format of OpenVINO™, benefiting from the full extent of its features. 
-  It is obtained by :doc:`converting a model <openvino_docs_model_processing_introduction>` 
-  from one of the remaining supported formats using the Python model conversion API or the 
-  OpenVINO Converter.
-| Consider storing your model in this format to minimize first-inference latency, 
-  perform model optimizations, and save space on your drive, in some cases.
-
-
-| **PyTorch, TensorFlow, TensorFlow Lite, ONNX, and PaddlePaddle**
-| These supported model formats can be read, compiled, and converted to OpenVINO IR,
-  either automatically or explicitly.
-
-
-In the Python API, these options are provided as three separate methods: 
-``read_model()``, ``compile_model()``, and ``convert_model()``.
-
-The ``convert_model()`` method enables you to perform additional adjustments 
-to the model, such as setting shapes, changing model input types or layouts, 
-cutting parts of the model, freezing inputs, etc. For a detailed description 
-of the conversion process, see the 
-:doc:`model conversion guide <openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide>`.
-
-
-
-
-
-Note that for PyTorch models, Python API
-is the only conversion option. 
-
-TensorFlow may present additional considerations
-:doc:`TensorFlow Frontend Capabilities and Limitations <openvino_docs_MO_DG_TensorFlow_Frontend>`.
-
-
-
-
-
-
+   :description: Convert models from the original framework to OpenVINO representation.
+
+.. toctree::
+   :maxdepth: 1
+   :hidden:
+
+   Convert from PyTorch <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_PyTorch>
+   Convert from TensorFlow <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_TensorFlow>
+   Convert from ONNX <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_ONNX>
+   Convert from TensorFlow_Lite <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_TensorFlow_Lite>
+   Convert from PaddlePaddle <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_Paddle>


+:doc:`IR (Intermediate Representation) <openvino_ir>` is OpenVINO own format consisting of  ``.xml`` and ``.bin`` files.
+Convert the model into OpenVINO IR for `better performance <#ir-conversion-benefits>`__.

+Convert Models
+##############################################

 Here are code examples of how to use these methods with different model formats:

@@ -557,49 +531,57 @@ Here are code examples of how to use these methods with different model formats:
              :doc:`article <openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_Paddle>`.


-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
 * :doc:`How to convert PyTorch <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_PyTorch>`
 * :doc:`How to convert ONNX <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_ONNX>`
 * :doc:`How to convert TensorFlow <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_TensorFlow>`
 * :doc:`How to convert TensorFlow Lite <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_TensorFlow_Lite>`
 * :doc:`How to convert PaddlePaddle <openvino_docs_OV_Converter_UG_prepare_model_convert_model_Convert_Model_From_Paddle>`

-To choose the best workflow for your application, read the :doc:`Model Preparation section <openvino_docs_model_processing_introduction>`
+To choose the best workflow for your application, read the :doc:`Model Preparation section <openvino_docs_model_processing_introduction>`.

-Refer to the list of all supported conversion options in :doc:`Conversion Parameters <openvino_docs_OV_Converter_UG_Conversion_Options>`
+Refer to the list of all supported conversion options in :doc:`Conversion Parameters <openvino_docs_OV_Converter_UG_Conversion_Options>`.
+
+IR Conversion Benefits
+################################################
+
+
+| **Saving to IR to improve first inference latency**
+|    When first inference latency matters, rather than convert the framework model each time it is loaded, which may take some time depending on its size, it is better to do it once. Save the model as an OpenVINO IR with ``save_model`` and then load it with ``read_model`` as needed. This should improve the time it takes the model to make the first inference as it avoids the conversion step.
+
+| **Saving to IR in FP16 to save space**
+|    Save storage space, even more so if FP16 is used as it may cut the size by about 50%, especially useful for large models, like Llama2-7B.
+
+| **Saving to IR to avoid large dependencies in inference code**
+|    Frameworks such as TensorFlow and PyTorch tend to be large dependencies (multiple gigabytes), and not all inference environments have enough space to hold them. 
+|    Converting models to OpenVINO IR allows them to be used in an environment where OpenVINO is the only dependency, so much less disk space is needed. 
+|    Loading and compiling with OpenVINO directly usually takes less runtime memory than loading the model in the source framework and then converting and compiling it.
+
+An example showing how to take advantage of OpenVINO IR, saving a model in OpenVINO IR once, using it many times, is shown below:
+
+.. code-block:: py
+
+   # Run once
+
+   import openvino as ov
+   import tensorflow as tf
+
+   # 1. Convert model created with TF code
+   model = tf.keras.applications.resnet50.ResNet50(weights="imagenet")
+   ov_model = ov.convert_model(model)
+
+   # 2. Save model as OpenVINO IR
+   ov.save_model(ov_model, 'model.xml', compress_to_fp16=True) # enabled by default
+
+   # Repeat as needed
+
+   import openvino as ov
+
+   # 3. Load model from file
+   core = ov.Core()
+   ov_model = core.read_model("model.xml")
+
+   # 4. Compile model from memory
+   compiled_model = core.compile_model(ov_model)

 Additional Resources
 ####################
--- a/docs/articles_en/openvino_workflow/model_preparation/conversion_parameters.rst
+++ b/docs/articles_en/openvino_workflow/model_preparation/conversion_parameters.rst
@@ -7,7 +7,7 @@ Conversion Parameters
 .. meta::
   :description: Model Conversion API provides several parameters to adjust model conversion.

-This document describes all available parameters for ``openvino.convert_model``, ``ovc``, and ``openvino.save_model`` without focusing on a particular framework model format. Use this information for your reference as a common description of the conversion API capabilities in general. Part of the options can be not relevant to some specific frameworks. Use :doc:`Supported Model Formats <Supported_Model_Formats>` page for more dedicated framework-dependent tutorials.
+This document describes all available parameters for ``openvino.convert_model``, ``ovc``, and ``openvino.save_model`` without focusing on a particular framework model format. Use this information for your reference as a common description of the conversion API capabilities in general. Part of the options can be not relevant to some specific frameworks. Use :doc:`Supported Model Formats <openvino_docs_model_processing_introduction>` page for more dedicated framework-dependent tutorials.

 In most cases when it is required to convert a model the following simple syntax can be used:

--- a/docs/articles_en/openvino_workflow/model_preparation/pytorch_vision.rst
+++ b/docs/articles_en/openvino_workflow/model_preparation/pytorch_vision.rst
@@ -0,0 +1,12 @@
+.. {#pytorch_vision}
+
+PyVision
+=======================
+
+
+.. meta::
+   :description: Learn about supported model formats and the methods used to convert, read, and compile them in OpenVINO™.
+
+Images input to AI models often need to be preprocessed in order to have proper dimensions or data type. 
+Instead of doing it with another library in an additional pipeline step, you can use torchvision.transforms OpenVINO feature. 
+It automatically translates a torchvision preprocessing pipeline to OpenVINO operators and then embeds them into your OpenVINO model, reducing overall program complexity and allowing additional performance optimizations to take place.
--- a/docs/articles_en/openvino_workflow/model_preparation/setting_input_shapes.rst
+++ b/docs/articles_en/openvino_workflow/model_preparation/setting_input_shapes.rst
@@ -136,3 +136,4 @@ For example, launch model conversion for the ONNX OCR model and specify a bounda
         ovc ocr.onnx --input data[1..3,150,200,1],seq_len[1..3]

 In practice, not every model is designed in a way that allows change of input shapes. An attempt to change the shape for such models may lead to an exception during model conversion, later in model inference, or even to wrong results of inference without explicit exception raised. A knowledge about model topology is required to set shapes appropriately.
+
--- a/docs/articles_en/openvino_workflow/model_preparation_tensorflow_pytorch_guide.rst
+++ b/docs/articles_en/openvino_workflow/model_preparation_tensorflow_pytorch_guide.rst
@@ -0,0 +1,229 @@
+.. {#openvino_docs_model_processing_introduction_draft}
+
+Import TensorFlow and PyTorch Models
+==============================================
+
+In OpenVINO you can load a model in different formats. 
+The examples below show how TensorFlow and PyTorch models. The models are loaded, converted to OpenVINO format, and compiled for inferencing in just several lines of code.
+To learn more about how models can be imported in OpenVINO, refer to the :doc:`Model Preparation <openvino_docs_model_processing_introduction>` page.
+
+.. tab-set::
+
+   .. tab-item:: Import TensorFlow model
+
+      .. code-block:: py
+         :force:
+
+            import openvino as ov
+
+            # 1. Compile model from file
+            core = ov.Core()
+            compiled_model = core.compile_model("model.pb")
+
+   .. tab-item:: Import PyTorch model
+
+      .. code-block:: py
+
+            import openvino as ov
+            import torch
+
+            # 1. Convert model loaded from PyTorch file
+            model = torch.load("model.pt")
+            model.eval()
+            ov_model = ov.convert_model(model)
+
+            # 2. Compile model from memory
+            core = ov.Core()
+            compiled_model = core.compile_model(ov_model)
+
+While the above examples provide a simple and straightforward option to import models into OpenVINO, there are other options to provide more customization and flexibility. 
+
+
+TensorFlow Import Options
+##############################################
+
+OpenVINO direct support of TensorFlow allows developers to use their models in an OpenVINO inference pipeline without changes. However, as multiple ways of doing this exist, it may not be clear which is the best approach for a given situation. The following diagram aims to simplify this decision given a certain context, although some additional considerations should be taken into account depending on the use case. 
+      
+.. image:: _static/images/import_tensorflow.svg
+
+
+Method 1. Convert using ov.convert_model function (Python only)
+---------------------------------------------------------------------
+
+As seen above, if your starting point is a Python object in memory, for example a ``tf.keras.Model`` or ``tf.Module``, a direct way to get the model in OpenVINO is to use ``ov.convert_model``. This method produces an ``ov.Model`` (one of the three states) that can later be reshaped, saved to OpenVINO IR or compiled to do inference. In code it may look as follows:
+
+.. code-block:: py
+
+   import openvino as ov
+   import tensorflow as tf
+
+   # 1a. Convert model created with TF code
+   model = tf.keras.applications.resnet50.ResNet50(weights="imagenet")
+   ov_model = ov.convert_model(model)
+
+   # 1b. Convert model from file
+   ov_model = ov.convert_model("model.pb")
+
+
+   # 2. Compile model from memory
+   core = ov.Core()
+   compiled_model = core.compile_model(ov_model)
+
+Method 2. Convert from file using ov.compile_model function
+---------------------------------------------------------------------
+
+In case you are starting with a file, you will need to see if the needs to be customized, such as applying quantization or reshaping its inputs.
+
+If the model does not need to be customized, ``ov.Core.compile_model`` should be used, which reads, converts (if needed) and compiles the model, leaving it ready for inference all in one go. The code should look like this:
+
+.. code-block:: py
+
+   import openvino as ov
+
+   # 1. Compile model from file
+   core = ov.Core()
+   compiled_model = core.compile_model("model.pb")
+
+Method 3. Convert from file using ov.read_model function
+---------------------------------------------------------------------
+
+If the model does need to be customized, ``ov.read_model`` can be used as it just returns an ``ov.Model`` ready to be quantized or have its inputs reshaped. (Note: This method also works with the OpenVINO C++ API, so it is useful for developers working in a C++ environment.)
+
+.. code-block:: py
+
+   import openvino as ov
+
+   # 1. Convert model from file
+   core = ov.Core()
+   ov_model = ov.read_model("model.pb")
+
+   # 2. Compile model from memory
+   compiled_model = core.compile_model(ov_model)
+
+Method 4. Convert from file using OpenVINO Model Converter (ovc CLI)
+---------------------------------------------------------------------
+
+However, if the input reshaping is known in advance and/or the model has multiple outputs but only some of them are required, OpenVINO provides two equivalent ways of doing these while converting the model. One of them is the CLI command ``ovc`` while the other is the previously mentioned ``ov.convert_model`` (Method 1).
+
+The ``ovc`` tool is similar to ``ov.convert_model``, except it works using the command line rather than a Python environment. It will convert the model to OpenVINO IR format, apply any configurations you specify, and save the converted model to disk. It is useful if you are not working with your model in Python (e.g., if you are developing in a C++ environment) or if you prefer using the command line rather than a Python script.
+The code below shows how to convert a model with ovc and then load it for inference:
+
+.. code-block:: py
+
+   # 1. Convert model from file
+   ovc model.pb
+
+.. code-block:: py
+
+   import openvino as ov
+
+   # 2. Load model from file
+   core = ov.Core()
+   ov_model = core.read_model("model.xml")
+
+   # 3. Compile model from memory
+   compiled_model = core.compile_model(ov_model)
+
+PyTorch Import Options
+##############################################
+
+OpenVINO direct support of PyTorch allows developers to use their models in an OpenVINO inference pipeline without changes. OpenVINO provides multiple ways of using PyTorch. The following diagram aims to simplify this decision given a certain context, although some additional considerations should be taken into account depending on the use case.
+
+.. image:: _static/images/import_pytorch.svg
+   
+PyTorch models can be imported into OpenVINO directly from a Python object. Saved PyTorch files can be used as well. To use a saved PyTorch file, it needs to be loaded in PyTorch first to convert it to a Python object.
+Once the model is loaded as a PyTorch Python object, you can decide whether to start using the OpenVINO framework and its features directly or to remain within the PyTorch framework while leveraging optimizations.
+
+Method 1. Convert using ov.convert_model function
+---------------------------------------------------------------------
+
+If OpenVINO is preferred, ov.convert_model is the method to use. It produces an ``ov.Model`` that can later be reshaped, saved to OpenVINO IR or compiled to do inference. In code it may look as follows:
+
+.. code-block:: py
+
+   import openvino as ov
+   import torch
+   from torchvision.models import resnet50
+
+   # 1a. Convert model created with PyTorch code
+   model = resnet50(weights="DEFAULT")
+   model.eval()
+
+   ov_model = ov.convert_model(model, example_input=torch.rand(1, 3, 224, 224))
+
+   # 1b. Convert model loaded from PyTorch file
+   model = torch.load("model.pt")
+   model.eval()
+   ov_model = ov.convert_model(model)
+
+   # 2. Compile model from memory
+   core = ov.Core()
+   compiled_model = core.compile_model(ov_model)
+
+Note that the need to set ``example_input`` depends on the model used. However, it is recommended to always set it if available as it usually leads to a better quality model. For more details, check out the docs.
+
+Method 2. Use OpenVINO backend in PyTorch
+---------------------------------------------------------------------
+
+In case PyTorch syntax is preferred, since PyTorch 2.0 and OpenVINO 2023.1, a PyTorch model can be optimized with OpenVINO by specifying it as a backend in ``torch.compile``.
+
+.. code-block:: py
+
+   import openvino.torch
+   import torch
+   from torchvision.models import resnet50
+
+   # 1a. Compile model created with PyTorch code
+   model = resnet50(weights="DEFAULT")
+   model.eval()
+   compiled_model = torch.compile(model, backend="openvino")
+
+   # 1b. Compile model loaded from PyTorch file
+   model = torch.load("model.pt")
+   model.eval()
+   compiled_model = torch.compile(model, backend="openvino")
+
+Method 3. Export model to ONNX and use one of OpenVINO methods
+---------------------------------------------------------------------
+
+If none of these two methods convert the model successfully, there is a third method that once was the main way of using PyTorch in OpenVINO, but now is mainly considered a backup plan. 
+This method consists of exporting a PyTorch model to ONNX and then loading it with the different methods available in OpenVINO. See ONNX, PaddlePaddle and TensorFlow Lite Import Options for more details.
+
+.. code-block:: py
+
+   import torch
+   import openvino as ov
+   from torchvision.models import resnet50
+
+   # 1. Export PyTorch model to ONNX
+   model = resnet50(weights="DEFAULT")
+   model.eval()
+
+   dummy_input = torch.randn(1,3,224,224)
+   torch.onnx.export(model, dummy_input, "model.onnx")
+
+   # 2. Use an OpenVINO method to read and compile it, for example compile_model
+   core = ov.Core()
+   compiled_model = core.compile_model("model.onnx")
+
+Supported Model Formats
+---------------------------------------------------------------------
+
+
+As PyTorch does not have a save format that contains everything needed to reproduce the model without using torch, OpenVINO only supports loading Python objects directly. The support is as follows:
+
+* Python objects
+
+  * torch.nn.Module
+  * torch.jit.ScriptModule
+  * torch.jit.ScriptFunction
+
+
+Jupyter Notebook Tutorials
+################################################
+
+OpenVINO also provides example notebooks for both frameworks showing how to load a model and make inference: 
+
+* `Convert TensorFlow Models to OpenVINO <notebooks/101-tensorflow-classification-to-openvino-with-output.html>`__
+* `Convert PyTorch Models to OpenVINO <notebooks/102-pytorch-onnx-to-openvino-with-output.html>`__
+
--- a/docs/articles_en/openvino_workflow/running_inference_with_openvino.rst
+++ b/docs/articles_en/openvino_workflow/running_inference_with_openvino.rst
@@ -22,7 +22,7 @@ Running Inference with OpenVINO™

 OpenVINO Runtime is a set of C++ libraries with C and Python bindings providing a common API
 to deploy inference on the platform of your choice. You can run any of the 
-:doc:`supported model formats <Supported_Model_Formats>` directly or convert the model
+:doc:`supported model formats <openvino_docs_model_processing_introduction>` directly or convert the model
 and save it to the :doc:`OpenVINO IR <openvino_ir>` format, for maximum performance.

 Why is OpenVINO IR inference faster? Even if you run a supported model directly, it is
@@ -37,7 +37,7 @@ OpenVINO IR provides by far the best first-inference latency scores.
 .. note::

   For more detailed information on how to convert, read, and compile supported model formats
-   see the :doc:`Supported Formats article <Supported_Model_Formats>`.
+   see the :doc:`Model Preparation article <openvino_docs_model_processing_introduction>`.
   
   Note that TensorFlow models can be run using the
   :doc:`torch.compile feature <pytorch_2_0_torch_compile>`, as well as the standard ways of
--- a/docs/articles_en/openvino_workflow/running_inference_with_openvino/dldt_deployment_optimization_guide/performance_hints.rst
+++ b/docs/articles_en/openvino_workflow/running_inference_with_openvino/dldt_deployment_optimization_guide/performance_hints.rst
@@ -23,7 +23,7 @@ Performance Hints: Latency and Throughput

 As discussed in the :doc:`Optimization Guide <openvino_docs_deployment_optimization_guide_dldt_optimization_guide>` there are a few different metrics associated with inference speed. Throughput and latency are some of the most widely used metrics that measure the overall performance of an application.

-Therefore, in order to ease the configuration of the device, OpenVINO offers two dedicated hints, namely ``ov::hint::PerformanceMode::THROUGHPUT`` and ``ov::hint::PerformanceMode::LATENCY``. A special ``ov::hint::PerformanceMode::UNDEFINED`` hint acts the same as specifying no hint. 
+Therefore, in order to ease the configuration of the device, OpenVINO offers two dedicated hints, namely ``ov::hint::PerformanceMode::THROUGHPUT`` and ``ov::hint::PerformanceMode::LATENCY``.

 For more information on conducting performance measurements with the ``benchmark_app``, refer to the last section in this document.

--- a/docs/dev/ci/github_actions/custom_actions.md
+++ b/docs/dev/ci/github_actions/custom_actions.md
@@ -8,3 +8,4 @@ You can find all the custom actions and their source code [here](../../../../.gi

 * Setup Python
 * System Info Print
+* Smart CI (see details: [feature documentation](./smart_ci.md))
--- a/docs/dev/ci/github_actions/overview.md
+++ b/docs/dev/ci/github_actions/overview.md
@@ -14,6 +14,7 @@ Welcome to the OpenVINO Developer guide on the GitHub Actions infrastructure. Th
 * [Docker images overview](#docker-images)
 * [Caches overview](#caches)
 * [How to add new tests](#adding-new-tests)
+* [Optimizing workflow based on PR changes](#optimizing-workflow-based-on-PR-changes)

 ## Workflows

@@ -261,6 +262,11 @@ The jobs in the workflows utilize appropriate caches based on a job's needs. Rea

 If you would like to add new tests, refer to [this document](./adding_tests.md).

+## Optimizing workflow based on PR changes
+
+To optimize pre-commit workflow by running only those jobs that are actually required to validate changes in a pull 
+request, you can use Smart CI feature. Refer to [this document](./smart_ci.md) to learn more.
+
 ## See also

 * [GitHub Actions official documentation](https://docs.github.com/en/actions)
--- a/docs/dev/ci/github_actions/smart_ci.md
+++ b/docs/dev/ci/github_actions/smart_ci.md
@@ -18,7 +18,7 @@ Basic understanding of [GitHub Actions workflows](https://docs.github.com/en/act
 ## Implementation

 Smart CI is implemented as a [custom GitHub Action](https://docs.github.com/en/actions/creating-actions/about-custom-actions) 
-stored in openvino repository: [.github/actions/smart-ci](../../../.github/actions/smart-ci). In GitHub Actions 
+stored in openvino repository: [.github/actions/smart-ci](../../../../.github/actions/smart-ci). In GitHub Actions 
 workflows this action is called as a first step in a separate job:
 ```yaml
 jobs:
@@ -67,7 +67,7 @@ The way how we define product components and "smart" rules for them is described

 Smart CI operates based on the set of rules described in two configuration files, stored in openvino repository.

-### Product components definition: [.github/labeler.yml](../../../.github/labeler.yml)
+### Product components definition: [.github/labeler.yml](../../../../.github/labeler.yml)
 This file contains mapping of source code paths to corresponding component names. Essentially, this a configuration 
 for [actions/labeler](https://github.com/marketplace/actions/labeler?version=v4.3.0) GitHub Action, which we use to
 automatically assign labels to pull requests based on PR changeset. We reuse it for Smart CI purposes, so that each 
@@ -83,7 +83,7 @@ If PR changes at least one file matching any of the [minimatch glob patterns](ht
 above, label "category: CPU" will be assigned to this PR, and GitHub Actions workflows that use Smart CI feature will
 consider component named "CPU" changed ("category:" prefix is omitted in component name). 

-### Definition of dependencies between components: [.github/components.yml](../../../.github/components.yml)
+### Definition of dependencies between components: [.github/components.yml](../../../../.github/components.yml)
 Some components are not entirely independent, and changes in them may affect other components as well. In this case, 
 in addition to the validation for the changed component itself (build + tests), validation for dependent components 
 is also required (either only build or both build and tests). This file describes these relationships between components,
@@ -126,11 +126,11 @@ any of the patterns in labeler.yml configuration.

 ### Adding a new component

-1. Add a new record to [.github/labeler.yml](../../../.github/labeler.yml).
+1. Add a new record to [.github/labeler.yml](../../../../.github/labeler.yml).
 Root-level key is a component (label) name, and value is a set of globs to define which source code paths are related to 
 this component. See [labeler usage](https://github.com/marketplace/actions/labeler?version=v4.3.0) to get familiar with
 globs syntax.
-2. Add a new record to [.github/components.yml](../../../.github/components.yml). 
+2. Add a new record to [.github/components.yml](../../../../.github/components.yml). 
 Root-level key is a component name, which is the same as the label name defined in the previous step, but with prefix 
 "category:" omitted (if any). If there were spaces present in label name - replace them with underscores. Example:
 `'category: LP transformations'` in labeler.yml -> `LP_transformations` in components.yml.  To fill the value, review 
@@ -169,9 +169,9 @@ respective components.

 ### Adding validation for a component
 You may wish to add a new validation job to test your new component, or choose an existing one. For that, go to the 
-desired workflow in [.github/workflows](../../../.github/workflows) (the main ones are 
-[linux.yml](../../../.github/workflows/linux.yml), [windows.yml](../../../.github/workflows/windows.yml) and 
-[mac.yml](../../../.github/workflows/mac.yml)). If Smart CI is enabled for the pipeline, you will find Smart_CI job 
+desired workflow in [.github/workflows](../../../../.github/workflows) (the main ones are 
+[linux.yml](../../../../.github/workflows/linux.yml), [windows.yml](../../../../.github/workflows/windows.yml) and 
+[mac.yml](../../../../.github/workflows/mac.yml)). If Smart CI is enabled for the pipeline, you will find Smart_CI job 
 in the beginning of the workflow:
 ```yaml
 jobs:
@@ -261,7 +261,7 @@ jobs:
          repo_token: ${{ secrets.GITHUB_TOKEN }}
 ```
 If needed, more parameters can be passed to "Get affected components" step, full list is available here:
-[.github/actions/smart-ci/action.yml](../../../.github/actions/smart-ci/action.yml).
+[.github/actions/smart-ci/action.yml](../../../../.github/actions/smart-ci/action.yml).

 After that, you can refer to the outputs from Smart_CI in validation jobs, as described in 
 [Adding validation for a component](#adding-validation-for-a-component) section. To learn more about the syntax of
@@ -318,7 +318,7 @@ Some components (like NVIDIA plugin or ONNX Runtime) are stored in their own rep
 defined via pattern matching on source code in openvino repository, while they still need to be validated together with 
 core OpenVINO. To add Smart CI rules for such components, skip the first step with modifying labeler configuration 
 in [Adding a new component](#adding-a-new-component) instruction and go directly to the next step:
-1. Add a new record to [.github/components.yml](../../../.github/components.yml),
+1. Add a new record to [.github/components.yml](../../../../.github/components.yml),
 with empty values for `revalidate` and `build` keys, like that:
    ```yaml
    NEW_EXTERNAL_COMPONENT:
--- a/docs/dev/cmake_options_for_custom_compilation.md
+++ b/docs/dev/cmake_options_for_custom_compilation.md
@@ -19,8 +19,6 @@ This document provides description and default values for CMake options that can
        * `ON` is default for x86 platforms; `OFF`, otherwise.
    * `ENABLE_INTEL_GPU` enables Intel GPU plugin compilation:
        * `ON` is default for x86 platforms; not available, otherwise.
-    * `ENABLE_INTEL_GNA` enables GNA plugin compilation:
-        * `ON` is default for x86 platforms; not available, otherwise.
    * `ENABLE_HETERO` enables HETERO plugin build:
        * `ON` is default.
    * `ENABLE_MULTI` enables MULTI plugin build:
@@ -58,9 +56,6 @@ This document provides description and default values for CMake options that can
    * `ON` if requirements are satisfied (auto-discovered by CMake).
 * `ENABLE_TESTS` enables tests compilation:
    * `OFF` is default.
-* `ENABLE_IR_V7_READER` enables IR v7 reader:
-    * `ON` is default.
-    **Note:** must be turned `OFF` when building OpenVINO runtime as static
 * `ENABLE_DOCS` enables building the OpenVINO documentation:
    * `OFF` is on Debian (Ubuntu) OSes
    * `OFF` is in other cases.
--- a/docs/dev/pypi_publish/pre-release-note.md
+++ b/docs/dev/pypi_publish/pre-release-note.md
@@ -0,0 +1,2 @@
+
+> **NOTE**: This version is pre-release software and has not undergone full release validation or qualification. No support is offered on pre-release software and APIs/behavior are subject to change.  It should NOT be incorporated into any production software/solution and instead should be used only for early testing and integration while awaiting a final release version of this software.
--- a/docs/dev/pypi_publish/pypi-openvino-dev.md
+++ b/docs/dev/pypi_publish/pypi-openvino-dev.md
@@ -1,6 +1,9 @@
 # OpenVINO™ Development Tools

-> **NOTE**: OpenVINO™ Development Tools package has been deprecated and will be discontinued with 2025.0 release. To learn more, refer to the [OpenVINO Legacy Features and Components page](https://docs.openvino.ai/2023.3/openvino_legacy_features.html).
+<!--- The note below is intended for master branch only for pre-release purpose. Remove it for official releases. --->
+> **NOTE**: This version is pre-release software and has not undergone full release validation or qualification. No support is offered on pre-release software and APIs/behavior are subject to change. It should NOT be incorporated into any production software/solution and instead should be used only for early testing and integration while awaiting a final release version of this software.
+
+> **NOTE**: OpenVINO™ Development Tools package has been deprecated and will be discontinued with 2025.0 release. To learn more, refer to the [OpenVINO Legacy Features and Components page](https://docs.openvino.ai/2023.2/openvino_legacy_features.html).

 Intel® Distribution of OpenVINO™ toolkit is an open-source toolkit for optimizing and deploying AI inference. It can be used to develop applications and solutions based on deep learning tasks, such as: emulation of human vision, automatic speech recognition, natural language processing, recommendation systems, etc. It provides high-performance and rich deployment options, from edge to cloud.

@@ -118,14 +121,12 @@ For example, to install and configure the components for working with TensorFlow

 | Component        | Console Script                                                                   | Description                                                                                                                                                                                                                                                                                                   |
 |------------------|---------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| [Legacy Model conversion API](https://docs.openvino.ai/2023.3/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html) | `mo` |**Model conversion API** imports, converts, and optimizes models that were trained in popular frameworks to a format usable by OpenVINO components. <br>Supported frameworks include Caffe\*, TensorFlow\*, MXNet\*, PaddlePaddle\*, and ONNX\*.                                               |                                         |
-| [Accuracy Checker](https://docs.openvino.ai/2023.3/omz_tools_accuracy_checker.html) and <br> [Annotation Converter](https://docs.openvino.ai/2023.3/omz_tools_accuracy_checker_annotation_converters.html) | `accuracy_check` <br> `convert_annotation` |**Accuracy Checker**  is a deep learning accuracy validation tool that allows you to collect accuracy metrics against popular datasets. The main advantages of the tool are the flexibility of configuration and a set of supported datasets, preprocessing, postprocessing, and metrics. <br> **Annotation Converter** is a utility that prepares datasets for evaluation with Accuracy Checker.                                             |
-| [Post-Training Optimization Tool](https://docs.openvino.ai/2023.3/pot_introduction.html)| `pot` |**Post-Training Optimization Tool** allows you to optimize trained models with advanced capabilities, such as quantization and low-precision optimizations, without the need to retrain or fine-tune models.                                            |
-| [Model Downloader and other Open Model Zoo tools](https://docs.openvino.ai/2023.3/omz_tools_downloader.html)| `omz_downloader` <br> `omz_converter` <br> `omz_quantizer` <br> `omz_info_dumper`| **Model Downloader** is a tool for getting access to the collection of high-quality and extremely fast pre-trained deep learning [public](@ref omz_models_group_public) and [Intel](@ref omz_models_group_intel)-trained models. These free pre-trained models can be used to speed up the development and production deployment process without training your own models. The tool downloads model files from online sources and, if necessary, patches them to make them more usable with model conversion API. A number of additional tools are also provided to automate the process of working with downloaded models:<br> **Model Converter** is a tool for converting Open Model Zoo models that are stored in an original deep learning framework format into the OpenVINO Intermediate Representation (IR) using model conversion API. <br> **Model Quantizer** is a tool for automatic quantization of full-precision models in the IR format into low-precision versions using the Post-Training Optimization Tool. <br> **Model Information Dumper** is a helper utility for dumping information about the models to a stable, machine-readable format.                                          |
+| [Legacy Model conversion API](https://docs.openvino.ai/nightly/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html) | `mo` |**Model conversion API** imports, converts, and optimizes models that were trained in popular frameworks to a format usable by OpenVINO components. <br>Supported frameworks include Caffe\*, TensorFlow\*, MXNet\*, PaddlePaddle\*, and ONNX\*.                                               |                                         |
+| [Model Downloader and other Open Model Zoo tools](https://docs.openvino.ai/nightly/omz_tools_downloader.html)| `omz_downloader` <br> `omz_converter` <br> `omz_quantizer` <br> `omz_info_dumper`| **Model Downloader** is a tool for getting access to the collection of high-quality and extremely fast pre-trained deep learning [public](@ref omz_models_group_public) and [Intel](@ref omz_models_group_intel)-trained models. These free pre-trained models can be used to speed up the development and production deployment process without training your own models. The tool downloads model files from online sources and, if necessary, patches them to make them more usable with model conversion API. A number of additional tools are also provided to automate the process of working with downloaded models:<br> **Model Converter** is a tool for converting Open Model Zoo models that are stored in an original deep learning framework format into the OpenVINO Intermediate Representation (IR) using model conversion API. <br> **Model Quantizer** is a tool for automatic quantization of full-precision models in the IR format into low-precision versions using the Post-Training Optimization Tool. <br> **Model Information Dumper** is a helper utility for dumping information about the models to a stable, machine-readable format.                                          |

 ## Troubleshooting

-For general troubleshooting steps and issues, see [Troubleshooting Guide for OpenVINO Installation](https://docs.openvino.ai/2023.3/openvino_docs_get_started_guide_troubleshooting.html). The following sections also provide explanations to several error messages.
+For general troubleshooting steps and issues, see [Troubleshooting Guide for OpenVINO Installation](https://docs.openvino.ai/2023.2/openvino_docs_get_started_guide_troubleshooting.html). The following sections also provide explanations to several error messages.

 ### Errors with Installing via PIP for Users in China

--- a/docs/dev/pypi_publish/pypi-openvino-rt.md
+++ b/docs/dev/pypi_publish/pypi-openvino-rt.md
@@ -1,8 +1,11 @@
 # OpenVINO™ 

+<!--- The note below is intended for master branch only for pre-release purpose. Remove it for official releases. --->
+> **NOTE**: This version is pre-release software and has not undergone full release validation or qualification. No support is offered on pre-release software and APIs/behavior are subject to change. It should NOT be incorporated into any production software/solution and instead should be used only for early testing and integration while awaiting a final release version of this software.
+
 Intel® Distribution of OpenVINO™ toolkit is an open-source toolkit for optimizing and deploying AI inference. It can be used to develop applications and solutions based on deep learning tasks, such as: emulation of human vision, automatic speech recognition, natural language processing, recommendation systems, etc. It provides high-performance and rich deployment options, from edge to cloud.

-If you have already finished developing your models and converting them to the OpenVINO model format, you can install OpenVINO Runtime to deploy your applications on various devices. The [OpenVINO™](https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_OV_Runtime_User_Guide.html) Python package includes a set of libraries for an easy inference integration with your products.
+If you have already finished developing your models and converting them to the OpenVINO model format, you can install OpenVINO Runtime to deploy your applications on various devices. The [OpenVINO™](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_OV_Runtime_User_Guide.html) Python package includes a set of libraries for an easy inference integration with your products.

 ## System Requirements

@@ -72,13 +75,13 @@ If installation was successful, you will see the list of available devices.

 | Component        | Content                                                                  | Description                                                                                                                                                                                                                                                                                                   |
 |------------------|---------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| [OpenVINO Runtime](https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_OV_Runtime_User_Guide.html) | `openvino package` |**OpenVINO Runtime**  is a set of C++ libraries with C and Python bindings providing a common API to deliver inference solutions on the platform of your choice. Use the OpenVINO Runtime API to read PyTorch\*, TensorFlow\*, TensorFlow Lite\*, ONNX\*, and PaddlePaddle\* models and execute them on preferred devices. OpenVINO Runtime uses a plugin architecture and includes the following plugins: [CPU](https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_supported_plugins_CPU.html), [GPU](https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_supported_plugins_GPU.html), [Auto Batch](https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_Automatic_Batching.html), [Auto](https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_supported_plugins_AUTO.html), [Hetero](https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_Hetero_execution.html).       
-| [OpenVINO Model Converter (OVC)](https://docs.openvino.ai/2023.3/openvino_docs_model_processing_introduction.html#convert-a-model-in-cli-ovc) | `ovc` |**OpenVINO Model Converter**  converts models that were trained in popular frameworks to a format usable by OpenVINO components. <br>Supported frameworks include ONNX\*, TensorFlow\*, TensorFlow Lite\*, and PaddlePaddle\*.                                    |
-| [Benchmark Tool](https://docs.openvino.ai/2023.3/openvino_inference_engine_tools_benchmark_tool_README.html)| `benchmark_app` | **Benchmark Application** allows you to estimate deep learning inference performance on supported devices for synchronous and asynchronous modes.                                              |
+| [OpenVINO Runtime](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_OV_Runtime_User_Guide.html) | `openvino package` |**OpenVINO Runtime**  is a set of C++ libraries with C and Python bindings providing a common API to deliver inference solutions on the platform of your choice. Use the OpenVINO Runtime API to read PyTorch\*, TensorFlow\*, TensorFlow Lite\*, ONNX\*, and PaddlePaddle\* models and execute them on preferred devices. OpenVINO Runtime uses a plugin architecture and includes the following plugins: [CPU](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_CPU.html), [GPU](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_GPU.html), [Auto Batch](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_Automatic_Batching.html), [Auto](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_AUTO.html), [Hetero](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_Hetero_execution.html).       
+| [OpenVINO Model Converter (OVC)](https://docs.openvino.ai/2023.2/openvino_docs_model_processing_introduction.html#convert-a-model-in-cli-ovc) | `ovc` |**OpenVINO Model Converter**  converts models that were trained in popular frameworks to a format usable by OpenVINO components. <br>Supported frameworks include ONNX\*, TensorFlow\*, TensorFlow Lite\*, and PaddlePaddle\*.                                    |
+| [Benchmark Tool](https://docs.openvino.ai/2023.2/openvino_inference_engine_tools_benchmark_tool_README.html)| `benchmark_app` | **Benchmark Application** allows you to estimate deep learning inference performance on supported devices for synchronous and asynchronous modes.                                              |

 ## Troubleshooting

-For general troubleshooting steps and issues, see [Troubleshooting Guide for OpenVINO Installation](https://docs.openvino.ai/2023.3/openvino_docs_get_started_guide_troubleshooting.html). The following sections also provide explanations to several error messages. 
+For general troubleshooting steps and issues, see [Troubleshooting Guide for OpenVINO Installation](https://docs.openvino.ai/2023.2/openvino_docs_get_started_guide_troubleshooting.html). The following sections also provide explanations to several error messages. 

 ### Errors with Installing via PIP for Users in China

--- a/docs/dev/static_libaries.md
+++ b/docs/dev/static_libaries.md
@@ -31,14 +31,12 @@ The default architecture of OpenVINO Runtime assumes that the following componen
 * (Device) Inference backends (CPU, GPU, MULTI, HETERO, etc.)
 * (Model) Frontends (IR, ONNX, PDPD, etc.)
 * Preprocessing library (to perform preprocessing, e.g. resize and color space conversions)
-* IR v7 reader (used in legacy tests only, if you are not to going to run OpenVINO tests, set `-DENABLE_TESTS=OFF` which disables IR v7 reader)

 With the static OpenVINO Runtime, all these modules should be linked into a final user application and **the list of modules/configuration must be known for the CMake configuration stage**. To minimize the total binary size, you can explicitly turn `OFF` unnecessary components. Use [[CMake Options for Custom Compilation|CMakeOptionsForCustomCompilation ]] as a reference for OpenVINO CMake configuration.

 For example, to enable only IR v11 reading and CPU inference capabilities, use:
 ```sh
 cmake -DENABLE_INTEL_GPU=OFF \
-      -DENABLE_INTEL_GNA=OFF \
      -DENABLE_TEMPLATE=OFF \
      -DENABLE_HETERO=OFF \
      -DENABLE_MULTI=OFF \
@@ -49,7 +47,6 @@ cmake -DENABLE_INTEL_GPU=OFF \
      -DENABLE_OV_TF_FRONTEND=OFF \
      -DENABLE_OV_TF_LITE_FRONTEND=OFF \
      -DENABLE_OV_PYTORCH_FRONTEND=OFF \
-      -DENABLE_IR_V7_READER=OFF \
      -DENABLE_GAPI_PREPROCESSING=OFF \
      -DENABLE_INTEL_CPU=ON \
      -DENABLE_OV_IR_FRONTEND=ON
@@ -135,7 +132,6 @@ cmake -DCMAKE_TOOLCHAIN_FILE=<openvino source dir>/cmake/toolchains/mt.runtime.w

 * The enabled and tested capabilities of OpenVINO Runtime in a static build:
    * OpenVINO common runtime - work with `ov::Model`, perform model loading on particular device
-    * CPU and GNA inference plugins (**GPU is not enabled**)
    * MULTI, HETERO, AUTO, and BATCH inference modes
    * IR, ONNX, PDPD, and TF frontends to read `ov::Model`
 * Static build support for building static libraries only for OpenVINO Runtime libraries. All other third-party prebuilt dependencies remain in the same format:
--- a/docs/home.rst
+++ b/docs/home.rst
@@ -61,40 +61,40 @@ OpenVINO 2023.2

      See latest benchmark numbers for OpenVINO and OpenVINO Model Server

-   .. grid-item-card:: Flexible Workflow
-      :link: Supported_Model_Formats
+   .. grid-item-card:: Work with Multiple Model Formats
+      :link: openvino_docs_model_processing_introduction
      :link-alt: Supported Model Formats     
      :link-type: doc

-      Load models directly (for TensorFlow, ONNX, PaddlePaddle) or convert to the OpenVINO format.
+      OpenVINO supports different model formats: PyTorch, TensorFlow, TensorFlow Lite, ONNX, and PaddlePaddle.

-   .. grid-item-card:: Deploy at Scale With OpenVINO Model Server
+   .. grid-item-card:: Deploy at Scale with OpenVINO Model Server
      :link: ovms_what_is_openvino_model_server
      :link-alt: model server    
      :link-type: doc

      Cloud-ready deployments for microservice applications

-   .. grid-item-card:: Model Optimization
+   .. grid-item-card:: Optimize Models
      :link: openvino_docs_model_optimization_guide
      :link-alt: model optimization    
      :link-type: doc

-      Reach for performance with post-training and training-time compression with NNCF
+      Boost performance using quantization and compression with NNCF

-   .. grid-item-card:: PyTorch 2.0 - torch.compile() backend
+   .. grid-item-card:: Use OpenVINO with PyTorch Apps with torch.compile() 
      :link: pytorch_2_0_torch_compile
      :link-alt: torch.compile 
      :link-type: doc

      Optimize generation of the graph model with PyTorch 2.0 torch.compile() backend

-   .. grid-item-card:: Generative AI optimization and deployment
+   .. grid-item-card:: Optimize and Deploy Generative AI
      :link: gen_ai_guide
      :link-alt: gen ai
      :link-type: doc

-      Generative AI optimization and deployment
+      Enhance the efficiency of Generative AI 


 Feature Overview
--- a/docs/notebooks/108-gpu-device-with-output.rst
+++ b/docs/notebooks/108-gpu-device-with-output.rst
@@ -246,7 +246,7 @@ for that property.
    GPU_QUEUE_THROTTLE            : Priority.MEDIUM
    GPU_ENABLE_LOOP_UNROLLING     : True
    CACHE_DIR                     : 
-    PERFORMANCE_HINT              : PerformanceMode.UNDEFINED
+    PERFORMANCE_HINT              : PerformanceMode.LATENCY
    COMPILATION_NUM_THREADS       : 20
    NUM_STREAMS                   : 1
    PERFORMANCE_HINT_NUM_REQUESTS : 0
--- a/docs/sphinx_setup/_static/images/img/import_pytorch.svg
+++ b/docs/sphinx_setup/_static/images/img/import_pytorch.svg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7419b60d37a9bc058626c52fcbfec20c3a5d22c6d0875fb84ef0df7ec2a68671
+size 142191
--- a/docs/sphinx_setup/_static/images/img/import_tensorflow.svg
+++ b/docs/sphinx_setup/_static/images/img/import_tensorflow.svg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d5666c2ee7503bc2844a99f73c1b64afacd2c42dadef441ce115cc18b00922c7
+size 224644
--- a/docs/sphinx_setup/_static/images/import_pytorch.svg
+++ b/docs/sphinx_setup/_static/images/import_pytorch.svg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:7419b60d37a9bc058626c52fcbfec20c3a5d22c6d0875fb84ef0df7ec2a68671
+size 142191
--- a/docs/sphinx_setup/_static/images/import_tensorflow.svg
+++ b/docs/sphinx_setup/_static/images/import_tensorflow.svg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d5666c2ee7503bc2844a99f73c1b64afacd2c42dadef441ce115cc18b00922c7
+size 224644
--- a/samples/cpp/benchmark_app/main.cpp
+++ b/samples/cpp/benchmark_app/main.cpp
@@ -125,43 +125,52 @@ void next_step(const std::string additional_info = "") {
              << (additional_info.empty() ? "" : " (" + additional_info + ")") << std::endl;
 }

-ov::hint::PerformanceMode get_performance_hint(const std::string& device, const ov::Core& core) {
-    OPENVINO_SUPPRESS_DEPRECATED_START
-    ov::hint::PerformanceMode ov_perf_hint = ov::hint::PerformanceMode::UNDEFINED;
-    OPENVINO_SUPPRESS_DEPRECATED_END
+void handle_performance_hint(const std::string& device, const ov::Core& core, ov::AnyMap& config) {
+    ov::hint::PerformanceMode ov_perf_hint = ov::hint::PerformanceMode::THROUGHPUT;
    auto supported_properties = core.get_property(device, ov::supported_properties);
    if (std::find(supported_properties.begin(), supported_properties.end(), ov::hint::performance_mode) !=
        supported_properties.end()) {
-        if (FLAGS_hint != "") {
+        // Use FLAGS_hint to decide performance mode:
+        //
+        // "throughput" or "tput": THROUGHPUT mode
+        // "cumulative_throughput" or "ctput": CUMULATIVE_THROUGHPUT mode
+        // "latency": LATENCY mode
+        // "none": not set ov::hint::performance_mode, let plugin use its default performance mode
+        // ""    : use default THROUGHPUT mode, if FLAG_api="sync" then set LATENCY mode
+        if (FLAGS_hint != "" && FLAGS_hint != "none") {
            if (FLAGS_hint == "throughput" || FLAGS_hint == "tput") {
                ov_perf_hint = ov::hint::PerformanceMode::THROUGHPUT;
            } else if (FLAGS_hint == "latency") {
                ov_perf_hint = ov::hint::PerformanceMode::LATENCY;
            } else if (FLAGS_hint == "cumulative_throughput" || FLAGS_hint == "ctput") {
                ov_perf_hint = ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT;
-            } else if (FLAGS_hint == "none") {
-                OPENVINO_SUPPRESS_DEPRECATED_START
-                ov_perf_hint = ov::hint::PerformanceMode::UNDEFINED;
-                OPENVINO_SUPPRESS_DEPRECATED_END
            } else {
                throw std::logic_error(
                    "Incorrect performance hint. Please set -hint option to"
                    "`throughput`(tput), `latency', 'cumulative_throughput'(ctput) value or 'none'.");
            }
-        } else {
-            ov_perf_hint =
-                FLAGS_api == "async" ? ov::hint::PerformanceMode::THROUGHPUT : ov::hint::PerformanceMode::LATENCY;
-
+        } else if (FLAGS_hint == "") {
+            ov_perf_hint = ov::hint::PerformanceMode::THROUGHPUT;
+            if (FLAGS_api == "sync") {
+                ov_perf_hint = ov::hint::PerformanceMode::LATENCY;
+            }
            slog::warn << "Performance hint was not explicitly specified in command line. "
                          "Device("
                       << device << ") performance hint will be set to " << ov_perf_hint << "." << slog::endl;
        }
+
+        if (FLAGS_hint != "none") {
+            // apply command line hint setting and override if hint exists
+            config[ov::hint::performance_mode.name()] = ov_perf_hint;
+        } else {
+            config.erase(ov::hint::performance_mode.name());
+        }
    } else {
-        if (FLAGS_hint != "") {
+        if (FLAGS_hint != "none" || FLAGS_hint != "") {
            slog::warn << "Device(" << device << ") does not support performance hint property(-hint)." << slog::endl;
        }
    }
-    return ov_perf_hint;
+    return;
 }

 void setDeviceProperty(ov::Core& core,
@@ -367,20 +376,7 @@ int main(int argc, char* argv[]) {
        // Update config per device according to command line parameters
        for (auto& device : devices) {
            auto& device_config = config[device];
-            auto ov_perf_hint = get_performance_hint(device, core);
-            OPENVINO_SUPPRESS_DEPRECATED_START
-            if (isFlagSetInCommandLine("hint")) {
-                if (ov_perf_hint != ov::hint::PerformanceMode::UNDEFINED) {
-                    // apply command line hint setting and override if hint exists
-                    device_config[ov::hint::performance_mode.name()] = ov_perf_hint;
-                } else {
-                    device_config.erase(ov::hint::performance_mode.name());
-                }
-            } else if (ov_perf_hint != ov::hint::PerformanceMode::UNDEFINED) {
-                // keep hint setting in the config if no hint setting from command line
-                device_config.emplace(ov::hint::performance_mode(ov_perf_hint));
-            }
-            OPENVINO_SUPPRESS_DEPRECATED_END
+            handle_performance_hint(device, core, device_config);

            if (FLAGS_nireq != 0)
                device_config[ov::hint::num_requests.name()] = unsigned(FLAGS_nireq);
@@ -443,8 +439,7 @@ int main(int argc, char* argv[]) {
                                               "<dev1>:<nstreams1>,<dev2>:<nstreams2>" +
                                               " or via configuration file.");
                    }
-                } else if (ov_perf_hint == ov::hint::PerformanceMode::UNDEFINED && !device_config.count(key) &&
-                           (FLAGS_api == "async")) {
+                } else if (FLAGS_api == "none" && !device_config.count(key) && (FLAGS_api == "async")) {
                    slog::warn << "-nstreams default value is determined automatically for " << device
                               << " device. "
                                  "Although the automatic selection usually provides a "
--- a/samples/cpp/speech_sample/CMakeLists.txt
+++ b/samples/cpp/speech_sample/CMakeLists.txt
@@ -1,38 +0,0 @@
-# Copyright (C) 2018-2023 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-#
-
-file (GLOB SRC ${CMAKE_CURRENT_SOURCE_DIR}/*.cpp
-file (GLOB HDR ${CMAKE_CURRENT_SOURCE_DIR}/*.hpp)
-               ${CMAKE_CURRENT_SOURCE_DIR}/*.h)
-
-# Required zlib and cnpy dependencies
-
-if(NOT TARGET ZLIB::ZLIB)
-    if(EXISTS "${Samples_SOURCE_DIR}/thirdparty/zlib")
-        # OpenVINO package puts thirdparty to samples dir
-        add_subdirectory("${Samples_SOURCE_DIR}/thirdparty/zlib"
-                         "${Samples_BINARY_DIR}/thirdparty/zlib" EXCLUDE_FROM_ALL)
-    elseif(EXISTS "${Samples_SOURCE_DIR}/../../thirdparty/zlib")
-        # Allow running samples CMakeLists.txt as stand alone from openvino sources
-        add_subdirectory("${Samples_SOURCE_DIR}/../../thirdparty/zlib"
-                         "${Samples_BINARY_DIR}/thirdparty/zlib" EXCLUDE_FROM_ALL)
-    endif()
-endif()
-
-if(EXISTS "${Samples_SOURCE_DIR}/thirdparty/cnpy")
-    # OpenVINO package puts thirdparty to samples dir
-    add_subdirectory("${Samples_SOURCE_DIR}/thirdparty/cnpy"
-                     "${Samples_BINARY_DIR}/thirdparty/cnpy" EXCLUDE_FROM_ALL)
-elseif(EXISTS "${Samples_SOURCE_DIR}/../../thirdparty/cnpy" AND NOT TARGET cnpy)
-    # Allow running samples CMakeLists.txt as stand alone from openvino sources
-    add_subdirectory("${Samples_SOURCE_DIR}/../../thirdparty/cnpy"
-                     "${Samples_BINARY_DIR}/thirdparty/cnpy" EXCLUDE_FROM_ALL)
-endif()
-
-# add sample
-
-ov_add_sample(NAME speech_sample
-              SOURCES ${SRC}
-              HEADERS ${HDR}
-              DEPENDENCIES ${GFLAGS_TARGET} cnpy ie_samples_utils)
--- a/samples/cpp/speech_sample/README.md
+++ b/samples/cpp/speech_sample/README.md
@@ -1,38 +0,0 @@
-# Automatic Speech Recognition C++ Sample
-
-> **NOTE**: This sample is being deprecated and will no longer be maintained after OpenVINO 2023.2 (LTS). The main reason for it is the outdated state of the sample and its extensive usage of GNA, which is not going to be supported by OpenVINO beyond 2023.2. 
-
-This sample demonstrates how to execute an Asynchronous Inference of acoustic model based on Kaldi\* neural networks and speech feature vectors.  
-
-The sample works with Kaldi ARK or Numpy* uncompressed NPZ files, so it does not cover an end-to-end speech recognition scenario (speech to text), requiring additional preprocessing (feature extraction) to get a feature vector from a speech signal, as well as postprocessing (decoding) to produce text from scores.
-
-For more detailed information on how this sample works, check the dedicated [article](https://docs.openvino.ai/2023.2/openvino_inference_engine_samples_speech_sample_README.html)
-
-## Requirements 
-
-| Options                    | Values                                                                                                                                   |
-| ---------------------------| -----------------------------------------------------------------------------------------------------------------------------------------|
-| Validated Models           | Acoustic model based on Kaldi\* neural networks (see                                                                                     |
-|                            | [Model Preparation](https://docs.openvino.ai/2023.2/openvino_inference_engine_samples_speech_sample_README.html)                         |
-|                            | section)                                                                                                                                 |
-| Model Format               | OpenVINO™ toolkit Intermediate Representation (*.xml + *.bin)                                                                            |
-| Supported devices          | See [Execution Modes](https://docs.openvino.ai/2023.2/openvino_inference_engine_samples_speech_sample_README.html#execution-modes)       |
-|                            | section below and [List Supported Devices](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_Supported_Devices.html) |
-
-The following C++ API is used in the application:
-
-| Feature                  | API                                                                           | Description                                                                  |
-| -------------------------| ------------------------------------------------------------------------------|------------------------------------------------------------------------------|
-| Available Devices        | ``ov::Core::get_available_devices``, ``ov::Core::get_property``               | Get information of the devices for inference                                 |
-| Import/Export Model      | ``ov::Core::import_model``, ``ov::CompiledModel::export_model``               | The GNA plugin supports loading and saving of the GNA-optimized model        |
-| Model Operations         | ``ov::set_batch``, ``ov::Model::add_output``, ``ov::CompiledModel::inputs``,  |                                                                              |
-|                          | ``ov::CompiledModel::outputs``                                                | Managing of model: configure batch_size, input and output tensors            |
-| Node Operations          | ``ov::OutputVector::size``, ``ov::Output::get_shape``                         | Get node shape                                                               |
-| Asynchronous Infer       | ``ov::InferRequest::start_async``, ``ov::InferRequest::wait``                 | Do asynchronous inference and waits until inference result becomes available |
-| InferRequest Operations  | ``ov::InferRequest::query_state``, ``ov::VariableState::reset``               | Gets and resets CompiledModel state control                                  |
-| Tensor Operations        | ``ov::Tensor::get_size``, ``ov::Tensor::data``,                               |                                                                              |
-|                          | ``ov::InferRequest::get_tensor``                                              | Get a tensor, its size and data                                              |
-| Profiling                | ``ov::InferRequest::get_profiling_info``                                      | Get infer request profiling info                                             |
-
-
-Basic OpenVINO™ Runtime API is covered by [Hello Classification C++ sample](https://docs.openvino.ai/2023.2/openvino_inference_engine_samples_hello_classification_README.html).
--- a/samples/cpp/speech_sample/fileutils.cpp
+++ b/samples/cpp/speech_sample/fileutils.cpp
@@ -1,178 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#include "fileutils.hpp"
-
-void ArkFile::get_file_info(const char* fileName,
-                            uint32_t numArrayToFindSize,
-                            uint32_t* ptrNumArrays,
-                            uint32_t* ptrNumMemoryBytes) {
-    uint32_t numArrays = 0;
-    uint32_t numMemoryBytes = 0;
-
-    std::ifstream in_file(fileName, std::ios::binary);
-    if (in_file.good()) {
-        while (!in_file.eof()) {
-            std::string line;
-            uint32_t numRows = 0u, numCols = 0u, num_bytes = 0u;
-            std::getline(in_file, line, '\0');  // read variable length name followed by space and NUL
-            std::getline(in_file, line, '\4');  // read "BFM" followed by space and control-D
-            if (line.compare("BFM ") != 0) {
-                break;
-            }
-            in_file.read(reinterpret_cast<char*>(&numRows), sizeof(uint32_t));  // read number of rows
-            std::getline(in_file, line, '\4');                                  // read control-D
-            in_file.read(reinterpret_cast<char*>(&numCols), sizeof(uint32_t));  // read number of columns
-            num_bytes = numRows * numCols * sizeof(float);
-            in_file.seekg(num_bytes, in_file.cur);  // read data
-
-            if (numArrays == numArrayToFindSize) {
-                numMemoryBytes += num_bytes;
-            }
-            numArrays++;
-        }
-        in_file.close();
-    } else {
-        throw std::runtime_error(std::string("Failed to open %s for reading in get_file_info()!\n") + fileName);
-    }
-
-    if (ptrNumArrays != NULL)
-        *ptrNumArrays = numArrays;
-    if (ptrNumMemoryBytes != NULL)
-        *ptrNumMemoryBytes = numMemoryBytes;
-}
-
-void ArkFile::load_file(const char* fileName,
-                        uint32_t arrayIndex,
-                        std::string& ptrName,
-                        std::vector<uint8_t>& memory,
-                        uint32_t* ptrNumRows,
-                        uint32_t* ptrNumColumns,
-                        uint32_t* ptrNumBytesPerElement) {
-    std::ifstream in_file(fileName, std::ios::binary);
-    if (in_file.good()) {
-        uint32_t i = 0;
-        while (i < arrayIndex) {
-            std::string line;
-            uint32_t numRows = 0u, numCols = 0u;
-            std::getline(in_file, line, '\0');  // read variable length name followed by space and NUL
-            std::getline(in_file, line, '\4');  // read "BFM" followed by space and control-D
-            if (line.compare("BFM ") != 0) {
-                break;
-            }
-            in_file.read(reinterpret_cast<char*>(&numRows), sizeof(uint32_t));  // read number of rows
-            std::getline(in_file, line, '\4');                                  // read control-D
-            in_file.read(reinterpret_cast<char*>(&numCols), sizeof(uint32_t));  // read number of columns
-            in_file.seekg(numRows * numCols * sizeof(float), in_file.cur);      // read data
-            i++;
-        }
-        if (!in_file.eof()) {
-            std::string line;
-            std::getline(in_file, ptrName, '\0');  // read variable length name followed by space and NUL
-            std::getline(in_file, line, '\4');     // read "BFM" followed by space and control-D
-            if (line.compare("BFM ") != 0) {
-                throw std::runtime_error(std::string("Cannot find array specifier in file %s in load_file()!\n") +
-                                         fileName);
-            }
-            in_file.read(reinterpret_cast<char*>(ptrNumRows), sizeof(uint32_t));     // read number of rows
-            std::getline(in_file, line, '\4');                                       // read control-D
-            in_file.read(reinterpret_cast<char*>(ptrNumColumns), sizeof(uint32_t));  // read number of columns
-            in_file.read(reinterpret_cast<char*>(&memory.front()),
-                         *ptrNumRows * *ptrNumColumns * sizeof(float));  // read array data
-        }
-        in_file.close();
-    } else {
-        throw std::runtime_error(std::string("Failed to open %s for reading in load_file()!\n") + fileName);
-    }
-
-    *ptrNumBytesPerElement = sizeof(float);
-}
-
-void ArkFile::save_file(const char* fileName,
-                        bool shouldAppend,
-                        std::string name,
-                        void* ptrMemory,
-                        uint32_t numRows,
-                        uint32_t numColumns) {
-    std::ios_base::openmode mode = std::ios::binary;
-    if (shouldAppend) {
-        mode |= std::ios::app;
-    }
-    std::ofstream out_file(fileName, mode);
-    if (out_file.good()) {
-        out_file.write(name.c_str(), name.length());  // write name
-        out_file.write("\0", 1);
-        out_file.write("BFM ", 4);
-        out_file.write("\4", 1);
-        out_file.write(reinterpret_cast<char*>(&numRows), sizeof(uint32_t));
-        out_file.write("\4", 1);
-        out_file.write(reinterpret_cast<char*>(&numColumns), sizeof(uint32_t));
-        out_file.write(reinterpret_cast<char*>(ptrMemory), numRows * numColumns * sizeof(float));
-        out_file.close();
-    } else {
-        throw std::runtime_error(std::string("Failed to open %s for writing in save_file()!\n") + fileName);
-    }
-}
-
-void NumpyFile::get_file_info(const char* fileName,
-                              uint32_t numArrayToFindSize,
-                              uint32_t* ptrNumArrays,
-                              uint32_t* ptrNumMemoryBytes) {
-    uint32_t numArrays = 0;
-    uint32_t numMemoryBytes = 0;
-
-    cnpy::npz_t my_npz1 = cnpy::npz_load(fileName);
-    auto it = my_npz1.begin();
-    std::advance(it, numArrayToFindSize);
-    if (it != my_npz1.end()) {
-        numArrays = my_npz1.size();
-        cnpy::NpyArray my_npy = it->second;
-        numMemoryBytes = my_npy.data_holder->size();
-
-        if (ptrNumArrays != NULL)
-            *ptrNumArrays = numArrays;
-        if (ptrNumMemoryBytes != NULL)
-            *ptrNumMemoryBytes = numMemoryBytes;
-    } else {
-        throw std::runtime_error(std::string("Failed to get info %s  get_file_info()!\n") + fileName);
-    }
-}
-
-void NumpyFile::load_file(const char* fileName,
-                          uint32_t arrayIndex,
-                          std::string& ptrName,
-                          std::vector<uint8_t>& memory,
-                          uint32_t* ptrNumRows,
-                          uint32_t* ptrNumColumns,
-                          uint32_t* ptrNumBytesPerElement) {
-    cnpy::npz_t my_npz1 = cnpy::npz_load(fileName);
-    auto it = my_npz1.begin();
-    std::advance(it, arrayIndex);
-    if (it != my_npz1.end()) {
-        ptrName = it->first;
-        cnpy::NpyArray my_npy = it->second;
-        *ptrNumRows = my_npy.shape[0];
-        *ptrNumColumns = my_npy.shape[1];
-
-        for (size_t i = 0; i < my_npy.data_holder->size(); i++) {
-            memory.at(i) = my_npy.data_holder->at(i);
-        }
-
-        *ptrNumBytesPerElement = sizeof(float);
-    } else {
-        throw std::runtime_error(std::string("Failed to open %s for reading in load_file()!\n") + fileName);
-    }
-}
-
-void NumpyFile::save_file(const char* fileName,
-                          bool shouldAppend,
-                          std::string name,
-                          void* ptrMemory,
-                          uint32_t numRows,
-                          uint32_t numColumns) {
-    std::string mode;
-    shouldAppend ? mode = "a" : mode = "w";
-    std::vector<size_t> shape{numRows, numColumns};
-    cnpy::npz_save(fileName, name, reinterpret_cast<float*>(ptrMemory), shape, mode);
-}
--- a/samples/cpp/speech_sample/fileutils.hpp
+++ b/samples/cpp/speech_sample/fileutils.hpp
@@ -1,139 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#pragma once
-#include <cnpy.h>
-
-#include <samples/common.hpp>
-#include <samples/slog.hpp>
-
-/// @brief Interface to work with files like input and output
-class BaseFile {
-public:
-    virtual void load_file(const char* fileName,
-                           uint32_t arrayIndex,
-                           std::string& ptrName,
-                           std::vector<uint8_t>& memory,
-                           uint32_t* ptrNumRows,
-                           uint32_t* ptrNumColumns,
-                           uint32_t* ptrNumBytesPerElement) = 0;
-
-    virtual void save_file(const char* fileName,
-                           bool shouldAppend,
-                           std::string name,
-                           void* ptrMemory,
-                           uint32_t numRows,
-                           uint32_t numColumns) = 0;
-
-    virtual void get_file_info(const char* fileName,
-                               uint32_t numArrayToFindSize,
-                               uint32_t* ptrNumArrays,
-                               uint32_t* ptrNumMemoryBytes) = 0;
-};
-
-/// @brief Responsible to work with .ark files
-class ArkFile : public BaseFile {
-public:
-    /**
-     * @brief Get info from Kaldi ARK speech feature vector file
-     * @param fileName .ark file name
-     * @param numArrayToFindSize number speech feature vectors in the file
-     * @param ptrNumArrays pointer to specific number array
-     * @param ptrNumMemoryBytes pointer to specific number of memory bytes
-     * @return none.
-     */
-    void get_file_info(const char* fileName,
-                       uint32_t numArrayToFindSize,
-                       uint32_t* ptrNumArrays,
-                       uint32_t* ptrNumMemoryBytes) override;
-
-    /**
-     * @brief Load Kaldi ARK speech feature vector file
-     * @param fileName .ark file name
-     * @param arrayIndex number speech feature vector in the file
-     * @param ptrName reference to variable length name
-     * @param memory reference to speech feature vector to save
-     * @param ptrNumRows pointer to number of rows to read
-     * @param ptrNumColumns pointer to number of columns to read
-     * @param ptrNumBytesPerElement pointer to number bytes per element (size of float by default)
-     * @return none.
-     */
-    void load_file(const char* fileName,
-                   uint32_t arrayIndex,
-                   std::string& ptrName,
-                   std::vector<uint8_t>& memory,
-                   uint32_t* ptrNumRows,
-                   uint32_t* ptrNumColumns,
-                   uint32_t* ptrNumBytesPerElement) override;
-
-    /**
-     * @brief Save Kaldi ARK speech feature vector file
-     * @param fileName .ark file name
-     * @param shouldAppend bool flag to rewrite or add to the end of file
-     * @param name reference to variable length name
-     * @param ptrMemory pointer to speech feature vector to save
-     * @param numRows number of rows
-     * @param numColumns number of columns
-     * @return none.
-     */
-    void save_file(const char* fileName,
-                   bool shouldAppend,
-                   std::string name,
-                   void* ptrMemory,
-                   uint32_t numRows,
-                   uint32_t numColumns) override;
-};
-
-/// @brief Responsible to work with .npz files
-class NumpyFile : public BaseFile {
-public:
-    /**
-     * @brief Get info from Numpy* uncompressed NPZ speech feature vector file
-     * @param fileName .npz file name
-     * @param numArrayToFindSize number speech feature vectors in the file
-     * @param ptrNumArrays pointer to specific number array
-     * @param ptrNumMemoryBytes pointer to specific number of memory bytes
-     * @return none.
-     */
-    void get_file_info(const char* fileName,
-                       uint32_t numArrayToFindSize,
-                       uint32_t* ptrNumArrays,
-                       uint32_t* ptrNumMemoryBytes) override;
-
-    /**
-     * @brief Load Numpy* uncompressed NPZ speech feature vector file
-     * @param fileName .npz file name
-     * @param arrayIndex number speech feature vector in the file
-     * @param ptrName reference to variable length name
-     * @param memory reference to speech feature vector to save
-     * @param ptrNumRows pointer to number of rows to read
-     * @param ptrNumColumns pointer to number of columns to read
-     * @param ptrNumBytesPerElement pointer to number bytes per element (size of float by default)
-     * @return none.
-     */
-    void load_file(const char* fileName,
-                   uint32_t arrayIndex,
-                   std::string& ptrName,
-                   std::vector<uint8_t>& memory,
-                   uint32_t* ptrNumRows,
-                   uint32_t* ptrNumColumns,
-                   uint32_t* ptrNumBytesPerElement) override;
-
-    /**
-     * @brief Save Numpy* uncompressed NPZ speech feature vector file
-     * @param fileName .npz file name
-     * @param shouldAppend bool flag to rewrite or add to the end of file
-     * @param name reference to variable length name
-     * @param ptrMemory pointer to speech feature vector to save
-     * @param numRows number of rows
-     * @param numColumns number of columns
-     * @return none.
-     */
-    void save_file(const char* fileName,
-                   bool shouldAppend,
-                   std::string name,
-                   void* ptrMemory,
-                   uint32_t numRows,
-                   uint32_t numColumns) override;
-};
--- a/samples/cpp/speech_sample/main.cpp
+++ b/samples/cpp/speech_sample/main.cpp
@@ -1,706 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-#include <time.h>
-
-#include <chrono>
-#include <fstream>
-#include <functional>
-#include <iomanip>
-#include <iostream>
-#include <limits>
-#include <map>
-#include <memory>
-#include <random>
-#include <string>
-#include <thread>
-#include <utility>
-#include <vector>
-
-// clang-format off
-#include <openvino/openvino.hpp>
-#include <openvino/runtime/intel_gna/properties.hpp>
-
-#include <samples/args_helper.hpp>
-#include <samples/slog.hpp>
-
-#include "fileutils.hpp"
-#include "speech_sample.hpp"
-#include "utils.hpp"
-// clang-format on
-
-using namespace ov::preprocess;
-
-/**
- * @brief The entry point for OpenVINO Runtime automatic speech recognition sample
- * @file speech_sample/main.cpp
- * @example speech_sample/main.cpp
- */
-int main(int argc, char* argv[]) {
-    try {
-        // ------------------------------ Get OpenVINO Runtime version ----------------------------------------------
-        slog::info << "OpenVINO runtime: " << ov::get_openvino_version() << slog::endl;
-
-        // ------------------------------ Parsing and validation of input arguments ---------------------------------
-        if (!parse_and_check_command_line(argc, argv)) {
-            return 0;
-        }
-        BaseFile* file;
-        BaseFile* fileOutput;
-        ArkFile arkFile;
-        NumpyFile numpyFile;
-        std::pair<std::string, std::vector<std::string>> input_data;
-        if (!FLAGS_i.empty())
-            input_data = parse_parameters(FLAGS_i);
-        auto extInputFile = fileExt(input_data.first);
-        if (extInputFile == "ark") {
-            file = &arkFile;
-        } else if (extInputFile == "npz") {
-            file = &numpyFile;
-        } else {
-            throw std::logic_error("Invalid input file");
-        }
-        std::vector<std::string> inputFiles;
-        std::vector<uint32_t> numBytesThisUtterance;
-        uint32_t numUtterances(0);
-        if (!input_data.first.empty()) {
-            std::string outStr;
-            std::istringstream stream(input_data.first);
-            uint32_t currentNumUtterances(0), currentNumBytesThisUtterance(0);
-            while (getline(stream, outStr, ',')) {
-                std::string filename(fileNameNoExt(outStr) + "." + extInputFile);
-                inputFiles.push_back(filename);
-                file->get_file_info(filename.c_str(), 0, &currentNumUtterances, &currentNumBytesThisUtterance);
-                if (numUtterances == 0) {
-                    numUtterances = currentNumUtterances;
-                } else if (currentNumUtterances != numUtterances) {
-                    throw std::logic_error(
-                        "Incorrect input files. Number of utterance must be the same for all input files");
-                }
-                numBytesThisUtterance.push_back(currentNumBytesThisUtterance);
-            }
-        }
-        size_t numInputFiles(inputFiles.size());
-
-        // --------------------------- Step 1. Initialize OpenVINO Runtime core and read model
-        // -------------------------------------
-        ov::Core core;
-        try {
-            const auto& gnaLibraryVersion = core.get_property("GNA", ov::intel_gna::library_full_version);
-            slog::info << "Detected GNA Library: " << gnaLibraryVersion << slog::endl;
-        } catch (std::exception& e) {
-            slog::info << "Cannot detect GNA Library version, exception: " << e.what() << slog::endl;
-        }
-        slog::info << "Loading model files:" << slog::endl << FLAGS_m << slog::endl;
-        uint32_t batchSize = (FLAGS_cw_r > 0 || FLAGS_cw_l > 0 || !FLAGS_bs) ? 1 : (uint32_t)FLAGS_bs;
-        std::shared_ptr<ov::Model> model;
-        // --------------------------- Processing custom outputs ---------------------------------------------
-        const auto output_data = parse_parameters(FLAGS_o);
-        const auto reference_data = parse_parameters(FLAGS_r);
-
-        const auto outputs = get_first_non_empty(output_data.second, reference_data.second);
-
-        // ------------------------------ Preprocessing ------------------------------------------------------
-        // the preprocessing steps can be done only for loaded network and are not applicable for the imported network
-        // (already compiled)
-        if (!FLAGS_m.empty()) {
-            const auto outputs_with_ports = parse_to_extract_port(outputs);
-            model = core.read_model(FLAGS_m);
-            for (const auto& output_with_port : outputs_with_ports) {
-                auto output = model->add_output(output_with_port.first, output_with_port.second);
-                output.set_names({output_with_port.first + ":" + std::to_string(output_with_port.second)});
-            }
-            check_number_of_inputs(model->inputs().size(), numInputFiles);
-            ov::preprocess::PrePostProcessor proc(model);
-            const auto& inputs = model->inputs();
-            std::map<std::string, std::string> custom_layouts;
-            if (!FLAGS_layout.empty()) {
-                custom_layouts = parse_input_layouts(FLAGS_layout, inputs);
-            }
-            for (const auto& input : inputs) {
-                const auto& item_name = input.get_any_name();
-                auto& in = proc.input(item_name);
-                in.tensor().set_element_type(ov::element::f32);
-                // Explicitly set inputs layout
-                if (custom_layouts.count(item_name) > 0) {
-                    in.model().set_layout(ov::Layout(custom_layouts.at(item_name)));
-                }
-            }
-            for (size_t i = 0; i < model->outputs().size(); i++) {
-                proc.output(i).tensor().set_element_type(ov::element::f32);
-            }
-            model = proc.build();
-            if (FLAGS_bs) {
-                if (FLAGS_layout.empty() &&
-                    std::any_of(inputs.begin(), inputs.end(), [](const ov::Output<ov::Node>& i) {
-                        return ov::layout::get_layout(i).empty();
-                    })) {
-                    throw std::logic_error(
-                        "-bs option is set to " + std::to_string(FLAGS_bs) +
-                        " but model does not contain layout information for any input. Please "
-                        "specify it explicitly using -layout option. For example, input1[NCHW], input2[NC] or [NC]");
-                } else {
-                    ov::set_batch(model, batchSize);
-                }
-            }
-        }
-        // ------------------------------ Get Available Devices ------------------------------------------------------
-        auto isFeature = [&](const std::string xFeature) {
-            return FLAGS_d.find(xFeature) != std::string::npos;
-        };
-        bool useGna = isFeature("GNA");
-        bool useHetero = isFeature("HETERO");
-        std::string deviceStr = useHetero && useGna ? "HETERO:GNA,CPU" : FLAGS_d.substr(0, (FLAGS_d.find("_")));
-        // -----------------------------------------------------------------------------------------------------
-        // --------------------------- Set parameters and scale factors -------------------------------------
-        /** Setting parameter for per layer metrics **/
-        ov::AnyMap gnaPluginConfig;
-        ov::AnyMap genericPluginConfig;
-        if (useGna) {
-            std::string gnaDevice =
-                useHetero ? FLAGS_d.substr(FLAGS_d.find("GNA"), FLAGS_d.find(",") - FLAGS_d.find("GNA")) : FLAGS_d;
-            auto parse_gna_device = [&](const std::string& device) -> ov::intel_gna::ExecutionMode {
-                ov::intel_gna::ExecutionMode mode;
-                std::stringstream ss(device);
-                ss >> mode;
-                return mode;
-            };
-            gnaPluginConfig[ov::intel_gna::execution_mode.name()] = gnaDevice.find("_") == std::string::npos
-                                                                        ? ov::intel_gna::ExecutionMode::AUTO
-                                                                        : parse_gna_device(gnaDevice);
-        }
-        if (FLAGS_pc) {
-            genericPluginConfig.emplace(ov::enable_profiling(true));
-        }
-        if (FLAGS_q.compare("user") == 0) {
-            if (!FLAGS_rg.empty()) {
-                std::string errMessage("Custom scale factor can not be set for imported gna model: " + FLAGS_rg);
-                throw std::logic_error(errMessage);
-            } else {
-                auto scale_factors_per_input = parse_scale_factors(model->inputs(), FLAGS_sf);
-                if (numInputFiles != scale_factors_per_input.size()) {
-                    std::string errMessage("Incorrect command line for multiple inputs: " +
-                                           std::to_string(scale_factors_per_input.size()) +
-                                           " scale factors provided for " + std::to_string(numInputFiles) +
-                                           " input files.");
-                    throw std::logic_error(errMessage);
-                }
-                for (auto&& sf : scale_factors_per_input) {
-                    slog::info << "For input " << sf.first << " using scale factor of " << sf.second << slog::endl;
-                }
-                gnaPluginConfig[ov::intel_gna::scale_factors_per_input.name()] = scale_factors_per_input;
-            }
-        } else {
-            // "static" quantization with calculated scale factor
-            if (!FLAGS_rg.empty()) {
-                slog::info << "Using scale factor from provided imported gna model: " << FLAGS_rg << slog::endl;
-            } else {
-                std::map<std::string, float> scale_factors_per_input;
-                for (size_t i = 0; i < numInputFiles; i++) {
-                    auto inputFileName = inputFiles[i].c_str();
-                    std::string name;
-                    std::vector<uint8_t> ptrFeatures;
-                    uint32_t numArrays(0), numBytes(0), numFrames(0), numFrameElements(0), numBytesPerElement(0);
-                    file->get_file_info(inputFileName, 0, &numArrays, &numBytes);
-                    ptrFeatures.resize(numBytes);
-                    file->load_file(inputFileName,
-                                    0,
-                                    name,
-                                    ptrFeatures,
-                                    &numFrames,
-                                    &numFrameElements,
-                                    &numBytesPerElement);
-                    auto floatScaleFactor = scale_factor_for_quantization(ptrFeatures.data(),
-                                                                          MAX_VAL_2B_FEAT,
-                                                                          numFrames * numFrameElements);
-                    slog::info << "Using scale factor of " << floatScaleFactor << " calculated from first utterance."
-                               << slog::endl;
-                    scale_factors_per_input[strip_name(model->input(i).get_any_name())] = floatScaleFactor;
-                }
-                gnaPluginConfig[ov::intel_gna::scale_factors_per_input.name()] = scale_factors_per_input;
-            }
-        }
-        gnaPluginConfig[ov::hint::inference_precision.name()] = (FLAGS_qb == 8) ? ov::element::i8 : ov::element::i16;
-        const std::unordered_map<std::string, ov::intel_gna::HWGeneration> StringHWGenerationMap{
-            {"GNA_TARGET_1_0", ov::intel_gna::HWGeneration::GNA_1_0},
-            {"GNA_TARGET_1_0_E", ov::intel_gna::HWGeneration::GNA_1_0_E},
-            {"GNA_TARGET_2_0", ov::intel_gna::HWGeneration::GNA_2_0},
-            {"GNA_TARGET_3_0", ov::intel_gna::HWGeneration::GNA_3_0},
-            {"GNA_TARGET_3_1", ov::intel_gna::HWGeneration::GNA_3_1},
-            {"GNA_TARGET_3_5", ov::intel_gna::HWGeneration::GNA_3_5},
-            {"GNA_TARGET_3_5_E", ov::intel_gna::HWGeneration::GNA_3_5_E},
-            {"GNA_TARGET_3_6", ov::intel_gna::HWGeneration::GNA_3_6},
-            {"GNA_TARGET_4_0", ov::intel_gna::HWGeneration::GNA_4_0}};
-        auto parse_target = [&](const std::string& target) -> ov::intel_gna::HWGeneration {
-            auto hw_target = ov::intel_gna::HWGeneration::UNDEFINED;
-            const auto key_iter = StringHWGenerationMap.find(target);
-            if (key_iter != StringHWGenerationMap.end()) {
-                hw_target = key_iter->second;
-            } else if (!target.empty()) {
-                slog::warn << "Unsupported target: " << target << slog::endl;
-            }
-            return hw_target;
-        };
-
-        gnaPluginConfig[ov::intel_gna::execution_target.name()] = parse_target(FLAGS_exec_target);
-        gnaPluginConfig[ov::intel_gna::compile_target.name()] = parse_target(FLAGS_compile_target);
-        gnaPluginConfig[ov::intel_gna::memory_reuse.name()] = !FLAGS_memory_reuse_off;
-        gnaPluginConfig[ov::intel_gna::pwl_max_error_percent.name()] = FLAGS_pwl_me;
-        gnaPluginConfig[ov::log::level.name()] = FLAGS_log;
-        // -----------------------------------------------------------------------------------------------------
-        // --------------------------- Write model to file --------------------------------------------------
-        // Embedded GNA model dumping (for Intel(R) Speech Enabling Developer Kit)
-        if (!FLAGS_we.empty()) {
-            gnaPluginConfig[ov::intel_gna::firmware_model_image_path.name()] = FLAGS_we;
-        }
-        // -----------------------------------------------------------------------------------------------------
-        // --------------------------- Step 2. Loading model to the device ------------------------------------------
-        if (useGna) {
-            if (useHetero) {
-                genericPluginConfig.insert(ov::device::properties("GNA", gnaPluginConfig));
-            } else {
-                genericPluginConfig.insert(std::begin(gnaPluginConfig), std::end(gnaPluginConfig));
-            }
-        }
-        auto t0 = Time::now();
-        ms loadTime = std::chrono::duration_cast<ms>(Time::now() - t0);
-        slog::info << "Model loading time " << loadTime.count() << " ms" << slog::endl;
-        ov::CompiledModel executableNet;
-        if (!FLAGS_m.empty()) {
-            slog::info << "Loading model to the device " << FLAGS_d << slog::endl;
-            executableNet = core.compile_model(model, deviceStr, genericPluginConfig);
-        } else {
-            slog::info << "Importing model to the device" << slog::endl;
-            std::ifstream streamrq(FLAGS_rg, std::ios_base::binary | std::ios_base::in);
-            if (!streamrq.is_open()) {
-                throw std::runtime_error("Cannot open model file " + FLAGS_rg);
-            }
-            executableNet = core.import_model(streamrq, deviceStr, genericPluginConfig);
-            // loading batch from exported model
-            const auto& imported_inputs = executableNet.inputs();
-            if (std::any_of(imported_inputs.begin(), imported_inputs.end(), [](const ov::Output<const ov::Node>& i) {
-                    return ov::layout::get_layout(i).empty();
-                })) {
-                slog::warn << "No batch dimension was found at any input, assuming batch to be 1." << slog::endl;
-                batchSize = 1;
-            } else {
-                for (auto& info : imported_inputs) {
-                    auto imported_layout = ov::layout::get_layout(info);
-                    if (ov::layout::has_batch(imported_layout)) {
-                        batchSize = (uint32_t)info.get_shape()[ov::layout::batch_idx(imported_layout)];
-                        break;
-                    }
-                }
-            }
-        }
-        // --------------------------- Exporting gna model using OpenVINO API---------------------
-        if (!FLAGS_wg.empty()) {
-            slog::info << "Writing GNA Model to file " << FLAGS_wg << slog::endl;
-            t0 = Time::now();
-            std::ofstream streamwq(FLAGS_wg, std::ios_base::binary | std::ios::out);
-            executableNet.export_model(streamwq);
-            ms exportTime = std::chrono::duration_cast<ms>(Time::now() - t0);
-            slog::info << "Exporting time " << exportTime.count() << " ms" << slog::endl;
-            return 0;
-        }
-        if (!FLAGS_we.empty()) {
-            slog::info << "Exported GNA embedded model to file " << FLAGS_we << slog::endl;
-            if (!FLAGS_compile_target.empty()) {
-                slog::info << "GNA embedded model target: " << FLAGS_compile_target << slog::endl;
-            }
-            return 0;
-        }
-        // ---------------------------------------------------------------------------------------------------------
-        // --------------------------- Step 3. Create infer request
-        // --------------------------------------------------
-        std::vector<InferRequestStruct> inferRequests(1);
-
-        for (auto& inferRequest : inferRequests) {
-            inferRequest = {executableNet.create_infer_request(), -1, batchSize};
-        }
-        // --------------------------- Step 4. Configure input & output
-        // --------------------------------------------------
-        std::vector<ov::Tensor> ptrInputBlobs;
-        auto cInputInfo = executableNet.inputs();
-        check_number_of_inputs(cInputInfo.size(), numInputFiles);
-        if (!input_data.second.empty()) {
-            std::vector<std::string> inputNameBlobs = input_data.second;
-            if (inputNameBlobs.size() != cInputInfo.size()) {
-                std::string errMessage(std::string("Number of network inputs ( ") + std::to_string(cInputInfo.size()) +
-                                       " ) is not equal to the number of inputs entered in the -i argument ( " +
-                                       std::to_string(inputNameBlobs.size()) + " ).");
-                throw std::logic_error(errMessage);
-            }
-            for (const auto& input : inputNameBlobs) {
-                ov::Tensor blob = inferRequests.begin()->inferRequest.get_tensor(input);
-                if (!blob) {
-                    std::string errMessage("No blob with name : " + input);
-                    throw std::logic_error(errMessage);
-                }
-                ptrInputBlobs.push_back(blob);
-            }
-        } else {
-            for (const auto& input : cInputInfo) {
-                ptrInputBlobs.push_back(inferRequests.begin()->inferRequest.get_tensor(input));
-            }
-        }
-        std::vector<std::string> output_name_files;
-        std::vector<std::string> reference_name_files;
-        size_t count_file = 1;
-        if (!output_data.first.empty()) {
-            output_name_files = convert_str_to_vector(output_data.first);
-            if (output_name_files.size() != outputs.size() && outputs.size()) {
-                throw std::logic_error("The number of output files is not equal to the number of network outputs.");
-            }
-            count_file = output_name_files.size();
-            if (executableNet.outputs().size() > 1 && output_data.second.empty() && count_file == 1) {
-                throw std::logic_error("-o is ambiguous: the model has multiple outputs but only one file provided "
-                                       "without output name specification");
-            }
-        }
-        if (!reference_data.first.empty()) {
-            reference_name_files = convert_str_to_vector(reference_data.first);
-            if (reference_name_files.size() != outputs.size() && outputs.size()) {
-                throw std::logic_error("The number of reference files is not equal to the number of network outputs.");
-            }
-            count_file = reference_name_files.size();
-            if (executableNet.outputs().size() > 1 && reference_data.second.empty() && count_file == 1) {
-                throw std::logic_error("-r is ambiguous: the model has multiple outputs but only one file provided "
-                                       "without output name specification");
-            }
-        }
-        if (count_file > executableNet.outputs().size()) {
-            throw std::logic_error(
-                "The number of output/reference files is not equal to the number of network outputs.");
-        }
-        // -----------------------------------------------------------------------------------------------------
-        // --------------------------- Step 5. Do inference --------------------------------------------------------
-        std::vector<std::vector<uint8_t>> ptrUtterances;
-        const auto effective_outputs_size = outputs.size() ? outputs.size() : executableNet.outputs().size();
-        std::vector<std::vector<uint8_t>> vectorPtrScores(effective_outputs_size);
-        std::vector<uint16_t> numScoresPerOutput(effective_outputs_size);
-
-        std::vector<std::vector<uint8_t>> vectorPtrReferenceScores(reference_name_files.size());
-        std::vector<ScoreErrorT> vectorFrameError(reference_name_files.size()),
-            vectorTotalError(reference_name_files.size());
-        ptrUtterances.resize(inputFiles.size());
-        // initialize memory state before starting
-        for (auto&& state : inferRequests.begin()->inferRequest.query_state()) {
-            state.reset();
-        }
-        /** Work with each utterance **/
-        for (uint32_t utteranceIndex = 0; utteranceIndex < numUtterances; ++utteranceIndex) {
-            std::map<std::string, ov::ProfilingInfo> utterancePerfMap;
-            uint64_t totalNumberOfRunsOnHw = 0;
-            std::string uttName;
-            uint32_t numFrames(0), n(0);
-            std::vector<uint32_t> numFrameElementsInput;
-            std::vector<uint32_t> numFramesReference(reference_name_files.size()),
-                numFrameElementsReference(reference_name_files.size()),
-                numBytesPerElementReference(reference_name_files.size()),
-                numBytesReferenceScoreThisUtterance(reference_name_files.size());
-
-            /** Get information from input file for current utterance **/
-            numFrameElementsInput.resize(numInputFiles);
-            for (size_t i = 0; i < inputFiles.size(); i++) {
-                std::vector<uint8_t> ptrUtterance;
-                auto inputFilename = inputFiles[i].c_str();
-                uint32_t currentNumFrames(0), currentNumFrameElementsInput(0), currentNumBytesPerElementInput(0);
-                file->get_file_info(inputFilename, utteranceIndex, &n, &numBytesThisUtterance[i]);
-                ptrUtterance.resize(numBytesThisUtterance[i]);
-                file->load_file(inputFilename,
-                                utteranceIndex,
-                                uttName,
-                                ptrUtterance,
-                                &currentNumFrames,
-                                &currentNumFrameElementsInput,
-                                &currentNumBytesPerElementInput);
-                if (numFrames == 0) {
-                    numFrames = currentNumFrames;
-                } else if (numFrames != currentNumFrames) {
-                    std::string errMessage("Number of frames in input files is different: " +
-                                           std::to_string(numFrames) + " and " + std::to_string(currentNumFrames));
-                    throw std::logic_error(errMessage);
-                }
-                ptrUtterances[i] = ptrUtterance;
-                numFrameElementsInput[i] = currentNumFrameElementsInput;
-            }
-            int i = 0;
-            for (auto& ptrInputBlob : ptrInputBlobs) {
-                if (ptrInputBlob.get_size() != numFrameElementsInput[i++] * batchSize) {
-                    throw std::logic_error("network input size(" + std::to_string(ptrInputBlob.get_size()) +
-                                           ") mismatch to input file size (" +
-                                           std::to_string(numFrameElementsInput[i - 1] * batchSize) + ")");
-                }
-            }
-
-            double totalTime = 0.0;
-
-            for (size_t errorIndex = 0; errorIndex < vectorFrameError.size(); errorIndex++) {
-                clear_score_error(&vectorTotalError[errorIndex]);
-                vectorTotalError[errorIndex].threshold = vectorFrameError[errorIndex].threshold = MAX_SCORE_DIFFERENCE;
-            }
-
-            std::vector<uint8_t*> inputFrame;
-            for (auto& ut : ptrUtterances) {
-                inputFrame.push_back(&ut.front());
-            }
-            std::map<std::string, ov::ProfilingInfo> callPerfMap;
-            size_t frameIndex = 0;
-            uint32_t numFramesFile = numFrames;
-            numFrames += FLAGS_cw_l + FLAGS_cw_r;
-            uint32_t numFramesThisBatch{batchSize};
-            auto t0 = Time::now();
-            auto t1 = t0;
-
-            BaseFile* fileReferenceScores;
-            std::string refUtteranceName;
-
-            if (!reference_data.first.empty()) {
-                /** Read file with reference scores **/
-                auto exReferenceScoresFile = fileExt(reference_data.first);
-                if (exReferenceScoresFile == "ark") {
-                    fileReferenceScores = &arkFile;
-                } else if (exReferenceScoresFile == "npz") {
-                    fileReferenceScores = &numpyFile;
-                } else {
-                    throw std::logic_error("Invalid Reference Scores file");
-                }
-                for (size_t next_output = 0; next_output < count_file; next_output++) {
-                    if (fileReferenceScores != nullptr) {
-                        fileReferenceScores->get_file_info(reference_name_files[next_output].c_str(),
-                                                           utteranceIndex,
-                                                           &n,
-                                                           &numBytesReferenceScoreThisUtterance[next_output]);
-                        vectorPtrReferenceScores[next_output].resize(numBytesReferenceScoreThisUtterance[next_output]);
-                        fileReferenceScores->load_file(reference_name_files[next_output].c_str(),
-                                                       utteranceIndex,
-                                                       refUtteranceName,
-                                                       vectorPtrReferenceScores[next_output],
-                                                       &numFramesReference[next_output],
-                                                       &numFrameElementsReference[next_output],
-                                                       &numBytesPerElementReference[next_output]);
-                    }
-                }
-            }
-
-            while (frameIndex <= numFrames) {
-                if (frameIndex == numFrames) {
-                    if (std::find_if(inferRequests.begin(), inferRequests.end(), [&](InferRequestStruct x) {
-                            return (x.frameIndex != -1);
-                        }) == inferRequests.end()) {
-                        break;
-                    }
-                }
-                bool inferRequestFetched = false;
-                /** Start inference loop **/
-                for (auto& inferRequest : inferRequests) {
-                    if (frameIndex == numFrames) {
-                        numFramesThisBatch = 1;
-                    } else {
-                        numFramesThisBatch =
-                            (numFrames - frameIndex < batchSize) ? (numFrames - frameIndex) : batchSize;
-                    }
-
-                    /* waits until inference result becomes available */
-                    if (inferRequest.frameIndex != -1) {
-                        inferRequest.inferRequest.wait();
-                        if (inferRequest.frameIndex >= 0)
-                            for (size_t next_output = 0; next_output < count_file; next_output++) {
-                                const auto output_name = outputs.size() > next_output
-                                                             ? outputs[next_output]
-                                                             : executableNet.output(next_output).get_any_name();
-                                auto dims = executableNet.output(output_name).get_shape();
-                                numScoresPerOutput[next_output] = std::accumulate(std::begin(dims),
-                                                                                  std::end(dims),
-                                                                                  size_t{1},
-                                                                                  std::multiplies<size_t>());
-
-                                vectorPtrScores[next_output].resize(numFramesFile * numScoresPerOutput[next_output] *
-                                                                    sizeof(float));
-
-                                if (!FLAGS_o.empty()) {
-                                    /* Prepare output data for save to file in future */
-                                    auto outputFrame = &vectorPtrScores[next_output].front() +
-                                                       numScoresPerOutput[next_output] * sizeof(float) *
-                                                           (inferRequest.frameIndex) / batchSize;
-
-                                    ov::Tensor outputBlob =
-                                        inferRequest.inferRequest.get_tensor(executableNet.output(output_name));
-                                    // locked memory holder should be alive all time while access to its buffer happens
-                                    auto byteSize = numScoresPerOutput[next_output] * sizeof(float);
-                                    std::memcpy(outputFrame, outputBlob.data<float>(), byteSize);
-                                }
-                                if (!FLAGS_r.empty()) {
-                                    /** Compare output data with reference scores **/
-                                    ov::Tensor outputBlob =
-                                        inferRequest.inferRequest.get_tensor(executableNet.output(output_name));
-
-                                    if (numScoresPerOutput[next_output] / numFrameElementsReference[next_output] ==
-                                        batchSize) {
-                                        compare_scores(
-                                            outputBlob.data<float>(),
-                                            &vectorPtrReferenceScores[next_output]
-                                                                     [inferRequest.frameIndex *
-                                                                      numFrameElementsReference[next_output] *
-                                                                      numBytesPerElementReference[next_output]],
-                                            &vectorFrameError[next_output],
-                                            inferRequest.numFramesThisBatch,
-                                            numFrameElementsReference[next_output]);
-                                        update_score_error(&vectorFrameError[next_output],
-                                                           &vectorTotalError[next_output]);
-                                    } else {
-                                        throw std::logic_error("Number of output and reference frames does not match.");
-                                    }
-                                }
-                                if (FLAGS_pc) {
-                                    // retrieve new counters
-                                    get_performance_counters(inferRequest.inferRequest, callPerfMap);
-                                    // summarize retrieved counters with all previous
-                                    sum_performance_counters(callPerfMap, utterancePerfMap, totalNumberOfRunsOnHw);
-                                }
-                            }
-                        // -----------------------------------------------------------------------------------------------------
-                    }
-                    if (frameIndex == numFrames) {
-                        inferRequest.frameIndex = -1;
-                        continue;
-                    }
-                    ptrInputBlobs.clear();
-                    if (input_data.second.empty()) {
-                        for (auto& input : cInputInfo) {
-                            ptrInputBlobs.push_back(inferRequest.inferRequest.get_tensor(input));
-                        }
-                    } else {
-                        std::vector<std::string> inputNameBlobs = input_data.second;
-                        for (const auto& input : inputNameBlobs) {
-                            ov::Tensor blob = inferRequests.begin()->inferRequest.get_tensor(input);
-                            if (!blob) {
-                                std::string errMessage("No blob with name : " + input);
-                                throw std::logic_error(errMessage);
-                            }
-                            ptrInputBlobs.push_back(blob);
-                        }
-                    }
-
-                    /** Iterate over all the input blobs **/
-                    for (size_t i = 0; i < numInputFiles; ++i) {
-                        ov::Tensor minput = ptrInputBlobs[i];
-                        if (!minput) {
-                            std::string errMessage("We expect ptrInputBlobs[" + std::to_string(i) +
-                                                   "] to be inherited from Tensor, " +
-                                                   "but in fact we were not able to cast input to Tensor");
-                            throw std::logic_error(errMessage);
-                        }
-                        memcpy(minput.data(),
-                               inputFrame[i],
-                               numFramesThisBatch * numFrameElementsInput[i] * sizeof(float));
-                        // Used to infer fewer frames than the batch size
-                        if (batchSize != numFramesThisBatch) {
-                            memset(minput.data<float>() + numFramesThisBatch * numFrameElementsInput[i],
-                                   0,
-                                   (batchSize - numFramesThisBatch) * numFrameElementsInput[i]);
-                        }
-                    }
-                    // -----------------------------------------------------------------------------------------------------
-                    int index = static_cast<int>(frameIndex) - (FLAGS_cw_l + FLAGS_cw_r);
-                    /* Starting inference in asynchronous mode*/
-                    inferRequest.inferRequest.start_async();
-                    inferRequest.frameIndex = index < 0 ? -2 : index;
-                    inferRequest.numFramesThisBatch = numFramesThisBatch;
-                    frameIndex += numFramesThisBatch;
-                    for (size_t j = 0; j < inputFiles.size(); j++) {
-                        if (FLAGS_cw_l > 0 || FLAGS_cw_r > 0) {
-                            int idx = frameIndex - FLAGS_cw_l;
-                            if (idx > 0 && idx < static_cast<int>(numFramesFile)) {
-                                inputFrame[j] += sizeof(float) * numFrameElementsInput[j] * numFramesThisBatch;
-                            } else if (idx >= static_cast<int>(numFramesFile)) {
-                                inputFrame[j] = &ptrUtterances[j].front() + (numFramesFile - 1) * sizeof(float) *
-                                                                                numFrameElementsInput[j] *
-                                                                                numFramesThisBatch;
-                            } else if (idx <= 0) {
-                                inputFrame[j] = &ptrUtterances[j].front();
-                            }
-                        } else {
-                            inputFrame[j] += sizeof(float) * numFrameElementsInput[j] * numFramesThisBatch;
-                        }
-                    }
-                    inferRequestFetched |= true;
-                }
-                /** Inference was finished for current frame **/
-                if (!inferRequestFetched) {
-                    std::this_thread::sleep_for(std::chrono::milliseconds(1));
-                    continue;
-                }
-            }
-            t1 = Time::now();
-            fsec fs = t1 - t0;
-            ms d = std::chrono::duration_cast<ms>(fs);
-            totalTime += d.count();
-            // resetting state between utterances
-            for (auto&& state : inferRequests.begin()->inferRequest.query_state()) {
-                state.reset();
-            }
-            // -----------------------------------------------------------------------------------------------------
-
-            // --------------------------- Step 6. Process output
-            // -------------------------------------------------------
-
-            /** Show performance results **/
-            std::cout << "Utterance " << utteranceIndex << ": " << std::endl;
-            std::cout << "Total time in Infer (HW and SW):\t" << totalTime << " ms" << std::endl;
-            std::cout << "Frames in utterance:\t\t\t" << numFrames << " frames" << std::endl;
-            std::cout << "Average Infer time per frame:\t\t" << totalTime / static_cast<double>(numFrames) << " ms\n"
-                      << std::endl;
-
-            if (FLAGS_pc) {
-                // print performance results
-                print_performance_counters(utterancePerfMap,
-                                           frameIndex,
-                                           std::cout,
-                                           getFullDeviceName(core, FLAGS_d),
-                                           totalNumberOfRunsOnHw,
-                                           FLAGS_d);
-            }
-
-            for (size_t next_output = 0; next_output < count_file; next_output++) {
-                if (!FLAGS_o.empty()) {
-                    auto exOutputScoresFile = fileExt(output_data.first);
-                    if (exOutputScoresFile == "ark") {
-                        fileOutput = &arkFile;
-                    } else if (exOutputScoresFile == "npz") {
-                        fileOutput = &numpyFile;
-                    } else {
-                        throw std::logic_error("Invalid Reference Scores file");
-                    }
-                    /* Save output data to file */
-                    bool shouldAppend = (utteranceIndex == 0) ? false : true;
-                    fileOutput->save_file(output_name_files[next_output].c_str(),
-                                          shouldAppend,
-                                          uttName,
-                                          &vectorPtrScores[next_output].front(),
-                                          numFramesFile,
-                                          numScoresPerOutput[next_output] / batchSize);
-                }
-                if (!FLAGS_r.empty()) {
-                    // print statistical score error
-                    const auto output_name = outputs.size() > next_output
-                                                 ? outputs[next_output]
-                                                 : executableNet.output(next_output).get_any_name();
-                    std::cout << "Output name: " << output_name << std::endl;
-                    std::cout << "Number scores per frame: " << numScoresPerOutput[next_output] / batchSize << std::endl
-                              << std::endl;
-                    print_reference_compare_results(vectorTotalError[next_output], numFrames, std::cout);
-                }
-            }
-        }
-    } catch (const std::exception& error) {
-        slog::err << error.what() << slog::endl;
-        return 1;
-    } catch (...) {
-        slog::err << "Unknown/internal exception happened" << slog::endl;
-        return 1;
-    }
-    slog::info << "Execution successful" << slog::endl;
-    return 0;
-}
--- a/samples/cpp/speech_sample/speech_sample.hpp
+++ b/samples/cpp/speech_sample/speech_sample.hpp
@@ -1,310 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#pragma once
-
-#include <gflags/gflags.h>
-
-#include <iostream>
-#include <string>
-#include <vector>
-
-/// @brief message for help argument
-static const char help_message[] = "Print a usage message.";
-
-/// @brief message for input data argument
-static const char input_message[] = "Required. Path(s) to input file(s). "
-                                    "Usage for a single file/layer: <input_file.ark> or <input_file.npz>. "
-                                    "Example of usage for several files/layers: "
-                                    "<layer1>:<port_num1>=<input_file1.ark>,<layer2>:<port_num2>=<input_file2.ark>.";
-
-/// @brief message for model argument
-static const char model_message[] = "Required. Path to an .xml file with a trained model (required if -rg is missing).";
-
-/// @brief message for assigning calculation to device
-static const char target_device_message[] =
-    "Optional. Specify a target device to infer on. CPU, GPU, NPU, GNA_AUTO, GNA_HW, "
-    "GNA_HW_WITH_SW_FBACK, GNA_SW_FP32, "
-    "GNA_SW_EXACT and HETERO with combination of GNA as the primary device and CPU"
-    " as a secondary (e.g. HETERO:GNA,CPU) are supported. "
-    "The sample will look for a suitable plugin for device specified.";
-
-/// @brief message for execution target
-static const char execution_target_message[] =
-    "Optional. Specify GNA execution target generation. "
-    "May be one of GNA_TARGET_2_0, GNA_TARGET_3_0. "
-    "By default, generation corresponds to the GNA HW available in the system "
-    "or the latest fully supported generation by the software. "
-    "See the GNA Plugin's GNA_EXEC_TARGET config option description.";
-
-/// @brief message for compile target
-static const char compile_target_message[] = "Optional. Specify GNA compile target generation. "
-                                             "May be one of GNA_TARGET_2_0, GNA_TARGET_3_0. "
-                                             "By default, generation corresponds to the GNA HW available in the system "
-                                             "or the latest fully supported generation by the software. "
-                                             "See the GNA Plugin's GNA_COMPILE_TARGET config option description.";
-
-/// @brief message for enabling GNA log
-static const char enable_log_message[] = "Optional. Enable GNA logging, which may give additional info "
-                                         "about potential issues found in network. "
-                                         "By default logging is disabled.";
-
-/// @brief message for performance counters
-static const char performance_counter_message[] = "Optional. Enables per-layer performance report.";
-
-/// @brief message for disabling of compact (memory_reuse) mode
-static const char memory_reuse_message[] = "Optional. Disables memory optimizations for compiled model.";
-
-/// @brief message for user library argument
-static const char custom_cpu_library_message[] = "Required for CPU plugin custom layers."
-                                                 "Absolute path to a shared library with the kernels implementations.";
-
-/// @brief message for score output argument
-static const char output_message[] = "Optional. Output file name(s) to save scores (inference results). "
-                                     "Usage for a single file/layer: <output_file.ark> or <output_file.npz>. "
-                                     "Example of usage for several files/layers: "
-                                     "<layer1>:<port_num1>=<output_file1.ark>,<layer2>:<port_num2>=<output_file2.ark>.";
-
-/// @brief message for reference score file argument
-static const char reference_score_message[] =
-    "Optional. Read reference score file(s) and compare inference results with reference scores. "
-    "Usage for a single file/layer: <reference_file.ark> or <reference_file.npz>. "
-    "Example of usage for several files/layers: "
-    "<layer1>:<port_num1>=<reference_file1.ark>,<layer2>:<port_num2>=<reference_file2.ark>.";
-
-/// @brief message for read GNA model argument
-static const char read_gna_model_message[] =
-    "Read GNA model from file using path/filename provided (required if -m is missing).";
-
-/// @brief message for write GNA model argument
-static const char write_gna_model_message[] = "Optional. Write GNA model to file using path/filename provided.";
-
-/// @brief message for write GNA embedded model argument
-static const char write_embedded_model_message[] =
-    "Optional. Write GNA embedded model to file using path/filename provided.";
-
-/// @brief message for write GNA embedded model generation argument
-static const char write_embedded_model_generation_message[] =
-    "Optional. GNA generation configuration string for embedded export."
-    "Can be GNA1 (default) or GNA3.";
-
-/// @brief message for quantization argument
-static const char quantization_message[] =
-    "Optional. Input quantization mode for GNA: static (default) or user defined (use with -sf).";
-
-/// @brief message for quantization bits argument
-static const char quantization_bits_message[] =
-    "Optional. Weight resolution in bits for GNA quantization: 8 or 16 (default)";
-
-/// @brief message for scale factor argument
-static const char scale_factor_message[] =
-    "Optional. User-specified input scale factor for GNA quantization (use with -q user). "
-    "If the model contains multiple inputs, provide scale factors by separating them with commas. "
-    "For example: <layer1>:<sf1>,<layer2>:<sf2> or just <sf> to be applied to all inputs.";
-
-/// @brief message for batch size argument
-static const char batch_size_message[] = "Optional. Batch size 1-8 (default 1)";
-
-/// @brief message for left context window argument
-static const char context_window_message_l[] =
-    "Optional. Number of frames for left context windows (default is 0). "
-    "Works only with context window networks."
-    " If you use the cw_l or cw_r flag, then batch size argument is ignored.";
-
-/// @brief message for right context window argument
-static const char context_window_message_r[] =
-    "Optional. Number of frames for right context windows (default is 0). "
-    "Works only with context window networks."
-    " If you use the cw_r or cw_l flag, then batch size argument is ignored.";
-
-/// @brief message for inputs layer names
-static const char layout_message[] =
-    "Optional. Prompts how network layouts should be treated by application. "
-    "For example, \"input1[NCHW],input2[NC]\" or \"[NCHW]\" in case of one input size.";
-;
-
-/// @brief message for PWL max error percent
-static const char pwl_max_error_percent_message[] = "Optional. The maximum percent of error for PWL function."
-                                                    "The value must be in <0, 100> range. The default value is 1.0.";
-
-/// \brief Define flag for showing help message <br>
-DEFINE_bool(h, false, help_message);
-
-/// \brief Define flag for disabling compact (memory_reuse) mode <br>
-DEFINE_bool(memory_reuse_off, false, memory_reuse_message);
-
-/// \brief Define parameter for set image file <br>
-/// It is a required parameter
-DEFINE_string(i, "", input_message);
-
-/// \brief Define parameter for set model file <br>
-/// It is a required parameter
-DEFINE_string(m, "", model_message);
-
-/// \brief device the target device to infer on (default CPU) <br>
-DEFINE_string(d, "CPU", target_device_message);
-
-/// \brief GNA execution target <br>
-DEFINE_string(exec_target, "", execution_target_message);
-
-/// \brief GNA compile target <br>
-DEFINE_string(compile_target, "", compile_target_message);
-
-/// \brief GNA log level (default LOG_NONE) <br>
-DEFINE_string(log, "LOG_NONE", enable_log_message);
-
-/// \brief Enable per-layer performance report
-DEFINE_bool(pc, false, performance_counter_message);
-
-/// @brief Write output file to save ark scores
-DEFINE_string(o, "", output_message);
-
-/// @brief Read reference score file
-DEFINE_string(r, "", reference_score_message);
-
-/// @brief Read GNA model from file (model.bin)
-DEFINE_string(rg, "", read_gna_model_message);
-
-/// @brief Write GNA model to file (model.bin)
-DEFINE_string(wg, "", write_gna_model_message);
-
-/// @brief Write GNA embedded model to file (model.bin)
-DEFINE_string(we, "", write_embedded_model_message);
-
-/// @brief Input quantization mode (default static)
-DEFINE_string(q, "static", quantization_message);
-
-/// @brief Weight resolution in bits (default 16)
-DEFINE_int32(qb, 16, quantization_bits_message);
-
-/// @brief Scale factor for quantization
-DEFINE_string(sf, "", scale_factor_message);
-
-/// @brief Batch size (default 0)
-DEFINE_int32(bs, 0, batch_size_message);
-
-/// @brief Right context window size (default 0)
-DEFINE_int32(cw_r, 0, context_window_message_r);
-
-/// @brief Left context window size (default 0)
-DEFINE_int32(cw_l, 0, context_window_message_l);
-
-/// @brief Input layer name
-DEFINE_string(layout, "", layout_message);
-
-/// @brief PWL max error percent
-DEFINE_double(pwl_me, 1.0, pwl_max_error_percent_message);
-
-/**
- * \brief This function show a help message
- */
-static void show_usage() {
-    std::cout << std::endl;
-    std::cout << "speech_sample [OPTION]" << std::endl;
-    std::cout << "Options:" << std::endl;
-    std::cout << std::endl;
-    std::cout << "    -h                         " << help_message << std::endl;
-    std::cout << "    -i \"<path>\"                " << input_message << std::endl;
-    std::cout << "    -m \"<path>\"                " << model_message << std::endl;
-    std::cout << "    -o \"<path>\"                " << output_message << std::endl;
-    std::cout << "    -d \"<device>\"              " << target_device_message << std::endl;
-    std::cout << "    -pc                        " << performance_counter_message << std::endl;
-    std::cout << "    -q \"<mode>\"                " << quantization_message << std::endl;
-    std::cout << "    -qb \"<integer>\"            " << quantization_bits_message << std::endl;
-    std::cout << "    -sf \"<double>\"             " << scale_factor_message << std::endl;
-    std::cout << "    -bs \"<integer>\"            " << batch_size_message << std::endl;
-    std::cout << "    -r \"<path>\"                " << reference_score_message << std::endl;
-    std::cout << "    -rg \"<path>\"               " << read_gna_model_message << std::endl;
-    std::cout << "    -wg \"<path>\"               " << write_gna_model_message << std::endl;
-    std::cout << "    -we \"<path>\"               " << write_embedded_model_message << std::endl;
-    std::cout << "    -cw_l \"<integer>\"          " << context_window_message_l << std::endl;
-    std::cout << "    -cw_r \"<integer>\"          " << context_window_message_r << std::endl;
-    std::cout << "    -layout \"<string>\"         " << layout_message << std::endl;
-    std::cout << "    -pwl_me \"<double>\"         " << pwl_max_error_percent_message << std::endl;
-    std::cout << "    -exec_target \"<string>\"    " << execution_target_message << std::endl;
-    std::cout << "    -compile_target \"<string>\" " << compile_target_message << std::endl;
-    std::cout << "    -memory_reuse_off          " << memory_reuse_message << std::endl;
-}
-
-/**
- * @brief Checks input arguments
- * @param argc number of args
- * @param argv list of input arguments
- * @return bool status true(Success) or false(Fail)
- */
-bool parse_and_check_command_line(int argc, char* argv[]) {
-    slog::info << "Parsing input parameters" << slog::endl;
-
-    gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
-    if (FLAGS_h) {
-        show_usage();
-        showAvailableDevices();
-        return false;
-    }
-    bool isDumpMode = !FLAGS_wg.empty() || !FLAGS_we.empty();
-
-    // input not required only in dump mode and if external scale factor provided
-    if (FLAGS_i.empty() && (!isDumpMode || FLAGS_q.compare("user") != 0)) {
-        show_usage();
-        if (isDumpMode) {
-            throw std::logic_error("In model dump mode either static quantization is used (-i) or user scale"
-                                   " factor need to be provided. See -q user option");
-        }
-        throw std::logic_error("Input file not set. Please use -i.");
-    }
-
-    if (FLAGS_m.empty() && FLAGS_rg.empty()) {
-        show_usage();
-        throw std::logic_error("Either IR file (-m) or GNAModel file (-rg) need to be set.");
-    }
-
-    if ((!FLAGS_m.empty() && !FLAGS_rg.empty())) {
-        throw std::logic_error("Only one of -m and -rg is allowed.");
-    }
-
-    std::vector<std::string> supportedDevices = {"CPU",
-                                                 "GPU",
-                                                 "GNA_AUTO",
-                                                 "GNA_HW",
-                                                 "GNA_HW_WITH_SW_FBACK",
-                                                 "GNA_SW_EXACT",
-                                                 "GNA_SW_FP32",
-                                                 "HETERO:GNA,CPU",
-                                                 "HETERO:GNA_HW,CPU",
-                                                 "HETERO:GNA_SW_EXACT,CPU",
-                                                 "HETERO:GNA_SW_FP32,CPU",
-                                                 "NPU"};
-
-    if (std::find(supportedDevices.begin(), supportedDevices.end(), FLAGS_d) == supportedDevices.end()) {
-        throw std::logic_error("Specified device is not supported.");
-    }
-
-    uint32_t batchSize = (uint32_t)FLAGS_bs;
-    if (batchSize && ((batchSize < 1) || (batchSize > 8))) {
-        throw std::logic_error("Batch size out of range (1..8).");
-    }
-
-    /** default is a static quantization **/
-    if ((FLAGS_q.compare("static") != 0) && (FLAGS_q.compare("user") != 0)) {
-        throw std::logic_error("Quantization mode not supported (static, user).");
-    }
-
-    if (FLAGS_qb != 16 && FLAGS_qb != 8) {
-        throw std::logic_error("Only 8 or 16 bits supported.");
-    }
-
-    if (FLAGS_cw_r < 0) {
-        throw std::logic_error("Invalid value for 'cw_r' argument. It must be greater than or equal to 0");
-    }
-
-    if (FLAGS_cw_l < 0) {
-        throw std::logic_error("Invalid value for 'cw_l' argument. It must be greater than or equal to 0");
-    }
-
-    if (FLAGS_pwl_me < 0.0 || FLAGS_pwl_me > 100.0) {
-        throw std::logic_error("Invalid value for 'pwl_me' argument. It must be greater than 0.0 and less than 100.0");
-    }
-
-    return true;
-}
--- a/samples/cpp/speech_sample/utils.hpp
+++ b/samples/cpp/speech_sample/utils.hpp
@@ -1,542 +0,0 @@
-// Copyright (C) 2018-2023 Intel Corporation
-// SPDX-License-Identifier: Apache-2.0
-//
-
-#pragma once
-#include <cnpy.h>
-
-#include <samples/common.hpp>
-
-#define MAX_SCORE_DIFFERENCE 0.0001f  // max score difference for frame error threshold
-#define MAX_VAL_2B_FEAT      16384    // max to find scale factor
-
-typedef std::chrono::high_resolution_clock Time;
-typedef std::chrono::duration<double, std::ratio<1, 1000>> ms;
-typedef std::chrono::duration<float> fsec;
-
-/**
- * @brief struct to store score error
- */
-struct ScoreErrorT {
-    uint32_t numScores;
-    uint32_t numErrors;
-    float threshold;
-    float maxError;
-    float rmsError;
-    float sumError;
-    float sumRmsError;
-    float sumSquaredError;
-    float maxRelError;
-    float sumRelError;
-    float sumSquaredRelError;
-    float maxAbsRefScore;
-    float sumAbsRefScore;
-};
-
-/**
- * @brief struct to store infer request data per frame
- */
-struct InferRequestStruct {
-    ov::InferRequest inferRequest;
-    int frameIndex;
-    uint32_t numFramesThisBatch;
-};
-
-/**
- * @brief Check number of input files and model network inputs
- * @param numInputs number model inputs
- * @param numInputFiles number of input files
- * @return none.
- */
-void check_number_of_inputs(size_t numInputs, size_t numInputFiles) {
-    if (numInputs != numInputFiles) {
-        throw std::logic_error("Number of network inputs (" + std::to_string(numInputs) +
-                               ")"
-                               " is not equal to number of input files (" +
-                               std::to_string(numInputFiles) + ")");
-    }
-}
-
-/**
- * @brief Get scale factor for quantization
- * @param ptrFloatMemory pointer to float memory with speech feature vector
- * @param targetMax max scale factor
- * @param numElements number of elements in speech feature vector
- * @return scale factor
- */
-float scale_factor_for_quantization(void* ptrFloatMemory, float targetMax, uint32_t numElements) {
-    float* ptrFloatFeat = reinterpret_cast<float*>(ptrFloatMemory);
-    float max = 0.0;
-    float scaleFactor;
-
-    for (uint32_t i = 0; i < numElements; i++) {
-        if (fabs(ptrFloatFeat[i]) > max) {
-            max = fabs(ptrFloatFeat[i]);
-        }
-    }
-
-    if (max == 0) {
-        scaleFactor = 1.0;
-    } else {
-        scaleFactor = targetMax / max;
-    }
-
-    return (scaleFactor);
-}
-
-/**
- * @brief Clean score error
- * @param error pointer to score error struct
- * @return none.
- */
-void clear_score_error(ScoreErrorT* error) {
-    error->numScores = 0;
-    error->numErrors = 0;
-    error->maxError = 0.0;
-    error->rmsError = 0.0;
-    error->sumError = 0.0;
-    error->sumRmsError = 0.0;
-    error->sumSquaredError = 0.0;
-    error->maxRelError = 0.0;
-    error->sumRelError = 0.0;
-    error->sumSquaredRelError = 0.0;
-    error->maxAbsRefScore = 0.0;
-    error->sumAbsRefScore = 0.0;
-}
-
-/**
- * @brief Update total score error
- * @param error pointer to score error struct
- * @param totalError pointer to total score error struct
- * @return none.
- */
-void update_score_error(ScoreErrorT* error, ScoreErrorT* totalError) {
-    totalError->numErrors += error->numErrors;
-    totalError->numScores += error->numScores;
-    totalError->sumRmsError += error->rmsError;
-    totalError->sumError += error->sumError;
-    totalError->sumAbsRefScore += error->sumAbsRefScore;
-    totalError->sumSquaredError += error->sumSquaredError;
-    if (error->maxError > totalError->maxError) {
-        totalError->maxError = error->maxError;
-    }
-    if (error->maxAbsRefScore > totalError->maxAbsRefScore) {
-        totalError->maxAbsRefScore = error->maxAbsRefScore;
-    }
-    totalError->sumRelError += error->sumRelError;
-    totalError->sumSquaredRelError += error->sumSquaredRelError;
-    if (error->maxRelError > totalError->maxRelError) {
-        totalError->maxRelError = error->maxRelError;
-    }
-}
-
-/**
- * @brief Compare score errors, array should be the same length
- * @param ptrScoreArray - pointer to score error struct array
- * @param ptrRefScoreArray - pointer to score error struct array to compare
- * @param scoreError - pointer to score error struct to save a new error
- * @param numRows - number rows in score error arrays
- * @param numColumns - number columns in score error arrays
- * @return none.
- */
-void compare_scores(float* ptrScoreArray,
-                    void* ptrRefScoreArray,
-                    ScoreErrorT* scoreError,
-                    uint32_t numRows,
-                    uint32_t numColumns) {
-    uint32_t numErrors = 0;
-
-    clear_score_error(scoreError);
-
-    float* A = ptrScoreArray;
-    float* B = reinterpret_cast<float*>(ptrRefScoreArray);
-    for (uint32_t i = 0; i < numRows; i++) {
-        for (uint32_t j = 0; j < numColumns; j++) {
-            float score = A[i * numColumns + j];
-            // std::cout << "score" << score << std::endl;
-            float refscore = B[i * numColumns + j];
-            float abs_refscore = fabs(refscore);
-            float error = fabs(refscore - score);
-            float rel_error = error / (static_cast<float>(abs_refscore) + 1e-20f);
-            float squared_error = error * error;
-            float squared_rel_error = rel_error * rel_error;
-            scoreError->numScores++;
-            scoreError->sumError += error;
-            scoreError->sumAbsRefScore += abs_refscore;
-            scoreError->sumSquaredError += squared_error;
-            if (abs_refscore > scoreError->maxAbsRefScore) {
-                scoreError->maxAbsRefScore = abs_refscore;
-            }
-            if (error > scoreError->maxError) {
-                scoreError->maxError = error;
-            }
-            scoreError->sumRelError += rel_error;
-            scoreError->sumSquaredRelError += squared_rel_error;
-            if (rel_error > scoreError->maxRelError) {
-                scoreError->maxRelError = rel_error;
-            }
-            if (error > scoreError->threshold) {
-                numErrors++;
-            }
-        }
-    }
-    scoreError->rmsError = sqrt(scoreError->sumSquaredError / (numRows * numColumns));
-    scoreError->sumRmsError += scoreError->rmsError;
-    scoreError->numErrors = numErrors;
-    // std::cout << "rmsError=" << scoreError->rmsError << "sumRmsError="<<scoreError->sumRmsError;
-}
-
-/**
- * @brief Get total stdev error
- * @param error pointer to score error struct
- * @return error
- */
-float std_dev_error(ScoreErrorT error) {
-    return (sqrt(error.sumSquaredError / error.numScores -
-                 (error.sumError / error.numScores) * (error.sumError / error.numScores)));
-}
-
-/**
- * @brief Print a report on the statistical score error
- * @param totalError reference to a total score error struct
- * @param framesNum number of frames in utterance
- * @param stream output stream
- * @return none.
- */
-void print_reference_compare_results(ScoreErrorT const& totalError, size_t framesNum, std::ostream& stream) {
-    stream << " max abs ref score: " << totalError.maxAbsRefScore << std::endl;
-    stream << " avg abs ref score: " << totalError.sumAbsRefScore / totalError.numScores << std::endl;
-    stream << "         max error: " << totalError.maxError << std::endl;
-    stream << "         avg error: " << totalError.sumError / totalError.numScores << std::endl;
-    stream << "     avg rms error: " << totalError.sumRmsError / framesNum << std::endl;
-    stream << "       stdev error: " << std_dev_error(totalError) << std::endl << std::endl;
-    stream << std::endl;
-}
-
-/**
- * @brief Print a report on the performance counts
- * @param utterancePerfMap reference to a map to store performance counters
- * @param numberOfFrames number of frames
- * @param stream output stream
- * @param fullDeviceName full device name string
- * @param numberOfFramesOnHw number of frames delivered to GNA HW
- * @param FLAGS_d flag of device
- * @return none.
- */
-void print_performance_counters(std::map<std::string, ov::ProfilingInfo> const& utterancePerfMap,
-                                size_t numberOfFrames,
-                                std::ostream& stream,
-                                std::string fullDeviceName,
-                                const uint64_t numberOfFramesOnHw,
-                                std::string FLAGS_d) {
-#if !defined(__arm__) && !defined(_M_ARM) && !defined(__aarch64__) && !defined(_M_ARM64)
-    std::ios::fmtflags fmt(std::cout.flags());
-    stream << std::endl << "Performance counts:" << std::endl;
-    stream << std::setw(10) << std::right << ""
-           << "Counter descriptions";
-    stream << std::setw(22) << "Utt scoring time";
-    stream << std::setw(18) << "Avg infer time";
-    stream << std::endl;
-
-    stream << std::setw(46) << "(ms)";
-    stream << std::setw(24) << "(us per call)";
-    stream << std::endl;
-    // if GNA HW counters
-    for (const auto& it : utterancePerfMap) {
-        std::string const& counter_name = it.first;
-        float current_units_us = static_cast<float>(it.second.real_time.count());
-        float call_units_us = 0;
-        if (numberOfFrames == 0) {
-            throw std::logic_error("Number off frames = 0,  division by zero.");
-        } else {
-            call_units_us = current_units_us / numberOfFrames;
-        }
-        if (FLAGS_d.find("GNA") != std::string::npos) {
-            stream << std::setw(30) << std::left << counter_name.substr(4, counter_name.size() - 1);
-        } else {
-            stream << std::setw(30) << std::left << counter_name;
-        }
-        stream << std::setw(16) << std::right << current_units_us / 1000;
-        stream << std::setw(21) << std::right << call_units_us;
-        stream << std::endl;
-    }
-    stream << std::endl;
-    std::cout << std::endl;
-    std::cout << "Full device name: " << fullDeviceName << std::endl;
-    std::cout << std::endl;
-    stream << "Number of frames delivered to GNA HW: " << numberOfFramesOnHw;
-    stream << "/" << numberOfFrames;
-    stream << std::endl;
-    std::cout.flags(fmt);
-#endif
-}
-
-/**
- * @brief Get performance counts
- * @param request reference to infer request
- * @param perfCounters reference to a map to save performance counters
- * @return none.
- */
-void get_performance_counters(ov::InferRequest& request, std::map<std::string, ov::ProfilingInfo>& perfCounters) {
-    auto retPerfCounters = request.get_profiling_info();
-
-    for (const auto& element : retPerfCounters) {
-        perfCounters[element.node_name] = element;
-    }
-}
-
-/**
- * @brief Summarize performance counts and total number of frames executed on the GNA HW device
- * @param perfCounters reference to a map to get performance counters
- * @param totalPerfCounters reference to a map to save total performance counters
- * @param totalRunsOnHw reference to a total number of frames computed on GNA HW
- * @return none.
- */
-void sum_performance_counters(std::map<std::string, ov::ProfilingInfo> const& perfCounters,
-                              std::map<std::string, ov::ProfilingInfo>& totalPerfCounters,
-                              uint64_t& totalRunsOnHw) {
-    auto runOnHw = false;
-    for (const auto& pair : perfCounters) {
-        totalPerfCounters[pair.first].real_time += pair.second.real_time;
-        runOnHw |= pair.second.real_time > std::chrono::microseconds(0);  // if realTime is above zero, that means that
-                                                                          // a primitive was executed on the device
-    }
-    totalRunsOnHw += runOnHw;
-}
-
-/**
- * @brief Split string by delimeter
- * @param s input string
- * @param delim delimeter
- * @return vector of chunks
- */
-std::vector<std::string> split(const std::string& s, char delim) {
-    std::vector<std::string> result;
-    std::stringstream ss(s);
-    std::string item;
-
-    while (getline(ss, item, delim)) {
-        result.push_back(item);
-    }
-    return result;
-}
-
-/**
- * @brief Concat strings using delimeter
- * @param chunks input chunks
- * @param delim delimeter
- * @return concatenated string
- */
-std::string concat(const std::vector<std::string>& chunks, char delim) {
-    std::stringstream ss;
-    for (auto&& chunk : chunks) {
-        if (!ss.str().empty()) {
-            ss << delim;
-        }
-        ss << chunk;
-    }
-    return ss.str();
-}
-
-/**
- * @brief Check whether name is present in node vector
- * @param nodes nodes
- * @param node_name name
- * @return false or true
- */
-bool check_name(const ov::OutputVector& nodes, const std::string& node_name) {
-    std::vector<std::string> any_names;
-    bool count = false;
-    for (auto& node : nodes) {
-        any_names.push_back(node.get_any_name());
-        auto names = node.get_names();
-        count = std::count(names.begin(), names.end(), node_name);
-        if (count)
-            break;
-    }
-    if (!count) {
-        std::stringstream ss;
-        ss << "Incorrect node name '" + node_name << "'! ";
-        ss << "Try one of the following names: [ ";
-        for (auto&& name : any_names) {
-            ss << name << " ";
-        }
-        ss << "]";
-        throw std::logic_error(ss.str());
-    }
-    return count;
-}
-
-/**
- * @brief Strip the name of the input to exclude ":port"
- * @param name input name
- * @return striped input name
- */
-std::string strip_name(const std::string& name) {
-    return {name, 0, name.rfind(':')};
-}
-
-/**
- * @brief Parse scale factors per input
- * Format : <input_name1>=<sf1>,<input2>=<sf2> or just <sf>
- * @param inputs model inputs
- * @param values_string values_string input string
- * @return map of scale factors per input
- */
-std::map<std::string, float> parse_scale_factors(const ov::OutputVector& inputs, const std::string& values_string) {
-    auto get_sf = [&](const std::string& sf_string, const std::string& input_name = "") -> float {
-        float sf;
-        try {
-            sf = std::stof(sf_string);
-        } catch (...) {
-            throw std::logic_error("Can't get float scale factor from: " + sf_string);
-        }
-        if (sf <= 0.0f) {
-            throw std::logic_error("Scale factor for input '" + input_name +
-                                   "' (counting from zero) is out of range (must be positive).");
-        }
-        return sf;
-    };
-    std::map<std::string, float> result;
-    auto scale_factor_strings = split(values_string, ',');
-    for (auto& scale_factor_string : scale_factor_strings) {
-        auto values = split(scale_factor_string, '=');
-        if (values.size() == 1) {
-            if (scale_factor_strings.size() != 1) {
-                throw std::logic_error("Unrecognized scale factor format! "
-                                       "Please specify <input_name1>=<sf1>,<input_name2>=<sf2> or "
-                                       "just <sf> to be applied to all inputs");
-            }
-            auto scale_factor = get_sf(values.at(0));
-            for (auto& input : inputs) {
-                result[input.get_any_name()] = scale_factor;
-            }
-        } else if (values.size() > 0) {
-            auto sf_sting = values.back();
-            values.pop_back();
-            auto input_name = values.back();
-            check_name(inputs, input_name);
-            result[input_name] = get_sf(sf_sting, input_name);
-        }
-    }
-    return result;
-}
-
-/**
- * @brief Parse string of file names separated by comma to save it to vector of file names
- * @param str file names separated by comma
- * @return vector of file names
- */
-std::vector<std::string> convert_str_to_vector(std::string str) {
-    std::vector<std::string> blobName;
-    if (!str.empty()) {
-        size_t pos_last = 0;
-        size_t pos_next = 0;
-        while ((pos_next = str.find(",", pos_last)) != std::string::npos) {
-            blobName.push_back(str.substr(pos_last, pos_next - pos_last));
-            pos_last = pos_next + 1;
-        }
-        blobName.push_back(str.substr(pos_last));
-    }
-    return blobName;
-}
-
-/**
- * @brief Parse layout string like "input0[value0],input1[value1]" or "[value]" (applied to all inputs)
- * @param layout_string input names with layout values
- * @param input_info reference to vector of inputs
- * @return map of inputs with layout values
- */
-std::map<std::string, std::string> parse_input_layouts(const std::string& layout_string,
-                                                       const std::vector<ov::Output<ov::Node>>& input_info) {
-    // Parse parameter string like "input0[value0],input1[value1]" or "[value]" (applied to all
-    // inputs)
-    std::map<std::string, std::string> return_value;
-    std::string search_string = layout_string;
-    auto start_pos = search_string.find_first_of('[');
-    auto input_name = search_string.substr(0, start_pos);
-    while (start_pos != std::string::npos) {
-        auto end_pos = search_string.find_first_of(']');
-        if (end_pos == std::string::npos)
-            break;
-        if (start_pos)
-            input_name = search_string.substr(0, start_pos);
-        auto input_value = search_string.substr(start_pos + 1, end_pos - start_pos - 1);
-        if (!input_name.empty()) {
-            return_value[input_name] = input_value;
-        } else {
-            for (auto& item : input_info) {
-                return_value[item.get_any_name()] = input_value;
-            }
-        }
-        search_string = search_string.substr(end_pos + 1);
-        if (search_string.empty() || (search_string.front() != ',' && search_string.front() != '['))
-            break;
-        if (search_string.front() == ',')
-            search_string = search_string.substr(1);
-        start_pos = search_string.find_first_of('[');
-    }
-    if (!search_string.empty())
-        throw std::logic_error("Can't parse input parameter string: " + layout_string);
-    return return_value;
-}
-
-/**
- * @brief Parse parameters for inputs/outputs/reference like as "<name1>=<file1.ark/.npz>,<name2>=<file2.ark/.npz>" or
- * "<file.ark/.npz>" in case of one input/output/reference.
- * @note Examplary result for given data: {"<file1.ark/.npz>,<file2.ark/.npz>",{"<name1>","<name2>"}}
- * @param file_paths_string input/output path
- * @return pair of filename and vector of layers names
- */
-std::pair<std::string, std::vector<std::string>> parse_parameters(const std::string& file_paths_string) {
-    auto search_string = file_paths_string;
-    char comma_delim = ',';
-    char equal_delim = '=';
-    std::string filename = "";
-    std::vector<std::string> layers_names;
-    std::vector<std::string> filenames;
-    if (!std::count(search_string.begin(), search_string.end(), comma_delim) &&
-        !std::count(search_string.begin(), search_string.end(), equal_delim)) {
-        return {search_string, layers_names};
-    }
-    search_string += comma_delim;
-    std::vector<std::string> splitted = split(search_string, comma_delim);
-    for (size_t j = 0; j < splitted.size(); j++) {
-        auto equal_delim_pos = splitted[j].find_first_of(equal_delim);
-        if (equal_delim_pos != std::string::npos) {
-            layers_names.push_back(splitted[j].substr(0, equal_delim_pos));
-            filenames.push_back(splitted[j].substr(equal_delim_pos + 1, std::string::npos));
-        }
-    }
-    for (std::vector<std::string>::const_iterator name = filenames.begin(); name != filenames.end(); ++name) {
-        filename += *name;
-        if (name != filenames.end() - 1)
-            filename += comma_delim;
-    }
-    return {filename, layers_names};
-}
-
-std::vector<std::pair<std::string, size_t>> parse_to_extract_port(const std::vector<std::string>& full_names) {
-    std::vector<std::pair<std::string, size_t>> result;
-    for (const auto& full_name : full_names) {
-        auto pos_layer = full_name.rfind(":");
-        if (pos_layer == std::string::npos) {
-            throw std::logic_error("Output " + full_name + " doesn't have a port");
-        }
-        const auto name = full_name.substr(0, pos_layer);
-        try {
-            const size_t port = std::stoul(full_name.substr(pos_layer + 1));
-            result.push_back({name, port});
-        } catch (const std::exception&) {
-            throw std::logic_error("Ports should have integer type");
-        }
-    }
-    return result;
-}
-
-const std::vector<std::string>& get_first_non_empty(const std::vector<std::string>& first,
-                                                    const std::vector<std::string>& second) {
-    if (!first.empty())
-        return first;
-    return second;
-}
--- a/samples/python/speech_sample/README.md
+++ b/samples/python/speech_sample/README.md
@@ -1,43 +0,0 @@
-# Automatic Speech Recognition Python Sample
-
-> **NOTE**: This sample is being deprecated and will no longer be maintained after OpenVINO 2023.2 (LTS). The main reason for it is the outdated state of the sample and its extensive usage of GNA, which is not going to be supported by OpenVINO beyond 2023.2. 
-
-This sample demonstrates how to do a Synchronous Inference of acoustic model based on Kaldi\* neural models and speech feature vectors.
-
-The sample works with Kaldi ARK or Numpy* uncompressed NPZ files, so it does not cover an end-to-end speech recognition scenario (speech to text), requiring additional preprocessing (feature extraction) to get a feature vector from a speech signal, as well as postprocessing (decoding) to produce text from scores.
-
-For more detailed information on how this sample works, check the dedicated [article](https://docs.openvino.ai/2023.2/openvino_inference_engine_ie_bridges_python_sample_speech_sample_README.html)
-
-## Requirements
-
-| Options                     | Values                                                                                                                                                        |
-| ----------------------------| --------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Validated Models            | Acoustic model based on Kaldi* neural models (see                                                                                                             |
-|                             | [Model Preparation](https://docs.openvino.ai/2023.2/openvino_inference_engine_ie_bridges_python_sample_speech_sample_README.html#model-preparation) section)  |
-| Model Format                | OpenVINO™ toolkit Intermediate Representation (.xml + .bin)                                                                                                   |
-| Supported devices           | See [Execution Modes](https://docs.openvino.ai/2023.2/openvino_inference_engine_ie_bridges_python_sample_speech_sample_README.html#execution-modes)           |
-|                             | section below and [List Supported Devices](https://docs.openvino.ai/2023.2/openvino_docs_OV_UG_supported_plugins_Supported_Devices.html)                      |
-| Other language realization  | [C++](https://docs.openvino.ai/2023.2/openvino_inference_engine_samples_speech_sample_README.html)                                                            |
-
-Automatic Speech Recognition Python sample application demonstrates how to use the following Python API in applications:
-
-| Feature                  | API                                                                                                                                                                                                             | Description                                                           |
-| -------------------------| ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------|
-| Import/Export Model      | [openvino.runtime.Core.import_model](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.Core.html#openvino.runtime.Core.import_model),                                             |                                                                       |
-|                          | [openvino.runtime.CompiledModel.export_model](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.CompiledModel.html#openvino.runtime.CompiledModel.export_model)                   | The GNA plugin supports loading and saving of the GNA-optimized model |
-| Model Operations         | [openvino.runtime.Model.add_outputs](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.Model.html#openvino.runtime.Model.add_outputs) ,                                           |                                                                       |
-|                          | [openvino.runtime.set_batch](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.html#openvino.runtime.set_batch),                                                                  |                                                                       |
-|                          | [openvino.runtime.CompiledModel.inputs](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.CompiledModel.html#openvino.runtime.CompiledModel.inputs),                              |                                                                       |
-|                          | [openvino.runtime.CompiledModel.outputs](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.CompiledModel.html#openvino.runtime.CompiledModel.outputs),                            |                                                                       |
-|                          | [openvino.runtime.ConstOutput.any_name](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.ConstOutput.html#openvino.runtime.ConstOutput.any_name)                                 | Managing of model: configure batch_size, input and output tensors     |
-| Synchronous Infer        | [openvino.runtime.CompiledModel.create_infer_request](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.CompiledModel.html#openvino.runtime.CompiledModel.create_infer_request),  |                                                                       |
-|                          | [openvino.runtime.InferRequest.infer](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.InferRequest.html#openvino.runtime.InferRequest.infer)                                    | Do synchronous inference                                              |
-| InferRequest Operations  | [openvino.runtime.InferRequest.get_input_tensor](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.InferRequest.html#openvino.runtime.InferRequest.get_input_tensor),             |                                                                       |
-|                          | [openvino.runtime.InferRequest.model_outputs](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.InferRequest.html#openvino.runtime.InferRequest.model_outputs),                   |                                                                       |
-|                          | [openvino.runtime.InferRequest.model_inputs](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.InferRequest.html#openvino.runtime.InferRequest.model_inputs),                     | Get info about model using infer request API                          |
-| InferRequest Operations  | [openvino.runtime.InferRequest.query_state](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.InferRequest.html#openvino.runtime.InferRequest.query_state),                       |                                                                       |
-|                          | [openvino.runtime.VariableState.reset](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.inference_engine.VariableState.html#openvino.inference_engine.VariableState.reset)               | Gets and resets CompiledModel state control                           |
-| Profiling                | [openvino.runtime.InferRequest.profiling_info](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.InferRequest.html#openvino.runtime.InferRequest.profiling_info),                 |                                                                       |
-|                          | [openvino.runtime.ProfilingInfo.real_time](https://docs.openvino.ai/2023.2/api/ie_python_api/_autosummary/openvino.runtime.ProfilingInfo.html#openvino.runtime.ProfilingInfo.real_time)                         | Get infer request profiling info                                      |
-
-Basic OpenVINO™ Runtime API is covered by [Hello Classification Python* Sample](https://docs.openvino.ai/2023.2/openvino_inference_engine_ie_bridges_python_sample_hello_classification_README.html).
--- a/samples/python/speech_sample/arg_parser.py
+++ b/samples/python/speech_sample/arg_parser.py
@@ -1,142 +0,0 @@
-# -*- coding: utf-8 -*-
-# Copyright (C) 2018-2023 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-import argparse
-import re
-from typing import List, Tuple, Union
-
-
-def build_arg_parser() -> argparse.ArgumentParser:
-    """Create and return argument parser."""
-    parser = argparse.ArgumentParser(add_help=False)
-    args = parser.add_argument_group('Options')
-    model = parser.add_mutually_exclusive_group(required=True)
-
-    model.add_argument('-m', '--model', type=str,
-                       help='Path to an .xml file with a trained model (required if -rg is missing).')
-    model.add_argument('-rg', '--import_gna_model', type=str,
-                       help='Read GNA model from file using path/filename provided (required if -m is missing).')
-
-    args.add_argument('-h', '--help', action='help', help='Show this help message and exit.')
-    args.add_argument('-i', '--input', required=True, type=str,
-                      help='Required. Path(s) to input file(s). '
-                      'Usage for a single file/layer: <input_file.ark> or <input_file.npz>. '
-                      'Example of usage for several files/layers: <layer1>:<port_num1>=<input_file1.ark>,<layer2>:<port_num2>=<input_file2.ark>.')
-    args.add_argument('-o', '--output', type=str,
-                      help='Optional. Output file name(s) to save scores (inference results). '
-                      'Usage for a single file/layer: <output_file.ark> or <output_file.npz>. '
-                      'Example of usage for several files/layers: <layer1>:<port_num1>=<output_file1.ark>,<layer2>:<port_num2>=<output_file2.ark>.')
-    args.add_argument('-r', '--reference', type=str,
-                      help='Optional. Read reference score file(s) and compare inference results with reference scores. '
-                      'Usage for a single file/layer: <reference_file.ark> or <reference_file.npz>. '
-                      'Example of usage for several files/layers: <layer1>:<port_num1>=<reference_file1.ark>,<layer2>:<port_num2>=<reference_file2.ark>.')
-    args.add_argument('-d', '--device', default='CPU', type=str,
-                      help='Optional. Specify a target device to infer on. '
-                      'CPU, GPU, NPU, GNA_AUTO, GNA_HW, GNA_SW_FP32, GNA_SW_EXACT and HETERO with combination of GNA'
-                      ' as the primary device and CPU as a secondary (e.g. HETERO:GNA,CPU) are supported. '
-                      'The sample will look for a suitable plugin for device specified. Default value is CPU.')
-    args.add_argument('-bs', '--batch_size', type=int, choices=range(1, 9), metavar='[1-8]',
-                      help='Optional. Batch size 1-8.')
-    args.add_argument('-layout', type=str,
-                      help='Optional. Custom layout in format: "input0[value0],input1[value1]" or "[value]" (applied to all inputs)')
-    args.add_argument('-qb', '--quantization_bits', default=16, type=int, choices=(8, 16), metavar='[8, 16]',
-                      help='Optional. Weight resolution in bits for GNA quantization: 8 or 16 (default 16).')
-    args.add_argument('-sf', '--scale_factor', type=str,
-                      help='Optional. User-specified input scale factor for GNA quantization. '
-                      'If the model contains multiple inputs, provide scale factors by separating them with commas. '
-                      'For example: <layer1>:<sf1>,<layer2>:<sf2> or just <sf> to be applied to all inputs.')
-    args.add_argument('-wg', '--export_gna_model', type=str,
-                      help='Optional. Write GNA model to file using path/filename provided.')
-    args.add_argument('-we', '--export_embedded_gna_model', type=str,
-                      help='Optional. Write GNA embedded model to file using path/filename provided.')
-    args.add_argument('-we_gen', '--embedded_gna_configuration', default='GNA1', type=str, metavar='[GNA1, GNA3]',
-                      help='Optional. GNA generation configuration string for embedded export. '
-                      'Can be GNA1 (default) or GNA3.')
-    args.add_argument('--exec_target', default='', type=str, choices=('GNA_TARGET_2_0', 'GNA_TARGET_3_0'),
-                      metavar='[GNA_TARGET_2_0, GNA_TARGET_3_0]',
-                      help='Optional. Specify GNA execution target generation. '
-                      'By default, generation corresponds to the GNA HW available in the system '
-                      'or the latest fully supported generation by the software. '
-                      "See the GNA Plugin's GNA_EXEC_TARGET config option description.")
-    args.add_argument('-pc', '--performance_counter', action='store_true',
-                      help='Optional. Enables performance report (specify -a to ensure arch accurate results).')
-    args.add_argument('-a', '--arch', default='CORE', type=str.upper, choices=('CORE', 'ATOM'), metavar='[CORE, ATOM]',
-                      help='Optional. Specify architecture. CORE, ATOM with the combination of -pc.')
-    args.add_argument('-cw_l', '--context_window_left', type=int, default=0,
-                      help='Optional. Number of frames for left context windows (default is 0). '
-                      'Works only with context window models. '
-                      'If you use the cw_l or cw_r flag, then batch size argument is ignored.')
-    args.add_argument('-cw_r', '--context_window_right', type=int, default=0,
-                      help='Optional. Number of frames for right context windows (default is 0). '
-                      'Works only with context window models. '
-                      'If you use the cw_l or cw_r flag, then batch size argument is ignored.')
-    args.add_argument('-pwl_me', type=float, default=1.0,
-                      help='Optional. The maximum percent of error for PWL function. '
-                      'The value must be in <0, 100> range. The default value is 1.0.')
-
-    return parser
-
-
-def parse_arg_with_names(arg_string: Union[str, None], separator: str = '=') -> Tuple[List[str], List[str]]:
-    keys = []
-    values = []
-
-    if isinstance(arg_string, str):
-        for parameter in re.split(', |,', arg_string):
-            if separator in parameter:
-                key, value = parameter.split(separator)
-                keys.append(key)
-                values.append(value)
-            else:
-                values.append(parameter)
-
-    return keys, values
-
-
-def check_arg_with_names(arg: Tuple[List[str], List[str]]) -> bool:
-    return True if len(arg[0]) == 0 and len(arg[1]) > 1 else False
-
-
-def parse_args(separator: str = '=') -> argparse.Namespace:
-    """Parse and validate command-line arguments."""
-    parser = build_arg_parser()
-    args = parser.parse_args()
-
-    if args.context_window_left < 0:
-        parser.error('Invalid value for argument -cw_l/--context_window_left: Must be an integer >= 0.')
-
-    if args.context_window_right < 0:
-        parser.error('Invalid value for argument -cw_r/--context_window_right: Must be an integer >= 0.')
-
-    if args.pwl_me < 0.0 or args.pwl_me > 100.0:
-        parser.error('Invalid value for -pwl_me argument. It must be greater than 0.0 and less than 100.0')
-
-    args.input = parse_arg_with_names(args.input, separator)
-    if check_arg_with_names(args.input):
-        parser.error(
-            'Invalid format for -i/--input argment. Please specify the parameter like this '
-            f'<input_name1>{separator}<file1.ark/.npz>,<input_name2>{separator}<file2.ark/.npz> or just <file.ark/.npz> in case of one input.',
-        )
-
-    args.scale_factor = parse_arg_with_names(args.scale_factor, separator)
-    if check_arg_with_names(args.scale_factor):
-        parser.error(
-            'Invalid format for -sf/--scale_factor argment. Please specify the parameter like this '
-            f'<input_name1>{separator}<sf1>,<input_name2>{separator}<sf2> or just <sf> to be applied to all inputs.',
-        )
-
-    args.output = parse_arg_with_names(args.output, separator)
-    if check_arg_with_names(args.output):
-        parser.error(
-            'Invalid format for -o/--output argment. Please specify the parameter like this '
-            f'<output_name1>{separator}<output1.ark/.npz>,<output_name2>{separator}<output2.ark/.npz> or just <output.ark/.npz> in case of one output.',
-        )
-
-    args.reference = parse_arg_with_names(args.reference, separator)
-    if check_arg_with_names(args.reference):
-        parser.error(
-            'Invalid format for -r/--reference argment. Please specify the parameter like this '
-            f'<output_name1>{separator}<reference1.ark/.npz>,<output_name2>{separator}<reference2.ark/.npz> or <reference.ark/.npz> in case of one output.',
-        )
-
-    return args
--- a/samples/python/speech_sample/file_options.py
+++ b/samples/python/speech_sample/file_options.py
@@ -1,112 +0,0 @@
-# -*- coding: utf-8 -*-
-# Copyright (C) 2018-2023 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-
-import logging as log
-import sys
-from typing import IO, Any, List, NamedTuple
-
-import numpy as np
-
-
-class FileData(NamedTuple):
-    keys: List[str]
-    utterances: List[np.ndarray]
-
-
-def read_ark_file(file_name: str) -> FileData:
-    """Read utterance matrices from a .ark file."""
-    def read_key(input_file: IO[Any]) -> str:
-        """Read a identifier of utterance matrix."""
-        key = ''
-        char = input_file.read(1).decode()
-
-        while char not in ('', ' '):
-            key += char
-            char = input_file.read(1).decode()
-
-        return key
-
-    def read_matrix(input_file: IO[Any]) -> np.ndarray:
-        """Read a utterance matrix."""
-        header = input_file.read(5).decode()
-        if 'FM' in header:
-            num_of_bytes = 4
-            dtype = 'float32'
-        elif 'DM' in header:
-            num_of_bytes = 8
-            dtype = 'float64'
-        else:
-            log.error(f'The utterance header "{header}" does not contain information about a type of elements.')
-            sys.exit(-7)
-
-        _, rows, _, cols = np.frombuffer(input_file.read(10), 'int8, int32, int8, int32')[0]
-        buffer = input_file.read(rows * cols * num_of_bytes)
-        vector = np.frombuffer(buffer, dtype)
-        matrix = np.reshape(vector, (rows, cols))
-
-        return matrix
-
-    keys = []
-    utterances = []
-    with open(file_name, 'rb') as input_file:
-        key = read_key(input_file)
-
-        while key:
-            utterances.append(read_matrix(input_file))
-            keys.append(key)
-            key = read_key(input_file)
-
-    return FileData(keys, utterances)
-
-
-def write_ark_file(file_name: str, keys: List[str], utterances: List[np.ndarray]):
-    """Write utterance matrices to a .ark file."""
-    with open(file_name, 'wb') as output_file:
-        for key, matrix in zip(keys, utterances):
-            # write a utterance key
-            output_file.write(key.encode())
-            output_file.write(' '.encode())
-            output_file.write('\0B'.encode())
-
-            # write a matrix precision
-            if matrix.dtype == 'float32':
-                output_file.write('FM '.encode())
-            elif matrix.dtype == 'float64':
-                output_file.write('DM '.encode())
-
-            # write a matrix shape
-            output_file.write('\04'.encode())
-            output_file.write(matrix.shape[0].to_bytes(4, byteorder='little', signed=False))
-            output_file.write('\04'.encode())
-            output_file.write(matrix.shape[1].to_bytes(4, byteorder='little', signed=False))
-
-            # write a matrix data
-            output_file.write(matrix.tobytes())
-
-
-def read_utterance_file(file_name: str) -> FileData:
-    """Read utterance matrices from a file."""
-    file_extension = file_name.split('.')[-1]
-
-    if file_extension == 'ark':
-        return read_ark_file(file_name)
-    elif file_extension == 'npz':
-        data = dict(np.load(file_name))
-        return FileData(list(data.keys()), list(data.values()))
-    else:
-        log.error(f'The file {file_name} cannot be read. The sample supports only .ark and .npz files.')
-        sys.exit(-1)
-
-
-def write_utterance_file(file_name: str, keys: List[str], utterances: List[np.ndarray]):
-    """Write utterance matrices to a file."""
-    file_extension = file_name.split('.')[-1]
-
-    if file_extension == 'ark':
-        write_ark_file(file_name, keys, utterances)
-    elif file_extension == 'npz':
-        np.savez(file_name, **dict(zip(keys, utterances)))
-    else:
-        log.error(f'The file {file_name} cannot be written. The sample supports only .ark and .npz files.')
-        sys.exit(-2)
--- a/samples/python/speech_sample/speech_sample.py
+++ b/samples/python/speech_sample/speech_sample.py
@@ -1,285 +0,0 @@
-#!/usr/bin/env python3
-# -*- coding: utf-8 -*-
-# Copyright (C) 2018-2023 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-
-import sys
-from io import BytesIO
-from timeit import default_timer
-from typing import Dict
-
-import numpy as np
-import openvino as ov
-
-from arg_parser import parse_args
-from file_options import read_utterance_file, write_utterance_file
-from utils import (GNA_ATOM_FREQUENCY, GNA_CORE_FREQUENCY,
-                   calculate_scale_factor, compare_with_reference,
-                   get_input_layouts, get_sorted_scale_factors, log,
-                   set_scale_factors)
-
-
-def do_inference(data: Dict[str, np.ndarray], infer_request: ov.InferRequest, cw_l: int = 0, cw_r: int = 0) -> np.ndarray:
-    """Do a synchronous matrix inference."""
-    frames_to_infer = {}
-    result = {}
-
-    batch_size = infer_request.model_inputs[0].shape[0]
-    num_of_frames = next(iter(data.values())).shape[0]
-
-    for output in infer_request.model_outputs:
-        result[output.any_name] = np.ndarray((num_of_frames, np.prod(tuple(output.shape)[1:])))
-
-    for i in range(-cw_l, num_of_frames + cw_r, batch_size):
-        if i < 0:
-            index = 0
-        elif i >= num_of_frames:
-            index = num_of_frames - 1
-        else:
-            index = i
-
-        for _input in infer_request.model_inputs:
-            frames_to_infer[_input.any_name] = data[_input.any_name][index:index + batch_size]
-            num_of_frames_to_infer = len(frames_to_infer[_input.any_name])
-
-            # Add [batch_size - num_of_frames_to_infer] zero rows to 2d numpy array
-            # Used to infer fewer frames than the batch size
-            frames_to_infer[_input.any_name] = np.pad(
-                frames_to_infer[_input.any_name],
-                [(0, batch_size - num_of_frames_to_infer), (0, 0)],
-            )
-
-            frames_to_infer[_input.any_name] = frames_to_infer[_input.any_name].reshape(_input.tensor.shape)
-
-        frame_results = infer_request.infer(frames_to_infer)
-
-        if i - cw_r < 0:
-            continue
-
-        for output in frame_results.keys():
-            vector_result = frame_results[output].reshape((batch_size, result[output.any_name].shape[1]))
-            result[output.any_name][i - cw_r:i - cw_r + batch_size] = vector_result[:num_of_frames_to_infer]
-
-    return result
-
-
-def main():
-    args = parse_args()
-
-# --------------------------- Step 1. Initialize OpenVINO Runtime Core ------------------------------------------------
-    log.info('Creating OpenVINO Runtime Core')
-    core = ov.Core()
-
-# --------------------------- Step 2. Read a model --------------------------------------------------------------------
-    if args.model:
-        log.info(f'Reading the model: {args.model}')
-        # (.xml and .bin files) or (.onnx file)
-        model = core.read_model(args.model)
-
-# --------------------------- Step 3. Apply preprocessing -------------------------------------------------------------
-        model.add_outputs(args.output[0] + args.reference[0])
-
-        if args.layout:
-            layouts = get_input_layouts(args.layout, model.inputs)
-
-        ppp = ov.preprocess.PrePostProcessor(model)
-
-        for i in range(len(model.inputs)):
-            ppp.input(i).tensor().set_element_type(ov.Type.f32)
-
-            input_name = model.input(i).get_any_name()
-
-            if args.layout and input_name in layouts.keys():
-                ppp.input(i).tensor().set_layout(ov.Layout(layouts[input_name]))
-                ppp.input(i).model().set_layout(ov.Layout(layouts[input_name]))
-
-        for i in range(len(model.outputs)):
-            ppp.output(i).tensor().set_element_type(ov.Type.f32)
-
-        model = ppp.build()
-
-        if args.batch_size:
-            batch_size = args.batch_size if args.context_window_left == args.context_window_right == 0 else 1
-
-            if any((not _input.node.layout.empty for _input in model.inputs)):
-                ov.set_batch(model, batch_size)
-            else:
-                log.warning('Layout is not set for any input, so custom batch size is not set')
-
-# ---------------------------Step 4. Configure plugin ---------------------------------------------------------
-    devices = args.device.replace('HETERO:', '').split(',')
-    plugin_config = {}
-
-    if 'GNA' in args.device:
-        gna_device_mode = devices[0] if '_' in devices[0] else 'GNA_AUTO'
-        devices[0] = 'GNA'
-
-        plugin_config['GNA_DEVICE_MODE'] = gna_device_mode
-        plugin_config['GNA_PRECISION'] = f'I{args.quantization_bits}'
-        plugin_config['GNA_EXEC_TARGET'] = args.exec_target
-        plugin_config['GNA_PWL_MAX_ERROR_PERCENT'] = str(args.pwl_me)
-
-        # Set a GNA scale factor
-        if args.import_gna_model:
-            if args.scale_factor[1]:
-                log.error(f'Custom scale factor can not be set for imported gna model: {args.import_gna_model}')
-                return 1
-            else:
-                log.info(f'Using scale factor from provided imported gna model: {args.import_gna_model}')
-        else:
-            if args.scale_factor[1]:
-                scale_factors = get_sorted_scale_factors(args.scale_factor, model.inputs)
-            else:
-                scale_factors = []
-
-                for file_name in args.input[1]:
-                    _, utterances = read_utterance_file(file_name)
-                    scale_factor = calculate_scale_factor(utterances[0])
-                    log.info('Using scale factor(s) calculated from first utterance')
-                    scale_factors.append(str(scale_factor))
-
-            set_scale_factors(plugin_config, scale_factors, model.inputs)
-
-        if args.export_embedded_gna_model:
-            plugin_config['GNA_FIRMWARE_MODEL_IMAGE'] = args.export_embedded_gna_model
-            plugin_config['GNA_FIRMWARE_MODEL_IMAGE_GENERATION'] = args.embedded_gna_configuration
-
-        if args.performance_counter:
-            plugin_config['PERF_COUNT'] = 'YES'
-
-    device_str = f'HETERO:{",".join(devices)}' if 'HETERO' in args.device else devices[0]
-
-# --------------------------- Step 5. Loading model to the device -----------------------------------------------------
-    log.info('Loading the model to the plugin')
-    if args.model:
-        compiled_model = core.compile_model(model, device_str, plugin_config)
-    else:
-        with open(args.import_gna_model, 'rb') as f:
-            buf = BytesIO(f.read())
-            compiled_model = core.import_model(buf, device_str, plugin_config)
-
-# --------------------------- Exporting GNA model using InferenceEngine AOT API ---------------------------------------
-    if args.export_gna_model:
-        log.info(f'Writing GNA Model to {args.export_gna_model}')
-        user_stream = compiled_model.export_model()
-        with open(args.export_gna_model, 'wb') as f:
-            f.write(user_stream)
-        return 0
-
-    if args.export_embedded_gna_model:
-        log.info(f'Exported GNA embedded model to file {args.export_embedded_gna_model}')
-        log.info(f'GNA embedded model export done for GNA generation {args.embedded_gna_configuration}')
-        return 0
-
-# --------------------------- Step 6. Set up input --------------------------------------------------------------------
-    input_layer_names = args.input[0] if args.input[0] else [_input.any_name for _input in compiled_model.inputs]
-    input_file_names = args.input[1]
-
-    if len(input_layer_names) != len(input_file_names):
-        log.error(f'Number of model inputs ({len(compiled_model.inputs)}) is not equal '
-                  f'to number of ark files ({len(input_file_names)})')
-        return 3
-
-    input_file_data = [read_utterance_file(file_name) for file_name in input_file_names]
-
-    infer_data = [
-        {
-            input_layer_names[j]: input_file_data[j].utterances[i]
-            for j in range(len(input_file_data))
-        }
-        for i in range(len(input_file_data[0].utterances))
-    ]
-
-    output_layer_names = args.output[0] if args.output[0] else [compiled_model.outputs[0].any_name]
-    output_file_names = args.output[1]
-
-    reference_layer_names = args.reference[0] if args.reference[0] else [compiled_model.outputs[0].any_name]
-    reference_file_names = args.reference[1]
-
-    reference_file_data = [read_utterance_file(file_name) for file_name in reference_file_names]
-
-    references = [
-        {
-            reference_layer_names[j]: reference_file_data[j].utterances[i]
-            for j in range(len(reference_file_data))
-        }
-        for i in range(len(input_file_data[0].utterances))
-    ]
-
-# --------------------------- Step 7. Create infer request ------------------------------------------------------------
-    infer_request = compiled_model.create_infer_request()
-
-# --------------------------- Step 8. Do inference --------------------------------------------------------------------
-    log.info('Starting inference in synchronous mode')
-    results = []
-    total_infer_time = 0
-
-    for i in range(len(infer_data)):
-        start_infer_time = default_timer()
-
-        # Reset states between utterance inferences to remove a memory impact
-        infer_request.reset_state()
-
-        results.append(do_inference(
-            infer_data[i],
-            infer_request,
-            args.context_window_left,
-            args.context_window_right,
-        ))
-
-        infer_time = default_timer() - start_infer_time
-        total_infer_time += infer_time
-        num_of_frames = infer_data[i][input_layer_names[0]].shape[0]
-        avg_infer_time_per_frame = infer_time / num_of_frames
-
-# --------------------------- Step 9. Process output ------------------------------------------------------------------
-        log.info('')
-        log.info(f'Utterance {i}:')
-        log.info(f'Total time in Infer (HW and SW): {infer_time * 1000:.2f}ms')
-        log.info(f'Frames in utterance: {num_of_frames}')
-        log.info(f'Average Infer time per frame: {avg_infer_time_per_frame * 1000:.2f}ms')
-
-        for name in set(reference_layer_names + output_layer_names):
-            log.info('')
-            log.info(f'Output layer name: {name}')
-            log.info(f'Number scores per frame: {results[i][name].shape[1]}')
-
-            if name in references[i].keys():
-                log.info('')
-                compare_with_reference(results[i][name], references[i][name])
-
-        if args.performance_counter:
-            if 'GNA' in args.device:
-                total_cycles = infer_request.profiling_info[0].real_time.total_seconds()
-                stall_cycles = infer_request.profiling_info[1].real_time.total_seconds()
-                active_cycles = total_cycles - stall_cycles
-                frequency = 10**6
-                if args.arch == 'CORE':
-                    frequency *= GNA_CORE_FREQUENCY
-                else:
-                    frequency *= GNA_ATOM_FREQUENCY
-                total_inference_time = total_cycles / frequency
-                active_time = active_cycles / frequency
-                stall_time = stall_cycles / frequency
-                log.info('')
-                log.info('Performance Statistics of GNA Hardware')
-                log.info(f'   Total Inference Time: {(total_inference_time * 1000):.4f} ms')
-                log.info(f'   Active Time: {(active_time * 1000):.4f} ms')
-                log.info(f'   Stall Time:  {(stall_time * 1000):.4f} ms')
-
-    log.info('')
-    log.info(f'Total sample time: {total_infer_time * 1000:.2f}ms')
-
-    for i in range(len(output_file_names)):
-        log.info(f'Saving results from "{output_layer_names[i]}" layer to {output_file_names[i]}')
-        data = [results[j][output_layer_names[i]] for j in range(len(input_file_data[0].utterances))]
-        write_utterance_file(output_file_names[i], input_file_data[0].keys, data)
-
-# ----------------------------------------------------------------------------------------------------------------------
-    log.info('This sample is an API example, '
-             'for any performance measurements please use the dedicated benchmark_app tool\n')
-    return 0
-
-
-if __name__ == '__main__':
-    sys.exit(main())
--- a/samples/python/speech_sample/utils.py
+++ b/samples/python/speech_sample/utils.py
@@ -1,74 +0,0 @@
-# -*- coding: utf-8 -*-
-# Copyright (C) 2018-2023 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-
-import logging as log
-import sys
-from typing import Dict, List, Tuple
-
-import numpy as np
-from openvino.runtime import Output
-
-# Operating Frequency for GNA HW devices for Core and Atom architecture
-GNA_CORE_FREQUENCY = 400
-GNA_ATOM_FREQUENCY = 200
-
-log.basicConfig(format='[ %(levelname)s ] %(message)s', level=log.INFO, stream=sys.stdout)
-
-
-def compare_with_reference(result: np.ndarray, reference: np.ndarray):
-    error_matrix = np.absolute(result - reference)
-
-    max_error = np.max(error_matrix)
-    sum_error = np.sum(error_matrix)
-    avg_error = sum_error / error_matrix.size
-    sum_square_error = np.sum(np.square(error_matrix))
-    avg_rms_error = np.sqrt(sum_square_error / error_matrix.size)
-    stdev_error = np.sqrt(sum_square_error / error_matrix.size - avg_error * avg_error)
-
-    log.info(f'max error: {max_error:.7f}')
-    log.info(f'avg error: {avg_error:.7f}')
-    log.info(f'avg rms error: {avg_rms_error:.7f}')
-    log.info(f'stdev error: {stdev_error:.7f}')
-
-
-def calculate_scale_factor(matrix: np.ndarray) -> float:
-    """Get scale factor for quantization using utterance matrix."""
-    # Max to find scale factor
-    target_max = 16384
-    max_val = np.max(matrix)
-    if max_val == 0:
-        return 1.0
-    else:
-        return target_max / max_val
-
-
-def set_scale_factors(plugin_config: Dict[str, str], scale_factors: List[str], inputs: List[Output]):
-    """Set a scale factor provided for each input."""
-    for i in range(len(inputs)):
-        log.info(f'For input {inputs[i].get_any_name()} using scale factor of {scale_factors[i]}')
-        plugin_config[f'GNA_SCALE_FACTOR_{i}'] = scale_factors[i]
-
-
-def get_input_layouts(layout_string: str, inputs: List[Output]) -> Dict[str, str]:
-    if layout_string[0] == '[':
-        return {_input.get_any_name(): layout_string[1:-1] for _input in inputs}
-    else:
-        sep = '],' if ',' in layout_string else ']'
-        tmp = [_input.split('[') for _input in layout_string[:-1].split(sep)]
-        return {_input[0]: _input[1] for _input in tmp}
-
-
-def get_sorted_scale_factors(scale_factor_arg: Tuple[List[str], List[str]], inputs: List[Output]) -> List[str]:
-    if scale_factor_arg[0]:
-        res = [1 for _ in range(len(inputs))]
-        input_names = [_input.get_any_name() for _input in inputs]
-
-        for i in range(len(scale_factor_arg[0])):
-            input_index = input_names.index(scale_factor_arg[0][i])
-            res[input_index] = scale_factor_arg[1][i]
-
-        return res
-
-    else:
-        return [scale_factor_arg[1][0] for _ in range(len(inputs))]
--- a/src/bindings/python/docs/test_examples.md
+++ b/src/bindings/python/docs/test_examples.md
@@ -23,11 +23,6 @@ To run OpenVINO Python API 2.0 tests:
 pytest tests/
 ```

-To run OpenVINO Python API 1.0 tests, use this command:
-```
-pytest tests_compatibility/
-```
-
 By default, tests are run on the CPU plugin. If you want to run them on a different plugin,
 you need to specify this environment variable:
 ```
@@ -147,10 +142,6 @@ Notice that the test name is shared between cases. In a real-life pull request,
 * ... or create reference values during runtime. Always use a good, thrust-worthy library for that!
 * Re-use common parts of the code (like multiple lines that create helper object) and move them out to make tests easier to read.

-### Difference between *tests* and *tests_compatibility* directories
-<!-- TO-DELETE when compatibility layer is no longer supported in the project -->
-Someone could notice two similar folders [`tests`](./../tests/) and [`tests_compatibility`](./../tests_compatibility/). First one is the desired place for all upcoming features and tests. Compatibility layer is only supported in specific cases and any updates to it should be explicitly approved by OpenVINO™ reviewers. Please do not duplicate tests in both directories if not necessary.
-
 ## See also
 * [OpenVINO™ README](../../../../README.md)
 * [OpenVINO™ bindings README](../../README.md)
--- a/src/bindings/python/setup.cfg
+++ b/src/bindings/python/setup.cfg
@@ -19,7 +19,6 @@ passenv =
  https_proxy
 commands=
  pytest tests -m "not template_extension" -v -k 'not _cuda' --ignore=tests/test_utils
-  pytest --backend={env:OV_BACKEND} tests_compatibility/test_ngraph -v -k 'not _cuda' --ignore=tests_compatibility/test_onnx/test_zoo_models.py
  pytest --backend={env:OV_BACKEND} /openvino/src/frontends/onnx/tests -v --ignore=/openvino/src/frontends/onnx/tests/tests_python/test_zoo_models.py

 [testenv:zoo_models]
@@ -68,7 +67,6 @@ docstring-convention = google
 enable-extensions = G
 per-file-ignores =
    src/openvino/runtime/*/ops.py: VNE001,VNE003
-    tests_compatibility/test_ngraph/*: C101,C812,C815,C816,C819,CCE001,D212,E800,ECE001,N400,N802,N806,P101,P103,PT001,PT005,PT006,PT011,PT019,PT023,RST201,S001,VNE002
    src/compatibility/ngraph/*: C101,C812,C819,CCE001,E800,N806,P101,RST201,RST202,RST203,RST206,VNE001,VNE003
    src/openvino/preprocess/torchvision/*: N801, VNE001
    *__init__.py: F401
--- a/src/bindings/python/src/compatibility/pyngraph/CMakeLists.txt
+++ b/src/bindings/python/src/compatibility/pyngraph/CMakeLists.txt
@@ -83,8 +83,3 @@ install(DIRECTORY ${CMAKE_CURRENT_LIST_DIR}/../ngraph
        COMPONENT ${OV_CPACK_COMP_PYTHON_OPENVINO}_${pyversion}
        ${OV_CPACK_COMP_PYTHON_OPENVINO_EXCLUDE_ALL}
        USE_SOURCE_PERMISSIONS)
-
-install(DIRECTORY ${OpenVINOPython_SOURCE_DIR}/tests_compatibility
-        DESTINATION tests/${PROJECT_NAME}
-        COMPONENT tests
-        EXCLUDE_FROM_ALL)
--- a/src/bindings/python/src/openvino/_offline_transformations/init.py
+++ b/src/bindings/python/src/openvino/_offline_transformations/init.py
@@ -11,7 +11,6 @@ __version__ = get_version()
 from openvino._pyopenvino._offline_transformations import apply_fused_names_cleanup
 from openvino._pyopenvino._offline_transformations import apply_moc_transformations
 from openvino._pyopenvino._offline_transformations import apply_moc_legacy_transformations
-from openvino._pyopenvino._offline_transformations import apply_pot_transformations
 from openvino._pyopenvino._offline_transformations import apply_low_latency_transformation
 from openvino._pyopenvino._offline_transformations import apply_pruning_transformation
 from openvino._pyopenvino._offline_transformations import apply_make_stateful_transformation
--- a/src/bindings/python/src/openvino/frontend/tensorflow/utils.py
+++ b/src/bindings/python/src/openvino/frontend/tensorflow/utils.py
@@ -363,7 +363,7 @@ def extract_model_graph(argv):
    if isinstance(model, tf.compat.v1.GraphDef):
        graph = tf.Graph()
        with graph.as_default():
-            tf.graph_util.import_graph_def(model)
+            tf.graph_util.import_graph_def(model, name='')
        argv["input_model"] = graph
        return True
    if isinstance(model, tf.compat.v1.Session):
--- a/src/bindings/python/src/openvino/properties/hint/init.py
+++ b/src/bindings/python/src/openvino/properties/hint/init.py
@@ -6,7 +6,7 @@
 from openvino._pyopenvino.properties.hint import Priority
 from openvino._pyopenvino.properties.hint import SchedulingCoreType
 from openvino._pyopenvino.properties.hint import ExecutionMode
-from openvino.runtime.properties.hint.overloads import PerformanceMode
+from openvino._pyopenvino.properties.hint import PerformanceMode

 # Properties
 import openvino._pyopenvino.properties.hint as __hint
--- a/src/bindings/python/src/openvino/runtime/ie_api.py
+++ b/src/bindings/python/src/openvino/runtime/ie_api.py
@@ -23,19 +23,6 @@ from openvino.runtime.utils.data_helpers import (
 )


-def _deprecated_memory_arg(shared_memory: bool, share_inputs: bool) -> bool:
-    if shared_memory is not None:
-        warnings.warn(
-            "`shared_memory` is deprecated and will be removed in 2024.0. "
-            "Value of `shared_memory` is going to override `share_inputs` value. "
-            "Please use only `share_inputs` explicitly.",
-            FutureWarning,
-            stacklevel=3,
-        )
-        return shared_memory
-    return share_inputs
-
-
 class Model(ModelBase):
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        if args and not kwargs:
@@ -70,8 +57,6 @@ class InferRequest(_InferRequestWrapper):
        inputs: Any = None,
        share_inputs: bool = False,
        share_outputs: bool = False,
-        *,
-        shared_memory: Any = None,
    ) -> OVDict:
        """Infers specified input(s) in synchronous mode.

@@ -129,22 +114,14 @@ class InferRequest(_InferRequestWrapper):

                              Default value: False
        :type share_outputs: bool, optional
-        :param shared_memory: Deprecated. Works like `share_inputs` mode.

-                              If not specified, function uses `share_inputs` value.
-
-                              Note: Will be removed in 2024.0 release!
-                              Note: This is keyword-only argument.
-
-                              Default value: None
-        :type shared_memory: bool, optional
        :return: Dictionary of results from output tensors with port/int/str keys.
        :rtype: OVDict
        """
        return OVDict(super().infer(_data_dispatch(
            self,
            inputs,
-            is_shared=_deprecated_memory_arg(shared_memory, share_inputs),
+            is_shared=share_inputs,
        ), share_outputs=share_outputs))

    def start_async(
@@ -152,8 +129,6 @@ class InferRequest(_InferRequestWrapper):
        inputs: Any = None,
        userdata: Any = None,
        share_inputs: bool = False,
-        *,
-        shared_memory: Any = None,
    ) -> None:
        """Starts inference of specified input(s) in asynchronous mode.

@@ -202,21 +177,12 @@ class InferRequest(_InferRequestWrapper):

                              Default value: False
        :type share_inputs: bool, optional
-        :param shared_memory: Deprecated. Works like `share_inputs` mode.
-
-                              If not specified, function uses `share_inputs` value.
-
-                              Note: Will be removed in 2024.0 release!
-                              Note: This is keyword-only argument.
-
-                              Default value: None
-        :type shared_memory: bool, optional
        """
        super().start_async(
            _data_dispatch(
                self,
                inputs,
-                is_shared=_deprecated_memory_arg(shared_memory, share_inputs),
+                is_shared=share_inputs,
            ),
            userdata,
        )
@@ -302,8 +268,6 @@ class CompiledModel(CompiledModelBase):
        inputs: Any = None,
        share_inputs: bool = True,
        share_outputs: bool = False,
-        *,
-        shared_memory: Any = None,
    ) -> OVDict:
        """Callable infer wrapper for CompiledModel.

@@ -369,15 +333,7 @@ class CompiledModel(CompiledModelBase):

                              Default value: False
        :type share_outputs: bool, optional
-        :param shared_memory: Deprecated. Works like `share_inputs` mode.

-                              If not specified, function uses `share_inputs` value.
-
-                              Note: Will be removed in 2024.0 release!
-                              Note: This is keyword-only argument.
-
-                              Default value: None
-        :type shared_memory: bool, optional
        :return: Dictionary of results from output tensors with port/int/str as keys.
        :rtype: OVDict
        """
@@ -386,7 +342,7 @@ class CompiledModel(CompiledModelBase):

        return self._infer_request.infer(
            inputs,
-            share_inputs=_deprecated_memory_arg(shared_memory, share_inputs),
+            share_inputs=share_inputs,
            share_outputs=share_outputs,
        )

@@ -430,8 +386,6 @@ class AsyncInferQueue(AsyncInferQueueBase):
        inputs: Any = None,
        userdata: Any = None,
        share_inputs: bool = False,
-        *,
-        shared_memory: Any = None,
    ) -> None:
        """Run asynchronous inference using the next available InferRequest from the pool.

@@ -476,21 +430,12 @@ class AsyncInferQueue(AsyncInferQueueBase):

                              Default value: False
        :type share_inputs: bool, optional
-        :param shared_memory: Deprecated. Works like `share_inputs` mode.
-
-                              If not specified, function uses `share_inputs` value.
-
-                              Note: Will be removed in 2024.0 release!
-                              Note: This is keyword-only argument.
-
-                              Default value: None
-        :type shared_memory: bool, optional
        """
        super().start_async(
            _data_dispatch(
                self[self.get_idle_request_id()],
                inputs,
-                is_shared=_deprecated_memory_arg(shared_memory, share_inputs),
+                is_shared=share_inputs,
            ),
            userdata,
        )
--- a/src/bindings/python/src/openvino/runtime/properties/hint/init.py
+++ b/src/bindings/python/src/openvino/runtime/properties/hint/init.py
@@ -6,7 +6,7 @@
 from openvino._pyopenvino.properties.hint import Priority
 from openvino._pyopenvino.properties.hint import SchedulingCoreType
 from openvino._pyopenvino.properties.hint import ExecutionMode
-from openvino.runtime.properties.hint.overloads import PerformanceMode
+from openvino._pyopenvino.properties.hint import PerformanceMode

 # Properties
 from openvino._pyopenvino.properties.hint import inference_precision
--- a/src/bindings/python/src/openvino/runtime/properties/hint/overloads.py
+++ b/src/bindings/python/src/openvino/runtime/properties/hint/overloads.py
@@ -1,19 +0,0 @@
-# -*- coding: utf-8 -*-
-# Copyright (C) 2018-2023 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-
-from openvino.utils import deprecatedclassproperty
-
-from openvino._pyopenvino.properties.hint import PerformanceMode as PerformanceModeBase
-
-
-class PerformanceMode(PerformanceModeBase):
-
-    @deprecatedclassproperty(
-        name="PerformanceMode.UNDEFINED",  # noqa: N802, N805
-        version="2024.0",
-        message="Please use actual value instead.",
-        stacklevel=2,
-    )
-    def UNDEFINED(cls) -> PerformanceModeBase:  # noqa: N802, N805
-        return super().UNDEFINED
--- a/src/bindings/python/src/openvino/runtime/utils/init.py
+++ b/src/bindings/python/src/openvino/runtime/utils/init.py
@@ -4,6 +4,4 @@

 """Generic utilities. Factor related functions out to separate files."""

-from openvino._pyopenvino.util import numpy_to_c
-from openvino._pyopenvino.util import get_constant_from_source, replace_node, replace_output_update_name
-from openvino.runtime.utils.util import clone_model
+from openvino._pyopenvino.util import numpy_to_c, replace_node, replace_output_update_name
--- a/src/bindings/python/src/openvino/runtime/utils/util.py
+++ b/src/bindings/python/src/openvino/runtime/utils/util.py
@@ -1,17 +0,0 @@
-# -*- coding: utf-8 -*-
-# Copyright (C) 2018-2023 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-
-from openvino._pyopenvino.util import clone_model as clone_model_base
-from openvino.utils import deprecated
-
-from typing import TYPE_CHECKING
-
-if TYPE_CHECKING:
-    from openvino.runtime import Model
-
-
-@deprecated(version="2024.0")
-def clone_model(model: "Model") -> "Model":
-    from openvino.runtime import Model
-    return Model(clone_model_base(model))
--- a/src/bindings/python/src/pyopenvino/core/infer_request.cpp
+++ b/src/bindings/python/src/pyopenvino/core/infer_request.cpp
@@ -667,30 +667,6 @@ void regclass_InferRequest(py::module m) {
            :rtype: List[openvino.runtime.ConstOutput]
        )");

-    cls.def_property_readonly(
-        "inputs",
-        [](InferRequestWrapper& self) {
-            Common::utils::deprecation_warning("inputs", "2024.0", "Please use 'input_tensors' property instead.");
-            return self.get_input_tensors();
-        },
-        R"(
-            Gets all input tensors of this InferRequest.
-            
-            :rtype: List[openvino.runtime.Tensor]
-            )");
-
-    cls.def_property_readonly(
-        "outputs",
-        [](InferRequestWrapper& self) {
-            Common::utils::deprecation_warning("outputs", "2024.0", "Please use 'output_tensors' property instead.");
-            return self.get_output_tensors();
-        },
-        R"(
-            Gets all output tensors of this InferRequest.
-            
-            :rtype: List[openvino.runtime.Tensor]
-            )");
-
    cls.def_property_readonly("input_tensors",
                              &InferRequestWrapper::get_input_tensors,
                              R"(
--- a/src/bindings/python/src/pyopenvino/core/offline_transformations.cpp
+++ b/src/bindings/python/src/pyopenvino/core/offline_transformations.cpp
@@ -9,7 +9,6 @@
 #include <compress_quantize_weights.hpp>
 #include <openvino/pass/make_stateful.hpp>
 #include <openvino/pass/serialize.hpp>
-#include <pot_transformations.hpp>
 #include <pruning.hpp>
 #include <transformations/common_optimizations/compress_float_constants.hpp>
 #include <transformations/common_optimizations/fused_names_cleanup.hpp>
@@ -55,16 +54,6 @@ void regmodule_offline_transformations(py::module m) {
        py::arg("model"),
        py::arg("params_with_custom_types"));

-    m_offline_transformations.def(
-        "apply_pot_transformations",
-        [](std::shared_ptr<ov::Model> model, std::string device) {
-            ov::pass::Manager manager;
-            manager.register_pass<ov::pass::POTTransformations>(std::move(device));
-            manager.run_passes(model);
-        },
-        py::arg("model"),
-        py::arg("device"));
-
    m_offline_transformations.def(
        "apply_low_latency_transformation",
        [](std::shared_ptr<ov::Model> model, bool use_const_initializer = true) {
--- a/src/bindings/python/src/pyopenvino/core/properties/properties.cpp
+++ b/src/bindings/python/src/pyopenvino/core/properties/properties.cpp
@@ -55,13 +55,10 @@ void regmodule_properties(py::module m) {
        .value("HIGH", ov::hint::Priority::HIGH)
        .value("DEFAULT", ov::hint::Priority::DEFAULT);

-    OPENVINO_SUPPRESS_DEPRECATED_START
    py::enum_<ov::hint::PerformanceMode>(m_hint, "PerformanceMode", py::arithmetic())
-        .value("UNDEFINED", ov::hint::PerformanceMode::UNDEFINED)
        .value("LATENCY", ov::hint::PerformanceMode::LATENCY)
        .value("THROUGHPUT", ov::hint::PerformanceMode::THROUGHPUT)
        .value("CUMULATIVE_THROUGHPUT", ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT);
-    OPENVINO_SUPPRESS_DEPRECATED_END

    py::enum_<ov::hint::SchedulingCoreType>(m_hint, "SchedulingCoreType", py::arithmetic())
        .value("ANY_CORE", ov::hint::SchedulingCoreType::ANY_CORE)
--- a/src/bindings/python/src/pyopenvino/graph/util.cpp
+++ b/src/bindings/python/src/pyopenvino/graph/util.cpp
@@ -27,35 +27,6 @@ inline void* numpy_to_c(py::array a) {
 void regmodule_graph_util(py::module m) {
    py::module mod = m.def_submodule("util", "openvino.runtime.utils");
    mod.def("numpy_to_c", &numpy_to_c);
-    OPENVINO_SUPPRESS_DEPRECATED_START
-    mod.def("get_constant_from_source",
-            &ov::get_constant_from_source,
-            py::arg("output"),
-            R"(
-                Runs an estimation of source tensor.
-
-                :param index: Output node.
-                :type index: openvino.runtime.Output
-                :return: If it succeeded to calculate both bounds and
-                         they are the same, returns Constant operation
-                         from the resulting bound, otherwise Null.
-                :rtype: openvino.runtime.op.Constant or openvino.runtime.Node
-            )");
-    OPENVINO_SUPPRESS_DEPRECATED_END
-    mod.def(
-        "clone_model",
-        [](ov::Model& model) {
-            return model.clone();
-        },
-        py::arg("model"),
-        R"(
-                Creates a copy of a model object.
-
-                :param model: Model to copy.
-                :type model: openvino.runtime.Model
-                :return: A copy of Model.
-                :rtype: openvino.runtime.Model
-            )");

    mod.def("replace_output_update_name", &ov::replace_output_update_name, py::arg("output"), py::arg("target_output"));

--- a/src/bindings/python/tests/conftest.py
+++ b/src/bindings/python/tests/conftest.py
@@ -6,23 +6,6 @@ import os
 import pytest


-def model_path(is_fp16=False):
-    base_path = os.path.dirname(__file__)
-    if is_fp16:
-        test_xml = os.path.join(base_path, "utils", "utils", "test_model_fp16.xml")
-        test_bin = os.path.join(base_path, "utils", "utils", "test_model_fp16.bin")
-    else:
-        test_xml = os.path.join(base_path, "utils", "utils", "test_model_fp32.xml")
-        test_bin = os.path.join(base_path, "utils", "utils", "test_model_fp32.bin")
-    return (test_xml, test_bin)
-
-
-def model_onnx_path():
-    base_path = os.path.dirname(__file__)
-    test_onnx = os.path.join(base_path, "test_utils", "utils", "test_model.onnx")
-    return test_onnx
-
-
 def pytest_configure(config):

    # register additional markers
--- a/src/bindings/python/tests/test_graph/test_basic.py
+++ b/src/bindings/python/tests/test_graph/test_basic.py
@@ -318,30 +318,22 @@ def test_clone_model():
    assert isinstance(model_original, Model)

    # Make copies of it
-    with pytest.deprecated_call():
-        model_copy1 = ov.utils.clone_model(model_original)
    model_copy2 = model_original.clone()
    model_copy3 = deepcopy(model_original)

-    assert isinstance(model_copy1, Model)
    assert isinstance(model_copy2, Model)
    assert isinstance(model_copy3, Model)

    # Make changes to the copied models' inputs
-    model_copy1.reshape({"A": [3, 3], "B": [3, 3]})
    model_copy2.reshape({"A": [3, 3], "B": [3, 3]})
    model_copy3.reshape({"A": [3, 3], "B": [3, 3]})

    original_model_shapes = [single_input.get_shape() for single_input in model_original.inputs]
-    model_copy1_shapes = [single_input.get_shape() for single_input in model_copy1.inputs]
    model_copy2_shapes = [single_input.get_shape() for single_input in model_copy2.inputs]
    model_copy3_shapes = [single_input.get_shape() for single_input in model_copy3.inputs]

-    assert original_model_shapes != model_copy1_shapes
    assert original_model_shapes != model_copy2_shapes
    assert original_model_shapes != model_copy3_shapes
-    assert model_copy1_shapes == model_copy2_shapes
-    assert model_copy1_shapes == model_copy3_shapes
    assert model_copy2_shapes == model_copy3_shapes


--- a/src/bindings/python/tests/test_graph/test_utils.py
+++ b/src/bindings/python/tests/test_graph/test_utils.py
@@ -2,30 +2,8 @@
 # Copyright (C) 2018-2023 Intel Corporation
 # SPDX-License-Identifier: Apache-2.0

-import openvino.runtime as ov
 import pytest
 from openvino._pyopenvino.util import deprecation_warning
-from openvino import Shape
-
-
-def test_get_constant_from_source_success():
-    input1 = ov.opset8.parameter(Shape([5, 5]), dtype=int, name="input_1")
-    input2 = ov.opset8.parameter(Shape([25]), dtype=int, name="input_2")
-    shape_of = ov.opset8.shape_of(input2, name="shape_of")
-    reshape = ov.opset8.reshape(input1, shape_of, special_zero=True)
-    folded_const = ov.utils.get_constant_from_source(reshape.input(1).get_source_output())
-
-    assert folded_const is not None
-    assert folded_const.get_vector() == [25]
-
-
-def test_get_constant_from_source_failed():
-    input1 = ov.opset8.parameter(Shape([5, 5]), dtype=int, name="input_1")
-    input2 = ov.opset8.parameter(Shape([1]), dtype=int, name="input_2")
-    reshape = ov.opset8.reshape(input1, input2, special_zero=True)
-    folded_const = ov.utils.get_constant_from_source(reshape.input(1).get_source_output())
-
-    assert folded_const is None


 def test_deprecation_warning():
--- a/src/bindings/python/tests/test_runtime/test_async_infer_request.py
+++ b/src/bindings/python/tests/test_runtime/test_async_infer_request.py
@@ -296,28 +296,6 @@ def test_array_like_input_async_infer_queue(device, share_inputs):
            infer_queue_list[i].get_output_tensor().data, np.abs(input_data))


-@pytest.mark.parametrize("shared_flag", [True, False])
-def test_shared_memory_deprecation(device, shared_flag):
-    compiled, request, _, input_data = abs_model_with_data(
-        device, Type.f32, np.float32)
-
-    with pytest.warns(FutureWarning, match="`shared_memory` is deprecated and will be removed in 2024.0"):
-        _ = compiled(input_data, shared_memory=shared_flag)
-
-    with pytest.warns(FutureWarning, match="`shared_memory` is deprecated and will be removed in 2024.0"):
-        _ = request.infer(input_data, shared_memory=shared_flag)
-
-    with pytest.warns(FutureWarning, match="`shared_memory` is deprecated and will be removed in 2024.0"):
-        request.start_async(input_data, shared_memory=shared_flag)
-    request.wait()
-
-    queue = AsyncInferQueue(compiled, jobs=1)
-
-    with pytest.warns(FutureWarning, match="`shared_memory` is deprecated and will be removed in 2024.0"):
-        queue.start_async(input_data, shared_memory=shared_flag)
-    queue.wait_all()
-
-
@pytest.mark.skip(reason="Sporadically failed. Need further investigation. Ticket - 95967")
 def test_cancel(device):
    core = Core()
--- a/src/bindings/python/tests/test_runtime/test_compiled_model.py
+++ b/src/bindings/python/tests/test_runtime/test_compiled_model.py
@@ -213,7 +213,7 @@ def test_direct_infer(device, shared_flag):
    compiled_model, img = generate_model_and_image(device)

    tensor = Tensor(img)
-    res = compiled_model({"data": tensor}, shared_memory=shared_flag)
+    res = compiled_model({"data": tensor}, share_inputs=shared_flag)
    assert np.argmax(res[compiled_model.outputs[0]]) == 531
    ref = compiled_model.infer_new_request({"data": tensor})
    assert np.array_equal(ref[compiled_model.outputs[0]], res[compiled_model.outputs[0]])
--- a/src/bindings/python/tests/test_runtime/test_properties.py
+++ b/src/bindings/python/tests/test_runtime/test_properties.py
@@ -37,13 +37,6 @@ def test_properties_rw_base():
    assert "incompatible function arguments" in str(e.value)


-def test_deprecation():
-    with pytest.warns(DeprecationWarning) as w:
-        _ = hints.PerformanceMode.UNDEFINED
-    assert issubclass(w[0].category, DeprecationWarning)
-    assert "PerformanceMode.UNDEFINED is deprecated and will be removed" in str(w[0].message)
-
-
 ###
 # Enum-like values
 ###
@@ -71,7 +64,6 @@ def test_deprecation():
        (
            hints.PerformanceMode,
            (
-                (hints.PerformanceMode.UNDEFINED, "PerformanceMode.UNDEFINED", -1),
                (hints.PerformanceMode.LATENCY, "PerformanceMode.LATENCY", 1),
                (hints.PerformanceMode.THROUGHPUT, "PerformanceMode.THROUGHPUT", 2),
                (hints.PerformanceMode.CUMULATIVE_THROUGHPUT, "PerformanceMode.CUMULATIVE_THROUGHPUT", 3),
@@ -253,7 +245,7 @@ def test_properties_ro(ov_property_ro, expected_value):
        (
            hints.performance_mode,
            "PERFORMANCE_HINT",
-            ((hints.PerformanceMode.UNDEFINED, hints.PerformanceMode.UNDEFINED),),
+            ((hints.PerformanceMode.LATENCY, hints.PerformanceMode.LATENCY),),
        ),
        (
            hints.enable_cpu_pinning,
--- a/src/bindings/python/tests/test_transformations/test_offline_api.py
+++ b/src/bindings/python/tests/test_transformations/test_offline_api.py
@@ -7,7 +7,6 @@ import pytest
 import numpy as np
 from openvino._offline_transformations import (
    apply_moc_transformations,
-    apply_pot_transformations,
    apply_low_latency_transformation,
    apply_pruning_transformation,
    apply_make_stateful_transformation,
@@ -113,15 +112,6 @@ def test_moc_with_smart_reshape():
    assert len(model.get_ops()) == 3


-def test_pot_transformations():
-    model = get_relu_model()
-
-    apply_pot_transformations(model, "GNA")
-
-    assert model is not None
-    assert len(model.get_ops()) == 3
-
-
 def test_low_latency_transformation():
    model = get_relu_model()

--- a/src/bindings/python/tests/utils/utils/test_model.onnx
+++ b/src/bindings/python/tests/utils/utils/test_model.onnx
--- a/src/bindings/python/tests/utils/utils/test_model_fp16.bin
+++ b/src/bindings/python/tests/utils/utils/test_model_fp16.bin
--- a/src/bindings/python/tests/utils/utils/test_model_fp16.xml
+++ b/src/bindings/python/tests/utils/utils/test_model_fp16.xml
@@ -1,467 +0,0 @@
-<?xml version="1.0" ?>
-<net name="test_model" version="10">
-	<layers>
-		<layer id="0" name="data" type="Parameter" version="opset1">
-			<data element_type="f16" shape="1,3,32,32"/>
-			<output>
-				<port id="0" precision="FP16">
-					<dim>1</dim>
-					<dim>3</dim>
-					<dim>32</dim>
-					<dim>32</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="1" name="20/mean/Fused_Mul_614616_const" type="Const" version="opset1">
-			<data element_type="f16" offset="0" shape="16,3,5,5" size="2400"/>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>16</dim>
-					<dim>3</dim>
-					<dim>5</dim>
-					<dim>5</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="2" name="19/WithoutBiases" type="Convolution" version="opset1">
-			<data dilations="1,1" output_padding="0,0" pads_begin="2,2" pads_end="2,2" strides="1,1"/>
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>3</dim>
-					<dim>32</dim>
-					<dim>32</dim>
-				</port>
-				<port id="1">
-					<dim>16</dim>
-					<dim>3</dim>
-					<dim>5</dim>
-					<dim>5</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>32</dim>
-					<dim>32</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="3" name="data_add_575/copy_const" type="Const" version="opset1">
-			<data element_type="f16" offset="2400" shape="1,16,1,1" size="32"/>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>1</dim>
-					<dim>1</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="4" name="19/Fused_Add_" type="Add" version="opset1">
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>32</dim>
-					<dim>32</dim>
-				</port>
-				<port id="1">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>1</dim>
-					<dim>1</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>32</dim>
-					<dim>32</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="5" name="21" type="ReLU" version="opset1">
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>32</dim>
-					<dim>32</dim>
-				</port>
-			</input>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>32</dim>
-					<dim>32</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="6" name="22" type="MaxPool" version="opset1">
-			<data kernel="2,2" pads_begin="0,0" pads_end="0,0" rounding_type="floor" strides="2,2"/>
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>32</dim>
-					<dim>32</dim>
-				</port>
-			</input>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>16</dim>
-					<dim>16</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="7" name="onnx_initializer_node_8/Output_0/Data__const" type="Const" version="opset1">
-			<data element_type="f16" offset="2432" shape="32,16,5,5" size="25600"/>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>32</dim>
-					<dim>16</dim>
-					<dim>5</dim>
-					<dim>5</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="8" name="23/WithoutBiases" type="Convolution" version="opset1">
-			<data dilations="1,1" output_padding="0,0" pads_begin="2,2" pads_end="2,2" strides="1,1"/>
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>16</dim>
-					<dim>16</dim>
-					<dim>16</dim>
-				</port>
-				<port id="1">
-					<dim>32</dim>
-					<dim>16</dim>
-					<dim>5</dim>
-					<dim>5</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>32</dim>
-					<dim>16</dim>
-					<dim>16</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="9" name="23/Dims351/copy_const" type="Const" version="opset1">
-			<data element_type="f16" offset="28032" shape="1,32,1,1" size="64"/>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>32</dim>
-					<dim>1</dim>
-					<dim>1</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="10" name="23" type="Add" version="opset1">
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>32</dim>
-					<dim>16</dim>
-					<dim>16</dim>
-				</port>
-				<port id="1">
-					<dim>1</dim>
-					<dim>32</dim>
-					<dim>1</dim>
-					<dim>1</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>32</dim>
-					<dim>16</dim>
-					<dim>16</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="11" name="25/mean/Fused_Mul_618620_const" type="Const" version="opset1">
-			<data element_type="f16" offset="28096" shape="64,32,3,3" size="36864"/>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>64</dim>
-					<dim>32</dim>
-					<dim>3</dim>
-					<dim>3</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="12" name="24/WithoutBiases" type="Convolution" version="opset1">
-			<data dilations="1,1" output_padding="0,0" pads_begin="2,2" pads_end="2,2" strides="1,1"/>
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>32</dim>
-					<dim>16</dim>
-					<dim>16</dim>
-				</port>
-				<port id="1">
-					<dim>64</dim>
-					<dim>32</dim>
-					<dim>3</dim>
-					<dim>3</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>18</dim>
-					<dim>18</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="13" name="data_add_578583/copy_const" type="Const" version="opset1">
-			<data element_type="f16" offset="64960" shape="1,64,1,1" size="128"/>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>1</dim>
-					<dim>1</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="14" name="24/Fused_Add_" type="Add" version="opset1">
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>18</dim>
-					<dim>18</dim>
-				</port>
-				<port id="1">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>1</dim>
-					<dim>1</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>18</dim>
-					<dim>18</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="15" name="26" type="ReLU" version="opset1">
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>18</dim>
-					<dim>18</dim>
-				</port>
-			</input>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>18</dim>
-					<dim>18</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="16" name="27" type="MaxPool" version="opset1">
-			<data kernel="2,2" pads_begin="0,0" pads_end="0,0" rounding_type="floor" strides="2,2"/>
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>18</dim>
-					<dim>18</dim>
-				</port>
-			</input>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>9</dim>
-					<dim>9</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="17" name="28/Reshape/Cast_1955_const" type="Const" version="opset1">
-			<data element_type="i64" offset="65088" shape="2" size="16"/>
-			<output>
-				<port id="1" precision="I64">
-					<dim>2</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="18" name="28/Reshape" type="Reshape" version="opset1">
-			<data special_zero="True"/>
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>64</dim>
-					<dim>9</dim>
-					<dim>9</dim>
-				</port>
-				<port id="1">
-					<dim>2</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>5184</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="19" name="onnx_initializer_node_17/Output_0/Data__const" type="Const" version="opset1">
-			<data element_type="f16" offset="65104" shape="10,5184" size="103680"/>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>10</dim>
-					<dim>5184</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="20" name="29/WithoutBiases" type="MatMul" version="opset1">
-			<data transpose_a="0" transpose_b="1"/>
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>5184</dim>
-				</port>
-				<port id="1">
-					<dim>10</dim>
-					<dim>5184</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>10</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="21" name="onnx_initializer_node_18/Output_0/Data_/copy_const" type="Const" version="opset1">
-			<data element_type="f16" offset="168784" shape="1,10" size="20"/>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>10</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="22" name="29" type="Add" version="opset1">
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>10</dim>
-				</port>
-				<port id="1">
-					<dim>1</dim>
-					<dim>10</dim>
-				</port>
-			</input>
-			<output>
-				<port id="2" precision="FP16">
-					<dim>1</dim>
-					<dim>10</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="23" name="fc_out" type="SoftMax" version="opset1">
-			<data axis="1"/>
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>10</dim>
-				</port>
-			</input>
-			<output>
-				<port id="1" precision="FP16">
-					<dim>1</dim>
-					<dim>10</dim>
-				</port>
-			</output>
-		</layer>
-		<layer id="24" name="fc_out/sink_port_0" type="Result" version="opset1">
-			<input>
-				<port id="0">
-					<dim>1</dim>
-					<dim>10</dim>
-				</port>
-			</input>
-		</layer>
-	</layers>
-	<edges>
-		<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
-		<edge from-layer="1" from-port="1" to-layer="2" to-port="1"/>
-		<edge from-layer="2" from-port="2" to-layer="4" to-port="0"/>
-		<edge from-layer="3" from-port="1" to-layer="4" to-port="1"/>
-		<edge from-layer="4" from-port="2" to-layer="5" to-port="0"/>
-		<edge from-layer="5" from-port="1" to-layer="6" to-port="0"/>
-		<edge from-layer="6" from-port="1" to-layer="8" to-port="0"/>
-		<edge from-layer="7" from-port="1" to-layer="8" to-port="1"/>
-		<edge from-layer="8" from-port="2" to-layer="10" to-port="0"/>
-		<edge from-layer="9" from-port="1" to-layer="10" to-port="1"/>
-		<edge from-layer="10" from-port="2" to-layer="12" to-port="0"/>
-		<edge from-layer="11" from-port="1" to-layer="12" to-port="1"/>
-		<edge from-layer="12" from-port="2" to-layer="14" to-port="0"/>
-		<edge from-layer="13" from-port="1" to-layer="14" to-port="1"/>
-		<edge from-layer="14" from-port="2" to-layer="15" to-port="0"/>
-		<edge from-layer="15" from-port="1" to-layer="16" to-port="0"/>
-		<edge from-layer="16" from-port="1" to-layer="18" to-port="0"/>
-		<edge from-layer="17" from-port="1" to-layer="18" to-port="1"/>
-		<edge from-layer="18" from-port="2" to-layer="20" to-port="0"/>
-		<edge from-layer="19" from-port="1" to-layer="20" to-port="1"/>
-		<edge from-layer="20" from-port="2" to-layer="22" to-port="0"/>
-		<edge from-layer="21" from-port="1" to-layer="22" to-port="1"/>
-		<edge from-layer="22" from-port="2" to-layer="23" to-port="0"/>
-		<edge from-layer="23" from-port="1" to-layer="24" to-port="0"/>
-	</edges>
-	<meta_data>
-		<MO_version value="unknown version"/>
-		<cli_parameters>
-			<blobs_as_inputs value="True"/>
-			<data_type value="FP16"/>
-			<disable_resnet_optimization value="False"/>
-			<disable_weights_compression value="False"/>
-			<enable_concat_optimization value="False"/>
-			<extensions value="DIR"/>
-			<framework value="onnx"/>
-			<freeze_placeholder_with_value value="{}"/>
-			<generate_deprecated_IR_V2 value="False"/>
-			<generate_deprecated_IR_V7 value="False"/>
-			<generate_experimental_IR_V10 value="True"/>
-			<input_model value="DIR/test_model.onnx"/>
-			<keep_quantize_ops_in_IR value="True"/>
-			<keep_shape_ops value="False"/>
-			<log_level value="ERROR"/>
-			<mean_scale_values value="{}"/>
-			<mean_values value="()"/>
-			<model_name value="test_model"/>
-			<move_to_preprocess value="False"/>
-			<output_dir value="DIR"/>
-			<placeholder_data_types value="{}"/>
-			<progress value="False"/>
-			<reverse_input_channels value="False"/>
-			<scale_values value="()"/>
-			<silent value="False"/>
-			<stream_output value="False"/>
-			<unset unset_cli_parameters="batch, disable_fusing, disable_gfusing, finegrain_fusing, input, input_shape, output, placeholder_shapes, scale, transformations_config"/>
-		</cli_parameters>
-	</meta_data>
-</net>
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Sergey Shlyapnikov	3c63a0afe9	[GPU] Add set_arguments() call in case of dynamic dependencies of static inst (#21664 )	2023-12-22 17:56:59 +01:00
Anastasia Kuporosova	5da132bdfe	[PyOV] Delete compatibility tests (#21820 ) Co-authored-by: Michal Lukaszewski <michal.lukaszewski@intel.com> Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>	2023-12-22 17:13:21 +01:00
Maksim Doronin	4f9e8603c4	Get path to openvino.dll from test infra (#21760 ) * Get path to openvino.dll from test infra * clang fixes * add get_path_name * more clang fixes --------- Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>	2023-12-22 17:07:57 +01:00
Oleg Pipikin	26576b0a30	Rework makeConstant builder (#21319 ) * Rework make_constant node builder * Fix * Apply comments * Apply comments * Fix --------- Co-authored-by: Pavel Durandin <pavel.durandin@intel.com>	2023-12-22 14:57:05 +01:00
Georgy Krivoruchko	b0feb2f632	[ONNX] Refactoring tests on API 2.0 (#21840 ) * Tests in onnx_import_rnn.in.cpp moved to API 2.0 * Tests in onnx_import_signal.in.cpp moved to API 2.0	2023-12-22 14:51:15 +01:00
Sergey Shlyapnikov	ef3526c3c0	[GPU] Reduce unnecessary dynamic impls cloning (#21833 ) Co-authored-by: Pavel Durandin <pavel.durandin@intel.com>	2023-12-22 17:15:04 +04:00
Xuejun Zhai	670824fe2c	[Test] move softmax.hpp from single_layer_tests/ to single_op/ (#21824 ) Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com>	2023-12-22 15:49:37 +04:00
Ilya Lavrenov	6b1d2f8978	Additional clean-up for legacy components removal (#21841 )	2023-12-22 15:47:40 +04:00
Ilya Lavrenov	a0d3610518	Revert IE cmake config back (#21842 )	2023-12-22 15:43:24 +04:00
Surya Siddharth Pemmaraju	a88679aeb8	Fixed a bug in input validation for torch.compile options (#21787 ) * Fixed bug in input validation for torch.compile options * Added default device if device is None * Addressed PR comments --------- Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>	2023-12-22 13:51:08 +04:00
Ilya Lavrenov	c59ddbab69	Removed legacy components from installation (#21803 ) * Removed legacy components from installation * Updated OMZ submodule	2023-12-22 10:38:26 +01:00
Ilya Lavrenov	36441797b7	Fixed GHA Windows workflow (#21836 )	2023-12-22 12:55:16 +04:00
Anastasiia Pnevskaia	cc22a93395	[DOCS] Moved Troubleshooting Reshape article to legacy section. (#21819 ) * Moved Troubleshooting Reshape article to legacy section. * Minor correction. * Link corrected. * Removed not working link.	2023-12-22 12:52:12 +04:00
Mateusz Tabaka	2fcaa88af5	LSTMCellFusion - support transposed/not transposed weights (#21780 ) * LSTMCellFusion - support transposed/not transposed weights * add comment describing fused subgraph	2023-12-22 12:03:17 +04:00
Denis Orlov	bc121c06c7	Move ONNX test parameters from Azure folder to ONNX frontend test folder (#21827 ) * Remove remaining files for Azure pipelines * Move the files instead of removal * Fix the folder ref in linux.yml * Fix the folder ref in linux_arm64.yml	2023-12-22 03:37:57 +04:00
Anastasia Kuporosova	62dda1bd3e	[PyOV] delete test models from tests (#21821 )	2023-12-21 17:45:40 +01:00
Anastasia Kuporosova	d10f49441d	[PyOV] Remove deprecated py api (#21675 ) * [PyOV][Draft] Remove deprecated py api * clean up * remove symbol * fix failure * fix ci * remove shared_memory from arguments * update tests	2023-12-21 16:57:22 +01:00
Tatiana Savina	243602929f	[DOCS] PT and TF tutorial (#21815 ) * add docs initial draft * fix tabs * move to learn ov section * split the doc * change model prep structure * change content * restructure conversion docs * change docs order * fix format issues * fix format * change blocks * fix links * fix build * fix link * change tutorial * add links and fix wording * change header * fix name	2023-12-21 14:57:41 +01:00
Vishniakov Nikolai	3546b0e15d	[JS OV] Specify JS API GitHub config (#21785 ) * Extend CODEOWNERS by js api responsibles * Extend smart ci by js component * Fix codeowners * Fix component definition * Update components.yml * Update components.yml --------- Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>	2023-12-21 13:43:36 +00:00
Anastasiia Pnevskaia	3ab5ee861d	Fixed names for GraphDef. (#21799 )	2023-12-21 17:04:05 +04:00
Ilya Lavrenov	80618b0498	Drop POT (#21805 ) * Drop POT * Removed POT transformations	2023-12-21 16:46:37 +04:00
Ilya Lavrenov	c79ae17bbf	Removed GNA plugin from 2024.0 (#21800 )	2023-12-21 13:39:06 +01:00
Vitaliy Urusovskij	1fd3399cdf	Shared `execution_graph_tests` to API2.0 (#21718 )	2023-12-21 13:27:24 +01:00
Roman Lyamin	46ebee6be6	[Transformations] Support precision conversion for PriorBoxClustered (#21795 )	2023-12-21 16:24:42 +04:00
Georgy Krivoruchko	f980ad8f4c	[ONNX] Refactoring tests on API 2.0 (#21784 ) * Tests in onnx_import_quant.in.cpp moved to API 2.0 * Tests in onnx_import_reshape.in.cpp moved to API 2.0 * Update src/frontends/onnx/tests/onnx_import_reshape.in.cpp * Update onnx_import_reshape.in.cpp --------- Co-authored-by: Vitaliy Urusovskij <vitaliy.urusovskij@intel.com>	2023-12-21 15:20:04 +04:00
Anastasiia Pnevskaia	12faade22f	[TF FE] Complex type support extended for Separate Bass model. (#21477 ) * Complex type support extended, fixed some errors. * Tests correction. * FloorDiv, TensorListConcatV2 fixed. * FloorDiv test added. * Corrected imports.	2023-12-21 15:17:12 +04:00
Maciej Smyk	71fca88d4b	[DOCS] Removal of unused images + fixing image directives (#21792 ) * removal of unused images * removal of duplicate image files * removal of unused files + fix for image directive * removal of unused images * removal of unused images + fix for protecting_model.rst * overview document graph update	2023-12-21 10:37:01 +01:00
Ilya Lavrenov	9b52171d29	Updated OpenVINO version to 2024.0 (#21790 )	2023-12-21 10:33:07 +01:00
Sofya Balandina	1cc1a3fe35	[op conformance] Made fixes to allign with accuracy validation (#21782 )	2023-12-21 13:23:47 +04:00
Karol Blaszczak	fcb7ca6edd	[DOCS] NPU update master (#21811 )	2023-12-21 10:08:33 +01:00
Ilya Lavrenov	2463acc5b0	Removed speech_sample (#21801 )	2023-12-21 13:06:57 +04:00
rghvsh	1a0f0ccd2a	[ONNX] Extend ONNX Frontend with BitwiseOr-18 operator (#21755 ) * Create bitwise_or.hpp * Update ops_bridge.cpp * Added protobuf(.prototxt) files via upload. * Update onnx_import.in.cpp * Update test_backend.py * Skip "test_bitwise_or_ui64_bcast_3v1d_cpu"	2023-12-21 12:27:28 +04:00
Roman Lyamin	0709a35cb0	[GPU] Small fix for dynamic_shape_gather_opts pass (#21807 )	2023-12-21 11:42:03 +04:00
Oleg Pipikin	12a9d55c3e	Use new builders instead of old ones in tests (#21742 ) * Use new make_convolution instead of old one * Use new make_group_convolution instead of old one * Use new make_convolution_backprop_data instead of old one * Use new make_group_convolution_backprop_data instead of old one * Use new make_binary_conv instead of old one * Remove makeVariadicSplit builder * Use new make_activation instead of old one * Use new make_eltwise instead of old one * Use new make_embedding_bag_packed_sum instead of old one * Remove augru builder * Fix clang-format * Fix	2023-12-21 11:26:39 +04:00
Xuejun Zhai	ca5bf95e26	Xuejun/behavior tests core threading (#21775 ) * [CPU Func Test] upgrade preprocessing related tests to 2.0 Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [CPU Func Test] upgrade version related tests to 2.0 Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * [CPU Func Test] upgrade core threading related tests to 2.0 Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com> * Revert "[CPU Func Test] upgrade preprocessing related tests to 2.0" This reverts commit `abc27eabd2`. --------- Signed-off-by: Zhai, Xuejun <xuejun.zhai@intel.com>	2023-12-21 11:04:06 +04:00
Vitaliy Urusovskij	56df9bc75e	Test utils to API2.0 + leftovers (#21731 ) * Move shared `snippets` tests to API2.0 * Clean test utils from API1.0 * Move tests leftovers to API2.0 * Remove commented * ClangFormat * Fix comment * Remove `single_layer/` includes from ported CPU tests * Remove extra `InferenceEngine` usage * CppLint * Remove `single_layer/` usage from fusing_test_utils * Remove `single_layer/` usage from GPU func tests	2023-12-21 10:02:58 +04:00
Ilya Lavrenov	8c03c991cd	Removed deployment manager (#21802 )	2023-12-21 01:02:51 +04:00
River Li	dc64268564	Remove ov::hint::PerformanceMode::UNDEFINED (#21592 ) * Remove ov::hint::PerformanceMode::UNDEFINED * Update for reviewer comments and build issue * Fix build error - may be used uninitialized * Update --------- Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>	2023-12-20 21:15:26 +04:00
Alina Kladieva	3d3bb51de9	[GHA] Fix smart ci for docs changes in linux arm64 (#21798 ) Update linux_arm64.yml	2023-12-20 21:13:17 +04:00
Alina Kladieva	2b950a65b3	[GHA] Add links to Smart CI doc to GHA overview. Align folder structure (#21794 ) Update custom_actions.md	2023-12-20 19:53:48 +04:00
yanlan song	15e43c6f5b	support user dynamism output buffer (#21647 ) * test gpu user output buffer dynamism Signed-off-by: fishbell <bell.song@intel.com> * enable test Signed-off-by: fishbell <bell.song@intel.com> * check valid pointer Signed-off-by: fishbell <bell.song@intel.com> * update case Signed-off-by: fishbell <bell.song@intel.com> * cpplint Signed-off-by: fishbell <bell.song@intel.com> --------- Signed-off-by: fishbell <bell.song@intel.com>	2023-12-20 19:40:16 +04:00
				`@@ -0,0 +1,2 @@`

				`> NOTE: This version is pre-release software and has not undergone full release validation or qualification. No support is offered on pre-release software and APIs/behavior are subject to change. It should NOT be incorporated into any production software/solution and instead should be used only for early testing and integration while awaiting a final release version of this software.`