Fix compile problem when open -Wnon-virtual-dtor compile flag (#10705 )

* Fix compile problem when open -Wnon-virtual-dtor compile flag * update code style * fix the code style
Fixed declaration of 'xxx' hides global declaration (#10733 )
2022-03-02 16:31:37 +03:00 · 2022-03-02 16:01:21 +03:00 · 2022-03-02 15:44:34 +03:00 · 2022-03-02 15:36:31 +03:00 · 2022-03-02 18:03:28 +08:00 · 2022-03-02 12:50:31 +03:00
1063 changed files with 19567 additions and 12327 deletions
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -34,7 +34,9 @@ endif()
 message (STATUS "PROJECT ............................... " ${PROJECT_NAME})
 message (STATUS "CMAKE_VERSION ......................... " ${CMAKE_VERSION})
 message (STATUS "CMAKE_BINARY_DIR ...................... " ${CMAKE_BINARY_DIR})
+message (STATUS "CMAKE_SOURCE_DIR ...................... " ${CMAKE_SOURCE_DIR})
 message (STATUS "OpenVINO_SOURCE_DIR ................... " ${OpenVINO_SOURCE_DIR})
+message (STATUS "OpenVINO_BINARY_DIR ................... " ${OpenVINO_BINARY_DIR})
 message (STATUS "CMAKE_GENERATOR ....................... " ${CMAKE_GENERATOR})
 message (STATUS "CMAKE_C_COMPILER_ID ................... " ${CMAKE_C_COMPILER_ID})
 message (STATUS "CMAKE_CXX_COMPILER_ID ................. " ${CMAKE_CXX_COMPILER_ID})
@@ -42,7 +44,7 @@ message (STATUS "CMAKE_BUILD_TYPE ...................... " ${CMAKE_BUILD_TYPE})
 message (STATUS "CMAKE_TOOLCHAIN_FILE .................. " ${CMAKE_TOOLCHAIN_FILE})

 # remove file with exported developer targets to force its regeneration
-file(REMOVE "${CMAKE_BINARY_DIR}/ngraph/ngraphTargets.cmake")
+file(REMOVE "${CMAKE_BINARY_DIR}/ngraphTargets.cmake")
 file(REMOVE "${CMAKE_BINARY_DIR}/InferenceEngineTargets.cmake")
 file(REMOVE "${CMAKE_BINARY_DIR}/OpenVINOTargets.cmake")
 foreach(component IN LISTS openvino_export_components)
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,68 @@
+# How to contribute to the OpenVINO repository
+
+We suppose that you are an enthusiastic coder, want to contribute some code. For that purpose OpenVINO project now has a repository on the GitHub, to simplify everybody's life! All the bug fixes, new functionality, new tutorials etc. should be submitted via the GitHub's mechanism of pull requests.
+
+If you are not familiar with the mechanism - do not worry, it's very simple. Keep reading.
+
+## Before you start contributing you should
+
+-   Make sure you agree to contribute your code under  [OpenVINO (Apache 2.0)](https://github.com/openvinotoolkit/openvino/blob/master/LICENSE)  license.
+-   If you are submitting a new module, you should go into  [openvino_contrib](https://github.com/openvinotoolkit/openvino_contrib)  repository by default.
+-   If you are going to fix a bug, check that it's still exists. This can be done by building the latest  [releases/2020/3](https://github.com/openvinotoolkit/openvino/tree/releases/2020/3)  branch (LTS release) or the latest master branch, and make sure that the error is still reproducible there. We do not fix bugs that only affect older non-LTS releases like 2020.2 for example (more details about  [branching strategy](https://github.com/openvinotoolkit/openvino/wiki/Branches))
+-   Make sure that nobody beat you into fixing or reporting the issue by doing a search on the  [Github OpenVINO issues](https://github.com/openvinotoolkit/openvino/issues)  page, and making sure that there isn't someone working on it. In the latter case you might provide support or suggestion in the issue or in the linked pull request.
+-   If you have a question about the software, then this is  **NOT**  the right place. You should open up a question at the  [OpenVINO forum](https://community.intel.com/t5/Intel-Distribution-of-OpenVINO/bd-p/distribution-openvino-toolkit). In order to post a decent question from the start, feel free to read the official forum guidelines.
+
+Before you open up anything on the OpenVINO GitHub page, be sure that you are at the right place with your problem.
+
+## "Fork & Pull Request model" for code contribution
+
+### [](https://github.com/openvinotoolkit/openvino/wiki/Contribute#the-instruction-in-brief)The instruction in brief
+
+-   Register at GitHub. Create your fork of OpenVINO repository  [https://github.com/openvinotoolkit/openvino](https://github.com/openvinotoolkit/openvino)  (see  [https://help.github.com/articles/fork-a-repo](https://help.github.com/articles/fork-a-repo)  for details).
+-   Install Git.
+    -   Set your user name and email address in a Git configuration according to GitHub account (see  [https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup](https://git-scm.com/book/en/v2/Getting-Started-First-Time-Git-Setup)  for details).
+-   Choose a task for yourself. It could be a bugfix or some new code.
+-   Choose a base branch for your work. More details about branches and policies are here:  [Branches](https://github.com/openvinotoolkit/openvino/wiki/Branches)
+-   Clone your fork to your computer.
+-   Create a new branch (with a meaningful name) from the base branch you chose.
+-   Modify / add the code following our  [Coding Style Guide](https://github.com/openvinotoolkit/openvino/wiki/CodingStyleGuideLines)  and  [Documentation guidelines](https://github.com/openvinotoolkit/openvino/wiki/CodingStyleGuideLinesDocumentation).
+-   If you want to add a new sample, please look at this  [Guide for contributing to C++/C/Python IE samples](https://github.com/openvinotoolkit/openvino/wiki/SampleContribute)
+-   Run testsuite locally:
+    -   execute each test binary from the artifacts directory, e.g.  `<source dir>/bin/intel64/Release/ieFuncTests`
+-   If you contribute to the documentation and want to add a new guide:
+    -   Create a new markdown file in an appropriate folder.
+    -   **REQUIRED:**  The document title must contain a document label in a form:  `{#openvino_docs_<name>}`. For example:  `Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™ {#openvino_docs_MO_DG_IR_and_opsets}`.
+    -   Add your file to the documentation structure. Open the documentation structure file  [`docs/doxygen/ie_docs.xml`](https://github.com/openvinotoolkit/openvino/blob/master/docs/doxygen/ie_docs.xml)  and add your file path to the appropriate section.
+-   When you are done, make sure that your branch is to date with latest state of the branch you want to contribute to (e.g.  `git fetch upstream && git merge upstream/master`), push your branch to your GitHub fork; then create a pull request from your branch to the base branch (see  [https://help.github.com/articles/using-pull-requests](https://help.github.com/articles/using-pull-requests)  for details).
+
+## Making a good pull request
+
+Following these guidelines will increase the likelihood of your pull request being accepted:
+
+-   Before pushing your PR to the repository, make sure that it builds perfectly fine on your local system.
+-   Add enough information, like a meaningful title, the reason why you made the commit and a link to the issue page if you opened one for this PR.
+-   Scope your PR to one issue. Before submitting, make sure the diff contains no unrelated changes. If you want to cover more than one issue, submit your changes for each as separate pull requests.
+-   If you have added new functionality, you should update/create the relevant documentation, as well as add tests for it to the testsuite.
+-   Try not to include "oops" commits - ones that just fix an error in the previous commit. If you have those, then before submitting  [squash](https://github.com/openvinotoolkit/openvino/wiki/Contribute#https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#Squashing-Commits)  those fixes directly into the commits where they belong.
+-   Make sure to choose the right base branch and to follow the  [Coding Style Guide](https://github.com/openvinotoolkit/openvino/wiki/CodingStyleGuideLines)  for your code or  [Documentation guidelines](https://github.com/openvinotoolkit/openvino/wiki/CodingStyleGuideLinesDocumentation)  you are changing documentation files.
+-   Make sure to add test for new functionality or test that reproduces fixed bug with related test data. Please do not add extra images or videos, if some of existing media files are suitable.
+
+## Testing and merging pull requests
+
+-   Your pull request will be automatically tested by OpenVINO's precommit (testing status are automatically reported as "green" or "red" circles in precommit steps on PR's page). If any builders have failed, you should fix the issue. To rerun the automatic builds just push changes to your branch on GitHub. No need to close pull request and open a new one!
+-   Once all the builders are "green", one of OpenVINO developers will review your code. Reviewer could ask you to modify your pull request. Please provide timely response for reviewers (within weeks, not months), otherwise you submission could be postponed or even rejected.
+
+## PR review good practices
+
+-   Originator is responsible for driving the review of changes and should ping reviewers periodically.
+-   Originator should close comments from the Reviewer when it is resolved. The Reviewer may re-open the comment if he does not agree with the resolution.
+-   Originator should request re-review from the Reviewer when all comments are resolved by pushing the button in the “Reviewers” section.
+-   If it is still WIP and you want to check CI test results early then use  _Draft_  PR.
+-   Do  **NOT**  rewrite history (push -f) once you converted draft PR into regular one, add new commits instead. Looking at diffs makes review easier.
+-   Write meaningful description of commits resulting from review.  _"Addressing review comments"_  is  **NOT**  a good description! Having a quick look at good descriptions can tell you much what is going on in PR without a need to go through all of resolved comments.
+
+## Merging PR
+
+As soon as the reviewer is fine with the pull request and Precommit likes your code and shows "green" status, the "Approved" review status is put, which signals OpenVINO maintainers that they can merge your pull request.
+
+© Copyright 2018-2022, OpenVINO team
--- a/README.md
+++ b/README.md
@@ -43,7 +43,7 @@ Please report questions, issues and suggestions using:
 \* Other names and brands may be claimed as the property of others.

 [Open Model Zoo]:https://github.com/openvinotoolkit/open_model_zoo
-[OpenVINO™ Runtime]:https://docs.openvino.ai/latest/openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html
+[OpenVINO™ Runtime]:https://docs.openvino.ai/latest/openvino_docs_OV_Runtime_User_Guide.html
 [Model Optimizer]:https://docs.openvino.ai/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html
 [Post-Training Optimization Tool]:https://docs.openvino.ai/latest/pot_README.html
 [tag on StackOverflow]:https://stackoverflow.com/search?q=%23openvino
--- a/cmake/dependencies.cmake
+++ b/cmake/dependencies.cmake
@@ -28,12 +28,12 @@ if(COMMAND get_linux_name)
 endif()

 if(CMAKE_CROSSCOMPILING AND CMAKE_HOST_SYSTEM_NAME MATCHES Linux AND CMAKE_HOST_SYSTEM_PROCESSOR MATCHES "amd64.*|x86_64.*|AMD64.*")
-    set(protoc_version "3.9.2")
+    set(protoc_version "3.18.2")

    RESOLVE_DEPENDENCY(SYSTEM_PROTOC_ROOT
        ARCHIVE_LIN "protoc-${protoc_version}-linux-x86_64.tar.gz"
        TARGET_PATH "${TEMP}/protoc-${protoc_version}-linux-x86_64"
-        SHA256 "1d6da1d97d0cbfcd333558afe24533eb3cb48dc1e0ab5e971aa1e50ede8bcf45"
+        SHA256 "42fde2b6044c1f74c7e86d4e03b43aac87128ddf57ac6ed8c4eab7a1e21bbf21"
    )
    debug_message(STATUS "host protoc-${protoc_version} root path = " ${SYSTEM_PROTOC_ROOT})

--- a/cmake/developer_package/IEDevScriptsConfig.cmake
+++ b/cmake/developer_package/IEDevScriptsConfig.cmake
@@ -158,16 +158,22 @@ else ()
 endif()
 add_definitions(-DIE_BUILD_POSTFIX=\"${IE_BUILD_POSTFIX}\")

+macro(ov_set_if_not_defined var value)
+    if(NOT DEFINED ${var})
+        set(${var} ${value})
+    endif()
+endmacro()
+
 if(NOT UNIX)
-    set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
-    set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
+    ov_set_if_not_defined(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
+    ov_set_if_not_defined(CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
 else()
-    set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER}/lib)
-    set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER}/lib)
+    ov_set_if_not_defined(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER}/lib)
+    ov_set_if_not_defined(CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER}/lib)
 endif()
-set(CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
-set(CMAKE_PDB_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
-set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
+ov_set_if_not_defined(CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
+ov_set_if_not_defined(CMAKE_PDB_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})
+ov_set_if_not_defined(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${OUTPUT_ROOT}/${BIN_FOLDER})

 if(APPLE)
    set(CMAKE_MACOSX_RPATH ON)
@@ -206,6 +212,10 @@ endif()

 macro(ov_install_static_lib target comp)
    if(NOT BUILD_SHARED_LIBS)
+        get_target_property(target_type ${target} TYPE)
+        if(${target_type} STREQUAL "STATIC_LIBRARY")
+            set_target_properties(${target} PROPERTIES EXCLUDE_FROM_ALL FALSE)
+        endif()
        install(TARGETS ${target} EXPORT OpenVINOTargets
                ARCHIVE DESTINATION ${IE_CPACK_ARCHIVE_PATH} COMPONENT ${comp} ${ARGN})
    endif()
--- a/cmake/developer_package/download/download_and_extract.cmake
+++ b/cmake/developer_package/download/download_and_extract.cmake
@@ -146,8 +146,6 @@ function (DownloadOrExtractInternal URL archive_path unpacked_path folder fattal

 endfunction(DownloadOrExtractInternal)

-file(REMOVE ${CMAKE_BINARY_DIR}/dependencies_64.txt)
-
 function (CheckOrDownloadAndExtract component RELATIVE_URL archive_name unpacked_path result_path folder fattal resultExt use_alternatives sha256 files_to_extract)
  set (archive_path ${TEMP}/download/${archive_name})
  set (status "ON")
@@ -164,7 +162,6 @@ function (CheckOrDownloadAndExtract component RELATIVE_URL archive_name unpacked
  if (${use_alternatives})
    set(DEP_INFO "${component}=${URL}")
    debug_message (STATUS "DEPENDENCY_URL: ${DEP_INFO}")
-    file(APPEND ${CMAKE_BINARY_DIR}/dependencies_64.txt "${DEP_INFO}\n")
  endif()

  debug_message ("checking that unpacked directory exist: ${unpacked_path}")
--- a/cmake/developer_package/packaging.cmake
+++ b/cmake/developer_package/packaging.cmake
@@ -15,6 +15,10 @@ function(ie_cpack_set_library_dir)
        set(IE_CPACK_LIBRARY_PATH runtime/lib/${ARCH_FOLDER}/$<CONFIG> PARENT_SCOPE)
        set(IE_CPACK_RUNTIME_PATH runtime/bin/${ARCH_FOLDER}/$<CONFIG> PARENT_SCOPE)
        set(IE_CPACK_ARCHIVE_PATH runtime/lib/${ARCH_FOLDER}/$<CONFIG> PARENT_SCOPE)
+    elseif(APPLE)
+        set(IE_CPACK_LIBRARY_PATH runtime/lib/${ARCH_FOLDER}/$<CONFIG> PARENT_SCOPE)
+        set(IE_CPACK_RUNTIME_PATH runtime/lib/${ARCH_FOLDER}/$<CONFIG> PARENT_SCOPE)
+        set(IE_CPACK_ARCHIVE_PATH runtime/lib/${ARCH_FOLDER}/$<CONFIG> PARENT_SCOPE)
    else()
        set(IE_CPACK_LIBRARY_PATH runtime/lib/${ARCH_FOLDER} PARENT_SCOPE)
        set(IE_CPACK_RUNTIME_PATH runtime/lib/${ARCH_FOLDER} PARENT_SCOPE)
--- a/cmake/templates/InferenceEngineDeveloperPackageConfig.cmake.in
+++ b/cmake/templates/InferenceEngineDeveloperPackageConfig.cmake.in
@@ -44,7 +44,7 @@ find_dependency(InferenceEngine
                NO_DEFAULT_PATH)

 find_dependency(ngraph
-                PATHS "${CMAKE_CURRENT_LIST_DIR}/src/core"
+                PATHS "${CMAKE_CURRENT_LIST_DIR}"
                NO_CMAKE_FIND_ROOT_PATH
                NO_DEFAULT_PATH)

--- a/cmake/test_model_zoo.cmake
+++ b/cmake/test_model_zoo.cmake
@@ -86,11 +86,6 @@ ov_model_convert("${OpenVINO_SOURCE_DIR}/${rel_path}"
                 "${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/test_model_zoo/onnx_import"
                 ie_onnx_import_out_files)

-set(rel_path "docs/onnx_custom_op")
-ov_model_convert("${OpenVINO_SOURCE_DIR}/${rel_path}"
-                 "${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/test_model_zoo/docs/models"
-                 docs_onnx_out_files)
-
 if(ENABLE_TESTS)
    if(ENABLE_OV_ONNX_FRONTEND AND ENABLE_REQUIREMENTS_INSTALL)
        find_package(PythonInterp 3 REQUIRED)
--- a/cmake/toolchains/mt.runtime.win32.toolchain.cmake
+++ b/cmake/toolchains/mt.runtime.win32.toolchain.cmake
@@ -25,7 +25,7 @@ endif()
 if(use_static_runtime)
    foreach(lang C CXX)
        foreach(build_type "" "_DEBUG" "_MINSIZEREL" "_RELEASE" "_RELWITHDEBINFO")
-            set(flag_var "CMAKE_${lang}_FLAGS${build_type}")
+            set(flag_var "CMAKE_${lang}_FLAGS${build_type}_INIT")
            string(REPLACE "/MD" "/MT" ${flag_var} "${${flag_var}}")
        endforeach()
    endforeach()
--- a/cmake/toolchains/oecore.arm64.toolchain.cmake
+++ b/cmake/toolchains/oecore.arm64.toolchain.cmake
@@ -1,41 +0,0 @@
-# Copyright (C) 2018-2022 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-#
-
-if(DEFINED OECORE_BASE_DIR)
-    # OECORE_BASE_DIR was passed via CMake command line, nothing to do
-elseif(DEFINED ENV{OECORE_BASE_DIR})
-    # User sets OECORE_BASE_DIR environment variable
-    set(OECORE_BASE_DIR $ENV{OECORE_BASE_DIR})
-elseif(DEFINED ENV{OECORE_NATIVE_SYSROOT})
-    # OECORE_NATIVE_SYSROOT is a default environment variable for the OECore toolchain
-    set(OECORE_BASE_DIR "$ENV{OECORE_NATIVE_SYSROOT}/../..")
-else()
-    # Use default value
-    set(OECORE_BASE_DIR "/usr/local/oecore-x86_64")
-endif()
-
-set(OECORE_TARGET_NAME              "aarch64-ese-linux")
-set(OECORE_TARGET_SYSROOT           "${OECORE_BASE_DIR}/sysroots/${OECORE_TARGET_NAME}")
-set(OECORE_HOST_SYSROOT             "${OECORE_BASE_DIR}/sysroots/x86_64-esesdk-linux")
-set(OECORE_HOST_COMPILER_BIN_DIR    "${OECORE_HOST_SYSROOT}/usr/bin/${OECORE_TARGET_NAME}")
-
-set(CMAKE_SYSTEM_NAME       "Linux")
-set(CMAKE_SYSTEM_PROCESSOR  "aarch64")
-
-set(CMAKE_SYSROOT "${OECORE_TARGET_SYSROOT}")
-
-set(CMAKE_C_COMPILER    "${OECORE_HOST_COMPILER_BIN_DIR}/aarch64-ese-linux-gcc")
-set(CMAKE_CXX_COMPILER  "${OECORE_HOST_COMPILER_BIN_DIR}/aarch64-ese-linux-g++")
-
-set(CMAKE_C_FLAGS_INIT      "-mcpu=cortex-a53 -mtune=cortex-a53 --sysroot=${OECORE_TARGET_SYSROOT}")
-set(CMAKE_CXX_FLAGS_INIT    "-mcpu=cortex-a53 -mtune=cortex-a53 --sysroot=${OECORE_TARGET_SYSROOT}")
-
-set(CMAKE_EXE_LINKER_FLAGS_INIT     "-Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed --sysroot=${OECORE_TARGET_SYSROOT}")
-set(CMAKE_SHARED_LINKER_FLAGS_INIT  "-Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed --sysroot=${OECORE_TARGET_SYSROOT}")
-set(CMAKE_MODULE_LINKER_FLAGS_INIT  "-Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed --sysroot=${OECORE_TARGET_SYSROOT}")
-
-set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
-set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
-set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
-set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)
--- a/cmake/toolchains/onecoreuap.toolchain.cmake
+++ b/cmake/toolchains/onecoreuap.toolchain.cmake
@@ -35,14 +35,14 @@ if(_onecoreuap_arch STREQUAL "x64")
    # Forcefull make VS search for C++ libraries in these folders prior to other c++ standard libraries localizations.
    add_link_options("/LIBPATH:\"\$\(VC_LibraryPath_VC_x64_OneCore\)\"")

-    set(CMAKE_C_STANDARD_LIBRARIES "\$\(UCRTContentRoot\)lib/\$\(TargetUniversalCRTVersion\)/um/\$\(Platform\)/OneCoreUap.lib" CACHE STRING "" FORCE)
-    set(CMAKE_CXX_STANDARD_LIBRARIES "\$\(UCRTContentRoot\)lib/\$\(TargetUniversalCRTVersion\)/um/\$\(Platform\)/OneCoreUap.lib" CACHE STRING "" FORCE)
+    set(CMAKE_C_STANDARD_LIBRARIES_INIT "\$\(UCRTContentRoot\)lib/\$\(TargetUniversalCRTVersion\)/um/\$\(Platform\)/OneCoreUap.lib" CACHE STRING "" FORCE)
+    set(CMAKE_CXX_STANDARD_LIBRARIES_INIT "\$\(UCRTContentRoot\)lib/\$\(TargetUniversalCRTVersion\)/um/\$\(Platform\)/OneCoreUap.lib" CACHE STRING "" FORCE)
 elseif(_onecoreuap_arch STREQUAL "X86")
    add_link_options("/LIBPATH:\"\$\(VCInstallDir\)lib/onecore\"")
    add_link_options("/LIBPATH:\"\$\(VC_LibraryPath_VC_x86_OneCore\)\"")

-    set(CMAKE_C_STANDARD_LIBRARIES "\$\(UCRTContentRoot\)lib/\$\(TargetUniversalCRTVersion\)/um/x86/OneCoreUap.lib" CACHE STRING "" FORCE)
-    set(CMAKE_CXX_STANDARD_LIBRARIES "\$\(UCRTContentRoot\)lib/\$\(TargetUniversalCRTVersion\)/um/x86/OneCoreUap.lib" CACHE STRING "" FORCE)
+    set(CMAKE_C_STANDARD_LIBRARIES_INIT "\$\(UCRTContentRoot\)lib/\$\(TargetUniversalCRTVersion\)/um/x86/OneCoreUap.lib" CACHE STRING "" FORCE)
+    set(CMAKE_CXX_STANDARD_LIBRARIES_INIT "\$\(UCRTContentRoot\)lib/\$\(TargetUniversalCRTVersion\)/um/x86/OneCoreUap.lib" CACHE STRING "" FORCE)
 else()
    message(FATAL_ERROR "Unsupported architecture ${_onecoreuap_arch}. Only X86 or X86_64 are supported")
 endif()
@@ -52,8 +52,8 @@ unset(_onecoreuap_arch)
 # compile flags

 set(includes "/I\"\$\(UniversalCRT_IncludePath\)\"")
-set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${includes}")
-set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${includes}")
+set(CMAKE_C_FLAGS_INIT "${CMAKE_C_FLAGS_INIT} ${includes}")
+set(CMAKE_CXX_FLAGS_INIT "${CMAKE_CXX_FLAGS_INIT} ${includes}")
 unset(includes)

 # linker flags
@@ -62,9 +62,9 @@ foreach(lib kernel32 user32 advapi32 ole32 mscoree combase)
    set(linker_flags "/NODEFAULTLIB:${lib}.lib ${linker_flags}")
 endforeach()

-set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${linker_flags}")
-set(CMAKE_MODULE_LINKER_FLAGS "${CMAKE_MODULE_LINKER_FLAGS} ${linker_flags}")
-set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${linker_flags}")
+set(CMAKE_SHARED_LINKER_FLAGS_INIT "${CMAKE_SHARED_LINKER_FLAGS_INIT} ${linker_flags}")
+set(CMAKE_MODULE_LINKER_FLAGS_INIT "${CMAKE_MODULE_LINKER_FLAGS_INIT} ${linker_flags}")
+set(CMAKE_EXE_LINKER_FLAGS_INIT "${CMAKE_EXE_LINKER_FLAGS_INIT} ${linker_flags}")
 unset(linker_flags)

 #
--- a/docs/CMakeLists.txt
+++ b/docs/CMakeLists.txt
@@ -17,9 +17,6 @@ if(NOT ENABLE_DOCKER)
        set(OpenVINO_DIR ${CMAKE_BINARY_DIR})
    endif()

-    if(ENABLE_OV_ONNX_FRONTEND)
-        add_subdirectory(onnx_custom_op)
-    endif()
    add_subdirectory(template_extension)

    set(all_docs_targets
--- a/docs/Doxyfile.config
+++ b/docs/Doxyfile.config
@@ -843,16 +843,6 @@ INPUT                  = "@MARKDOWN_INPUT@" \
                         "@OpenVINO_SOURCE_DIR@/src/common/transformations/include/" \
                         "@OpenVINO_SOURCE_DIR@/src/common/util/include/" \
                         "@OpenVINO_SOURCE_DIR@/src/core/include/" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/descriptor" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/op/" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/op/util" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/opsets/" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/pass/" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/pattern/" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/pattern/op/" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/runtime/" \
-                         "@OpenVINO_SOURCE_DIR@/src/core/include/ngraph/type/" \
                         "@OpenVINO_SOURCE_DIR@/src/core/include/openvino/" \
                         "@OpenVINO_SOURCE_DIR@/src/core/include/openvino/core/" \
                         "@OpenVINO_SOURCE_DIR@/src/core/include/openvino/core/descriptor/" \
@@ -917,7 +907,9 @@ RECURSIVE              = YES
 # Note that relative paths are relative to the directory from which doxygen is
 # run.

-EXCLUDE                =
+EXCLUDE                = "@OpenVINO_SOURCE_DIR@/thirdparty" \
+                         "@OpenVINO_SOURCE_DIR@/temp" \
+                         "@OpenVINO_SOURCE_DIR@/bin"

 # The EXCLUDE_SYMLINKS tag can be used to select whether or not files or
 # directories that are symbolic links (a Unix file system feature) are excluded
@@ -936,7 +928,6 @@ EXCLUDE_SYMLINKS       = NO
 EXCLUDE_PATTERNS       = */temp/* \
                         */bin/* \
                         */tests/* \
-                         */openvx/* \
                         */thirdparty/* \
                         "@DOXYREST_OUT@" \
                         "@XML_OUTPUT@" \
@@ -1045,7 +1036,6 @@ EXCLUDE_SYMBOLS        = InferenceEngine::details \
 EXAMPLE_PATH           = "@OpenVINO_SOURCE_DIR@" \
                         "@OpenVINO_SOURCE_DIR@/docs/HOWTO/" \
                         "@OpenVINO_SOURCE_DIR@/docs/" \
-                         "@OpenVINO_SOURCE_DIR@/docs/onnx_custom_op/" \
                         "@OpenVINO_SOURCE_DIR@/docs/template_extension/" \
                         "@OpenVINO_SOURCE_DIR@/docs/template_extension/old/" \
                         "@OpenVINO_SOURCE_DIR@/docs/template_extension/new/" \
--- a/docs/Extensibility_UG/Intro.md
+++ b/docs/Extensibility_UG/Intro.md
@@ -0,0 +1,115 @@
+# OpenVINO Extensibility Mechanism {#openvino_docs_Extensibility_UG_Intro}
+
+@sphinxdirective
+
+.. toctree::
+   :maxdepth: 1
+   :hidden:
+
+   openvino_docs_Extensibility_UG_add_openvino_ops
+
+@endsphinxdirective
+
+The Intel® Distribution of OpenVINO™ toolkit supports neural network models trained with multiple frameworks including
+TensorFlow, Caffe, MXNet, Kaldi, PaddlePaddle, and ONNX. The list of supported operations (layers) is different for
+each of the supported frameworks. To see the operations supported by your framework, refer to
+[Supported Framework Operations](../MO_DG/prepare_model/Supported_Frameworks_Layers.md).
+
+Custom operations, that is those not included in the list, are not recognized by OpenVINO™ out-of-the-box. Therefore, creating Intermediate Representation (IR) for a model using them requires additional steps. This guide illustrates the workflow for running inference on topologies featuring custom operations, allowing you to plug in your own implementation for existing or completely new operations.
+
+If your model contains operations not normally supported by OpenVINO™, the OpenVINO™ Extensibility API lets you add support for those custom operations and use one implementation for Model Optimizer and OpenVINO™ Runtime.
+
+There are two steps to support inference of a model with custom operation(s):
+1. Add support for a [custom operation in the Model Optimizer](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) so
+the Model Optimizer can generate the IR with the operation.
+2. Create a custom operation in it as described in the [Custom Operation](add_openvino_ops.md).
+
+## OpenVINO™ Extensions
+
+An OpenVINO™ provides extensions for:
+
+ * [Custom OpenVINO™ Operation](add_openvino_ops.md):
+    - Enables the creation of unsupported operations
+    - Enables the use of `ov::Core::read_model` to read models with unsupported operations
+    - Provides a shape inference mechanism for custom operations
+    - Provides an evaluate method which allow to support the operation on CPU or perform constant folding
+
+> **NOTE**: This documentation is written based on the [Template extension](https://github.com/openvinotoolkit/openvino/tree/master/docs/template_extension/new), which demonstrates extension development details. You can review the complete code, which is fully compilable and up-to-date, to see how it works.
+
+## Load extensions to OpenVINO™ Runtime
+
+To load the extensions to the `ov::Core` object, use the `ov::Core::add_extension` method, this method allows to load library with extensions or extensions from the code.
+
+### Load extensions to core
+
+Extensions can be loaded from code with `ov::Core::add_extension` method:
+
+@sphinxdirective
+
+.. tab:: C++
+
+    .. doxygensnippet:: docs/snippets/ov_extensions.cpp
+       :language: cpp
+       :fragment: add_extension
+
+.. tab:: Python
+
+    .. doxygensnippet:: docs/snippets/ov_extensions.py
+       :language: python
+       :fragment: add_extension
+
+@endsphinxdirective
+
+### Create library with extensions
+
+You need to create extension library in following cases:
+ - Load extensions to Model Optimizer
+ - Load extensions to Python application
+
+If you want to create an extension library, for example in order to load these extensions to the Model Optimizer, you need to do next steps:
+Create an entry point for extension library. OpenVINO™ provides an `OPENVINO_CREATE_EXTENSIONS()` macro, which allows to define an entry point to a library with OpenVINO™ Extensions.
+This macro should have a vector of all OpenVINO™ Extensions as an argument.
+
+Based on that, the declaration of an extension class can look as follows:
+
+@snippet template_extension/new/ov_extension.cpp ov_extension:entry_point
+
+To configure the build of your extension library, use the following CMake script:
+
+@snippet template_extension/new/CMakeLists.txt cmake:extension
+
+This CMake script finds the OpenVINO™ using the `find_package` CMake command.
+
+To build the extension library, run the commands below:
+
+```sh
+$ cd docs/template_extension/new
+$ mkdir build
+$ cd build
+$ cmake -DOpenVINO_DIR=<OpenVINO_DIR> ../
+$ cmake --build .
+```
+
+After the build you can use path to your extension library to load your extensions to OpenVINO™ Runtime:
+
+@sphinxdirective
+
+.. tab:: C++
+
+    .. doxygensnippet:: docs/snippets/ov_extensions.cpp
+       :language: cpp
+       :fragment: add_extension_lib
+
+.. tab:: Python
+
+    .. doxygensnippet:: docs/snippets/ov_extensions.py
+       :language: python
+       :fragment: add_extension_lib
+
+@endsphinxdirective
+
+## See Also
+
+* [OpenVINO Transformations](./ov_transformations.md)
+* [Using Inference Engine Samples](../OV_Runtime_UG/Samples_Overview.md)
+* [Hello Shape Infer SSD sample](../../samples/cpp/hello_reshape_ssd/README.md)
--- a/docs/Extensibility_UG/add_openvino_ops.md
+++ b/docs/Extensibility_UG/add_openvino_ops.md
@@ -0,0 +1,62 @@
+# Custom OpenVINO™ Operations {#openvino_docs_Extensibility_UG_add_openvino_ops}
+
+OpenVINO™ Extension API allows you to register custom operations to support models with operations which OpenVINO™ does not support out-of-the-box.
+
+## Operation Class
+
+To add your custom operation, create a new class that extends `ov::Op`, which is in turn derived from `ov::Node`, the base class for all graph operations in OpenVINO™. To add `ov::Op` please include next file:
+
+@snippet template_extension/new/identity.hpp op:common_include
+
+Follow the steps below to add a custom operation:
+
+1. Add the `OPENVINO_OP` macro which defines a `NodeTypeInfo` object that identifies the type of the operation to the graph users and helps with dynamic type resolution. The type info of an operation currently consists of a string operation identifier and a string for operation version.
+
+2. Implement default constructor and constructors that optionally take the operation inputs and attributes as parameters. 
+
+3. Override the shape inference method `validate_and_infer_types`. This method is called multiple times during graph manipulations to determine the shapes and element types of the operations outputs. To access the input shapes and input element types, use the `get_input_partial_shape()` and `get_input_element_type()` methods of `ov::Node`. Set the inferred shape and element type of the output using `set_output_type`.
+
+4. Override the `clone_with_new_inputs` method, which enables graph manipulation routines to create copies of this operation and connect it to different nodes during optimization.
+
+5. Override the `visit_attributes` method, which enables serialization and deserialization of operation attributes. An `AttributeVisitor` is passed to the method, and the implementation is expected to walk over all the attributes in the op using the type-aware `on_attribute` helper. Helpers are already implemented for standard C++ types like `int64_t`, `float`, `bool`, `vector`, and for existing OpenVINO defined types.
+
+6. Override `evaluate`, which is an optional method that enables fallback of some devices to this implementation and the application of constant folding if there is a custom operation on the constant branch. If your operation contains `evaluate` method you also need to override the `has_evaluate` method, this method allow to get information about availability of `evaluate` method for the operation.
+
+7. Add the `OPENVINO_FRAMEWORK_MAP` macro if you want to map custom operation to framework operation with the same name. It is an optional macro which can be used for one to one mapping. In order to use this macro please include frontend specific headers:
+   @snippet template_extension/new/identity.hpp op:frontend_include
+
+Based on that, declaration of an operation class can look as follows:
+
+@snippet template_extension/new/identity.hpp op:header
+
+### Operation Constructors
+
+OpenVINO™ operation contains two constructors: 
+* Default constructor, which enables you to create an operation without attributes 
+* Constructor that creates and validates an operation with specified inputs and attributes
+
+@snippet template_extension/new/identity.cpp op:ctor
+
+### `validate_and_infer_types()`
+
+`ov::Node::validate_and_infer_types` method validates operation attributes and calculates output shapes using attributes of the operation.
+
+@snippet template_extension/new/identity.cpp op:validate
+
+### `clone_with_new_inputs()`
+
+`ov::Node::clone_with_new_inputs` method creates a copy of the operation with new inputs.
+
+@snippet template_extension/new/identity.cpp op:copy
+
+### `visit_attributes()`
+
+`ov::Node::visit_attributes` method enables you to visit all operation attributes.
+
+@snippet template_extension/new/identity.cpp op:visit_attributes
+
+### `evaluate()` and `has_evaluate()`
+
+`ov::Node::evaluate` method enables you to apply constant folding to an operation.
+
+@snippet template_extension/new/identity.cpp op:evaluate
--- a/docs/Extensibility_UG/graph_rewrite_pass.md
+++ b/docs/Extensibility_UG/graph_rewrite_pass.md
@@ -0,0 +1,28 @@
+# OpenVINO Graph Rewrite Pass {#openvino_docs_Extensibility_UG_graph_rewrite_pass}
+
+`ov::pass::GraphRewrite` serves for running multiple matcher passes on `ov::Model` in a single graph traversal.
+Example:
+
+@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:graph_rewrite
+
+In addition, GraphRewrite handles nodes that were registered by MatcherPasses during their execution. This nodes will be added to the beginning of the sequence with nodes for pattern matching.
+
+> **NOTE**: when using `ov::pass::Manager` temporary GraphRewrite is used to execute single MatcherPass.
+
+GraphRewrite has two algorithms for MatcherPasses execution. First algorithm is straightforward. It applies each MatcherPass in registration order to current node.
+
+![graph_rewrite_execution]
+
+But it is not really efficient when you have a lot of registered passes. So first of all GraphRewrite checks that all MatcherPass patterns has type-based root node (it means that type of this node is not hidden into predicate).
+And then creates map from registered MatcherPasses. That helps to avoid additional cost of applying each MatcherPass for each node.
+
+![graph_rewrite_efficient_search]
+
+> **NOTE**: GraphRewrite execution algorithm cannot be set manually and depends only on root nodes registered inside MatcherPasses.
+
+## See Also
+
+* [OpenVINO™ Transformations](./ov_transformations.md)
+
+[graph_rewrite_execution]: ./img/graph_rewrite_execution.png
+[graph_rewrite_efficient_search]: ./img/graph_rewrite_efficient_search.png
--- a/docs/Extensibility_UG/img/graph_rewrite_efficient_search.png
+++ b/docs/Extensibility_UG/img/graph_rewrite_efficient_search.png
--- a/docs/Extensibility_UG/img/graph_rewrite_execution.png
+++ b/docs/Extensibility_UG/img/graph_rewrite_execution.png
--- a/docs/Extensibility_UG/img/ngraph_insert_node.png
+++ b/docs/Extensibility_UG/img/ngraph_insert_node.png
--- a/docs/Extensibility_UG/img/ngraph_replace_node.png
+++ b/docs/Extensibility_UG/img/ngraph_replace_node.png
--- a/docs/Extensibility_UG/img/register_new_node.png
+++ b/docs/Extensibility_UG/img/register_new_node.png
--- a/docs/Extensibility_UG/img/transformations_structure.png
+++ b/docs/Extensibility_UG/img/transformations_structure.png
--- a/docs/Extensibility_UG/matcher_pass.md
+++ b/docs/Extensibility_UG/matcher_pass.md
@@ -0,0 +1,101 @@
+# OpenVINO Matcher Pass {#openvino_docs_Extensibility_UG_matcher_pass}
+
+`ov::pass::MatcherPass` is used for pattern-based transformations.
+
+Template for MatcherPass transformation class
+@snippet src/transformations/template_pattern_transformation.hpp graph_rewrite:template_transformation_hpp
+
+@snippet src/transformations/template_pattern_transformation.cpp graph_rewrite:template_transformation_cpp
+
+To use `ov::pass::MatcherPass`, you need to complete these steps:
+1. Create a pattern
+2. Implement a callback
+3. Register the pattern and Matcher
+4. Execute MatcherPass
+
+So let's go through each of these steps.
+
+## Create a pattern
+
+Pattern is a single root `ov::Model`. But the only difference is that you do not need to create a model object, you just need to create and connect opset or special pattern operations.
+Then you need to take the last created operation and put it as a root of the pattern. This root node will be used as a root node in pattern matching.
+> **NOTE**: Any nodes in a pattern that have no consumers and are not registered as root will not be used in pattern matching.
+
+@snippet ov_model_snippets.cpp pattern:simple_example
+
+The `Parameter` operation in the example above has type and shape specified. These attributes are needed only to create Parameter operation class and will not be used in pattern matching.
+
+For more pattern examples, refer to the [pattern matching](#pattern_matching) section.
+
+## Implement callback
+
+Callback is an action applied to every pattern entrance. In general, callback is the lambda function that takes Matcher object with detected subgraph.
+
+@snippet ov_model_snippets.cpp pattern:callback_example
+
+The example above shows the callback structure and how Matcher can be used for accessing nodes detected by pattern.
+Callback return value is `true` if root node was replaced and another pattern cannot be applied to the same root node; otherwise, it is `false`.
+> **NOTE**: It is not recommended to manipulate with nodes that are under root node. This may affect GraphRewrite execution as it is expected that all nodes that come after root node in topological order are valid and can be used in pattern matching.
+
+MatcherPass also provides functionality that allows reporting of the newly created nodes that can be used in additional pattern matching.
+If MatcherPass was registered in `ov::pass::Manager` or `ov::pass::GraphRewrite`, these registered nodes will be added for additional pattern matching.
+That means that matcher passes registered in `ov::pass::GraphRewrite` will be applied to these nodes.
+
+The example below shows how single MatcherPass can fuse sequence of operations using the `register_new_node` method.
+
+@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:relu_fusion
+
+> **NOTE**: If you register multiple nodes, please add them in topological order. We do not topologically sort these nodes as it is a time-consuming operation.
+
+## Register pattern and Matcher
+
+The last step is to register Matcher and callback inside the MatcherPass pass. To do this, call the `register_matcher` method.
+> **NOTE**: Only one matcher can be registered for a single MatcherPass class.
+
+```cpp
+// Register matcher and callback
+register_matcher(m, callback);
+```
+## Execute MatcherPass
+
+MatcherPass has multiple ways to be executed:
+* Run on a single node - it can be useful if you want to run MatcherPass inside another transformation.
+@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:run_on_node
+* Run on `ov::Model` using GraphRewrite - this approach gives ability to run MatcherPass on whole `ov::Model`. Moreover, multiple MatcherPass transformation can be registered in a single GraphRewite to be executed in a single graph traversal.
+@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:graph_rewrite
+* Run on `ov::Model` using `ov::pass::Manager` - this approach helps you to register MatcherPass for execution on `ov::Model` as another transformation types.
+@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:manager
+
+## Pattern Matching <a name="pattern_matching"></a>
+
+Sometimes patterns cannot be expressed via regular operations or it is too complicated.
+For example, if you want to detect **Convolution->Add** sub-graph without specifying particular input type for Convolution operation or you want to create a pattern where some of operations can have different types.
+And for these cases OpenVINO™ provides additional helpers to construct patterns for GraphRewrite transformations.
+
+There are two main helpers:
+1. `ov::pass::pattern::any_input` - helps to express inputs if their types are undefined.
+2. `ov::pass::pattern::wrap_type<T>` - helps to express nodes of pattern without specifying node attributes.
+
+Let's go through the example to have better understanding of how it works:
+
+> **NOTE**: Node attributes do not participate in pattern matching and are needed only for operations creation. Only operation types participate in pattern matching.
+
+The example below shows basic usage of `ov::passpattern::any_input`.
+Here we construct Multiply pattern with arbitrary first input and Constant as a second input.
+Also as Multiply is commutative operation, it does not matter in which order we set inputs (any_input/Constant or Constant/any_input) because both cases will be matched.
+
+@snippet ov_model_snippets.cpp pattern:label_example
+
+This example shows how we can construct a pattern when operation has arbitrary number of inputs.
+
+@snippet ov_model_snippets.cpp pattern:concat_example
+
+This example shows how to use predicate to construct a pattern. Also it shows how to match pattern manually on given node.
+
+@snippet ov_model_snippets.cpp pattern:predicate_example
+
+> **NOTE**: Be careful with manual matching because Matcher object holds matched nodes. To clear a match, use the m->clear_state() method.
+
+## See Also
+
+* [OpenVINO™ Transformations](./ov_transformations.md)
--- a/docs/Extensibility_UG/model_pass.md
+++ b/docs/Extensibility_UG/model_pass.md
@@ -0,0 +1,17 @@
+# OpenVINO Model Pass {#openvino_docs_Extensibility_UG_model_pass}
+
+`ov::pass::ModelPass` is used for transformations that take entire `ov::Model` as an input and process it.
+
+Template for ModelPass transformation class
+
+@snippet src/transformations/template_model_transformation.hpp model_pass:template_transformation_hpp
+
+@snippet src/transformations/template_model_transformation.cpp model_pass:template_transformation_cpp
+
+Using `ov::pass::ModelPass`, you need to override the `run_on_model` method where you will write the transformation code.
+Return value is `true` if the original model has changed during transformation (new operation was added, or operations replacement was made, or node attributes were changed); otherwise, it is `false`.
+Also `ov::pass::ModelPass` based transformations can be executed via `ov::pass::Manager`.
+
+## See Also
+
+* [OpenVINO™ Transformations](./ov_transformations.md)
--- a/docs/Extensibility_UG/ov_transformations.md
+++ b/docs/Extensibility_UG/ov_transformations.md
@@ -0,0 +1,172 @@
+# Overview of Transformations API {#openvino_docs_transformations}
+
+@sphinxdirective
+
+.. toctree::
+   :maxdepth: 1
+   :hidden:
+
+   openvino_docs_Extensibility_UG_model_pass
+   openvino_docs_Extensibility_UG_matcher_pass
+   openvino_docs_Extensibility_UG_graph_rewrite_pass
+
+@endsphinxdirective
+
+This guide contains all necessary information that you need to start implementing OpenVINO™ transformations.
+
+## Working with Model
+
+Before the moving to transformation part it is needed to say several words about functions which allow to modify `ov::Model`.
+This chapter extends the [model representation guide](../OV_Runtime_UG/model_representation.md) and shows an API that allows us to manipulate with `ov::Model`.
+
+### Working with node input and output ports
+
+First of all let's talk about `ov::Node` input/output ports. Each OpenVINO™ operation has input and output ports except cases when operation has `Parameter` or `Constant` type.
+
+Every port belongs to its node, so using a port we can access parent node, get shape and type for particular input/output, get all consumers in case of output port, and get producer node in case of input port.
+With output port we can set inputs for newly created operations.
+
+Lets look at the code example.
+
+@snippet ov_model_snippets.cpp ov:ports_example
+
+### Node replacement
+
+OpenVINO™ provides two ways for node replacement: via OpenVINO™ helper function and directly via port methods. We are going to review both of them.
+
+Let's start with OpenVINO™ helper functions. The most popular function is `ov::replace_node(old_node, new_node)`.
+
+We will review real replacement case where Negative operation is replaced with Multiply.
+
+![ngraph_replace_node]
+
+@snippet ov_model_snippets.cpp ov:replace_node
+
+`ov::replace_node` has a constraint that number of output ports for both of ops must be the same; otherwise, it raises an exception.
+
+
+The alternative way to do the same replacement is the following:
+
+@snippet ov_model_snippets.cpp ov:manual_replace
+
+Another transformation example is insertion.
+
+![ngraph_insert_node]
+
+@snippet ov_model_snippets.cpp ov:insert_node
+
+The alternative way to the insert operation is to make a node copy and use `ov::replace_node()`:
+
+@snippet ov_model_snippets.cpp ov:insert_node_with_copy
+
+### Node elimination
+
+Another type of node replacement is its elimination.
+
+To eliminate operation, OpenVINO™ has special method that considers all limitations related to OpenVINO™ Runtime.
+
+@snippet ov_model_snippets.cpp ov:eliminate_node
+
+`ov::replace_output_update_name()` in case of successful replacement it automatically preserves friendly name and runtime info.
+
+## Transformations types <a name="transformations_types"></a>
+
+OpenVINO™ Runtime has three main transformation types:
+
+* [Model pass](./model_pass.md) - straightforward way to work with `ov::Model` directly
+* [Matcher pass](./matcher_pass.md) - pattern-based transformation approach
+* [Graph rewrite pass](./graph_rewrite_pass.md) - container for matcher passes needed for efficient execution
+
+![transformations_structure]
+
+## Transformation conditional compilation
+
+Transformation library has two internal macros to support conditional compilation feature.
+
+* `MATCHER_SCOPE(region)` - allows to disable the MatcherPass if matcher isn't used. The region name should be unique. This macro creates a local variable `matcher_name` which you should use as a matcher name.
+* `RUN_ON_MODEL_SCOPE(region)` - allows to disable run_on_model pass if it isn't used. The region name should be unique.
+
+## Transformation writing essentials <a name="transformation_writing_essentials"></a>
+
+When developing a transformation, you need to follow these transformation rules:
+
+###1. Friendly Names
+
+Each `ov::Node` has an unique name and a friendly name. In transformations we care only about friendly name because it represents the name from the model.
+To avoid losing friendly name when replacing node with other node or subgraph, set the original friendly name to the latest node in replacing subgraph. See the example below.
+
+@snippet ov_model_snippets.cpp ov:replace_friendly_name
+
+In more advanced cases, when replaced operation has several outputs and we add additional consumers to its outputs, we make a decision how to set friendly name by arrangement.
+
+###2. Runtime Info
+
+Runtime info is a map `std::map<std::string, ov::Any>` located inside `ov::Node` class. It represents additional attributes in `ov::Node`.
+These attributes can be set by users or by plugins and when executing transformation that changes `ov::Model` we need to preserve these attributes as they will not be automatically propagated.
+In most cases, transformations have the following types: 1:1 (replace node with another node), 1:N (replace node with a sub-graph), N:1 (fuse sub-graph into a single node), N:M (any other transformation).
+Currently, there is no mechanism that automatically detects transformation types, so we need to propagate this runtime information manually. See the examples below.
+
+@snippet ov_model_snippets.cpp ov:copy_runtime_info
+
+When transformation has multiple fusions or decompositions, `ov::copy_runtime_info` must be called multiple times for each case.
+
+**Note**: copy_runtime_info removes rt_info from destination nodes. If you want to keep it, you need to specify them in source nodes like this: copy_runtime_info({a, b, c}, {a, b})
+
+###3. Constant Folding
+
+If your transformation inserts constant sub-graphs that need to be folded, do not forget to use `ov::pass::ConstantFolding()` after your transformation or call constant folding directly for operation.
+The example below shows how constant subgraph can be constructed.
+
+@snippet ov_model_snippets.cpp ov:constant_subgraph
+
+Manual constant folding is more preferable than `ov::pass::ConstantFolding()` because it is much faster.
+
+Below you can find an example of manual constant folding:
+
+@snippet src/transformations/template_pattern_transformation.cpp manual_constant_folding
+
+## Common mistakes in transformations <a name="common_mistakes"></a>
+
+In transformation development process:
+
+* Do not use deprecated OpenVINO™ API. Deprecated methods has the `OPENVINO_DEPRECATED` macros in its definition.
+* Do not pass `shared_ptr<Node>` as an input for other node if type of node is unknown or it has multiple outputs. Use explicit output port.
+* If you replace node with another node that produces different shape, remember that new shape will not be propagated until the first `validate_nodes_and_infer_types` call for `ov::Model`. If you are using `ov::pass::Manager`, it will automatically call this method after each transformation execution.
+* Do not forget to call the `ov::pass::ConstantFolding` pass if your transformation creates constant subgraphs.
+* Use latest OpSet if you are not developing downgrade transformation pass.
+* When developing a callback for `ov::pass::MatcherPass`,  do not change nodes that come after the root node in topological order.
+
+## Using pass manager <a name="using_pass_manager"></a>
+
+`ov::pass::Manager` is a container class that can store the list of transformations and execute them. The main idea of this class is to have high-level representation for grouped list of transformations.
+It can register and apply any [transformation pass](#transformations_types) on model.
+In addition, `ov::pass::Manager` has extended debug capabilities (find more information in the [how to debug transformations](#how_to_debug_transformations) section).
+
+The example below shows basic usage of `ov::pass::Manager`
+
+@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:manager3
+
+Another example shows how multiple matcher passes can be united into single GraphRewrite.
+
+@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:manager2
+
+## How to debug transformations <a name="how_to_debug_transformations"></a>
+
+If you are using `ngraph::pass::Manager` to run sequence of transformations, you can get additional debug capabilities by using the following environment variables:
+
+```
+OV_PROFILE_PASS_ENABLE=1 - enables performance measurement for each transformation and prints execution status
+OV_ENABLE_VISUALIZE_TRACING=1 -  enables visualization after each transformation. By default, it saves dot and svg files.
+```
+
+> **Note**: Make sure that you have dot installed on your machine; otherwise, it will silently save only dot file without svg file.
+
+## See Also
+
+* [OpenVINO™ Model Representation](../OV_Runtime_UG/model_representation.md)
+* [OpenVINO™ Extensions](./Intro.md)
+
+[ngraph_replace_node]: ./img/ngraph_replace_node.png
+[ngraph_insert_node]: ./img/ngraph_insert_node.png
+[transformations_structure]: ./img/transformations_structure.png
+[register_new_node]: ./img/register_new_node.png
--- a/docs/HOWTO/Custom_Layers_Guide.md
+++ b/docs/HOWTO/Custom_Layers_Guide.md
@@ -1,349 +0,0 @@
-# Custom Operations Guide {#openvino_docs_HOWTO_Custom_Layers_Guide}
-
-The Intel® Distribution of OpenVINO™ toolkit supports neural network models trained with multiple frameworks including
-TensorFlow*, Caffe*, MXNet*, Kaldi* and ONNX* file format. The list of supported operations (layers) is different for
-each of the supported frameworks. To see the operations supported by your framework, refer to
-[Supported Framework Layers](../MO_DG/prepare_model/Supported_Frameworks_Layers.md).
-
-Custom operations, that is those not included in the list, are not recognized by Model Optimizer out-of-the-box. Therefore, creating Intermediate Representation (IR) for a model using them requires additional steps. This guide illustrates the workflow for running inference on topologies featuring custom operations, allowing you to plug in your own implementation for existing or completely new operations.
-
-> **NOTE**: *Layer* is a legacy term for *operation* which came from Caffe\* framework. Currently it is not used.
-> Refer to the [Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™](../MO_DG/IR_and_opsets.md)
-> for more information on the topic.
-
-## Terms Used in This Guide
-
- *Intermediate Representation (IR)* — OpenVINO's Neural Network format used by Inference Engine. It abstracts different frameworks and describs model topology, operations parameters, and weights.
-
- *Operation* — an abstract concept of a math function selected for a specific purpose. Operations supported by
-  OpenVINO™ are listed in the supported operation set provided in the [Available Operations Sets](../ops/opset.md).
-  Examples of the operations are: [ReLU](../ops/activation/ReLU_1.md), [Convolution](../ops/convolution/Convolution_1.md),
-  [Add](../ops/arithmetic/Add_1.md), etc.
-
- *Kernel* — The implementation of an operation function in the OpenVINO™ plugin, in this case, the math programmed (in
-  C++ and OpenCL) to perform the operation for a target hardware (CPU or GPU).
-
- *Inference Engine Extension* — Device-specific module implementing custom operations (a set of kernels).
-
-## Custom Operation Support Overview
-
-There are three steps to support inference of a model with custom operation(s):
-1. Add support for a custom operation in the [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) so
-the Model Optimizer can generate the IR with the operation.
-2. Create an operation set and implement a custom nGraph operation in it as described in the
-[Custom nGraph Operation](../OV_Runtime_UG/Extensibility_DG/AddingNGraphOps.md).
-3. Implement a customer operation in one of the [Inference Engine](../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md)
-plugins to support inference of this operation using a particular target hardware (CPU, GPU or VPU).
-
-To see the operations that are supported by each device plugin for the Inference Engine, refer to the
-[Supported Devices](../OV_Runtime_UG/supported_plugins/Supported_Devices.md).
-
-> **NOTE**: If a device doesn't support a particular operation, an alternative to creating a new operation is to target
-> an additional device using the HETERO plugin. The [Heterogeneous Plugin](../OV_Runtime_UG/supported_plugins/HETERO.md) may be
-> used to run an inference model on multiple devices allowing the unsupported operations on one device to "fallback" to
-> run on another device (e.g., CPU) that does support those operations.
-
-### Custom Operation Support for the Model Optimizer
-
-Model Optimizer model conversion pipeline is described in detail in "Model Conversion Pipeline" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). It is best to read that article first for a better understanding of the following material.
-
-Model Optimizer provides an extensions mechanism to support new operations and implement custom model transformations to generate optimized IR. This mechanism is described in the "Model Optimizer Extensions" section of 
-[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md).
-
-Two types of Model Optimizer extensions should be implemented to support custom operations, at a minimum:
-1. Operation class for a new operation. This class stores information about the operation, its attributes, shape inference function, attributes to be saved to an IR and some others internally used attributes. Refer to the "Model Optimizer Operation" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for detailed instructions on how to implement it.
-2. Operation attributes extractor. The extractor is responsible for parsing framework-specific representation of the
-operation and uses corresponding operation class to update graph node attributes with necessary attributes of the
-operation. Refer to the "Operation Extractor" section of
-[Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for detailed instructions on how to implement it.
-
-> **NOTE**: In some cases you may need to implement some transformation to support the operation. This topic is covered in the "Graph Transformation Extensions" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md).
-
-## Custom Operations Extensions for the Inference Engine
-
-Inference Engine provides an extension mechanism to support new operations. This mechanism is described in [Inference Engine Extensibility Mechanism](../OV_Runtime_UG/Extensibility_DG/Intro.md).
-
-Each device plugin includes a library of optimized implementations to execute known operations which must be extended to execute a custom operation. The custom operation extension is implemented according to the target device:
-
- Custom Operation CPU Extension
-   - A compiled shared library (`.so` or `.dll`) needed by the CPU Plugin for executing the custom operation
-   on a CPU. Refer to the [How to Implement Custom CPU Operations](../OV_Runtime_UG/Extensibility_DG/CPU_Kernel.md) for more
-   details.
- Custom Operation GPU Extension
-   - OpenCL source code (.cl) for the custom operation kernel that will be compiled to execute on the GPU along with an operation description file (.xml) needed by the GPU Plugin for the custom operation kernel. Refer to the [How to Implement Custom GPU Operations](../OV_Runtime_UG/Extensibility_DG/GPU_Kernel.md) for more details.
- Custom Operation VPU Extension
-   - OpenCL source code (.cl) for the custom operation kernel that will be compiled to execute on the VPU along with an  operation description file (.xml) needed by the VPU Plugin for the custom operation kernel. Refer to [How to Implement Custom Operations for VPU](../OV_Runtime_UG/Extensibility_DG/VPU_Kernel.md) for more details.
-
-Also, it is necessary to implement nGraph custom operation according to [Custom nGraph Operation](../OV_Runtime_UG/Extensibility_DG/AddingNGraphOps.md) so the Inference Engine can read an IR with this
-operation and correctly infer output tensor shape and type.
-
-## Enabling Magnetic Resonance Image Reconstruction Model
-This chapter provides step-by-step instructions on how to enable the magnetic resonance image reconstruction model implemented in the [repository](https://github.com/rmsouza01/Hybrid-CS-Model-MRI/) using a custom operation on CPU. The example is prepared for a model generated from the repository with hash `2ede2f96161ce70dcdc922371fe6b6b254aafcc8`.
-
-### Download and Convert the Model to a Frozen TensorFlow\* Model Format
-The original pre-trained model is provided in the hdf5 format which is not supported by OpenVINO directly and needs to be converted to TensorFlow\* frozen model format first.
-
-1. Download repository `https://github.com/rmsouza01/Hybrid-CS-Model-MRI`:<br>
-```bash
-    git clone https://github.com/rmsouza01/Hybrid-CS-Model-MRI
-    git checkout 2ede2f96161ce70dcdc922371fe6b6b254aafcc8
-```
-
-2. Convert pre-trained `.hdf5` to a frozen `.pb` graph using the following script (tested with TensorFlow==1.15.0 and
-Keras==2.2.4) which should be executed from the root of the cloned repository:<br>
-```py
-    import keras as K
-    import numpy as np
-    import Modules.frequency_spatial_network as fsnet
-    import tensorflow as tf
-
-    under_rate = '20'
-
-    stats = np.load("Data/stats_fs_unet_norm_" + under_rate + ".npy")
-    var_sampling_mask = np.load("Data/sampling_mask_" + under_rate + "perc.npy")
-
-    model = fsnet.wnet(stats[0], stats[1], stats[2], stats[3], kshape = (5,5), kshape2=(3,3))
-    model_name = "Models/wnet_" + under_rate + ".hdf5"
-    model.load_weights(model_name)
-
-    inp = np.random.standard_normal([1, 256, 256, 2]).astype(np.float32)
-    np.save('inp', inp)
-
-    sess = K.backend.get_session()
-    sess.as_default()
-    graph_def = sess.graph.as_graph_def()
-    graph_def = tf.graph_util.convert_variables_to_constants(sess, graph_def, ['conv2d_44/BiasAdd'])
-    with tf.gfile.FastGFile('wnet_20.pb', 'wb') as f:
-        f.write(graph_def.SerializeToString())    
-```
-   
-As a result the TensorFlow\* frozen model file "wnet_20.pb" is generated.
-
-### Convert the Frozen TensorFlow\* Model to Intermediate Representation
-
-Firstly, open the model in TensorBoard or other TensorFlow* model visualization tool. The model supports dynamic
-batch dimension because the value for the batch dimension is not hardcoded in the model. Model Optimizer need to set all
-dynamic dimensions to some specific value to create the IR, therefore specify the command line parameter `-b 1` to set
-the batch dimension equal to 1. The actual batch size dimension can be changed at runtime using the Inference Engine API
-described in the [Using Shape Inference](../OV_Runtime_UG/ShapeInference.md). Also refer to the General Conversion Parameters section in [Converting a Model to Intermediate Representation (IR)](../MO_DG/prepare_model/convert_model/Converting_Model.md) and [Convert Your TensorFlow* Model](../MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md)
-for more details and command line parameters used for the model conversion.
-
-```sh
-mo --input_model <PATH_TO_MODEL>/wnet_20.pb -b 1
-```
-
-> **NOTE**: This conversion guide is applicable for the 2021.3 release of OpenVINO and that starting from 2021.4
-> the OpenVINO supports this model out of the box.
-
-Model Optimizer produces the following error:
-```bash
-[ ERROR ]  List of operations that cannot be converted to Inference Engine IR:
-[ ERROR ]      Complex (1)
-[ ERROR ]          lambda_2/Complex
-[ ERROR ]      IFFT2D (1)
-[ ERROR ]          lambda_2/IFFT2D
-[ ERROR ]      ComplexAbs (1)
-[ ERROR ]          lambda_2/Abs
-[ ERROR ]  Part of the nodes was not converted to IR. Stopped.
-```
-
-The error means that the Model Optimizer doesn't know how to handle 3 types of TensorFlow\* operations: "Complex",
-"IFFT2D" and "ComplexAbs". In order to see more details about the conversion process run the model conversion with
-additional parameter `--log_level DEBUG`. It is worth to mention the following lines from the detailed output:
-
-```bash
-[ INFO ]  Called "tf_native_tf_node_infer" for node "lambda_2/Complex"
-[ <TIMESTAMP> ] [ DEBUG ] [ tf:228 ]  Added placeholder with name 'lambda_2/lambda_3/strided_slice_port_0_ie_placeholder'
-[ <TIMESTAMP> ] [ DEBUG ] [ tf:228 ]  Added placeholder with name 'lambda_2/lambda_4/strided_slice_port_0_ie_placeholder'
-[ <TIMESTAMP> ] [ DEBUG ] [ tf:241 ]  update_input_in_pbs: replace input 'lambda_2/lambda_3/strided_slice' with input 'lambda_2/lambda_3/strided_slice_port_0_ie_placeholder'
-[ <TIMESTAMP> ] [ DEBUG ] [ tf:249 ]  Replacing input '0' of the node 'lambda_2/Complex' with placeholder 'lambda_2/lambda_3/strided_slice_port_0_ie_placeholder'
-[ <TIMESTAMP> ] [ DEBUG ] [ tf:241 ]  update_input_in_pbs: replace input 'lambda_2/lambda_4/strided_slice' with input 'lambda_2/lambda_4/strided_slice_port_0_ie_placeholder'
-[ <TIMESTAMP> ] [ DEBUG ] [ tf:249 ]  Replacing input '1' of the node 'lambda_2/Complex' with placeholder 'lambda_2/lambda_4/strided_slice_port_0_ie_placeholder'
-[ <TIMESTAMP> ] [ DEBUG ] [ tf:148 ]  Inferred shape of the output tensor with index '0' of the node 'lambda_2/Complex': '[  1 256 256]'
-[ <TIMESTAMP> ] [ DEBUG ] [ infer:145 ]  Outputs:
-[ <TIMESTAMP> ] [ DEBUG ] [ infer:32 ]  output[0]: shape = [  1 256 256], value = <UNKNOWN>
-[ <TIMESTAMP> ] [ DEBUG ] [ infer:129 ]  --------------------
-[ <TIMESTAMP> ] [ DEBUG ] [ infer:130 ]  Partial infer for lambda_2/IFFT2D
-[ <TIMESTAMP> ] [ DEBUG ] [ infer:131 ]  Op: IFFT2D
-[ <TIMESTAMP> ] [ DEBUG ] [ infer:132 ]  Inputs:
-[ <TIMESTAMP> ] [ DEBUG ] [ infer:32 ]  input[0]: shape = [  1 256 256], value = <UNKNOWN>
-```
-
-This is a part of the log of the partial inference phase of the model conversion. See the "Partial Inference" section on
-the [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for
-more information about this phase. Model Optimizer inferred output shape for the unknown operation of type "Complex"
-using a "fallback" to TensorFlow\*. However, it is not enough to generate the IR because Model Optimizer doesn't know
-which  attributes of the operation should be saved to IR. So it is necessary to implement Model Optimizer extensions to
-support these operations.
-
-Before going into the extension development it is necessary to understand what these unsupported operations do according
-to the TensorFlow\* framework specification.
-
-* "Complex" - returns a tensor of complex type constructed from two real input tensors specifying real and imaginary
-part of a complex number.
-* "IFFT2D" - returns a tensor with inverse 2-dimensional discrete Fourier transform over the inner-most 2 dimensions of
- an input.
-* "ComplexAbs" - returns a tensor with absolute values of input tensor with complex numbers.
-
-The part of the model with all three unsupported operations is depicted below:
-
-![Unsupported sub-graph](img/unsupported_subgraph.png)
-
-This model uses complex numbers during the inference but Inference Engine does not support tensors of this data type. So
-it is necessary to find a way how to avoid using tensors of such a type in the model. Fortunately, the complex tensor
-appear as a result of "Complex" operation, is used as input in the "IFFT2D" operation then is passed to "ComplexAbs"
-which produces real value tensor as output. So there are just 3 operations consuming/producing complex tensors in the
-model.
-
-Let's design an OpenVINO operation "FFT" which get a single real number tensor describing the complex number and
-produces a single real number tensor describing output complex tensor. This way the fact that the model uses complex
-numbers is hidden inside the "FFT" operation implementation. The operation gets a tensor of shape `[N, H, W, 2]` and
-produces the output tensor with the same shape, where the innermost dimension contains pairs of real numbers describing
-the complex number (its real and imaginary part). As we will see further this operation will allow us to support the
-model. The implementation of the Model Optimizer operation should be saved to `mo_extensions/ops/FFT.py` file:
-
-@snippet FFT.py fft:operation
-
-The attribute `inverse` is a flag specifying type of the FFT to apply: forward or inverse.
-
-See the "Model Optimizer Operation" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for detailed instructions on how to implement the operation.
-
-Now it is necessary to implement extractor for the "IFFT2D" operation according to the
-"Operation Extractor" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md). The
-following snippet provides two extractors: one for "IFFT2D", another one for "FFT2D", however only on of  them is used in this example. The implementation should be saved to the file `mo_extensions/front/tf/FFT_ext.py`.
-
-@snippet FFT_ext.py fft_ext:extractor
-
-> **NOTE**: The graph is in inconsistent state after extracting node attributes because according to original operation
-> "IFFT2D" semantic it should have an input consuming a tensor of complex numbers, but the extractor instantiated an
-> operation "FFT" which expects a real tensor with specific layout. But the inconsistency will be resolved during
-> applying front phase transformations discussed below.
-
-The output shape of the operation "AddV2" from the picture above is `[N, H, W, 2]`. Where the innermost dimension
-contains pairs of real numbers describing the complex number (its real and imaginary part). The following "StridedSlice"
-operations split the input tensor into 2 parts to get a tensor of real and a tensor of imaginary parts which are then
-consumed with the "Complex" operation to produce a tensor of complex numbers. These "StridedSlice" and "Complex"
-operations can be removed so the "FFT" operation will get a real value tensor encoding complex numbers. To achieve this
-we implement the front phase transformation which searches for a pattern of two "StridedSlice" operations with specific
-attributes producing data to "Complex" operation and removes it from the graph. Refer to the
-"Pattern-Defined Front Phase Transformations" section of [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) for more
-information on how this type of transformation works. The code snippet should be saved to the file
-`mo_extensions/front/tf/Complex.py`.
-
-@snippet Complex.py complex:transformation
-
-> **NOTE**: The graph is in inconsistent state because the "ComplexAbs" operation consumes complex value tensor but
->  "FFT" produces real value tensor.
-
-Now lets implement a transformation which replace a "ComplexAbs" operation with a sub-graph of primitive operations
-which calculate the result using the following formulae: \f$module(z) = \sqrt{real(z) \cdot real(z) + imag(z) \cdot imag(z)}\f$.
-Original "IFFT2D" operation produces tensor of complex values, but the "FFT" operation produces a real value tensor with
-the same format and shape as the input for the operation. So the input shape for the "ComplexAbs" will be `[N, H, W, 2]`
-with the innermost dimension containing tuple with real and imaginary part of a complex number. In order to calculate
-absolute values for the complex tensor we do the following:
-1. Raise all elements in the power of 2.
-2. Calculate a reduced sum over the innermost dimension.
-3. Calculate a square root.
-
-The implementation should be saved to the file `mo_extensions/front/tf/ComplexAbs.py` and provided below:
-
-@snippet ComplexAbs.py complex_abs:transformation
-
-Now it is possible to convert the model using the following command line:
-```sh
-mo --input_model <PATH_TO_MODEL>/wnet_20.pb -b 1 --extensions mo_extensions/
-```
-
-The sub-graph corresponding to the originally non-supported one is depicted in the image below:
-
-![Converted sub-graph](img/converted_subgraph.png)
-
-> **NOTE**: Model Optimizer performed conversion of the model from NHWC to NCHW layout that is why the dimension with
-> the value 2 moved to another position.
-
-### Inference Engine Extension Implementation
-Now it is necessary to implement the extension for the CPU plugin with operation "FFT" introduced previously. The code
-below is based on the template extension described in [Inference Engine Extensibility Mechanism](../OV_Runtime_UG/Extensibility_DG/Intro.md).
-
-#### CMake Build File
-The first step is to create a CMake configuration file which builds the extension. The content of the "CMakeLists.txt"
-file is the following:
-
-@snippet template_extension/old/CMakeLists.txt cmake:extension
-
-The CPU FFT kernel implementation uses OpenCV to perform the FFT that is why the extension library is linked with
-`opencv_core` which comes with the OpenVINO.
-
-#### Custom nGraph Operation "FFT" Implementation
-The next step is to create the nGraph operation FFT. The header file "fft_op.hpp" has the following content:
-
-@snippet template_extension/old/fft_op.hpp fft_op:header
-
-The operation has just one boolean attribute `inverse`. Implementation of the necessary nGraph operation functions are
-in the `fft_op.cpp` file with the following content:
-
-@snippet template_extension/old/fft_op.cpp fft_op:implementation
-
-Refer to the [Custom nGraph Operation](../OV_Runtime_UG/Extensibility_DG/AddingNGraphOps.md) for more details.
-
-#### CPU FFT Kernel Implementation
-The operation implementation for CPU plugin uses OpenCV to perform the FFT. The header file "fft_kernel.hpp" has the
-following content:
-
-@snippet template_extension/old/fft_kernel.hpp fft_kernel:header
-
-The "fft_kernel.cpp" with the implementation of the CPU has the following content:
-
-@snippet template_extension/old/fft_kernel.cpp fft_kernel:implementation
-
-Refer to the [How to Implement Custom CPU Operations](../OV_Runtime_UG/Extensibility_DG/CPU_Kernel.md) for more details.
-
-#### Extension Library Implementation
-The last step is to create an extension library "extension.cpp" and "extension.hpp" which will include the FFT
-operation for the CPU plugin. The code of  the library is described in the [Extension Library](../OV_Runtime_UG/Extensibility_DG/Extension.md).
-
-### Building and Running the Custom Extension
-To build the extension, run the following:<br>
-```bash
-mkdir build && cd build
-source /opt/intel/openvino_2022/setupvars.sh
-cmake .. -DCMAKE_BUILD_TYPE=Release
-make --jobs=$(nproc)
-```
-
-The result of this command is a compiled shared library (`.so` or `.dll`). It should be loaded in the
-application using `Core` class instance method `AddExtension` like this
-`core.AddExtension(std::make_shared<Extension>(compiled_library_file_name), "CPU");`.
-
-To test that the extension is implemented correctly we can run the "mri_reconstruction_demo" with the following content:
-
-@snippet mri_reconstruction_demo.py mri_demo:demo
-
-The script can be executed using the following command line:
-```bash
-python3 mri_reconstruction_demo.py \
-        -m <PATH_TO_IR>/wnet_20.xml \
-        -i <PATH_TO_SAMPLE_MRI_IMAGE>.npy \
-        -p <Hybrid-CS-Model-MRI_repo>/Data/sampling_mask_20perc.npy \
-        -l <PATH_TO_BUILD_DIR>/libtemplate_extension.so \
-        -d CPU
-```
-
-## Additional Resources
-
- Intel® Distribution of OpenVINO™ toolkit home page: [https://software.intel.com/en-us/openvino-toolkit](https://software.intel.com/en-us/openvino-toolkit)
- OpenVINO™ toolkit online documentation: [https://docs.openvino.ai](https://docs.openvino.ai)
- [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
- [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md)
- [Inference Engine Extensibility Mechanism](../OV_Runtime_UG/Extensibility_DG/Intro.md)
- [OpenVINO™ Toolkit Samples Overview](../OV_Runtime_UG/Samples_Overview.md)
- [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel)
- For IoT Libraries and Code Samples see the [Intel® IoT Developer Kit](https://github.com/intel-iot-devkit).
-
-## Converting Models:
-
- [Convert Your Caffe* Model](../MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md)
- [Convert Your TensorFlow* Model](../MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md)
- [Convert Your MXNet* Model](../MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md)
- [Convert Your Kaldi* Model](../MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md)
- [Convert Your ONNX* Model](../MO_DG/prepare_model/convert_model/Convert_Model_From_ONNX.md)
--- a/docs/IE_PLUGIN_DG/AsyncInferRequest.md
+++ b/docs/IE_PLUGIN_DG/AsyncInferRequest.md
@@ -1,7 +1,7 @@
 # Asynchronous Inference Request {#openvino_docs_ie_plugin_dg_async_infer_request}

 Asynchronous Inference Request runs an inference pipeline asynchronously in one or several task executors depending on a device pipeline structure.
-Inference Engine Plugin API provides the base InferenceEngine::AsyncInferRequestThreadSafeDefault class:
+OpenVINO Runtime Plugin API provides the base InferenceEngine::AsyncInferRequestThreadSafeDefault class:

 - The class has the `_pipeline` field of `std::vector<std::pair<ITaskExecutor::Ptr, Task> >`, which contains pairs of an executor and executed task.
 - All executors are passed as arguments to a class constructor and they are in the running state and ready to run tasks.
@@ -10,7 +10,7 @@ Inference Engine Plugin API provides the base InferenceEngine::AsyncInferRequest
 `AsyncInferRequest` Class
 ------------------------

-Inference Engine Plugin API provides the base InferenceEngine::AsyncInferRequestThreadSafeDefault class for a custom asynchronous inference request implementation:
+OpenVINO Runtime Plugin API provides the base InferenceEngine::AsyncInferRequestThreadSafeDefault class for a custom asynchronous inference request implementation:

@snippet src/template_async_infer_request.hpp async_infer_request:header

--- a/docs/IE_PLUGIN_DG/Intro.md
+++ b/docs/IE_PLUGIN_DG/Intro.md
@@ -56,7 +56,7 @@ Detailed guides
 * Plugin and its components [testing](@ref openvino_docs_ie_plugin_dg_plugin_testing)
 * [Quantized networks](@ref openvino_docs_ie_plugin_dg_quantized_networks)
 * [Low precision transformations](@ref openvino_docs_IE_DG_lpt) guide
-* [Writing nGraph transformations](@ref ngraph_transformation) guide
+* [Writing OpenVINO™ transformations](@ref openvino_docs_transformations) guide

 API References
 -----------------------
--- a/docs/IE_PLUGIN_DG/Plugin.md
+++ b/docs/IE_PLUGIN_DG/Plugin.md
@@ -30,7 +30,7 @@ Based on that, declaration of a plugin class can look as follows:

 The provided plugin class also has several fields:

-* `_backend` - a backend engine that is used to perform actual computations for network inference. For `Template` plugin `ngraph::runtime::Backend` is used which performs computations using ngraph reference implementations.
+* `_backend` - a backend engine that is used to perform actual computations for network inference. For `Template` plugin `ngraph::runtime::Backend` is used which performs computations using OpenVINO™ reference implementations.
 * `_waitExecutor` - a task executor that waits for a response from a device about device tasks completion.
 * `_cfg` of type `Configuration`:

@@ -67,7 +67,7 @@ which holds a backend-dependent compiled graph in an internal representation:
 Before a creation of an `ExecutableNetwork` instance via a constructor, a plugin may check if a provided 
 InferenceEngine::ICNNNetwork object is supported by a device. In the example above, the plugin checks precision information.

-The very important part before creation of `ExecutableNetwork` instance is to call `TransformNetwork` method which applies ngraph transformation passes.
+The very important part before creation of `ExecutableNetwork` instance is to call `TransformNetwork` method which applies OpenVINO™ transformation passes.

 Actual graph compilation is done in the `ExecutableNetwork` constructor. Refer to the [ExecutableNetwork Implementation Guide](@ref openvino_docs_ie_plugin_dg_executable_network) for details.

@@ -77,27 +77,27 @@ Actual graph compilation is done in the `ExecutableNetwork` constructor. Refer t

 ### `TransformNetwork()`

-The function accepts a const shared pointer to `ngraph::Function` object and performs the following steps:
+The function accepts a const shared pointer to `ov::Model` object and performs the following steps:

 1. Deep copies a const object to a local object, which can later be modified.
-2. Applies common and plugin-specific transformations on a copied graph to make the graph more friendly to hardware operations. For details how to write custom plugin-specific transformation, please, refer to [Writing ngraph transformations](@ref ngraph_transformation) guide. See detailed topics about network representation:
+2. Applies common and plugin-specific transformations on a copied graph to make the graph more friendly to hardware operations. For details how to write custom plugin-specific transformation, please, refer to [Writing OpenVINO™ transformations](@ref openvino_docs_transformations) guide. See detailed topics about network representation:
    * [Intermediate Representation and Operation Sets](../_docs_MO_DG_IR_and_opsets.html)
    * [Quantized networks](@ref openvino_docs_ie_plugin_dg_quantized_networks).

@snippet template_plugin/src/template_plugin.cpp plugin:transform_network

-> **NOTE**: After all these transformations, a `ngraph::Function` object contains operations which can be perfectly mapped to backend kernels. E.g. if backend has kernel computing `A + B` operations at once, the `TransformNetwork` function should contain a pass which fuses operations `A` and `B` into a single custom operation `A + B` which fits backend kernels set.
+> **NOTE**: After all these transformations, a `ov::Model` object contains operations which can be perfectly mapped to backend kernels. E.g. if backend has kernel computing `A + B` operations at once, the `TransformNetwork` function should contain a pass which fuses operations `A` and `B` into a single custom operation `A + B` which fits backend kernels set.

 ### `QueryNetwork()`

 Use the method with the `HETERO` mode, which allows to distribute network execution between different 
-devices based on the `ngraph::Node::get_rt_info()` map, which can contain the `"affinity"` key.
+devices based on the `ov::Node::get_rt_info()` map, which can contain the `"affinity"` key.
 The `QueryNetwork` method analyzes operations of provided `network` and returns a list of supported
-operations via the InferenceEngine::QueryNetworkResult structure. The `QueryNetwork` firstly applies `TransformNetwork` passes to input `ngraph::Function` argument. After this, the transformed network in ideal case contains only operations are 1:1 mapped to kernels in computational backend. In this case, it's very easy to analyze which operations is supposed (`_backend` has a kernel for such operation or extensions for the operation is provided) and not supported (kernel is missed in `_backend`):
+operations via the InferenceEngine::QueryNetworkResult structure. The `QueryNetwork` firstly applies `TransformNetwork` passes to input `ov::Model` argument. After this, the transformed network in ideal case contains only operations are 1:1 mapped to kernels in computational backend. In this case, it's very easy to analyze which operations is supposed (`_backend` has a kernel for such operation or extensions for the operation is provided) and not supported (kernel is missed in `_backend`):

-1. Store original names of all operations in input `ngraph::Function`
+1. Store original names of all operations in input `ov::Model`
 2. Apply `TransformNetwork` passes. Note, the names of operations in a transformed network can be different and we need to restore the mapping in the steps below.
-3. Construct `supported` and `unsupported` maps which contains names of original operations. Note, that since the inference is performed using ngraph reference backend, the decision whether the operation is supported or not depends on whether the latest OpenVINO opset contains such operation.
+3. Construct `supported` and `unsupported` maps which contains names of original operations. Note, that since the inference is performed using OpenVINO™ reference backend, the decision whether the operation is supported or not depends on whether the latest OpenVINO opset contains such operation.
 4. `QueryNetworkResult.supportedLayersMap` contains only operations which are fully supported by `_backend`.

@snippet template_plugin/src/template_plugin.cpp plugin:query_network
--- a/docs/IE_PLUGIN_DG/PluginTesting.md
+++ b/docs/IE_PLUGIN_DG/PluginTesting.md
@@ -26,7 +26,7 @@ Engine concepts: plugin creation, multiple executable networks support, multiple
    @snippet single_layer_tests/convolution.cpp test_convolution:instantiate

 3. **Sub-graph tests** (`subgraph_tests` sub-folder). This group of tests is designed to tests small patterns or combination of layers. E.g. when a particular topology is being enabled in a plugin e.g. TF ResNet-50, there is no need to add the whole topology to test tests. In opposite way, a particular repetitive subgraph or pattern can be extracted from `ResNet-50` and added to the tests. The instantiation of the sub-graph tests is done in the same way as for single layer tests.
-> **Note**, such sub-graphs or patterns for sub-graph tests should be added to `IE::ngraphFunctions` library first (this library is a pre-defined set of small `ngraph::Function`) and re-used in sub-graph tests after.
+> **Note**, such sub-graphs or patterns for sub-graph tests should be added to `IE::ngraphFunctions` library first (this library is a pre-defined set of small `ov::Model`) and re-used in sub-graph tests after.

 4. **HETERO tests** (`subgraph_tests` sub-folder) contains tests for `HETERO` scenario (manual or automatic affinities settings, tests for `QueryNetwork`).

@@ -41,18 +41,14 @@ To use these tests for your own plugin development, link the `IE::funcSharedTest
 To build test binaries together with other build artifacts, use the `make all` command. For details, see
 [Build Plugin Using CMake*](@ref openvino_docs_ie_plugin_dg_plugin_build).

-### Tests for plugin-specific ngraph transformations
-
-Please, refer to [Transformation testing](@ref ngraph_transformation) guide.
-
 ### How to Extend Inference Engine Plugin Tests

 Inference Engine Plugin tests are open for contribution.
 Add common test case definitions applicable for all plugins to the `IE::funcSharedTests` target within the DLDT repository. Then, any other plugin supporting corresponding functionality can instantiate the new test.

-All Inference Engine per-layer tests check test layers functionality. They are developed using nGraph functions
+All Inference Engine per-layer tests check test layers functionality. They are developed using ov::Model.
 as input graphs used by tests. In this case, to test a new layer with layer tests, extend
-the `IE::ngraphFunctions` library, which is also included in the Inference Engine Developer package, with a new nGraph function
+the `IE::ngraphFunctions` library, which is also included in the Inference Engine Developer package, with a new model.
 including the corresponding operation.

 > **NOTE**: When implementing a new subgraph test, add new single-layer tests for each operation of the subgraph if such test does not exist.
--- a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/lpt.md
+++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/lpt.md
@@ -220,17 +220,17 @@ Typical transformation pipeline described below.
 ### Step 1. Common optimizations
 This step is optional for LPT but typically is presented in OpenVINO™ plugins. The step doesn't use any LPT transformation. Firstly, the step disables dequantization operations constant folding on constant subgraph on weights to prevent the lost of dequantization info on the next plugin transformations. After that, it optimizes nGraph function and convert operations to operation set 1. Typically, usage of this step is the simplest way to meet LPT requirements for the input quantized model. If plugin can guarantee that LPT input requirements are met, then this step can be skipped.

-@snippet snippets/lpt_mkldnn_plugin.cpp lpt_common
+@snippet snippets/lpt_intel_cpu_plugin.cpp lpt_common

 ### Step 2. Low precision transformations execution  
 This step is mandatory. It configures and runs LPT transformations.

-@snippet snippets/lpt_mkldnn_plugin.cpp lpt_execution
+@snippet snippets/lpt_intel_cpu_plugin.cpp lpt_execution

 ### Step 3. Plugin-specific transformations  
 This step is optional. It modifies the nGraph function to a device-specific operation set.

-@snippet snippets/lpt_mkldnn_plugin.cpp lpt_device
+@snippet snippets/lpt_intel_cpu_plugin.cpp lpt_device

 ## Result model overview

@@ -298,14 +298,14 @@ Low Precision Transformations can be customizable. Build-in customization option
 ### Operation precision restrictions
 This option defines precisions which allowed for the operation input ports. The option value is passed as input argument for `LowPrecision` constructor. For example:

-@snippet snippets/lpt_mkldnn_plugin.cpp lpt_supported_precisions
+@snippet snippets/lpt_intel_cpu_plugin.cpp lpt_supported_precisions

 In provided example in result model `Convolution` operation inputs must have specific precisions: `u8` (unsigned int8) precision on input 0 (on activations) and `i8` (signed int8) precision on input 1 (on weights).

 ### Operation per tensor quantization restrictions
 This option defines if operation supports per-tensor quantization only. The option value is passed as input argument for `LowPrecision` constructor. For example:

-@snippet snippets/lpt_mkldnn_plugin.cpp per_tensor_quantization
+@snippet snippets/lpt_intel_cpu_plugin.cpp per_tensor_quantization

 In provided example in result model `Convolution` operations must have per-tensor quantization on input 0 (on activations).

@@ -316,4 +316,4 @@ This option defines if each LPT transformation updates precision or not. The opt

 Plugin specific customization can be implemented via nGraph transformation callbacks. For example: asymmetric quantization support can be easily customizable via `LayerTransformation::isAsymmetricQuantization` and `WeightableLayerTransformation::isAsymmetricOnWeights` methods usage in callbacks. For example:

-@snippet snippets/lpt_mkldnn_plugin.cpp asymmetric_quantization
+@snippet snippets/lpt_intel_cpu_plugin.cpp asymmetric_quantization
--- a/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step2_markup.md
+++ b/docs/IE_PLUGIN_DG/plugin_transformation_pipeline/low_precision_transformations/pipeline/step2_markup.md
@@ -44,7 +44,7 @@ The original model key features:

 Transformations are run with the following parameters:

-@snippet snippets/lpt_mkldnn_plugin.cpp lpt_markup_pipeline
+@snippet snippets/lpt_intel_cpu_plugin.cpp lpt_markup_pipeline

 ## 1. MarkupCanBeQuantized
 The transformation marks operations that cannot be quantized. No attributes are required before the transformation.
--- a/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md
+++ b/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md
@@ -22,7 +22,7 @@

 Model Optimizer is a cross-platform command-line tool that facilitates the transition between the training and deployment environment, performs static model analysis, and adjusts deep learning models for optimal execution on end-point target devices.

-Model Optimizer process assumes you have a network model trained using supported deep learning frameworks: Caffe*, TensorFlow*, Kaldi*, MXNet* or converted to the ONNX* format. Model Optimizer produces an Intermediate Representation (IR) of the network, which can be inferred with the [Inference Engine](../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md).
+Model Optimizer process assumes you have a network model trained using supported deep learning frameworks: Caffe*, TensorFlow*, Kaldi*, MXNet* or converted to the ONNX* format. Model Optimizer produces an Intermediate Representation (IR) of the network, which can be inferred with the [OpenVINO™ Runtime](../OV_Runtime_UG/openvino_intro.md).

 > **NOTE**: Model Optimizer does not infer models. Model Optimizer is an offline tool that runs before the inference takes place.

--- a/docs/MO_DG/Known_Issues_Limitations.md
+++ b/docs/MO_DG/Known_Issues_Limitations.md
@@ -7,41 +7,3 @@ TensorFlow* provides only prebuilt binaries with AVX instructions enabled. When
 To run the Model Optimizer on this hardware, you should compile TensorFlow binaries from source as described at the [TensorFlow website](https://www.tensorflow.org/install/source). 

 Another option is to run the Model Optimizer to generate an IR on hardware that supports AVX to and then perform inference on hardware without AVX.
-
-
-## Multiple OpenMP Loadings
-
-If the application uses the Inference Engine with third-party components that depend on Intel OpenMP, multiple loadings of the libiomp library may occur and cause OpenMP runtime initialization conflicts. This may happen, for example, if the application uses Intel® Math Kernel Library (Intel® MKL) through the “Single Dynamic Library” (<code>libmkl_rt.so</code>) mechanism and calls Intel MKL after loading the Inference Engine plugin.
-The error log looks as follows:
-```sh
-OMP: Error #15: Initializing libiomp5.so, but found libiomp5.so already initialized.
-OMP: Hint: This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
-```
-
-Possible workarounds:
-
-*  Preload the OpenMP runtime using the <code>LD_PRELOAD</code> variable:
-   ```sh
-   LD_PRELOAD=<path_to_libiomp5.so> <path_to your_executable> ```
-   This eliminates multiple loadings of libiomp, and makes all the components use this specific version of OpenMP.
-
-*  Alternatively, you can set <code>KMP_DUPLICATE_LIB_OK=TRUE</code>. However, performance degradation or incorrect results may occur in this case.
-
-
-## Old proto compiler breaks protobuf library
-
-With python protobuf library version 3.5.1 the following incompatibility can happen.
-The known case is for Cent OS 7.4
-
-The error log looks as follows:
-
-```sh
-File "../lib64/python3.5/site-packages/google/protobuf/descriptor.py", line 829, in _new_
-return _message.default_pool.AddSerializedFile(serialized_pb)
-TypeError: expected bytes, str found
-```
-
-Possible workaround is to upgrade default protobuf compiler (libprotoc 2.5.0) to newer version, for example
-libprotoc 2.6.1.
-
-[protobuf_issue]: https://github.com/google/protobuf/issues/4272
--- a/docs/MO_DG/prepare_model/Additional_Optimizations.md
+++ b/docs/MO_DG/prepare_model/Additional_Optimizations.md
@@ -3,13 +3,13 @@
 Model Optimizer performs preprocessing to a model. It is possible to optimize this step and improve first inference time, to do that, follow the tips bellow:

 -	**Image mean/scale parameters**<br>
-	Make sure to use the input image mean/scale parameters (`--scale` and `–mean_values`) with the Model Optimizer when you need pre-processing. It allows the tool to bake the pre-processing into the IR to get accelerated by the Inference Engine.
+	Make sure to use the input image mean/scale parameters (`--scale` and `–mean_values`) with the Model Optimizer when you need pre-processing. It allows the tool to bake the pre-processing into the IR to get accelerated by the OpenVINO Runtime.

 -	**RGB vs. BGR inputs**<br>
 	If, for example, your network assumes the RGB inputs, the Model Optimizer can swap the channels in the first convolution using the `--reverse_input_channels` command line option, so you do not need to convert your inputs to RGB every time you get the BGR image, for example, from OpenCV*.

 -	**Larger batch size**<br>
-	Notice that the devices like GPU are doing better with larger batch size. While it is possible to set the batch size in the runtime using the Inference Engine [ShapeInference feature](../../OV_Runtime_UG/ShapeInference.md).
+	Notice that the devices like GPU are doing better with larger batch size. While it is possible to set the batch size in the runtime using the OpenVINO Runtime API [ShapeInference feature](../../OV_Runtime_UG/ShapeInference.md).

 -	**Resulting IR precision**<br>
 The resulting IR precision, for instance, `FP16` or `FP32`, directly affects performance. As CPU now supports `FP16` (while internally upscaling to `FP32` anyway) and because this is the best precision for a GPU target, you may want to always convert models to `FP16`. Notice that this is the only precision that Intel&reg; Movidius&trade; Myriad&trade; 2 and Intel&reg; Myriad&trade; X VPUs support.
--- a/docs/MO_DG/prepare_model/Default_Model_Optimizer_Optimizations.md
+++ b/docs/MO_DG/prepare_model/Default_Model_Optimizer_Optimizations.md
@@ -8,4 +8,4 @@ The picture above shows Caffe\* Resnet269\* topology. The left model is the orig

 If you still see these operations, inspect the Model Optimizer output carefully while searching for warnings, such as on the tool being unable to fuse. For example, non-linear operations (like activations) in between convolutions and linear operations might prevent the fusing. If performance is of concern, try to change (and potentially re-train) the topology. Refer to the [Model Optimizer Guide](Model_Optimization_Techniques.md) for more optimizations.

-Notice that the activation (`_relu`) is not touched by the Model Optimizer, and while it can be merged into convolution as well, this is rather a device-specific optimization, covered by Inference Engine during the model loading time. You are encouraged to inspect performance counters from plugins that should indicate that these particular layers are not executed (“Optimized out”). For more information, refer to <a href="#performance-counters">Internal Inference Performance Counters</a>.
+Notice that the activation (`_relu`) is not touched by the Model Optimizer, and while it can be merged into convolution as well, this is rather a device-specific optimization, covered by OpenVINO Runtime during the model loading time. You are encouraged to inspect performance counters from plugins that should indicate that these particular layers are not executed (“Optimized out”). For more information, refer to <a href="#performance-counters">Internal Inference Performance Counters</a>.
--- a/docs/MO_DG/prepare_model/Getting_performance_numbers.md
+++ b/docs/MO_DG/prepare_model/Getting_performance_numbers.md
@@ -3,11 +3,11 @@

 ## Tip 1. Measure the Proper Set of Operations 

-When evaluating performance of your model with the Inference Engine, you must measure the proper set of operations. To do so, consider the following tips: 
+When evaluating performance of your model with the OpenVINO Runtime, you must measure the proper set of operations. To do so, consider the following tips: 

 - Avoid including one-time costs like model loading.

- Track separately the operations that happen outside the Inference Engine, like video decoding. 
+- Track separately the operations that happen outside the OpenVINO Runtime, like video decoding. 

 > **NOTE**: Some image pre-processing can be baked into the IR and accelerated. For more information, refer to [Model Optimizer Knobs Related to Performance](Additional_Optimizations.md)

@@ -18,7 +18,7 @@ You need to build your performance conclusions on reproducible data. Do the perf
 -	If the warm-up run does not help or execution time still varies, you can try running a large number of iterations and then average or find a mean of the results.
 -	 For time values that range too much, use geomean.

-Refer to the [Inference Engine Samples](../../OV_Runtime_UG/Samples_Overview.md) for code examples for the performance measurements. Almost every sample, except interactive demos, has a `-ni` option to specify the number of iterations.
+Refer to the [OpenVINO Samples](../../OV_Runtime_UG/Samples_Overview.md) for code examples for the performance measurements. Almost every sample, except interactive demos, has a `-ni` option to specify the number of iterations.

 ## Getting performance numbers using OpenVINO tool 

@@ -45,16 +45,16 @@ Instead, it is possible to keep a separate infer request per camera or another s

 ## Comparing Performance with Native/Framework Code 

-When comparing the Inference Engine performance with the framework or another reference code, make sure that both versions are as similar as possible:
+When comparing the OpenVINO Runtime performance with the framework or another reference code, make sure that both versions are as similar as possible:

-	Wrap exactly the inference execution (refer to the [Inference Engine Samples](../../OV_Runtime_UG/Samples_Overview.md) for examples).
+-	Wrap exactly the inference execution (refer to the [OpenVINO Samples](../../OV_Runtime_UG/Samples_Overview.md) for examples).
 -	Do not include model loading time.
-	Ensure the inputs are identical for the Inference Engine and the framework. For example, Caffe\* allows to auto-populate the input with random values. Notice that it might give different performance than on real images.
-	Similarly, for correct performance comparison, make sure the access pattern, for example, input layouts, is optimal for Inference Engine (currently, it is NCHW).
+-	Ensure the inputs are identical for the OpenVINO Runtime and the framework. For example, Caffe\* allows to auto-populate the input with random values. Notice that it might give different performance than on real images.
+-	Similarly, for correct performance comparison, make sure the access pattern, for example, input layouts, is optimal for OpenVINO Runtime (currently, it is NCHW).
 -	Any user-side pre-processing should be tracked separately.
-	Make sure to try the same environment settings that the framework developers recommend, for example, for TensorFlow*. In many cases, things that are more machine friendly, like respecting NUMA (see <a href="#cpu-checklist">CPU Checklist</a>), might work well for the Inference Engine as well.
-	If applicable, use batching with the Inference Engine.
-	If possible, demand the same accuracy. For example, TensorFlow allows `FP16` support, so when comparing to that, make sure to test the Inference Engine with the `FP16` as well.
+-	Make sure to try the same environment settings that the framework developers recommend, for example, for TensorFlow*. In many cases, things that are more machine friendly, like respecting NUMA (see <a href="#cpu-checklist">CPU Checklist</a>), might work well for the OpenVINO Runtime as well.
+-	If applicable, use batching.
+-	If possible, demand the same accuracy. For example, TensorFlow allows `FP16` support, so when comparing to that, make sure to test the OpenVINO Runtime with the `FP16` as well.

 ## Using Tools <a name="using-tools"></a>

@@ -64,7 +64,7 @@ Alternatively, you can gather the raw profiling data that samples report, the se

 ### Internal Inference Performance Counters <a name="performance-counters"></a>

-Almost every sample (inspect command-line options for a specific sample with `-h`) supports a `-pc` command that outputs internal execution breakdown. Refer to the [samples code](../../OV_Runtime_UG/Samples_Overview.md) for the actual Inference Engine API behind that.
+Almost every sample (inspect command-line options for a specific sample with `-h`) supports a `-pc` command that outputs internal execution breakdown. Refer to the [OpenVINO Samples](../../OV_Runtime_UG/Samples_Overview.md) for the actual OpenVINO Runtime API behind that.

 Below is example of CPU plugin output for a network (since the device is CPU, the layers wall clock `realTime` and the `cpu` time are the same):

--- a/docs/MO_DG/prepare_model/Model_Optimizer_FAQ.md
+++ b/docs/MO_DG/prepare_model/Model_Optimizer_FAQ.md
@@ -158,7 +158,7 @@ However, if your model contains more than one input, the Model Optimizer is able

 #### 9. What does the message "Mean file for topologies with multiple inputs is not supported" mean? <a name="question-9"></a>

-Model Optimizer does not support mean file processing for topologies with more than one input. In this case, you need to perform preprocessing of the inputs for a generated Intermediate Representation in the Inference Engine to perform subtraction for every input of your multi-input model.
+Model Optimizer does not support mean file processing for topologies with more than one input. In this case, you need to perform preprocessing of the inputs for a generated Intermediate Representation in the OpenVINO Runtime to perform subtraction for every input of your multi-input model, see [Overview of Preprocessing](../../OV_Runtime_UG/preprocessing_overview.md) for details.

 #### 10. What does the message "Cannot load or process mean file: value error" mean? <a name="question-10"></a>

@@ -214,7 +214,7 @@ One of the layers in the specified topology might not have inputs or values. Ple

 #### 24. What does the message "Part of the nodes was not translated to IE. Stopped" mean? <a name="question-24"></a>

-Some of the layers are not supported by the Inference Engine and cannot be translated to an Intermediate Representation. You can extend the Model Optimizer by allowing generation of new types of layers and implement these layers in the dedicated Inference Engine plugins. For more information, refer to the [Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md) and [Inference Engine Extensibility Mechanism](../../OV_Runtime_UG/Extensibility_DG/Intro.md)
+Some of the operations are not supported by the OpenVINO Runtime and cannot be translated to an Intermediate Representation. You can extend the Model Optimizer by allowing generation of new types of operations and implement these operations in the dedicated OpenVINO plugins. For more information, refer to the [OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md)

 #### 25. What does the message "While creating an edge from .. to .. : node name is undefined in the graph. Check correctness of the input model" mean? <a name="question-25"></a>

@@ -268,7 +268,7 @@ Model Optimizer tried to write an event file in the specified directory but fail

 #### 37. What does the message "There is no registered 'infer' function for node  with op = .. . Please implement this function in the extensions" mean? <a name="question-37"></a>

-Most likely, you tried to extend Model Optimizer with a new primitive, but did not specify an infer function. For more information on extensions, see [Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md).
+Most likely, you tried to extend Model Optimizer with a new primitive, but did not specify an infer function. For more information on extensions, see [OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md).

 #### 38. What does the message "Stopped shape/value propagation at node" mean? <a name="question-38"></a>

@@ -300,7 +300,7 @@ Most likely, there is a problem with the specified file for model. The file exis

 #### 45. What does the message "Found custom layer. Model Optimizer does not support this layer. Please, register it in CustomLayersMapping.xml or implement extension" mean? <a name="question-45"></a>

-This means that the layer `{layer_name}` is not supported in the Model Optimizer. You can find a list of all unsupported layers in the corresponding section. You should implement the extensions for this layer ([Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md)).
+This means that the layer `{layer_name}` is not supported in the Model Optimizer. You can find a list of all unsupported layers in the corresponding section. You should implement the extensions for this layer ([OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md)).

 #### 46. What does the message "Custom replacement configuration file does not exist" mean? <a name="question-46"></a>

@@ -308,7 +308,7 @@ Path to the custom replacement configuration file was provided with the `--trans

 #### 47. What does the message "Extractors collection have case insensitive duplicates" mean? <a name="question-47"></a>

-When extending Model Optimizer with new primitives keep in mind that their names are case insensitive. Most likely, another operation with the same name is already defined. For more information, see [Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md).
+When extending Model Optimizer with new primitives keep in mind that their names are case insensitive. Most likely, another operation with the same name is already defined. For more information, see [OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md).

 #### 48. What does the message "Input model name is not in an expected format, cannot extract iteration number" mean? <a name="question-48"></a>

@@ -340,7 +340,7 @@ Please, make sure that inputs are defined and have correct shapes. You can use `

 #### 55. What does the message "Attempt to register of custom name for the second time as class. Note that custom names are case-insensitive" mean? <a name="question-55"></a>

-When extending Model Optimizer with new primitives keep in mind that their names are case insensitive. Most likely, another operation with the same name is already defined. For more information, see [Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md).
+When extending Model Optimizer with new primitives keep in mind that their names are case insensitive. Most likely, another operation with the same name is already defined. For more information, see [OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md).

 #### 56. What does the message "Both --input_shape and --batch were provided. Please, provide only one of them" mean? <a name="question-56"></a>

@@ -492,7 +492,7 @@ For more information, refer to [Converting a MXNet* Model](convert_model/Convert

 Model Optimizer tried to load the model that contains some unsupported operations. 
 If you want to convert model that contains unsupported operations you need to prepare extension for all such operations.
-For more information, refer to [Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md).
+For more information, refer to [OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md).

 #### 87. What does the message "Can not register Op ... Please, call function 'register_caffe_python_extractor' with parameter 'name'" mean? <a name="question-87"></a>

@@ -538,7 +538,7 @@ Note that the first call <code>register_caffe_python_extractor(ProposalPythonExa

 The second call prevents Model Optimizer from using this extension as if it is an extension for 
 a layer with type `Proposal`. Otherwise, this layer can be chosen as an implementation of extension that can lead to potential issues.
-For more information, refer to the [Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md).
+For more information, refer to the [OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md).

 #### 88. What does the message "Model Optimizer is unable to calculate output shape of Memory node .." mean? <a name="question-88"></a>

@@ -572,8 +572,8 @@ file is not available or does not exist. Also refer to FAQ [#90](#question-90).
 This message means that if you have model with custom layers and its json file has been generated with MXNet version
 lower than 1.0.0, Model Optimizer does not support such topologies. If you want to convert it you have to rebuild 
 MXNet with unsupported layers or generate new json with MXNet version 1.0.0 and higher. Also you need to implement 
-Inference Engine extension for used custom layers.
-For more information, refer to the [Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md).
+OpenVINO extension for used custom layers.
+For more information, refer to the [OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md).

 #### 97. What does the message "Graph contains a cycle. Can not proceed .." mean?  <a name="question-97"></a>

@@ -586,7 +586,7 @@ For Tensorflow:

 For all frameworks: 
 1. [Replace cycle containing Sub-graph in Model Optimizer](customize_model_optimizer/Subgraph_Replacement_Model_Optimizer.md)
-2. [Custom Layers Guide](../../HOWTO/Custom_Layers_Guide.md)
+2. [OpenVINO™ Extensibility Mechanism](../../Extensibility_UG/Intro.md)

 or
 * Edit network in original framework to exclude cycle.
--- a/docs/MO_DG/prepare_model/Supported_Frameworks_Layers.md
+++ b/docs/MO_DG/prepare_model/Supported_Frameworks_Layers.md
@@ -1,9 +1,9 @@
 # Supported Framework Layers {#openvino_docs_MO_DG_prepare_model_Supported_Frameworks_Layers}

-## Caffe\* Supported Layers
+## Caffe Supported Layers


-| Layer Name in Caffe\* | Limitations |
+| Layer Name in Caffe | Limitations |
 |:---------- | :----------|
 | Axpy |  |
 | BN |  |
@@ -47,10 +47,10 @@
 | Tile |  |


-## MXNet\* Supported Symbols
+## MXNet Supported Symbols


-| Symbol Name in MXNet\*| Limitations|
+| Symbol Name in MXNet| Limitations|
 | :----------| :----------|
 | _Plus |  |
 | _contrib_arange_like |  |
@@ -119,7 +119,7 @@
 | Concat |  |
 | Convolution |  |
 | Crop | "center_crop" = 1 is not supported |
-| Custom | [Custom Layers in the Model Optimizer](customize_model_optimizer/Customize_Model_Optimizer.md) |
+| Custom | [Custom Layers in Model Optimizer](customize_model_optimizer/Customize_Model_Optimizer.md) |
 | Deconvolution |  |
 | DeformableConvolution |  |
 | DeformablePSROIPooling |  |
@@ -149,12 +149,12 @@
 | zeros_like |  |


-## TensorFlow\* Supported Operations
+## TensorFlow Supported Operations

-Some TensorFlow\* operations do not match to any Inference Engine layer, but are still supported by the Model Optimizer and can be used on constant propagation path. These layers are labeled 'Constant propagation' in the table.
+Some TensorFlow operations do not match to any OpenVINO operation, but are still supported by the Model Optimizer and can be used on constant propagation path. These layers are labeled 'Constant propagation' in the table.


-| Operation Name in TensorFlow\* | Limitations|
+| Operation Name in TensorFlow | Limitations|
 | :----------| :----------|
 | Abs |  |
 | Acosh |  |
@@ -348,10 +348,10 @@ Some TensorFlow\* operations do not match to any Inference Engine layer, but are
 | ZerosLike |  |


-## TensorFlow 2 Keras\* Supported Operations
+## TensorFlow 2 Keras Supported Operations


-| Operation Name in TensorFlow 2 Keras\* | Limitations|
+| Operation Name in TensorFlow 2 Keras | Limitations|
 | :----------| :----------|
 | ActivityRegularization |  |
 | Add |  |
@@ -431,10 +431,10 @@ Some TensorFlow\* operations do not match to any Inference Engine layer, but are
 | ZeroPadding2D |  |
 | ZeroPadding3D |  |

-## Kaldi\* Supported Layers
+## Kaldi Supported Layers


-| Symbol Name in Kaldi\*| Limitations|
+| Symbol Name in Kaldi| Limitations|
 | :----------| :----------|
 | addshift |  |
 | affinecomponent |  |
@@ -478,10 +478,10 @@ Some TensorFlow\* operations do not match to any Inference Engine layer, but are
 | timeheightconvolutioncomponent |  |


-## ONNX\* Supported Operators
+## ONNX Supported Operators


-| Symbol Name in ONNX\*| Limitations|
+| Symbol Name in ONNX| Limitations|
 | :----------| :----------|
 | Abs |  |
 | Acos |  |
@@ -621,11 +621,11 @@ Some TensorFlow\* operations do not match to any Inference Engine layer, but are
 | Xor |  |


-## PaddlePaddle\* Supported Operators
+## PaddlePaddle Supported Operators

 paddlepaddle>=2.1

-| Operator Name in PaddlePaddle\*| Limitations|
+| Operator Name in PaddlePaddle| Limitations|
 | :----------| :----------|
 | adpative_pool2d | 'NHWC' data_layout is not supported |
 | arg_max | 'int32' output data_type is not supported |
--- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md
+++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md
@@ -10,8 +10,8 @@ A summary of the steps for optimizing and deploying a model that was trained wit

 1. [Configure the Model Optimizer](../../Deep_Learning_Model_Optimizer_DevGuide.md) for Caffe\*.
 2. [Convert a Caffe\* Model](#Convert_From_Caffe) to produce an optimized [Intermediate Representation (IR)](../../IR_and_opsets.md) of the model based on the trained network topology, weights, and biases values
-3. Test the model in the Intermediate Representation format using the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in the target environment via provided Inference Engine [sample applications](../../../OV_Runtime_UG/Samples_Overview.md)
-4. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in your application to deploy the model in the target environment
+3. Test the model in the Intermediate Representation format using the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in the target environment via provided [OpenVINO samples](../../../OV_Runtime_UG/Samples_Overview.md)
+4. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in your application to deploy the model in the target environment

 ## Supported Topologies

--- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md
+++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md
@@ -16,8 +16,8 @@ A summary of the steps for optimizing and deploying a model that was trained wit

 1. [Configure the Model Optimizer](../../Deep_Learning_Model_Optimizer_DevGuide.md) for Kaldi\*.
 2. [Convert a Kaldi\* Model](#Convert_From_Kaldi) to produce an optimized [Intermediate Representation (IR)](../../IR_and_opsets.md) of the model based on the trained network topology, weights, and biases values.
-3. Test the model in the Intermediate Representation format using the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in the target environment via provided Inference Engine [sample applications](../../../OV_Runtime_UG/Samples_Overview.md).
-4. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in your application to deploy the model in the target environment.
+3. Test the model in the Intermediate Representation format using the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in the target environment via provided [OpenVINO Samples](../../../OV_Runtime_UG/Samples_Overview.md).
+4. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in your application to deploy the model in the target environment.

 > **NOTE**: The Model Optimizer supports the [nnet1](http://kaldi-asr.org/doc/dnn1.html) and [nnet2](http://kaldi-asr.org/doc/dnn2.html) formats of Kaldi models. Support of the [nnet3](http://kaldi-asr.org/doc/dnn3.html) format is limited.

@@ -100,7 +100,7 @@ The Model Optimizer finds the last layer of the topology and removes this layer

  > **NOTE**: Model Optimizer can remove SoftMax layer only if the topology has one output.
 
-  > **NOTE**: For sample inference of Kaldi models, you can use the Inference Engine Speech Recognition sample application. The sample supports models with one output. If your model has several outputs, specify the desired one with the `--output` option.    
+  > **NOTE**: For sample inference of Kaldi models, you can use the OpenVINO Speech Recognition sample application. The sample supports models with one output. If your model has several outputs, specify the desired one with the `--output` option.    
  
 If you want to convert a model for inference on Intel® Movidius™ Myriad™, use the `--remove_memory` option. 
 It removes Memory layers from the IR. Instead of it, additional inputs and outputs appear in the IR. 
--- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md
+++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md
@@ -17,8 +17,8 @@ A summary of the steps for optimizing and deploying a model that was trained wit

 1. [Configure the Model Optimizer](../../Deep_Learning_Model_Optimizer_DevGuide.md) for MXNet* (MXNet was used to train your model)
 2. [Convert a MXNet model](#ConvertMxNet) to produce an optimized [Intermediate Representation (IR)](../../IR_and_opsets.md) of the model based on the trained network topology, weights, and biases values
-3. Test the model in the Intermediate Representation format using the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in the target environment via provided Inference Engine [sample applications](../../../OV_Runtime_UG/Samples_Overview.md)
-4. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in your application to deploy the model in the target environment
+3. Test the model in the Intermediate Representation format using the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in the target environment via provided [OpenVINO Samples](../../../OV_Runtime_UG/Samples_Overview.md)
+4. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in your application to deploy the model in the target environment

 ## Supported Topologies

--- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Paddle.md
+++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Paddle.md
@@ -1,11 +1,11 @@
-# Converting a Paddle* Model {#openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_Paddle}
+# Converting a PaddlePaddle Model {#openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_Paddle}

-A summary of the steps for optimizing and deploying a model trained with Paddle\*:
+A summary of the steps for optimizing and deploying a model trained with PaddlePaddle:

-1. [Configure the Model Optimizer](../../Deep_Learning_Model_Optimizer_DevGuide.md) for Paddle\*.
-2. [Convert a Paddle\* Model](#Convert_From_Paddle) to produce an optimized [Intermediate Representation (IR)](../../IR_and_opsets.md) of the model based on the trained network topology, weights, and biases.
-3. Test the model in the Intermediate Representation format using the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in the target environment via provided Inference Engine [sample applications](../../../OV_Runtime_UG/Samples_Overview.md).
-4. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in your application to deploy the model in the target environment.
+1. [Configure Model Optimizer](../../Deep_Learning_Model_Optimizer_DevGuide.md) for PaddlePaddle.
+2. [Convert a PaddlePaddle Model](#Convert_From_Paddle) to produce an optimized [Intermediate Representation (IR)](../../IR_and_opsets.md) of the model based on the trained network topology, weights, and biases.
+3. Test the model in the Intermediate Representation format using the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in the target environment via provided [OpenVINO Samples](../../../OV_Runtime_UG/Samples_Overview.md).
+4. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in your application to deploy the model in the target environment.

 ## Supported Topologies

@@ -29,11 +29,11 @@ A summary of the steps for optimizing and deploying a model trained with Paddle\

 > **NOTE:** The verified models are exported from the repository of branch release/2.1.

-## Convert a Paddle* Model <a name="Convert_From_Paddle"></a>
+## Convert a PaddlePaddle Model <a name="Convert_From_Paddle"></a>

-To convert a Paddle\* model:
+To convert a PaddlePaddle model:

-1. Activate environment with installed OpenVINO if needed
+1. Activate environment with installed OpenVINO™ if needed
 2. Use the `mo` script to simply convert a model, specifying the framework, the path to the input model `.pdmodel` file and the path to an output directory with write permissions:
 ```sh
 mo --input_model <INPUT_MODEL>.pdmodel --output_dir <OUTPUT_MODEL_DIR> --framework=paddle
@@ -44,13 +44,13 @@ Parameters to convert your model:
 * [Framework-agnostic parameters](Converting_Model.md): These parameters are used to convert a model trained with any supported framework.
 > **NOTE:** `--scale`, `--scale_values`, `--mean_values` are not supported in the current version of mo_paddle.

-### Example of Converting a Paddle* Model
-Below is the example command to convert yolo v3 Paddle\* network to OpenVINO IR network with Model Optimizer.
+### Example of Converting a PaddlePaddle Model
+Below is the example command to convert yolo v3 PaddlePaddle network to OpenVINO IR network with Model Optimizer.
 ```sh
 mo --model_name yolov3_darknet53_270e_coco --output_dir <OUTPUT_MODEL_DIR> --framework=paddle --data_type=FP32 --reverse_input_channels --input_shape=[1,3,608,608],[1,2],[1,2] --input=image,im_shape,scale_factor --output=save_infer_model/scale_0.tmp_1,save_infer_model/scale_1.tmp_1 --input_model=yolov3.pdmodel
 ```

-## Supported Paddle\* Layers
+## Supported PaddlePaddle Layers
 Refer to [Supported Framework Layers](../Supported_Frameworks_Layers.md) for the list of supported standard layers.

 ## Frequently Asked Questions (FAQ)
--- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_PyTorch.md
+++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_PyTorch.md
@@ -40,6 +40,8 @@ Here is the list of models that are tested and guaranteed to be supported. Howev
  instruction which is used instead of steps 2 and 3 of [regular instructions](#typical-pytorch).
 * [BERT_NER](https://github.com/kamalkraj/BERT-NER) topology can be converted using steps described in [Convert PyTorch* BERT-NER to the IR](pytorch_specific/Convert_Bert_ner.md)
  instruction which is used instead of steps 2 and 3 of [regular instructions](#typical-pytorch).
+* ResNeXt-101 from [facebookresearch/semi-supervised-ImageNet1K-models](https://github.com/facebookresearch/semi-supervised-ImageNet1K-models)
+  can be converted using [regular instructions](#typical-pytorch).

 ## Typical steps to convert PyTorch\* model <a name="typical-pytorch"></a>

@@ -48,8 +50,8 @@ PyTorch* framework is supported through export to ONNX\* format. A summary of th
 1. [Configure the Model Optimizer](../../Deep_Learning_Model_Optimizer_DevGuide.md) for ONNX\*.
 2. [Export PyTorch model to ONNX\*](#export-to-onnx).
 3. [Convert an ONNX\* model](Convert_Model_From_ONNX.md) to produce an optimized [Intermediate Representation (IR)](../../IR_and_opsets.md) of the model based on the trained network topology, weights, and biases values.
-4. Test the model in the Intermediate Representation format using the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in the target environment via provided [sample applications](../../../OV_Runtime_UG/Samples_Overview.md).
-5. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the Inference Engine in your application to deploy the model in the target environment.
+4. Test the model in the Intermediate Representation format using the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in the target environment via provided [sample applications](../../../OV_Runtime_UG/Samples_Overview.md).
+5. [Integrate OpenVINO Runtime](../../../OV_Runtime_UG/Samples_Overview.md) in your application to deploy the model in the target environment.

 ## Export PyTorch\* Model to ONNX\* Format <a name="export-to-onnx"></a>

--- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md
+++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md
@@ -31,14 +31,14 @@ A summary of the steps for optimizing and deploying a model that was trained wit
 1. [Configure the Model Optimizer](../../Deep_Learning_Model_Optimizer_DevGuide.md) for TensorFlow\* (TensorFlow was used to train your model).
 2. [Freeze the TensorFlow model](#freeze-the-tensorflow-model) if your model is not already frozen or skip this step and use the [instruction](#loading-nonfrozen-models) to a convert a non-frozen model.
 3. [Convert a TensorFlow\* model](#Convert_From_TF) to produce an optimized [Intermediate Representation (IR)](../../IR_and_opsets.md) of the model based on the trained network topology, weights, and biases values.
-4. Test the model in the Intermediate Representation format using the [Inference Engine](../../../OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md) in the target environment via provided [sample applications](../../../OV_Runtime_UG/Samples_Overview.md).
-5. [Integrate](../../../OV_Runtime_UG/Samples_Overview.md) the Inference Engine in your application to deploy the model in the target environment.
+4. Test the model in the Intermediate Representation format using the [OpenVINO™ Runtime](../../../OV_Runtime_UG/openvino_intro.md) in the target environment via provided [sample applications](../../../OV_Runtime_UG/Samples_Overview.md).
+5. [Integrate OpenVINO Runtime](../../../OV_Runtime_UG/Samples_Overview.md) in your application to deploy the model in the target environment.

 ## Supported Topologies

 **Supported Non-Frozen Topologies with Links to the Associated Slim Model Classification Download Files**

-Detailed information on how to convert models from the <a href="https://github.com/tensorflow/models/tree/master/research/slim/README.md">TensorFlow\*-Slim Image Classification Model Library</a> is available in the [Converting TensorFlow*-Slim Image Classification Model Library Models](tf_specific/Convert_Slim_Library_Models.md) chapter. The table below contains list of supported TensorFlow\*-Slim Image Classification Model Library models and required mean/scale values. The mean values are specified as if the input image is read in BGR channels order layout like Inference Engine classification sample does.
+Detailed information on how to convert models from the <a href="https://github.com/tensorflow/models/tree/master/research/slim/README.md">TensorFlow\*-Slim Image Classification Model Library</a> is available in the [Converting TensorFlow*-Slim Image Classification Model Library Models](tf_specific/Convert_Slim_Library_Models.md) chapter. The table below contains list of supported TensorFlow\*-Slim Image Classification Model Library models and required mean/scale values. The mean values are specified as if the input image is read in BGR channels order layout like OpenVINO classification sample does.

 | Model Name| Slim Model Checkpoint File| \-\-mean_values | \-\-scale|
 | ------------- | ------------ | ------------- | -----:|
@@ -354,7 +354,7 @@ TensorFlow*-specific parameters:
 mo --input_model inception_v1.pb -b 1 --tensorboard_logdir /tmp/log_dir --output_dir <OUTPUT_MODEL_DIR>
 ```

-* Launching the Model Optimizer for a model with custom TensorFlow operations (refer to the [TensorFlow* documentation](https://www.tensorflow.org/extend/adding_an_op)) implemented in C++ and compiled into the shared library `my_custom_op.so`. Model Optimizer falls back to TensorFlow to infer output shape of operations implemented in the library if a custom TensorFlow operation library is provided. If it is not provided, a custom operation with an inference function is needed. For more information about custom operations, refer to the [Custom Layers Guide](../../../HOWTO/Custom_Layers_Guide.md).
+* Launching the Model Optimizer for a model with custom TensorFlow operations (refer to the [TensorFlow* documentation](https://www.tensorflow.org/extend/adding_an_op)) implemented in C++ and compiled into the shared library `my_custom_op.so`. Model Optimizer falls back to TensorFlow to infer output shape of operations implemented in the library if a custom TensorFlow operation library is provided. If it is not provided, a custom operation with an inference function is needed. For more information about custom operations, refer to the [OpenVINO™ Extensibility Mechanism](../../../Extensibility_UG/Intro.md).
 ```sh
 mo --input_model custom_model.pb --tensorflow_custom_layer_libraries ./my_custom_op.so --output_dir <OUTPUT_MODEL_DIR>
 ```
--- a/docs/MO_DG/prepare_model/convert_model/Converting_Model.md
+++ b/docs/MO_DG/prepare_model/convert_model/Converting_Model.md
@@ -24,7 +24,7 @@
 To convert the model to the Intermediate Representation (IR), run Model Optimizer using the following command:

 ```sh
-mo --input_model INPUT_MODEL --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model INPUT_MODEL
 ```

 The output directory must have write permissions, so you can run Model Optimizer from the output directory or specify an output path with the `--output_dir` option.
@@ -37,6 +37,7 @@ Framework-specific parameters for:
 * [TensorFlow](Convert_Model_From_TensorFlow.md)
 * [MXNet](Convert_Model_From_MxNet.md)
 * [ONNX](Convert_Model_From_ONNX.md)
+* [PaddlePaddle](Convert_Model_From_Paddle.md)
 * [Kaldi](Convert_Model_From_Kaldi.md)


@@ -70,12 +71,9 @@ Framework-agnostic parameters:
                        square brackets, for example [1,3,227,227] or
                        (1,227,227,3), where the order of dimensions depends
                        on the framework input layout of the model. For
-                        example, [N,C,H,W] is used for Caffe* models and
-                        [N,H,W,C] for TensorFlow* models. Model Optimizer
-                        performs necessary transformations to convert the
-                        shape to the layout required by Inference Engine
-                        (N,C,H,W). The shape should not contain undefined
-                        dimensions (? or -1) and should fit the dimensions
+                        example, [N,C,H,W] is used for ONNX* models and
+                        [N,H,W,C] for TensorFlow* models. The shape can contain 
+                        undefined dimensions (? or -1) and should fit the dimensions
                        defined in the input operation of the graph. Boundaries 
                        of undefined dimension can be specified with ellipsis, 
                        for example [1,1..10,128,128]. One boundary can be undefined, 
@@ -155,13 +153,12 @@ Framework-agnostic parameters:
                        original model is in FP32 and --data_type=FP16 is
                        specified, all model weights and biases are compressed
                        to FP16.
-  --disable_fusing      Turn off fusing of linear operations to Convolution
+  --disable_fusing      [DEPRECATED] Turn off fusing of linear operations to Convolution.
  --disable_resnet_optimization
-                        Turn off resnet optimization
+                        [DEPRECATED] Turn off ResNet optimization.
  --finegrain_fusing FINEGRAIN_FUSING
-                        Regex for layers/operations that won't be fused.
+                        [DEPRECATED] Regex for layers/operations that won't be fused.
                        Example: --finegrain_fusing Convolution1,.*Scale.*
-  --disable_gfusing     Turn off fusing of grouped convolutions
  --enable_concat_optimization
                        Turn on Concat optimization.
  --extensions EXTENSIONS
@@ -184,9 +181,9 @@ Framework-agnostic parameters:
  --static_shape        Enables IR generation for fixed input shape (folding
                        `ShapeOf` operations and shape-calculating sub-graphs
                        to `Constant`). Changing model input shape using
-                        the Inference Engine API in runtime may fail for such an IR.
+                        the OpenVINO Runtime API in runtime may fail for such an IR.
  --disable_weights_compression
-                        Disable compression and store weights with original
+                        [DEPRECATED] Disable compression and store weights with original
                        precision.
  --progress            Enable model conversion progress display.
  --stream_output       Switch model conversion progress display to a
@@ -194,8 +191,13 @@ Framework-agnostic parameters:
  --transformations_config TRANSFORMATIONS_CONFIG
                        Use the configuration file with transformations
                        description.
-  --use_new_frontend    Force the usage of new frontend API for model processing.
-  --use_legacy_frontend Force the usage of legacy API for model processing.
+  --use_new_frontend    Force the usage of new Frontend of Model Optimizer for model conversion into IR.
+                        The new Frontend is C++ based and is available for ONNX* and PaddlePaddle* models.
+                        Model optimizer uses new Frontend for ONNX* and PaddlePaddle* by default that means
+                        `--use_new_frontend` and `--use_legacy_frontend` options are not specified.
+  --use_legacy_frontend Force the usage of legacy Frontend of Model Optimizer for model conversion into IR.
+                        The legacy Frontend is Python based and is available for TensorFlow*, ONNX*, MXNet*,
+                        Caffe*, and Kaldi* models.
 ```

 The sections below provide details on using particular parameters and examples of CLI commands.
@@ -205,7 +207,7 @@ Usually neural network models are trained with the normalized input data. This m
 * The input pre-processing operations are a part of a topology. In this case, the application that uses the framework to infer the topology does not pre-process the input.
 * The input pre-processing operations are not a part of a topology and the pre-processing is performed within the application which feeds the model with an input data.
 
-In the first case, the Model Optimizer generates the IR with required pre-processing layers and Inference Engine samples may be used to infer the model. 
+In the first case, the Model Optimizer generates the IR with required pre-processing operations and OpenVINO Samples may be used to infer the model. 
 
 In the second case, information about mean/scale values should be provided to the Model Optimizer to embed it to the generated IR. Model Optimizer provides a number of command line parameters to specify them: `--mean`, `--scale`, `--scale_values`, `--mean_values`. 

@@ -217,67 +219,64 @@ There is no a universal recipe for determining the mean/scale values for a parti
 * Open the model in a visualization tool and check for layers performing subtraction or multiplication (like `Sub`, `Mul`, `ScaleShift`, `Eltwise` etc) of the input data. If such layers exist, pre-processing is probably part of the model.

 ## When to Specify Input Shapes <a name="when_to_specify_input_shapes"></a>
-There are situations when the input data shape for the model is not fixed, like for the fully-convolutional neural networks. In this case, for example, TensorFlow\* models contain `-1` values in the `shape` attribute of the `Placeholder` operation. Inference Engine does not support input layers with undefined size, so if the input shapes are not defined in the model, the Model Optimizer fails to convert the model. The solution is to provide the input shape(s) using the `--input` or `--input_shape` command line parameter for all input(s) of the model or provide the batch size using the `-b` command line parameter if the model contains just one input with undefined batch size only. In the latter case, the `Placeholder` shape for the TensorFlow\* model looks like this `[-1, 224, 224, 3]`. 
+There are situations when Model Optimizer is unable to deduce input shapes of the model, for example, in case of model cutting due to unsupported operations.
+The solution is to provide input shapes of a static rank explicitly.

 ## When to Reverse Input Channels <a name="when_to_reverse_input_channels"></a>
-Input data for your application can be of RGB or BRG color input order. For example, Inference Engine samples load input images in the BGR channels order. However, the model may be trained on images loaded with the opposite order (for example, most TensorFlow\* models are trained with images in RGB order). In this case, inference results using the Inference Engine samples may be incorrect. The solution is to provide `--reverse_input_channels` command line parameter. Taking this parameter, the Model Optimizer performs first convolution or other channel dependent operation weights modification so these operations output will be like the image is passed with RGB channels order.
+Input data for your application can be of RGB or BRG color input order. For example, OpenVINO Samples load input images in the BGR channels order. However, the model may be trained on images loaded with the opposite order (for example, most TensorFlow\* models are trained with images in RGB order). In this case, inference results using the OpenVINO samples may be incorrect. The solution is to provide `--reverse_input_channels` command line parameter. Taking this parameter, the Model Optimizer performs first convolution or other channel dependent operation weights modification so these operations output will be like the image is passed with RGB channels order.

 ## When to Specify `--static_shape` Command Line Parameter
 If the `--static_shape` command line parameter is specified the Model Optimizer evaluates shapes of all operations in the model (shape propagation) for a fixed input(s) shape(s). During the shape propagation the Model Optimizer evaluates operations *Shape* and removes them from the computation graph. With that approach, the initial model which can consume inputs of different shapes may be converted to IR working with the input of one fixed shape only. For example, consider the case when some blob is reshaped from 4D of a shape *[N, C, H, W]* to a shape *[N, C, H \* W]*. During the model conversion the Model Optimize calculates output shape as a constant 1D blob with values *[N, C, H \* W]*. So if the input shape changes to some other value *[N,C,H1,W1]* (it is possible scenario for a fully convolutional model) then the reshape layer becomes invalid.
-Resulting Intermediate Representation will not be resizable with the help of Inference Engine.
+Resulting Intermediate Representation will not be resizable with the help of OpenVINO Runtime API.

 ## Examples of CLI Commands

 Launch the Model Optimizer for the Caffe bvlc_alexnet model with debug log level:
 ```sh
-mo --input_model bvlc_alexnet.caffemodel --log_level DEBUG --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model bvlc_alexnet.caffemodel --log_level DEBUG
 ```

 Launch the Model Optimizer for the Caffe bvlc_alexnet model with the output IR called `result.*` in the specified `output_dir`:
 ```sh
-mo --input_model bvlc_alexnet.caffemodel --model_name result --output_dir /../../models/
+mo --input_model bvlc_alexnet.caffemodel --model_name result --output_dir <OUTPUT_MODEL_DIR>
 ```

 Launch the Model Optimizer for the Caffe bvlc_alexnet model with one input with scale values:
 ```sh
-mo --input_model bvlc_alexnet.caffemodel --scale_values [59,59,59] --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model bvlc_alexnet.caffemodel --scale_values [59,59,59]
 ```

 Launch the Model Optimizer for the Caffe bvlc_alexnet model with multiple inputs with scale values:
 ```sh
-mo --input_model bvlc_alexnet.caffemodel --input data,rois --scale_values [59,59,59],[5,5,5] --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model bvlc_alexnet.caffemodel --input data,rois --scale_values [59,59,59],[5,5,5]
 ```

 Launch the Model Optimizer for the Caffe bvlc_alexnet model with multiple inputs with scale and mean values specified for the particular nodes:
 ```sh
-mo --input_model bvlc_alexnet.caffemodel --input data,rois --mean_values data[59,59,59] --scale_values rois[5,5,5] --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model bvlc_alexnet.caffemodel --input data,rois --mean_values data[59,59,59] --scale_values rois[5,5,5]
 ```

 Launch the Model Optimizer for the Caffe bvlc_alexnet model with specified input layer, overridden input shape, scale 5, batch 8 and specified name of an output operation:
 ```sh
-mo --input_model bvlc_alexnet.caffemodel --input "data[1 3 224 224]" --output pool5 -s 5 -b 8 --output_dir <OUTPUT_MODEL_DIR>
-```
-Launch the Model Optimizer for the Caffe bvlc_alexnet model with disabled fusing for linear operations to Convolution and grouped convolutions:
-```sh
-mo --input_model bvlc_alexnet.caffemodel --disable_fusing --disable_gfusing --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model bvlc_alexnet.caffemodel --input data --output pool5 -s 5 -b 8
 ```

 Launch the Model Optimizer for the Caffe bvlc_alexnet model with reversed input channels order between RGB and BGR, specified mean values to be used for the input image per channel and specified data type for input tensor values:
 ```sh
-mo --input_model bvlc_alexnet.caffemodel --reverse_input_channels --mean_values [255,255,255] --data_type FP16 --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model bvlc_alexnet.caffemodel --reverse_input_channels --mean_values [255,255,255] --data_type FP16
 ```

 Launch the Model Optimizer for the Caffe bvlc_alexnet model with extensions listed in specified directories, specified mean_images binaryproto 
- file. For more information about extensions, please refer to  the [Custom Layers Guide](../../../HOWTO/Custom_Layers_Guide.md).
+ file. For more information about extensions, please refer to the [OpenVINO™ Extensibility Mechanism](../../../Extensibility_UG/Intro.md).
 ```sh
-mo --input_model bvlc_alexnet.caffemodel --extensions /home/,/some/other/path/ --mean_file /path/to/binaryproto --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model bvlc_alexnet.caffemodel --extensions /home/,/some/other/path/ --mean_file /path/to/binaryproto
 ```

 Launch the Model Optimizer for TensorFlow* FaceNet* model with a placeholder freezing value. 
 It replaces the placeholder with a constant layer that contains the passed value.
 For more information about FaceNet conversion, please refer to [this](tf_specific/Convert_FaceNet_From_Tensorflow.md) page.
 ```sh
-mo --input_model FaceNet.pb --input "phase_train->False" --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model FaceNet.pb --input "phase_train->False"
 ```
 Launch the Model Optimizer for any model with a placeholder freezing tensor of values. 
 It replaces the placeholder with a constant layer that contains the passed values.
@@ -286,7 +285,7 @@ Tensor here is represented in square brackets with each value separated from ano
 If data type is set in the model, this tensor will be reshaped to a placeholder shape and casted to placeholder data type.
 Otherwise, it will be casted to data type passed to `--data_type` parameter (by default, it is FP32).
 ```sh
-mo --input_model FaceNet.pb --input "placeholder_layer_name->[0.1 1.2 2.3]" --output_dir <OUTPUT_MODEL_DIR>
+mo --input_model FaceNet.pb --input "placeholder_layer_name->[0.1 1.2 2.3]"
 ```


--- a/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md
+++ b/docs/MO_DG/prepare_model/convert_model/Cutting_Model.md
@@ -6,10 +6,10 @@ Sometimes some parts of a model must be removed while the Model Optimizer is con

 The following examples are the situations when model cutting is useful or even required:

-*   model has pre- or post-processing parts that cannot be translated to existing Inference Engine layers.
+*   model has pre- or post-processing parts that cannot be translated to existing OpenVINO operations.
 *   model has a training part that is convenient to be kept in the model, but not used during inference.
 *   model is too complex (contains lots of unsupported operations that cannot be easily implemented as custom layers), so the complete model cannot be converted in one shot.
-*   problem with model conversion in the Model Optimizer or inference in the Inference Engine occurred. To localize the issue, limit the scope for conversion by iteratively searching for problematic places in the model.
+*   problem with model conversion in the Model Optimizer or inference in the OpenVINO Runtime occurred. To localize the issue, limit the scope for conversion by iteratively searching for problematic places in the model.
 *   single custom layer or a combination of custom layers is isolated for debugging purposes.

 ## Command-Line Options
--- a/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md
+++ b/docs/MO_DG/prepare_model/convert_model/IR_suitable_for_INT8_inference.md
@@ -2,7 +2,7 @@

 ## Introduction

-Inference Engine CPU and GPU plugin can infer models in the low precision. 
+OpenVINO Runtime CPU and GPU devices can infer models in the low precision. 
 For details, refer to [Low Precision Inference on the CPU](../../../OV_Runtime_UG/Int8Inference.md).

 Intermediate Representation (IR) should be specifically formed to be suitable for low precision inference. 
--- a/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md
+++ b/docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md
@@ -114,4 +114,4 @@ cp models/13_decoder_auxs.nd nst_model
 ```sh
 mo --input_symbol <path/to/nst_model>/nst_vgg19-symbol.json --framework mxnet --output_dir <path/to/output_dir> --input_shape [1,3,224,224] --nd_prefix_name 13_decoder --pretrained_model <path/to/nst_model>/vgg19-0000.params
 ```
-4. The IR is generated (`.bin`, `.xml` and `.mapping` files) in the specified output directory and ready to be consumed by the Inference Engine. 
+4. The IR is generated (`.bin`, `.xml` and `.mapping` files) in the specified output directory and ready to be consumed by the OpenVINO Runtime. 
--- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_GNMT_From_Tensorflow.md
+++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_GNMT_From_Tensorflow.md
@@ -244,7 +244,7 @@ python3 benchmark_app.py -m <path to the generated GNMT IR> -d CPU
 ```


-2. With Inference Engine Python API:
+2. With OpenVINO Runtime Python API:

 > **NOTE**: Before running the example, insert a path to your GNMT `.xml` and `.bin` files into `MODEL_PATH` and `WEIGHTS_PATH`, and fill `input_data_tensor` and `seq_lengths` tensors according to your input data.

@@ -274,4 +274,4 @@ exec_net = ie.load_network(network=net, device_name="CPU")
 result_ie = exec_net.infer(input_data)
 ```

-For more information about Python API, refer to [Inference Engine Python API](ie_python_api/api.html).
+For more information about Python API, refer to [OpenVINO Runtime Python API](ie_python_api/api.html).
--- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md
+++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md
@@ -2,8 +2,8 @@

 > **NOTES**:
 > * Starting with the 2022.1 release, the Model Optimizer can convert the TensorFlow\* Object Detection API Faster and Mask RCNNs topologies differently. By default, the Model Optimizer adds operation "Proposal" to the generated IR. This operation needs an additional input to the model with name "image_info" which should be fed with several values describing the pre-processing applied to the input image (refer to the [Proposal](../../../../ops/detection/Proposal_4.md) operation specification for more information). However, this input is redundant for the models trained and inferred with equal size images. Model Optimizer can generate IR for such models and insert operation [DetectionOutput](../../../../ops/detection/DetectionOutput_1.md) instead of `Proposal`. The `DetectionOutput` operation does not require additional model input "image_info" and moreover, for some models the produced inference results are closer to the original TensorFlow\* model. In order to trigger new behaviour the attribute "operation_to_add" in the corresponding JSON transformation configuration file should be set to value "DetectionOutput" instead of default one "Proposal".
-> * Starting with the 2021.1 release, the Model Optimizer converts the TensorFlow\* Object Detection API SSDs, Faster and Mask RCNNs topologies keeping shape-calculating sub-graphs by default, so topologies can be re-shaped in the Inference Engine using dedicated reshape API. Refer to [Using Shape Inference](../../../../OV_Runtime_UG/ShapeInference.md) for more information on how to use this feature. It is possible to change the both spatial dimensions of the input image and batch size.
-> * To generate IRs for TF 1 SSD topologies, the Model Optimizer creates a number of `PriorBoxClustered` operations instead of a constant node with prior boxes calculated for the particular input image size. This change allows you to reshape the topology in the Inference Engine using dedicated Inference Engine API. The reshaping is supported for all SSD topologies except FPNs which contain hardcoded shapes for some operations preventing from changing topology input shape.
+> * Starting with the 2021.1 release, the Model Optimizer converts the TensorFlow\* Object Detection API SSDs, Faster and Mask RCNNs topologies keeping shape-calculating sub-graphs by default, so topologies can be re-shaped in the OpenVINO Runtime using dedicated reshape API. Refer to [Using Shape Inference](../../../../OV_Runtime_UG/ShapeInference.md) for more information on how to use this feature. It is possible to change the both spatial dimensions of the input image and batch size.
+> * To generate IRs for TF 1 SSD topologies, the Model Optimizer creates a number of `PriorBoxClustered` operations instead of a constant node with prior boxes calculated for the particular input image size. This change allows you to reshape the topology in the OpenVINO Runtime using dedicated API. The reshaping is supported for all SSD topologies except FPNs which contain hardcoded shapes for some operations preventing from changing topology input shape.

 ## How to Convert a Model

@@ -45,7 +45,7 @@ To convert a TensorFlow\* Object Detection API model, go to the `<INSTALL_DIR>/t
 * `--tensorflow_object_detection_api_pipeline_config <path_to_pipeline.config>` --- A special configuration file that describes the topology hyper-parameters and structure of the TensorFlow Object Detection API model. For the models downloaded from the TensorFlow\* Object Detection API zoo, the configuration file is named `pipeline.config`. If you plan to train a model yourself, you can find templates for these files in the [models repository](https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs).
 * `--input_shape` (optional) --- A custom input image shape. Refer to [Custom Input Shape](#tf_od_custom_input_shape) for more information how the `--input_shape` parameter is handled for the TensorFlow* Object Detection API models.

-> **NOTE**: The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the `RGB<->BGR` conversion specifying the command-line parameter: `--reverse_input_channels`. Otherwise, inference results may be incorrect. If you convert a TensorFlow\* Object Detection API model to use with the Inference Engine sample applications, you must specify the `--reverse_input_channels` parameter. For more information about the parameter, refer to **When to Reverse Input Channels** section of [Converting a Model to Intermediate Representation (IR)](../Converting_Model.md).
+> **NOTE**: The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the `RGB<->BGR` conversion specifying the command-line parameter: `--reverse_input_channels`. Otherwise, inference results may be incorrect. If you convert a TensorFlow\* Object Detection API model to use with the OpenVINO sample applications, you must specify the `--reverse_input_channels` parameter. For more information about the parameter, refer to **When to Reverse Input Channels** section of [Converting a Model to Intermediate Representation (IR)](../Converting_Model.md).

 Additionally to the mandatory parameters listed above you can use optional conversion parameters if needed. A full list of parameters is available in the [Converting a TensorFlow* Model](../Convert_Model_From_TensorFlow.md) topic.

@@ -57,24 +57,24 @@ mo --input_model=/tmp/ssd_inception_v2_coco_2018_01_28/frozen_inference_graph.pb

 ## OpenVINO&; Toolkit Samples and Open Model Zoo Demos

-Inference Engine comes with a number of samples to demonstrate use of OpenVINO API, additionally,
+OpenVINO comes with a number of samples to demonstrate use of OpenVINO Runtime API, additionally,
 Open Model Zoo provides set of demo applications to show implementation of close to real life applications
 based on deep learning in various tasks, including Image Classifiacton, Visual Object Detection, Text Recognition,
 Speech Recognition, Natural Language Processing and others. Refer to the links below for more details.


-* [Inference Engine Samples](../../../../OV_Runtime_UG/Samples_Overview.md)
+* [OpenVINO Samples](../../../../OV_Runtime_UG/Samples_Overview.md)
 * [Open Model Zoo Demos](@ref omz_demos)

 ## Important Notes About Feeding Input Images to the Samples

 There are several important notes about feeding input images to the samples:

-1. Inference Engine samples stretch input image to the size of the input operation without preserving aspect ratio. This behavior is usually correct for most topologies (including SSDs), but incorrect for other models like Faster R-CNN, Mask R-CNN and R-FCN. These models usually use keeps aspect ratio resizer. The type of pre-processing is defined in the pipeline configuration file in the section `image_resizer`. If keeping aspect ratio is used, then it is necessary to resize image before passing it to the sample and optionally pad the resized image with 0s (if the attribute "pad_to_max_dimension" in the pipeline.config is equal to "true").
+1. OpenVINO samples stretch input image to the size of the input operation without preserving aspect ratio. This behavior is usually correct for most topologies (including SSDs), but incorrect for other models like Faster R-CNN, Mask R-CNN and R-FCN. These models usually use keeps aspect ratio resizer. The type of pre-processing is defined in the pipeline configuration file in the section `image_resizer`. If keeping aspect ratio is used, then it is necessary to resize image before passing it to the sample and optionally pad the resized image with 0s (if the attribute "pad_to_max_dimension" in the pipeline.config is equal to "true").

-2. TensorFlow\* implementation of image resize may be different from the one implemented in the sample. Even reading input image from compressed format (like `.jpg`) could give different results in the sample and TensorFlow\*. So, if it is necessary to compare accuracy between the TensorFlow\* and the Inference Engine it is recommended to pass pre-resized input image in a non-compressed format (like `.bmp`).
+2. TensorFlow\* implementation of image resize may be different from the one implemented in the sample. Even reading input image from compressed format (like `.jpg`) could give different results in the sample and TensorFlow\*. So, if it is necessary to compare accuracy between the TensorFlow\* and the OpenVINO it is recommended to pass pre-resized input image in a non-compressed format (like `.bmp`).

-3. If you want to infer the model with the Inference Engine samples, convert the model specifying the `--reverse_input_channels` command line parameter. The samples load images in BGR channels order, while TensorFlow* models were trained with images in RGB order. When the `--reverse_input_channels` command line parameter is specified, the Model Optimizer performs first convolution or other channel dependent operation weights modification so the output will be like the image is passed with RGB channels order.
+3. If you want to infer the model with the OpenVINO samples, convert the model specifying the `--reverse_input_channels` command line parameter. The samples load images in BGR channels order, while TensorFlow* models were trained with images in RGB order. When the `--reverse_input_channels` command line parameter is specified, the Model Optimizer performs first convolution or other channel dependent operation weights modification so the output will be like the image is passed with RGB channels order.

 4. Read carefully messaged printed by the Model Optimizer during a model conversion. They contain important instructions on how to prepare input data before running the inference and how to interpret the output.

--- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Slim_Library_Models.md
+++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Slim_Library_Models.md
@@ -64,7 +64,7 @@ The `-b` command line parameter is required because the Model Optimizer cannot c
 Refer to the [Mean and Scale Values for TensorFlow\*-Slim Models](#tf_slim_mean_scale_values) for the information why `--mean_values` and `--scale` command line parameters are used.

 ## Mean and Scale Values for TensorFlow\*-Slim Models <a name="tf_slim_mean_scale_values"></a>
-The TensorFlow\*-Slim Models were trained with normalized input data. There are several different normalization algorithms used in the Slim library. Inference Engine classification sample does not perform image pre-processing except resizing to the input layer size. It is necessary to pass mean and scale values to the Model Optimizer so they are embedded into the generated IR in order to get correct classification results.
+The TensorFlow\*-Slim Models were trained with normalized input data. There are several different normalization algorithms used in the Slim library. OpenVINO classification sample does not perform image pre-processing except resizing to the input layer size. It is necessary to pass mean and scale values to the Model Optimizer so they are embedded into the generated IR in order to get correct classification results.

 The file [preprocessing_factory.py](https://github.com/tensorflow/models/blob/master/research/slim/preprocessing/preprocessing_factory.py) contains a dictionary variable `preprocessing_fn_map` defining mapping between the model type and pre-processing function to be used. The function code should be analyzed to figure out the mean/scale values. 

@@ -83,7 +83,7 @@ The [inception_preprocessing.py](https://github.com/tensorflow/models/blob/maste

 Firstly, the `image` is converted to data type `tf.float32` and the values in the tensor are scaled to the `[0, 1]` range using the [tf.image.convert_image_dtype](https://www.tensorflow.org/api_docs/python/tf/image/convert_image_dtype) function. Then the `0.5` is subtracted from the image values and values multiplied by `2.0`. The final image range of values is `[-1, 1]`.

-Inference Engine classification sample reads an input image as a three-dimensional array of integer values from the range `[0, 255]`. In order to scale them to `[-1, 1]` range, the mean value `127.5` for each image channel should be specified as well as scale factor `127.5`.
+OpenVINO classification sample reads an input image as a three-dimensional array of integer values from the range `[0, 255]`. In order to scale them to `[-1, 1]` range, the mean value `127.5` for each image channel should be specified as well as scale factor `127.5`.

 Similarly, the mean/scale values can be determined for other Slim models.

--- a/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md
+++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md
@@ -177,9 +177,9 @@ defined as a mathematical expression using the [ShapeOf](../../../ops/shape/Shap
 Model Optimizer calculates output shapes for all operations in a model to write them to Intermediate Representation
 files.

-> **NOTE**: This is a legacy requirement because starting from IR version 10 Inference Engine needs to know shapes of
+> **NOTE**: This is a legacy requirement because starting from IR version 10 OpenVINO Runtime needs to know shapes of
 > the [Const](../../../ops/infrastructure/Constant_1.md) and the [Parameter](../../../ops/infrastructure/Parameter_1.md)
-> operations only. The nGraph component of the Inference Engine calculates output shapes for all operations in a model
+> operations only. The OpenVINO Runtime calculates output shapes for all operations in a model
 > using shapes of [Parameter](../../../ops/infrastructure/Parameter_1.md) and
 > [Const](../../../ops/infrastructure/Constant_1.md) operations defined with respective operation attributes.

@@ -257,11 +257,13 @@ More information on how to develop middle transformations and dedicated API desc
 [Middle Phase Transformations](#middle-phase-transformations).

 ### NHWC to NCHW Layout Change <a name="layout-change"></a>
-There are several middle transformations responsible for changing model layout from NHWC to NCHW. These transformations
-are triggered by default for TensorFlow\* models only because it is the only framework with Convolution operations in
-NHWC layout. This layout change is disabled if the model does not have operations that OpenVINO&trade needs to execute in
-NCHW layout, for example, Convolutions in NHWC layout. It is still possible to force Model Optimizer to do layout change
-using `--disable_nhwc_to_nchw` command-line parameter.
+
+There are several middle transformations responsible for changing model layout from NHWC to NCHW. These transformations are triggered by default for TensorFlow models as TensorFlow supports Convolution operations in the NHWC layout.
+
+This layout change is disabled automatically if the model does not have operations that OpenVINO&trade needs to execute in the NCHW layout, for example, Convolutions in NHWC layout. 
+
+It is still possible to force Model Optimizer to do layout change, using `--disable_nhwc_to_nchw` command-line parameter, although it is not advised.
+

 The layout change is a complex problem and detailed explanation of it is out of this document scope. A very brief
 explanation of this process is provided below:
@@ -301,7 +303,7 @@ The last phase of a model conversion is the Intermediate Representation emitting
 steps:

 1. Iterates over all operation nodes in the graph and checks that all nodes have the `type` attribute set. This attribute
-defines the operation type and is used in the Inference Engine to instantiate proper operation from the
+defines the operation type and is used in the OpenVINO to instantiate proper operation from the
 [opset](@ref openvino_docs_ops_opset) specified in the `version` attribute of the node. If some node does not have
 attribute `type` or its values is equal to `None`, the Model Optimizer exits with an error.
 2. Performs type inference of graph operations similar to the shape inference. Inferred data types are saved to a port
@@ -741,8 +743,7 @@ sub-graph of the original graph isomorphic to the specified pattern.
 2. [Specific Operation Front Phase Transformations](#specific-operation-front-phase-transformations) triggered for the
 node with a specific `op` attribute value.
 3. [Generic Front Phase Transformations](#generic-front-phase-transformations).
-4. Manually enabled transformation defined with a JSON configuration file (for TensorFlow\*, ONNX\* and MXNet\* models
-only) specified using the `--transformations_config` command line parameter:
+4. Manually enabled transformation defined with a JSON configuration file (for TensorFlow, ONNX, MXNet, and PaddlePaddle models) specified using the `--transformations_config` command line parameter:
    1. [Node Name Pattern Front Phase Transformations](#node-name-pattern-front-phase-transformation).
    2. [Front Phase Transformations Using Start and End Points](#start-end-points-front-phase-transformations).
    3. [Generic Front Phase Transformations Enabled with Transformations Configuration File](#generic-transformations-config-front-phase-transformations).
@@ -1260,5 +1261,5 @@ Refer to the `extensions/back/GatherNormalizer.py` for the example of a such typ
 * [Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™](../../IR_and_opsets.md)
 * [Converting a Model to Intermediate Representation (IR)](../convert_model/Converting_Model.md)
 * [OpenVINO Model Representation](../../../OV_Runtime_UG/model_representation.md)
-* [Inference Engine Extensibility Mechanism](../../../OV_Runtime_UG/Extensibility_DG/Intro.md)
+* [OpenVINO™ Extensibility Mechanism](../../../Extensibility_UG/Intro.md)
 * [Extending the Model Optimizer with Caffe* Python Layers](Extending_Model_Optimizer_with_Caffe_Python_Layers.md)
--- a/docs/OV_Runtime_UG/API_Changes.md
+++ b/docs/OV_Runtime_UG/API_Changes.md
@@ -1,756 +1,9 @@
-# Inference Engine API Changes History {#openvino_docs_IE_DG_API_Changes}
+# OpenVINO™ Runtime API Changes History {#openvino_docs_OV_Runtime_API_Changes}

-The sections below contain detailed list of changes made to the Inference Engine API in recent releases.
+The sections below contain detailed list of changes made to the OpenVINO™ Runtime API in recent releases.

-## 2021.4
+## 2022.1

 ### New API

-* InferenceEngine::Core::LoadNetwork(modelPath, deviceName, config) simplified API to read and load network in one call
-
-### Deprecated API
-
- **InferenceEngine::Parameter**
-
- * InferenceEngine::Parameter(const std::shared_ptr<ngraph::Variant>&)
- * InferenceEngine::Parameter(std::shared_ptr<ngraph::Variant>& var)
- * std::shared_ptr<ngraph::Variant> InferenceEngine::Parameter::asVariant() const
- * InferenceEngine::Parameter::operator std::shared_ptr<ngraph::Variant>() const
-
- **GPU plugin configuration keys**
- * KEY_CLDNN_NV12_TWO_INPUTS GPU plugin option. Use KEY_GPU_NV12_TWO_INPUTS instead
- * KEY_CLDNN_PLUGIN_PRIORITY GPU plugin option. Use KEY_GPU_PLUGIN_PRIORITY instead
- * KEY_CLDNN_PLUGIN_THROTTLE GPU plugin option. Use KEY_GPU_PLUGIN_THROTTLE instead
- * KEY_CLDNN_MEM_POOL GPU plugin option
- * KEY_CLDNN_GRAPH_DUMPS_DIR GPU plugin option
- * KEY_CLDNN_SOURCES_DUMPS_DIR GPU plugin option
- * KEY_DUMP_KERNELS GPU plugin option
- * KEY_TUNING_MODE GPU plugin option
- * KEY_TUNING_FILE GPU plugin option
-
- **InferenceEngine::IInferRequest**
- * IInferRequest interface is deprecated, use InferRequest wrapper:
-  * Constructor for InferRequest from IInferRequest:: Ptr is deprecated
-  * Cast operator for InferRequest to IInferRequest shared pointer is deprecated
-
- **InferenceEngine::ICNNNetwork**
- * ICNNNetwork interface is deprecated by means of deprecation of all its methods, use CNNNetwork wrapper
-  * CNNNetwork methods working with ICNNNetwork are deprecated:
-  * Cast to ICNNNetwork shared pointer
-  * Cast to reference to ICNNNetwork interface
-  * Constructor from ICNNNetwork shared pointer
-
- **InferenceEngine::IExecutableNetwork**
- * IExecutableNetwork is deprecated, use ExecutableNetwork wrappers:
-  * Constructor of ExecutableNetwork from IExecutableNetwork shared pointer is deprecated
- * The following ExecutableNetwork methods are deprecated:
-  * ExecutableNetwork::reset
-  * Cast operator to IExecutableNetwork shared pointer
-  * ExecutableNetwork::CreateInferRequestPtr - use ExecutableNetwork::CreateInferRequest instead
-
- **Extensions API**
- * InferenceEngine::make_so_pointer which is used to create Extensions library is replaced by std::make_shared<Extension>(..)
- * InferenceEngine::IExtension::Release is deprecated with no replacement
- * Use IE_DEFINE_EXTENSION_CREATE_FUNCTION helper macro instead of explicit declaration of CreateExtension function, which create extension.
-
- **Other changes**
- * Version::ApiVersion structure is deprecated, Inference Engine does not have API version anymore
- * LowLatency - use lowLatency2 instead
- * CONFIG_KEY(DUMP_EXEC_GRAPH_AS_DOT) - use InferenceEngine::ExecutableNetwork::GetExecGraphInfo::serialize() instead
- * Core::ImportNetwork with no device - pass device name explicitly.
- * details::InferenceEngineException - use InferenceEngine::Exception and its derivatives instead.
-
-## 2021.3
-
-### New API
-
- * InferenceEngine::InferRequest::Cancel to cancel inference request execution
- * InferenceEngine::Layout::HWC to support HWC layout for input or output blobs
- * InferenceEngine::Precision::F64 data precision for f64 data type
- * InferenceEngine::CNNNetwork::getOVNameForTensor to map frameworks tensor names to OpenVINO internal tensor names
-
-### Deprecated API
-
- * InferenceEngine::IVariableState interface is deprecated, use InferenceEngine::VariableState wrapper
-
-## 2021.2
-
-### New API
-
- **State API**
-
- * InferenceEngine::InferRequest::QueryState query state value of network on current infer request
- * InferenceEngine::IVariableState class instead of IMemoryState (rename)
- * InferenceEngine::IVariableState::GetState instead of IMemoryState::GetLastState (rename)
-
- **BatchedBlob** - represents a InferenceEngine::BatchedBlob containing other blobs - one per batch.
-
- **Transformations API** - added a new header `ie_transformations.hpp` which contains transformations for InferenceEngine::CNNNetwork object. Such transformations can be called prior to loading network for compilation for particular device:
-
- * InferenceEngine::LowLatency
-
-### Deprecated API
-
- **State API**
-
- * InferenceEngine::ExecutableNetwork::QueryState - use InferenceEngine::InferRequest::QueryState
- * InferenceEngine::IVariableState::GetLastState - use InferenceEngine::IVariableState::GetState
-
-## 2021.1
-
-### Deprecated API
-
- **Utility functions to convert Unicode paths**
-
- * InferenceEngine::stringToFileName - use OS-specific native conversion functions
- * InferenceEngine::fileNameToString - use OS-specific native conversion functions
-
-### Removed API
-
- **Plugin API:**
-
- * InferenceEngine::InferencePlugin C++ plugin wrapper class
- * InferenceEngine::IInferencePlugin plugin interface
- * InferenceEngine::PluginDispatcher class
- * InferenceEngine::InferenceEnginePluginPtr typedef
- * InferenceEngine::ICNNNetReader reader interface
- * InferenceEngine::CNNNetReader class
-
- **Extensibility API:**
-
- * InferenceEngine::ILayerImplFactory class
- * InferenceEngine::IShapeInferImpl class
- * InferenceEngine::IShapeInferExtension class
- * InferenceEngine::IExtension::getFactoryFor(ILayerImplFactory\*& factory, const CNNLayer\* cnnLayer, ResponseDesc\* resp) noexcept method
- * InferenceEngine::IExtension::getPrimitiveTypes(char\*\*& types, unsigned int& size, ResponseDesc\* resp) noexcept method
- * InferenceEngine::ShapeInferImpl class
- * InferenceEngine::Extension::getFactoryFor(ILayerImplFactory\*& factory, const CNNLayer\* cnnLayer, ResponseDesc\* resp) noexcept method
- * InferenceEngine::Extension::getPrimitiveTypes(char\*\*& types, unsigned int& size, ResponseDesc\* resp) noexcept method
-
- **Network API:**
-
- * InferenceEngine::details::CNNNetworkIterator class
- * InferenceEngine::CNNNetwork::getPrecision() const method
- * InferenceEngine::CNNNetwork::getLayerByName(const char\* layerName) const method
- * InferenceEngine::CNNNetwork::size() const method
- * InferenceEngine::CNNNetwork::begin() const method
- * InferenceEngine::CNNNetwork::end() const method
- * InferenceEngine::CNNNetwork::AddExtension(const IShapeInferExtensionPtr& extension) method
- * InferenceEngine::ICNNNetwork::getPrecision() const noexcept method
- * InferenceEngine::ICNNNetwork::getName(char\* pName, size_t len) const noexcept method
- * InferenceEngine::ICNNNetwork::getData(const char\* dname) noexcept method
- * InferenceEngine::ICNNNetwork::addLayer(const CNNLayerPtr& layer) noexcept method
- * InferenceEngine::ICNNNetwork::getLayerByName(const char\* layerName, CNNLayerPtr& out, ResponseDesc\* resp) const noexcept method
- * InferenceEngine::ICNNNetwork::AddExtension(const IShapeInferExtensionPtr& extension, ResponseDesc\* resp) noexcept method
- * InferenceEngine::ICNNNetwork::getStats(ICNNNetworkStats\*\* stats, ResponseDesc\* resp) const noexcept method
- * InferenceEngine::ICNNNetworkStats class
- * InferenceEngine::NetworkNodeStats class
- * InferenceEngine::Data::getCreatorLayer() method
- * InferenceEngine::Data::getInputTo() method
- * InferenceEngine::LayerParams class
-
- **Layer API:**
-
- * InferenceEngine::CNNLayer class
- * InferenceEngine::WeightableLayer class
- * InferenceEngine::BatchNormalizationLayer class
- * InferenceEngine::BatchToSpaceLayer class
- * InferenceEngine::BinaryConvolutionLayer class
- * InferenceEngine::BroadcastLayer class
- * InferenceEngine::BucketizeLayer class
- * InferenceEngine::ClampLayer class
- * InferenceEngine::ConcatLayer class
- * InferenceEngine::ConvolutionLayer class
- * InferenceEngine::CropLayer class
- * InferenceEngine::DeconvolutionLayer class
- * InferenceEngine::DeformableConvolutionLayer class
- * InferenceEngine::DepthToSpaceLayer class
- * InferenceEngine::EltwiseLayer class
- * InferenceEngine::ExperimentalDetectronPriorGridGenerator class
- * InferenceEngine::ExperimentalDetectronPriorGridGeneratorLayer class
- * InferenceEngine::ExperimentalSparseWeightedReduceLayer class
- * InferenceEngine::FillLayer class
- * InferenceEngine::FullyConnectedLayer class
- * InferenceEngine::GRNLayer class
- * InferenceEngine::GRUCell class
- * InferenceEngine::GatherLayer class
- * InferenceEngine::GemmLayer class
- * InferenceEngine::LSTMCell class
- * InferenceEngine::MVNLayer class
- * InferenceEngine::MathLayer class
- * InferenceEngine::NonMaxSuppression class
- * InferenceEngine::NormLayer class
- * InferenceEngine::OneHotLayer class
- * InferenceEngine::PReLULayer class
- * InferenceEngine::PadLayer class
- * InferenceEngine::PoolingLayer class
- * InferenceEngine::PowerLayer class
- * InferenceEngine::QuantizeLayer class
- * InferenceEngine::RNNCell class
- * InferenceEngine::RNNCellBase class
- * InferenceEngine::RNNSequenceLayer class
- * InferenceEngine::RangeLayer class
- * InferenceEngine::ReLU6Layer class
- * InferenceEngine::ReLULayer class
- * InferenceEngine::ReduceLayer class
- * InferenceEngine::ReshapeLayer class
- * InferenceEngine::ReverseSequenceLayer class
- * InferenceEngine::ScaleShiftLayer class
- * InferenceEngine::ScatterLayer class
- * InferenceEngine::SelectLayer class
- * InferenceEngine::ShuffleChannelsLayer class
- * InferenceEngine::SoftMaxLayer class
- * InferenceEngine::SpaceToBatchLayer class
- * InferenceEngine::SpaceToDepthLayer class
- * InferenceEngine::SparseFillEmptyRowsLayer class
- * InferenceEngine::SparseSegmentReduceLayer class
- * InferenceEngine::SparseToDenseLayer class
- * InferenceEngine::SplitLayer class
- * InferenceEngine::StridedSliceLayer class
- * InferenceEngine::TensorIterator class
- * InferenceEngine::TileLayer class
- * InferenceEngine::TopKLayer class
- * InferenceEngine::UniqueLayer class
-
-## 2020.4
-
-### New API
-
- **CPU Plugin API:**
-
- * InferenceEngine::PluginConfigParams::KEY_ENFORCE_BF16 config key
-
- **Metrics and values for Query API:**
-
- * METRIC_KEY(OPTIMIZATION_CAPABILITIES)
-	 * METRIC_VALUE(BF16)
-
-### Deprecated API
-
- **MYRIAD Plugin API:**
-
- * VPU_CONFIG_KEY(IGNORE_IR_STATISTIC)
-
-### Removed API
-
- **Inference Engine NN Builder API:**
-
- * InferenceEngine::Builder::EltwiseLayer
- * InferenceEngine::Builder::MemoryLayer
- * InferenceEngine::Builder::ROIPoolingLayer
- * InferenceEngine::Builder::DeconvolutionLayer
- * InferenceEngine::Builder::ReLULayer
- * InferenceEngine::Builder::TanHLayer
- * InferenceEngine::Builder::InputLayer
- * InferenceEngine::Builder::PoolingLayer
- * InferenceEngine::Builder::CropLayer
- * InferenceEngine::Builder::GRUSequenceLayer
- * InferenceEngine::Builder::NormLayer
- * InferenceEngine::Builder::LSTMSequenceLayer
- * InferenceEngine::Builder::ClampLayer
- * InferenceEngine::Builder::PSROIPoolingLayer
- * InferenceEngine::Builder::Layer
- * InferenceEngine::Builder::RNNSequenceLayer
- * InferenceEngine::Builder::ReorgYoloLayer
- * InferenceEngine::Builder::NormalizeLayer
- * InferenceEngine::Builder::PriorBoxClusteredLayer
- * InferenceEngine::Builder::MVNLayer
- * InferenceEngine::Builder::PermuteLayer
- * InferenceEngine::Builder::SimplerNMSLayer
- * InferenceEngine::Builder::ConstLayer
- * InferenceEngine::Builder::DeformableConvolutionLayer
- * InferenceEngine::Builder::FullyConnectedLayer
- * InferenceEngine::Builder::PriorBoxLayer
- * InferenceEngine::Builder::SoftMaxLayer
- * InferenceEngine::Builder::OutputLayer
- * InferenceEngine::Builder::TileLayer
- * InferenceEngine::Builder::SplitLayer
- * InferenceEngine::Builder::PReLULayer
- * InferenceEngine::Builder::RegionYoloLayer
- * InferenceEngine::Builder::ReshapeLayer
- * InferenceEngine::Builder::ConvolutionLayer
- * InferenceEngine::Builder::DetectionOutputLayer
- * InferenceEngine::Builder::ConcatLayer
- * InferenceEngine::Builder::ELULayer
- * InferenceEngine::Builder::GRNLayer
- * InferenceEngine::Builder::LRNLayer
- * InferenceEngine::Builder::ArgMaxLayer
- * InferenceEngine::Builder::ReLU6Layer
- * InferenceEngine::Builder::ScaleShiftLayer
- * InferenceEngine::Builder::ProposalLayer
- * InferenceEngine::Builder::SigmoidLayer
- * InferenceEngine::Builder::ResampleLayer
- * InferenceEngine::Builder::CTCGreedyDecoderLayer
- * InferenceEngine::Builder::BatchNormalizationLayer
- * InferenceEngine::Builder::LayerDecorator
- * InferenceEngine::Builder::PowerLayer
- * InferenceEngine::Builder::Network
- * InferenceEngine::Builder::PortInfo
- * InferenceEngine::Builder::Connection
- * InferenceEngine::Builder::PortData
- * InferenceEngine::Builder::Port
- * InferenceEngine::Builder::ILayer
- * InferenceEngine::Builder::INetworkIterator
- * InferenceEngine::Builder::INetwork
- * InferenceEngine::Builder::ILayer
-
-## 2020.2
-
-### New API
-
- **Extensibility API:**
-
- * InferenceEngine::IExtension::getImplTypes(const std::shared_ptr<ngraph::Node>& node) method
- * InferenceEngine::IExtension::getImplementation(const std::shared_ptr<ngraph::Node>& node, const std::string& implType) method
-
-### Deprecated API
-
- **Extensibility API:**
-
- * InferenceEngine::ILayerImplFactory class
- * InferenceEngine::IShapeInferImpl class
- * InferenceEngine::IShapeInferImpl class
- * InferenceEngine::IShapeInferExtension class
- * InferenceEngine::IExtension::getFactoryFor(ILayerImplFactory\*& factory, const CNNLayer\* cnnLayer, ResponseDesc\* resp) noexcept method
- * InferenceEngine::IExtension::getPrimitiveTypes(char\*\*& types, unsigned int& size, ResponseDesc\* resp) noexcept method
- * InferenceEngine::ShapeInferImpl class
- * InferenceEngine::Extension::getFactoryFor(ILayerImplFactory\*& factory, const CNNLayer\* cnnLayer, ResponseDesc\* resp) noexcept method
- * InferenceEngine::Extension::getPrimitiveTypes(char\*\*& types, unsigned int& size, ResponseDesc\* resp) noexcept method
-
- **Network API:**
-
- * InferenceEngine::details::CNNNetworkIterator class
- * InferenceEngine::CNNNetwork::getPrecision() const method
- * InferenceEngine::CNNNetwork::getLayerByName(const char\* layerName) const method
- * InferenceEngine::CNNNetwork::size() const method
- * InferenceEngine::CNNNetwork::begin() const method
- * InferenceEngine::CNNNetwork::end() const method
- * InferenceEngine::CNNNetwork::AddExtension(const IShapeInferExtensionPtr& extension) method
- * InferenceEngine::ICNNNetwork::getPrecision() const noexcept method
- * InferenceEngine::ICNNNetwork::getName(char\* pName, size_t len) const noexcept method
- * InferenceEngine::ICNNNetwork::getData(const char\* dname) noexcept method
- * InferenceEngine::ICNNNetwork::addLayer(const CNNLayerPtr& layer) noexcept method
- * InferenceEngine::ICNNNetwork::getLayerByName(const char\* layerName, CNNLayerPtr& out, ResponseDesc\* resp) const noexcept method
- * InferenceEngine::ICNNNetwork::AddExtension(const IShapeInferExtensionPtr& extension, ResponseDesc\* resp) noexcept method
- * InferenceEngine::ICNNNetwork::getStats(ICNNNetworkStats\*\* stats, ResponseDesc\* resp) const noexcept method
- * InferenceEngine::ICNNNetworkStats class
- * InferenceEngine::NetworkNodeStats class
- * InferenceEngine::Data::getCreatorLayer() method
- * InferenceEngine::Data::getInputTo() method
- * InferenceEngine::LayerParams class
-
- **Layer API:**
-
- * InferenceEngine::CNNLayer class
- * InferenceEngine::WeightableLayer class
- * InferenceEngine::BatchNormalizationLayer class
- * InferenceEngine::BatchToSpaceLayer class
- * InferenceEngine::BinaryConvolutionLayer class
- * InferenceEngine::BroadcastLayer class
- * InferenceEngine::BucketizeLayer class
- * InferenceEngine::ClampLayer class
- * InferenceEngine::ConcatLayer class
- * InferenceEngine::ConvolutionLayer class
- * InferenceEngine::CropLayer class
- * InferenceEngine::DeconvolutionLayer class
- * InferenceEngine::DeformableConvolutionLayer class
- * InferenceEngine::DepthToSpaceLayer class
- * InferenceEngine::EltwiseLayer class
- * InferenceEngine::ExperimentalDetectronPriorGridGenerator class
- * InferenceEngine::ExperimentalDetectronPriorGridGeneratorLayer class
- * InferenceEngine::ExperimentalSparseWeightedReduceLayer class
- * InferenceEngine::FillLayer class
- * InferenceEngine::FullyConnectedLayer class
- * InferenceEngine::GRNLayer class
- * InferenceEngine::GRUCell class
- * InferenceEngine::GatherLayer class
- * InferenceEngine::GemmLayer class
- * InferenceEngine::LSTMCell class
- * InferenceEngine::MVNLayer class
- * InferenceEngine::MathLayer class
- * InferenceEngine::NonMaxSuppression class
- * InferenceEngine::NormLayer class
- * InferenceEngine::OneHotLayer class
- * InferenceEngine::PReLULayer class
- * InferenceEngine::PadLayer class
- * InferenceEngine::PoolingLayer class
- * InferenceEngine::PowerLayer class
- * InferenceEngine::QuantizeLayer class
- * InferenceEngine::RNNCell class
- * InferenceEngine::RNNCellBase class
- * InferenceEngine::RNNSequenceLayer class
- * InferenceEngine::RangeLayer class
- * InferenceEngine::ReLU6Layer class
- * InferenceEngine::ReLULayer class
- * InferenceEngine::ReduceLayer class
- * InferenceEngine::ReshapeLayer class
- * InferenceEngine::ReverseSequenceLayer class
- * InferenceEngine::ScaleShiftLayer class
- * InferenceEngine::ScatterLayer class
- * InferenceEngine::SelectLayer class
- * InferenceEngine::ShuffleChannelsLayer class
- * InferenceEngine::SoftMaxLayer class
- * InferenceEngine::SpaceToBatchLayer class
- * InferenceEngine::SpaceToDepthLayer class
- * InferenceEngine::SparseFillEmptyRowsLayer class
- * InferenceEngine::SparseSegmentReduceLayer class
- * InferenceEngine::SparseToDenseLayer class
- * InferenceEngine::SplitLayer class
- * InferenceEngine::StridedSliceLayer class
- * InferenceEngine::TensorIterator class
- * InferenceEngine::TileLayer class
- * InferenceEngine::TopKLayer class
- * InferenceEngine::UniqueLayer class
-
-## 2020.1
-
-### New API
-
- **Integration with ngraph API:**
-
- * InferenceEngine::CNNNetwork(const std::shared_ptr<ngraph::Function>& network) ctor from ngraph::Function
- * InferenceEngine::CNNNetwork::getFunction() const noexcept method
- * InferenceEngine::ICNNNetwork::getFunction() const noexcept method
- * InferenceEngine::Parameter(const std::shared_ptr<ngraph::Variant>& var) ctor
- * InferenceEngine::Parameter::asVariant() const method
- * InferenceEngine::Parameter::operator std::shared_ptr<ngraph::Variant>() const operator
- * InferenceEngine::Core::ReadNetwork(const std::wstring& modelPath, const std::wstring& binPath) method
- * InferenceEngine::Core::ReadNetwork(const std::string& modelPath, const std::string& binPath = "") method
- * InferenceEngine::Core::ReadNetwork(const std::string& model, const Blob::CPtr& weights) method
- * InferenceEngine::Code::AddExtension(const IExtensionPtr& extension) method
- * InferenceEngine::IExtension::getOpSets() method
-
-
- **Offline compilation: import / export to std::stream:**
-
- * InferenceEngine::ExecutableNetwork::Export(std::ostream& networkModel) method
- * InferenceEngine::Core::ImportNetwork(std::istream& networkModel, const std::string& deviceName = {}, const std::map<std::string, std::string>& config = {}) method
- * InferenceEngine::IExecutableNetwork::Export(std::ostream& networkModel, ResponseDesc \*resp) noexcept method
-
-
- **RemoteBlob accelerator memory sharing API:**
-
- * InferenceEngine::RemoteContext class
- * InferenceEngine::RemoteBlob class
- * InferenceEngine::Core::CreateContext(const std::string& deviceName, const ParamMap& params) method
- * InferenceEngine::Core::GetDefaultContext(const std::string& deviceName) method
- * InferenceEngine::Core::LoadNetwork(CNNNetwork network, RemoteContext::Ptr context, const std::map<std::string, std::string>& config = std::map<std::string, std::string>()) method
-
-
- **GNA firmware model image generation:**
-
-  * GNA_CONFIG_KEY(FIRMWARE_MODEL_IMAGE_GENERATION) config key
-     * GNA_CONFIG_VALUE(GEN) value
-     * GNA_CONFIG_VALUE(GEN_EXACT) value
-     * GNA_CONFIG_VALUE(SSE) value
-     * GNA_CONFIG_VALUE(SSE_EXACT) value
-     * GNA_CONFIG_VALUE(AVX1) value
-     * GNA_CONFIG_VALUE(AVX1_EXACT) value
-     * GNA_CONFIG_VALUE(AVX2) value
-     * GNA_CONFIG_VALUE(AVX2_EXACT) value
-
- **MemoryBlob mapping of memory to the user space:**
-
-  * InferenceEngine::MemoryBlob::rwmap() noexcept method
-  * InferenceEngine::MemoryBlob::rmap() noexcept method
-  * InferenceEngine::MemoryBlob::wmap() noexcept method
-
- **Memory interoperability on acceleration devices. General classes and GPU helper functions**
-  * InferenceEngine::RemoteBlob class
-  * InferenceEngine::RemoteContext class
-  * InferenceEngine::Core::CreateContext(const std::string& deviceName, const ParamMap& params) method
-  * InferenceEngine::Core::GetDefaultContext(const std::string& deviceName) method
-  * InferenceEngine::make_shared_blob(const TensorDesc& desc, RemoteContext::Ptr ctx) function
-  * InferenceEngine::gpu::make_shared_blob_nv12(size_t height, size_t width, RemoteContext::Ptr ctx, VASurfaceID nv12_surf) function
-  * InferenceEngine::gpu::make_shared_context(Core& core, std::string deviceName, VADisplay device) function
-  * InferenceEngine::gpu::make_shared_blob(const TensorDesc& desc, RemoteContext::Ptr ctx, VASurfaceID surface, uint32_t plane = 0) function
-  * InferenceEngine::gpu::make_shared_blob_nv12(RemoteContext::Ptr ctx, cl::Image2D& nv12_image_plane_y, cl::Image2D& nv12_image_plane_uv) function
-  * InferenceEngine::gpu::make_shared_context(Core& core, std::string deviceName, cl_context ctx) function
-  * InferenceEngine::gpu::make_shared_blob(const TensorDesc& desc, ClContext::Ptr ctx) function
-  * InferenceEngine::gpu::make_shared_blob(const TensorDesc& desc, RemoteContext::Ptr ctx, cl::Buffer& buffer) function
-  * InferenceEngine::gpu::make_shared_blob(const TensorDesc& desc, RemoteContext::Ptr ctx, cl_mem buffer) function
-  * InferenceEngine::gpu::make_shared_blob(const TensorDesc& desc, RemoteContext::Ptr ctx, cl::Image2D& image) function
-
-### Deprecated API
-
- **Inference Engine NN Builder API:**
-
- * InferenceEngine::Builder::EltwiseLayer
- * InferenceEngine::Builder::MemoryLayer
- * InferenceEngine::Builder::ROIPoolingLayer
- * InferenceEngine::Builder::DeconvolutionLayer
- * InferenceEngine::Builder::ReLULayer
- * InferenceEngine::Builder::TanHLayer
- * InferenceEngine::Builder::InputLayer
- * InferenceEngine::Builder::PoolingLayer
- * InferenceEngine::Builder::CropLayer
- * InferenceEngine::Builder::GRUSequenceLayer
- * InferenceEngine::Builder::NormLayer
- * InferenceEngine::Builder::LSTMSequenceLayer
- * InferenceEngine::Builder::ClampLayer
- * InferenceEngine::Builder::PSROIPoolingLayer
- * InferenceEngine::Builder::Layer
- * InferenceEngine::Builder::RNNSequenceLayer
- * InferenceEngine::Builder::ReorgYoloLayer
- * InferenceEngine::Builder::NormalizeLayer
- * InferenceEngine::Builder::PriorBoxClusteredLayer
- * InferenceEngine::Builder::MVNLayer
- * InferenceEngine::Builder::PermuteLayer
- * InferenceEngine::Builder::SimplerNMSLayer
- * InferenceEngine::Builder::ConstLayer
- * InferenceEngine::Builder::DeformableConvolutionLayer
- * InferenceEngine::Builder::FullyConnectedLayer
- * InferenceEngine::Builder::PriorBoxLayer
- * InferenceEngine::Builder::SoftMaxLayer
- * InferenceEngine::Builder::OutputLayer
- * InferenceEngine::Builder::TileLayer
- * InferenceEngine::Builder::SplitLayer
- * InferenceEngine::Builder::PReLULayer
- * InferenceEngine::Builder::RegionYoloLayer
- * InferenceEngine::Builder::ReshapeLayer
- * InferenceEngine::Builder::ConvolutionLayer
- * InferenceEngine::Builder::DetectionOutputLayer
- * InferenceEngine::Builder::ConcatLayer
- * InferenceEngine::Builder::ELULayer
- * InferenceEngine::Builder::GRNLayer
- * InferenceEngine::Builder::LRNLayer
- * InferenceEngine::Builder::ArgMaxLayer
- * InferenceEngine::Builder::ReLU6Layer
- * InferenceEngine::Builder::ScaleShiftLayer
- * InferenceEngine::Builder::ProposalLayer
- * InferenceEngine::Builder::SigmoidLayer
- * InferenceEngine::Builder::ResampleLayer
- * InferenceEngine::Builder::CTCGreedyDecoderLayer
- * InferenceEngine::Builder::BatchNormalizationLayer
- * InferenceEngine::Builder::LayerDecorator
- * InferenceEngine::Builder::PowerLayer
- * InferenceEngine::Builder::Network
- * InferenceEngine::Builder::PortInfo
- * InferenceEngine::Builder::Connection
- * InferenceEngine::Builder::PortData
- * InferenceEngine::Builder::Port
- * InferenceEngine::Builder::ILayer
- * InferenceEngine::Builder::INetworkIterator
- * InferenceEngine::Builder::INetwork
- * InferenceEngine::Builder::ILayer
-
- **Plugin API:**
-
- * InferenceEngine::InferencePlugin C++ plugin wrapper class
- * InferenceEngine::IInferencePlugin plugin interface
- * InferenceEngine::PluginDispatcher class
- * InferenceEngine::InferenceEnginePluginPtr typedef
- * InferenceEngine::ICNNNetReader reader interface
- * InferenceEngine::CNNNetReader class
-
- **Blob API:**
-
-  * Blob::element_size() const noexcept method
-  * Blob::buffer() noexcept method
-  * Blob::cbuffer() noexcept method
-  * MemoryBlob::buffer() noexcept method
-  * MemoryBlob::cbuffer() noexcept method
-
-
-### Removed API
-
- Removed all [Inference Engine API which deprecated in 2019'R2](https://docs.openvino.ai/2019_R3/_docs_IE_DG_API_Changes.html#deprecated_api)
-
-## 2019 R3
-
-### New API
-
- **New supported layers:**
-
- * InferenceEngine::SparseFillEmptyRowsLayer new class
- * InferenceEngine::UniqueLayer new class
- * InferenceEngine::NonMaxSuppressionLayer new class
- * InferenceEngine::ScatterLayer new class
-
- **FPGA plugin streaming support:**
-
- * DLIA_METRIC_VALUE(INPUT_STREAMING) value to METRIC_KEY(OPTIMIZATION_CAPABILITIES)
- * DLIA_CONFIG_KEY(ENABLE_STREAMING) config key
-
-### Removed API
-
- * InferenceEngine::EltwiseLayer::Select from InferenceEngine::EltwiseLayer::eOperation enumeration
-
-## 2019 R2
-
-### New API
-
- **Inference Engine Core API:**
-
- * Introduced InferenceEngine::Core high level class to manage devices
-
- **Query API extensions to InferenceEngine::ExecutableNetwork and InferenceEngine::IExecutableNetwork:**
-
- * InferenceEngine::ExecutableNetwork::SetConfig method
- * InferenceEngine::ExecutableNetwork::GetConfig method
- * InferenceEngine::ExecutableNetwork::GetMetric method
- * InferenceEngine::IExecutableNetwork::SetConfig method
- * InferenceEngine::IExecutableNetwork::GetConfig method
- * InferenceEngine::IExecutableNetwork::GetMetric method
-
- **Metrics and values for Query API:**
-
- * METRIC_KEY(AVAILABLE_DEVICES)
- * METRIC_KEY(SUPPORTED_METRICS)
- * METRIC_KEY(SUPPORTED_CONFIG_KEYS)
- * METRIC_KEY(FULL_DEVICE_NAME)
- * METRIC_KEY(OPTIMIZATION_CAPABILITIES)
-	 * METRIC_VALUE(FP32)
-	 * METRIC_VALUE(FP16)
-	 * METRIC_VALUE(INT8)
-	 * METRIC_VALUE(BIN)
-	 * METRIC_VALUE(WINOGRAD)
-	 * DLIA_METRIC_VALUE(FP11)
- * METRIC_KEY(RANGE_FOR_STREAMS)
- * METRIC_KEY(NUMBER_OF_WAITING_INFER_REQUESTS)
- * METRIC_KEY(NUMBER_OF_EXEC_INFER_REQUESTS)
- * METRIC_KEY(DEVICE_THERMAL)
- * METRIC_KEY(RANGE_FOR_ASYNC_INFER_REQUESTS)
- * EXEC_NETWORK_METRIC_KEY(NETWORK_NAME)
- * EXEC_NETWORK_METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS)
-
- **Common API:**
-
- * CLDNN_CONFIG_KEY(INT8_ENABLED) config key
- 	* CONFIG_KEY(GPU_THROUGHPUT_AUTO)
-	* CONFIG_KEY(GPU_THROUGHPUT_STREAMS)
- * DLIA_CONFIG_KEY(IO_TRANSFORMATIONS_NATIVE) config key
- * DLIA_CONFIG_KEY(DUMP_SUPPORTED_LAYERS_INFORMATION) config key
- * GNA_CONFIG_VALUE(SW_FP32) config value for GNA_CONFIG_KEY(DEVICE_MODE) key
- * MULTI_CONFIG_KEY(DEVICE_PRIORITIES) config key for `MULTI` device
- * InferenceEngine::CNNNetReader::ReadNetwork(const std::wstring &filepath) new method
- * InferenceEngine::CNNNetReader::ReadWeights(const std::wstring &filepath) new method
- * InferenceEngine::ExecutableNetwork::ExecutableNetwork(IExecutableNetwork::Ptr actual, InferenceEnginePluginPtr plg) constructor with additional `plg` parameter
- * InferenceEngine::InferRequest::InferRequest(IInferRequest::Ptr request, InferenceEnginePluginPtr plg) constructor with additional `plg` parameter
- * InferenceEngine::Data::setName method
- * InferenceEngine::QueryNetworkResult::supportedLayersMap
- * InferenceEngine::Precision::I64 extension to InferenceEngine::Precision::ePrecision enumeration
-
- **New supported primitives:**
-
- * InferenceEngine::Builder::DeformableConvolutionLayer new class
- * InferenceEngine::DeformableConvolutionLayer new class
- * InferenceEngine::EltwiseLayer::Logical_NOT, InferenceEngine::EltwiseLayer::Mean, InferenceEngine::EltwiseLayer::Select extensions to InferenceEngine::EltwiseLayer::eOperation enumeration
- * InferenceEngine::OneHotLayer new class
- * InferenceEngine::SelectLayer new class
- * InferenceEngine::BroadcastLayer new class
- * InferenceEngine::MathLayer new class
- * InferenceEngine::ReduceLayer new class
- * InferenceEngine::TopKLayer new class
-
- **Extensions to Blob creation API:**
-
- * InferenceEngine::Blob::is method
- * InferenceEngine::Blob::is const method
- * InferenceEngine::Blob::as method
- * InferenceEngine::Blob::as const method
- * InferenceEngine::Blob::getAllocator abstract method
- * InferenceEngine::Blob::getHandle abstract method
- * InferenceEngine::MemoryBlob class
- * InferenceEngine::ColorFormat enumeration
- * InferenceEngine::PreProcessInfo::setColorFormat method
- * InferenceEngine::PreProcessInfo::getColorFormat method
- * InferenceEngine::CompoundBlob class to work with blobs consisting of several planes
- * InferenceEngine::NV12Blob class representing NV12 blob with two planes
-
-### Deprecated API
-
-The methods listed below are deprecated and will be removed in 2019 R4 release:
-
- **Common API:**
-
- * InferenceEngine::InputInfo::getInputPrecision method
- * InferenceEngine::InputInfo::setInputPrecision method
- * InferenceEngine::InputInfo::getDims method
- * InferenceEngine::CNNLayer::GetParamsAsBool method
- * InferenceEngine::CNNNetwork::CNNNetwork(ICNNNetwork* actual) constructor
- * InferenceEngine::CNNNetwork::setTargetDevice method
- * HETERO_CONFIG_KEY(DUMP_DLA_MESSAGES) config key
- * InferenceEngine::ILayerImplFactory::getShapes method
- * InferenceEngine::IShapeInferImpl::inferShapes(const std::vector<SizeVector>&, const std::map<std::string, std::string>& , const std::map<std::string, Blob::Ptr>&, std::vector<SizeVector>&, ResponseDesc\*) method
- * InferenceEngine::Data::setBatchSize method
- * InferenceEngine::QueryNetworkResult::supportedLayers field
- * InferenceEngine::ICNNNetwork::setBatchSize(const size_t size) method
- * InferenceEngine::Blob::Resize method
- * InferenceEngine::Blob::Reshape method
- * InferenceEngine::TBlob::set method
-
- **InferenceEngine::IInferencePlugin and InferenceEngine:InferencePlugin obsolete methods:**
-
- * InferenceEngine::InferencePlugin::LoadNetwork(ICNNNetwork &network) method
- * InferenceEngine::InferencePlugin::Infer method
- * InferenceEngine::InferencePlugin::GetPerformanceCounts method
- * InferenceEngine::InferencePlugin::QueryNetwork(const ICNNNetwork &network, QueryNetworkResult &res) const method
- * InferenceEngine::IInferencePlugin::LoadNetwork(ICNNNetwork &network, ResponseDesc \*resp) method
- * InferenceEngine::IInferencePlugin::Infer(const Blob &input, Blob &result, ResponseDesc \*resp) method
- * InferenceEngine::IInferencePlugin::Infer(const BlobMap &input, BlobMap &result, ResponseDesc \*resp) method
- * InferenceEngine::IInferencePlugin::GetPerformanceCounts method
- * InferenceEngine::IInferencePlugin::QueryNetwork(const ICNNNetwork& network, QueryNetworkResult& res) const method
-
-
- **Fields in InferenceEngine::Data class are replaced with appropriate methods:**
-
- * InferenceEngine::Data::precision field
- * InferenceEngine::Data::layout field
- * InferenceEngine::Data::dims field
- * InferenceEngine::Data::creatorLayer field
- * InferenceEngine::Data::name field
- * InferenceEngine::Data::inputTo field
- * InferenceEngine::Data::userObject field
-
- **Heterogeneous plugin:**
-
- * InferenceEngine::IHeteroDeviceLoader class
- * InferenceEngine::IHeteroInferencePlugin class
- * InferenceEngine::HeteroPluginPtr class
- * operator InferenceEngine::InferencePlugin::HeteroPluginPtr operator
-
- **Blob creation API with dimensions in reverse order:**
-
- * InferenceEngine::Blob::Blob(Precision p) constructor
- * InferenceEngine::Blob::Blob(Precision p, Layout l) constructor
- * InferenceEngine::Blob::Blob(Precision p, const SizeVector &dims) constructor
- * InferenceEngine::Blob::Blob(Precision p, Layout l, const SizeVector &dims) constructor
- * InferenceEngine::TBlob::TBlob(Precision p, Layout l) constructor
- * InferenceEngine::TBlob::TBlob(Precision p, Layout l, const SizeVector& dims) constructor
- * InferenceEngine::TBlob::TBlob(Precision p, Layout l, const SizeVector& dims, T* ptr, size_t data_size) constructor
- * InferenceEngine::TBlob::TBlob(Precision p, Layout l, const SizeVector &dims, std::shared_ptr<IAllocator> alloc) constructor
- * InferenceEngine::Blob::type() method
- * InferenceEngine::Blob::precision() method
- * InferenceEngine::Blob::layout() method
- * InferenceEngine::Blob::dims() method
- * InferenceEngine::make_shared_blob(Precision p, Layout l, const SizeVector &dims) function
- * InferenceEngine::make_shared_blob(Precision p, const SizeVector &dims) function
- * InferenceEngine::make_shared_blob(Precision p, Layout l, const TArg &arg) function
- * InferenceEngine::make_shared_blob(Precision p, const TArg &arg) function
- * InferenceEngine::make_shared_blob(TBlob<TypeTo> &&arg) function
- * InferenceEngine::make_shared_blob(Precision p, Layout l) function
- * InferenceEngine::make_shared_blob(Precision p, Layout l, SizeVector dims, const std::vector<TypeTo> &arg) function
- * InferenceEngine::make_shared_blob(Precision p, Layout l, const std::vector<TypeTo> &arg) function
- * InferenceEngine::make_shared_blob(Precision p, const std::vector<TypeTo> &arg) function
- * InferenceEngine::make_shared_blob(Precision p, Layout l, const SizeVector &dims, TypeTo * ptr, size_t size) function
- * InferenceEngine::make_shared_blob(Precision p, const SizeVector &dims, TypeTo * ptr, size_t size) function
- * InferenceEngine::I_N variable
- * InferenceEngine::I_C variable
- * InferenceEngine::I_H variable
- * InferenceEngine::I_W variable
- * InferenceEngine::LayoutOffsetCounter class
- * InferenceEngine::ConvertLayout function
-
- **API working with device enumeration:**
-
- * InferenceEngine::TargetDevice enumeration
- * InferenceEngine::TargetDeviceInfo class
- * InferenceEngine::getDeviceName function
- * InferenceEngine::FindPluginRequest class
- * InferenceEngine::FindPluginResponse class
- * InferenceEngine::findPlugin(const FindPluginRequest &req, FindPluginResponse &result, ResponseDesc *resp) function
- * InferenceEngine::ICNNNetwork::setTargetDevice method
- * InferenceEngine::ICNNNetwork::getTargetDevice method
- * InferenceEngine::PluginDispatcher::getPluginByDevice method
- * InferenceEngine::PluginDispatcher::getSuitablePlugin method
+* The OpenVINO™ 2.0 API was introduced.
--- a/docs/OV_Runtime_UG/Bfloat16Inference.md
+++ b/docs/OV_Runtime_UG/Bfloat16Inference.md
@@ -26,7 +26,7 @@ There are two ways to check if CPU device can support bfloat16 computations for
 1. Query the instruction set using one of these system commands:
   * `lscpu | grep avx512_bf16`
   * `cat /proc/cpuinfo | grep avx512_bf16`
-2. Use the [Query API](InferenceEngine_QueryAPI.md) with `METRIC_KEY(OPTIMIZATION_CAPABILITIES)`, which should return `BF16` in the list of CPU optimization options:
+2. Use the [Configure devices](supported_plugins/config_properties.md) with `METRIC_KEY(OPTIMIZATION_CAPABILITIES)`, which should return `BF16` in the list of CPU optimization options:

@snippet snippets/Bfloat16Inference0.cpp part0

--- a/docs/OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md
+++ b/docs/OV_Runtime_UG/Deep_Learning_Inference_Engine_DevGuide.md
@@ -1,56 +0,0 @@
-# OpenVINO™ Runtime User Guide {#openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide}
-
-@sphinxdirective
-
-.. _deep learning inference engine:
-
-.. toctree::
-   :maxdepth: 1
-   :hidden:
-   
-   openvino_2_0_transition_guide
-   openvino_docs_IE_DG_Integrate_with_customer_application_new_API
-   openvino_docs_OV_Runtime_UG_Model_Representation
-   ngraph_transformation
-   openvino_docs_deployment_optimization_guide_dldt_optimization_guide
-   openvino_docs_IE_DG_Device_Plugins
-   Direct ONNX Format Support <openvino_docs_IE_DG_ONNX_Support>
-   openvino_docs_IE_DG_Paddle_Support
-   openvino_docs_IE_DG_Int8Inference
-   openvino_docs_IE_DG_Bfloat16Inference
-   openvino_docs_IE_DG_DynamicBatching
-   openvino_docs_IE_DG_ShapeInference
-   openvino_docs_IE_DG_Model_caching_overview
-   openvino_docs_IE_DG_Extensibility_DG_Intro
-   openvino_docs_IE_DG_Memory_primitives
-   openvino_docs_IE_DG_network_state_intro   
-   openvino_docs_IE_DG_API_Changes
-   openvino_docs_IE_DG_Known_Issues_Limitations
-   openvino_docs_IE_DG_Glossary
-      
-@endsphinxdirective
-
-## Introduction
-Inference Engine is a set of C++ libraries with C and Python bindings providing a common API to deliver inference solutions on the platform of your choice. Use the Inference Engine API to read the Intermediate Representation (IR), ONNX and execute the model on devices.
-
-Inference Engine uses a plugin architecture. Inference Engine plugin is a software component that contains complete implementation for inference on a certain Intel® hardware device: CPU, GPU, VPU, etc. Each plugin implements the unified API and provides additional hardware-specific APIs.
- 
-The scheme below illustrates the typical workflow for deploying a trained deep learning model: 
-
-![](img/BASIC_FLOW_IE_C.svg)
-
-
-## Video
-
-@sphinxdirective
-
-.. list-table::
-
-   * - .. raw:: html
-
-           <iframe allowfullscreen mozallowfullscreen msallowfullscreen oallowfullscreen webkitallowfullscreen height="315" width="100%"
-           src="https://www.youtube.com/embed/e6R13V8nbak">
-           </iframe>
-   * - **Inference Engine Concept**. Duration: 3:43
-     
-@endsphinxdirective
--- a/docs/OV_Runtime_UG/DynamicBatching.md
+++ b/docs/OV_Runtime_UG/DynamicBatching.md
@@ -1,4 +1,4 @@
-# Using Dynamic Batching {#openvino_docs_IE_DG_DynamicBatching}
+# Working with dynamic shapes {#openvino_docs_IE_DG_DynamicBatching}

 ## Using Dynamic Batching (C++)

--- a/docs/OV_Runtime_UG/Extensibility_DG/AddingNGraphOps.md
+++ b/docs/OV_Runtime_UG/Extensibility_DG/AddingNGraphOps.md
@@ -1,82 +0,0 @@
-# Custom nGraph Operations {#openvino_docs_IE_DG_Extensibility_DG_AddingNGraphOps}
-
-Inference Engine Extension API allows you to register operation sets (opsets) with custom nGraph operations to support models with operations which OpenVINO™ does not support out-of-the-box.
-
-Besides creating custom nGraph operations, to [support custom operations](../../HOWTO/Custom_Layers_Guide.md) in your model you must also create a Model Optimizer extension for the custom operations and an Inference Engine device plugin extension for the device you will use for inference.
-
-## Operation Class
-
-To add your custom nGraph operation, create a new class that extends `ngraph::Op`, which is in turn derived from `ngraph::Node`, the base class for all graph operations in nGraph. Follow the steps below to add a custom nGraph operation:
-
-1. Add the `NGRAPH_RTTI_DECLARATION` and `NGRAPH_RTTI_DEFINITION` macros which define a `NodeTypeInfo` object that identifies the type of the operation to the graph users and helps with dynamic type resolution. The type info of an nGraph operation currently consists of a string identifier and a version number, but this may change in the future.
-
-2. Implement constructors that optionally take the operation inputs and attributes as parameters. 
-
-3. Override the shape inference method `validate_and_infer_types`. This method is called multiple times during graph manipulations to determine the shapes and element types of the operations outputs. To access the input shapes and input element types, use the `get_input_partial_shape()` and `get_input_element_type()` methods of `ngraph::Node`. Set the inferred shape and element type of the output using `set_output_type`.
-
-4. Override the `clone_with_new_inputs` method, which enables graph manipulation routines to create copies of this operation and connect it to different nodes during optimization.
-
-5. Override the `visit_attributes` method, which enables serialization and deserialization of operation attributes. An `AttributeVisitor` is passed to the method, and the implementation is expected to walk over all the attributes in the op using the type-aware `on_attribute` helper. Helpers are already implemented for standard C++ types like `int64_t`, `float`, `bool`, `vector`, and for existing nGraph defined types.
-
-6. Override `evaluate`, which is an optional method that enables the application of constant folding if there is a custom operation on the constant branch. If your operation contains `evaluate` method you also need to override the `has_evaluate` method, this method allow to get information about availability of `evaluate` method for the operation.
-
-Based on that, declaration of an operation class can look as follows:
-
-@snippet template_extension/old/op.hpp op:header
-
-### Class Fields
-
-The provided implementation has several fields:
-
- * `add` of type `int64_t` is an attribute of a custom operation
- * `type_info` of type `ngraph::NodeTypeInfo` defines type and version of an operation
-
-### Operation Constructors
-
-nGraph operation contains two constructors: 
-* Default constructor, which enables you to create an operation without attributes 
-* Constructor that creates and validates an operation with specified inputs and attributes
-
-@snippet template_extension/old/op.cpp op:ctor
-
-### `validate_and_infer_types()`
-
-`ngraph::Node::validate_and_infer_types` method validates operation attributes and calculates output shapes using attributes of the operation.
-
-@snippet template_extension/old/op.cpp op:validate
-
-### `clone_with_new_inputs()`
-
-`ngraph::Node::clone_with_new_inputs` method creates a copy of the nGraph operation with new inputs.
-
-@snippet template_extension/old/op.cpp op:copy
-
-### `visit_attributes()`
-
-`ngraph::Node::visit_attributes` method enables you to visit all operation attributes.
-
-@snippet template_extension/old/op.cpp op:visit_attributes
-
-### `evaluate()` and `has_evaluate()`
-
-`ngraph::Node::evaluate` method enables you to apply constant folding to an operation.
-
-@snippet template_extension/old/op.cpp op:evaluate
-
-## Register Custom Operations in Extension Class
-
-To add custom operations to the [Extension](Extension.md) class, create an operation set with custom operations and implement the `InferenceEngine::IExtension::getOpSets` method:
-
-@snippet template_extension/old/extension.cpp extension:getOpSets
-
-This method returns a map of opsets that exist in the [extension library](Extension.md). 
-nGraph provides an opset mechanism to group operations into clusters. Different opsets distinguish between different versions of one operation.
-
-When specifying opset names, follow the rules below:
-* Use unique opset names.
-* Do not use the following built-in opset names: `extension`, `experimental`, `opset1`, `opset2`, `opset3`, ... , `opsetN`.
-* [Make sure that the Model Optimizer](../../HOWTO/Custom_Layers_Guide.md) and your extension use the same opset names.
-* IR v10 operations have the mandatory `version` attribute specifying the opset.
-Operations from the default opset cannot be redefined.
-
-Use a custom opset to create a new operation or extend functionality of an existing operation from another opset.
--- a/docs/OV_Runtime_UG/Extensibility_DG/Building.md
+++ b/docs/OV_Runtime_UG/Extensibility_DG/Building.md
@@ -1,19 +0,0 @@
-# Build Extension Library Using CMake* {#openvino_docs_IE_DG_Extensibility_DG_Building}
-
-Inference Engine build infrastructure provides the Inference Engine Package for application development.
-
-To configure the build of your extension library, use the following CMake script:
-
-@snippet template_extension/old/CMakeLists.txt cmake:extension
-
-This CMake script finds the Inference Engine and nGraph using the `find_package` CMake command.
-
-To build the extension library, run the commands below:
-
-```sh
-$ cd template_extension/old
-$ mkdir build
-$ cd build
-$ cmake -DOpenVINO_DIR=[OpenVINO_DIR]  ../
-$ cmake --build .
-```
--- a/docs/OV_Runtime_UG/Extensibility_DG/CPU_Kernel.md
+++ b/docs/OV_Runtime_UG/Extensibility_DG/CPU_Kernel.md
@@ -1,71 +0,0 @@
-# CPU Kernel Custom Operations {#openvino_docs_IE_DG_Extensibility_DG_CPU_Kernel}
-
-To enable operations not supported by OpenVINO™ out of the box, you need a custom extension for Model Optimizer, a custom nGraph operation set, and a custom kernel for the device you will target. This page describes custom kernel support for the CPU device.
-
-The primary means of the performance of the CPU codepath in the Inference Engine is the Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN), and new CPU kernels extend the Inference Engine plugin for the Intel MKL-DNN. Implementing the InferenceEngine::ILayerExecImpl API call defines a general CPU-side extension. There are no Intel MKL-DNN specifics in the way you need to implement a kernel.
-
-## Implementation Class
-
-All custom kernels for the CPU plugin should be inherited from the InferenceEngine::ILayerExecImpl interface.
-Based on that, declaration of a kernel implementation class can look as follows:
-
-@snippet template_extension/old/cpu_kernel.hpp cpu_implementation:header
-
-### Class Fields
-
-The provided implementation has several fields:
-
- * `add` of the type `int64_t` is an attribute of a custom operation.
- * `inShape` of the type `ngraph::Shape` is an input shape.
- * `outShape` of the type `ngraph::Shape` is an output shape.
- * `error` of the type `std::string` is a field to handle errors from a constructor.
-
-### Constructor of Implementation
-
-An implementation constructor checks parameters of an nGraph operation, stores required attributes, and stores an error message in case of an error.
-
-@snippet template_extension/old/cpu_kernel.cpp cpu_implementation:ctor
-
-### `getSupportedConfigurations`
-
-The InferenceEngine::ILayerExecImpl::getSupportedConfigurations method returns all supported configuration formats (input/output tensor layouts) for your implementation. To specify formats of data, use InferenceEngine::TensorDesc. Refer to the [Memory Primitives](../Memory_primitives.md) section for instructions.
-
-@snippet template_extension/old/cpu_kernel.cpp cpu_implementation:getSupportedConfigurations
-
-### `init`
-
-The InferenceEngine::ILayerExecImpl::init method gets a runtime-selected configuration from a vector that is populated from the `getSupportedConfigurations` method and checks the parameters:
-
-@snippet template_extension/old/cpu_kernel.cpp cpu_implementation:init
-
-### `execute`
-
-The InferenceEngine::ILayerExecImpl::execute method accepts and processes the actual tensors as input/output blobs:
-
-@snippet template_extension/old/cpu_kernel.cpp cpu_implementation:execute
-
-## Register Implementation in `Extension` Class
-
-To register custom kernel implementation in the [Extension](Extension.md) class, implement the following methods:
-
-* <a href="#getImpTypes">getImplTypes</a>
-* <a href="#getImplementation">getImplementation</a>
-
-### <a name="getImpTypes"><code>getImplTypes</code></a>
-
-InferenceEngine::IExtension::getImplTypes returns a vector of implementation types for an operation.
-
-@snippet template_extension/old/extension.cpp extension:getImplTypes
-
-### <a name="getImplementation"><code>getImplementation</code></a>
-
-InferenceEngine::IExtension::getImplementation returns the kernel implementation with a specified type for an operation.
-
-@snippet template_extension/old/extension.cpp extension:getImplementation
-
-
-## Load Extension with Executable Kernels to Plugin
-
-Use the `AddExtension` method of the general plugin interface to load your primitives:
-
-@snippet snippets/CPU_Kernel.cpp part0
--- a/docs/OV_Runtime_UG/Extensibility_DG/Custom_ONNX_Ops.md
+++ b/docs/OV_Runtime_UG/Extensibility_DG/Custom_ONNX_Ops.md
@@ -1,78 +0,0 @@
-# Custom ONNX* Operators {#openvino_docs_IE_DG_Extensibility_DG_Custom_ONNX_Ops}
-
-The ONNX\* importer provides a mechanism to register custom ONNX operators based on predefined or custom nGraph operations.
-The function responsible for registering a new operator is called `ngraph::onnx_import::register_operator` and defined in the `onnx_import/onnx_utils.hpp` file.
-
-## Register Custom ONNX Operator Based on Predefined nGraph Operations
-
-The steps below explain how to register a custom ONNX operator, for example, CustomRelu, in a domain called `com.example`.
-CustomRelu is defined as follows:
-```
-x >= 0 => f(x) = x * alpha
-x <  0 => f(x) = x * beta
-```
-where `alpha` and `beta` are float constants.
-
-1. Include headers:
-
-@snippet onnx_custom_op/onnx_custom_op.cpp onnx_custom_op:headers
-
-2. Register the CustomRelu operator in the ONNX importer:
-
-@snippet onnx_custom_op/onnx_custom_op.cpp onnx_custom_op:register_operator
-
-The `register_operator` function takes four arguments: op_type, opset version, domain, and a function object.
-The function object is a user-defined function that takes `ngraph::onnx_import::Node` as an input and based on that, returns a graph with nGraph operations.
-The `ngraph::onnx_import::Node` class represents a node in an ONNX model. It provides functions to fetch input node(s) using `get_ng_inputs`, attribute value using `get_attribute_value`, and many more. See the `onnx_import/core/node.hpp` file for the full class declaration.
-
-New operator registration must happen before an ONNX model is read. For example, if an model uses the `CustomRelu` operator, call `register_operator("CustomRelu", ...)` before InferenceEngine::Core::ReadNetwork.
-Reregistering ONNX operators within the same process is supported. If you register an existing operator, you get a warning.
-
-The example below demonstrates an exemplary model that requires a previously created `CustomRelu` operator:
-```
-@include onnx_custom_op/custom_relu_model.prototxt
-```
-
-This model is in text format, so before it can be passed to Inference Engine, it has to be converted to binary using:
-```py
-from google.protobuf import text_format
-import onnx
-
-with open("custom_relu_model.prototxt") as in_file:
-    proto = onnx.ModelProto()
-    text_format.Parse(in_file.read(), proto, allow_field_number=True)
-    s = onnx._serialize(proto)
-    onnx._save_bytes(s, "custom_relu_model.onnx")
-```
-
-
-To create a graph with nGraph operations, visit [Custom nGraph Operations](AddingNGraphOps.md).
-For a complete list of predefined nGraph operators, visit [Available Operations Sets](../../ops/opset.md).
-
-If you do not need an operator anymore, unregister it by calling `unregister_operator`. The function takes three arguments: `op_type`, `version`, and `domain`.
-
-@snippet onnx_custom_op/onnx_custom_op.cpp onnx_custom_op:unregister_operator
-
-## Register Custom ONNX Operator Based on Custom nGraph Operations
-
-The same principles apply when registering a custom ONNX operator based on custom nGraph operations.
-This example shows how to register a custom ONNX operator based on `Operation` presented in [this tutorial](AddingNGraphOps.md), which is used in [TemplateExtension](Extension.md):
-
-@snippet template_extension/old/extension.cpp extension:ctor
-
-Here, the `register_operator` function is called in the constructor of Extension. The constructor makes sure that the function is called before InferenceEngine::Core::ReadNetwork, because InferenceEngine::Core::AddExtension must be called before a model with a custom operator is read.
-
-The example below demonstrates how to unregister an operator from the destructor of Extension:
-@snippet template_extension/old/extension.cpp extension:dtor
-
-> **REQUIRED**: It is mandatory to unregister a custom ONNX operator if it is defined in a dynamic shared library.
-
-## Requirements for Building with CMake
-
-A program that uses the `register_operator` functionality requires `openvino::core` and `openvino::frontend::onnx` libraries in addition to the OpenVINO Inference Runtime.
-The `openvino::frontend::onnx` is a component of the `OpenVINO` package , so `find_package(OpenVINO REQUIRED COMPONENTS ONNX)` can find both.
-Those libraries need to be passed to the `target_link_libraries` command in the CMakeLists.txt file.
-
-See CMakeLists.txt below for reference:
-
-@snippet onnx_custom_op/CMakeLists.txt cmake:onnx_custom_op
--- a/docs/OV_Runtime_UG/Extensibility_DG/Extension.md
+++ b/docs/OV_Runtime_UG/Extensibility_DG/Extension.md
@@ -1,29 +0,0 @@
-# Extension Library {#openvino_docs_IE_DG_Extensibility_DG_Extension}
-
-Inference Engine provides an InferenceEngine::IExtension interface, which defines the interface for Inference Engine Extension libraries.
-Inherit all extension libraries from this interface. The example below contains an implementation of two operations: `Template`
-used as an example in this document and `FFT` used as a more complex example from the [Custom Operations Guide](../../HOWTO/Custom_Layers_Guide.md).
-
-> **NOTE**: `FFT` operation is implemented using the OpenCV library functions `cv::dft` and `cv::idft`.
-
-Based on that, the declaration of an extension class can look as follows:
-
-@snippet template_extension/old/extension.hpp extension:header
-
-The extension library should use `IE_DEFINE_EXTENSION_CREATE_FUNCTION` macro to export a function, which creates an `Extension` class:
-
-@snippet template_extension/old/extension.cpp extension:CreateExtension
-
-Also, an `Extension` object should implement the following methods:
-
-* InferenceEngine::IExtension::Release deletes an extension object.
-
-* InferenceEngine::IExtension::GetVersion returns information about the version of the library.
-
-@snippet template_extension/old/extension.cpp extension:GetVersion
-
-Implement the InferenceEngine::IExtension::getOpSets method if the extension contains custom layers. 
-Read [Custom nGraph Operation](AddingNGraphOps.md) for more information.
-
-To integrate execution kernels to the extension library, read [How to Implement Custom CPU Operations](CPU_Kernel.md).
-To register a custom ONNX\* operator to the extension library, read [Custom ONNX Operators](Custom_ONNX_Ops.md).
--- a/docs/OV_Runtime_UG/Extensibility_DG/GPU_Kernel.md
+++ b/docs/OV_Runtime_UG/Extensibility_DG/GPU_Kernel.md
@@ -1,233 +0,0 @@
-# How to Implement Custom GPU Operations {#openvino_docs_IE_DG_Extensibility_DG_GPU_Kernel}
-
-To enable operations not supported by OpenVINO™ out of the box, you need a custom extension for Model Optimizer, a custom nGraph operation set, and a custom kernel for the device you will target. This page describes custom kernel support for the GPU device.
-
-The GPU codepath abstracts many details about OpenCL\*. You need to provide the kernel code in OpenCL C and an XML configuration file that connects the kernel and its parameters to the parameters of the operation.
-
-There are two options for using the custom operation configuration file:
-
-* Include a section with your kernels into the global automatically-loaded `cldnn_global_custom_kernels/cldnn_global_custom_kernels.xml` file, which is hosted in the `<INSTALL_DIR>/runtime/bin` folder
-* Call the `InferenceEngine::Core::SetConfig()` method from your application with the `InferenceEngine::PluginConfigParams::KEY_CONFIG_FILE` key and the configuration file name as a value before loading the network that uses custom operations to the plugin:
-
-@snippet snippets/GPU_Kernel.cpp part0
-
-All Inference Engine samples, except the trivial `hello_classification`, and most Open Model Zoo demos 
-feature a dedicated command-line option `-c` to load custom kernels. For example, to load custom operations for the classification sample, run the command below:
-```sh
-$ ./classification_sample -m <path_to_model>/bvlc_alexnet_fp16.xml -i ./validation_set/daily/227x227/apron.bmp -d GPU
- -c <absolute_path_to_config>/custom_layer_example.xml
-```
-
-## Configuration File Format <a name="config-file-format"></a>
-
-The configuration file is expected to follow the `.xml` file structure
-with a node of the type `CustomLayer` for every custom operation you provide.
-
-The definitions described in the sections below use the following notations:
-
-Notation | Description
---|---
-(0/1) | Can have zero or one instance of this node or attribute
-(1) | Must have only one instance of this node or attribute
-(0+) | Can have any number of instances of this node or attribute
-(1+) | Can have one or more instances of this node or attribute
-
-### CustomLayer Node and Sub-Node Structure
-
-`CustomLayer` node contains the entire configuration for a single custom operation.
-
-| Attribute Name   |\#    |  Description |
-|-----|-----|-----|
-| `name`           | (1)  | The name of the operation type to be used. This name should be identical to the type used in the IR.|
-| `type`           | (1)  | Must be `SimpleGPU`.                                                                                |
-| `version`        | (1)  | Must be `1`.                                                                                        |
-
-**Sub-nodes**: `Kernel` (1), `Buffers` (1), `CompilerOptions` (0+),
-`WorkSizes` (0/1)
-
-### Kernel Node and Sub-Node Structure
-
-`Kernel` node contains all kernel source code configuration. No kernel
-node structure exists.
-
-**Sub-nodes**: `Source` (1+), `Define` (0+)
-
-### Source Node and Sub-Node Structure
-
-`Source` node points to a single OpenCL source file.
-
-| Attribute Name | \#  |Description|
-|-----|-----|-----|
-| `filename`     | (1) | Name of the file containing OpenCL source code. Note that the path is relative to your executable. Multiple source nodes will have their sources concatenated in order. |
-
-**Sub-nodes**: None
-
-### Define Node and Sub-Node Structure
-
-`Define` node configures a single `#&zwj;define` instruction to be added to
-the sources during compilation (JIT).
-
-| Attribute Name | \#    | Description |
-|------|-------|------|
-| `name`         | (1)   | The name of the defined JIT. For static constants, this can include the value as well, which is taken as a string. |
-| `param`        | (0/1) | This parameter value is used as the value of this JIT definition.                                          |
-| `type`         | (0/1) | The parameter type. Accepted values: `int`, `float`, and `int[]`, `float[]` for arrays.                    |
-| `default`      | (0/1) | The default value to be used if the specified parameters are missing from the operation in the IR.          |
-
-**Sub-nodes:** None
-
-The resulting JIT has the following form:
-`#&zwj;define [name] [type] [value/default]`.
-
-### Buffers Node and Sub-Node Structure
-
-`Buffers` node configures all input/output buffers for the OpenCL entry
-function. No buffers node structure exists.
-
-**Sub-nodes:** `Data` (0+), `Tensor` (1+)
-
-### Data Node and Sub-Node Structure
-
-`Data` node configures a single input with static data, for example,
-weights or biases.
-
-| Attribute Name | \#  | Description |
-|----|-----|------|
-| `name`         | (1) | Name of a blob attached to an operation in the IR             |
-| `arg-index`    | (1) | 0-based index in the entry function arguments to be bound to |
-
-**Sub-nodes**: None
-
-### Tensor Node and Sub-Node Structure
-
-`Tensor` node configures a single input or output tensor.
-
-| Attribute Name | \#    | Description  |
-|------|-------|-------|
-| `arg-index`    | (1)   | 0-based index in the entry function arguments to be bound to.                                                                          |
-| `type`         | (1)   | `input` or `output`                                                                                                                    |
-| `port-index`   | (1)   | 0-based index in the operation input/output ports in the IR                                                                            |
-| `format`       | (0/1) | Data layout declaration for the tensor. Accepted values: `BFYX`, `BYXF`, `YXFB`, `FYXB`, and same values in all lowercase. Default value: `BFYX` |
-
-### CompilerOptions Node and Sub-Node Structure
-
-`CompilerOptions` node configures the compilation flags for the OpenCL
-sources.
-
-| Attribute Name | \#  | Description                                        |
-|--------|-----|------|
-| `options`      | (1) | Options string to be passed to the OpenCL compiler |
-
-**Sub-nodes**: None
-
-### WorkSizes Node and Sub-Node Structure
-
-`WorkSizes` node configures the global/local work sizes to be used when
-queuing an OpenCL program for execution.
-
-| Attribute Name      | \#             | Description                                                                 |
-|-----|------|-----|
-| `global`<br>`local` | (0/1)<br>(0/1) | An array of up to three integers or formulas for defining OpenCL work-sizes to be used during execution.<br> The formulas can use the values of the B,F,Y,X dimensions and contain the operators: +,-,/,\*,%. All operators are evaluated in integer arithmetic. <br>Default value: `global=”B*F*Y*X” local=””` |
-| `dim`               | (0/1)          | A tensor to take the work-size from. Accepted values: `input N`, `output`, where `N` is an index of input tensor starting with 0. Default value: `output` |
-
-**Sub-nodes**: None
-
-## Example Configuration File
-
-The following code sample provides an example configuration file in XML 
-format. For information on the configuration file structure, see
-[Configuration File Format](#config-file-format).
-```xml
-<CustomLayer name="ReLU" type="SimpleGPU" version="1">
-  <Kernel entry="example_relu_kernel">
-    <Source filename="custom_layer_kernel.cl"/>
-    <Define name="neg_slope" type="float" param="negative_slope" default="0.0"/>
-  </Kernel>
-  <Buffers>
-    <Tensor arg-index="0" type="input" port-index="0" format="BFYX"/>
-    <Tensor arg-index="1" type="output" port-index="0" format="BFYX"/>
-  </Buffers>
-  <CompilerOptions options="-cl-mad-enable"/>
-  <WorkSizes global="X,Y,B*F"/>
-</CustomLayer>
-```
-
-## Built-In Definitions for Custom Layers
-
-The following table includes definitions that are attached before
-user sources, where `<TENSOR>` is the actual input and output, for
-example, `INPUT0` or `OUTPUT0`.
-
-For an example, see [Example Kernel](#example-kernel).
-
-| Name | Value  |
-|---|---|
-| `NUM_INPUTS` | Number of the input tensors bound to this kernel |
-| `GLOBAL_WORKSIZE`  | An array of global work sizes used to execute this kernel |
-| `GLOBAL_WORKSIZE_SIZE` | The size of the `GLOBAL_WORKSIZE` array |
-| `LOCAL_WORKSIZE`  | An array of local work sizes used to execute this kernel  |
-| `LOCAL_WORKSIZE_SIZE`   | The size of the `LOCAL_WORKSIZE` array |
-| `<TENSOR>_DIMS`| An array of the tensor dimension sizes. Always ordered as `BFYX` |
-| `<TENSOR>_DIMS_SIZE`| The size of the `<TENSOR>_DIMS` array.|
-| `<TENSOR>_TYPE`| The datatype of the tensor: `float`, `half`, or `char`|
-| `<TENSOR>_FORMAT_` | The format of the tensor, BFYX, BYXF, YXFB , FYXB, or ANY. The format is concatenated to the defined name. You can use the tensor format to define codepaths in your code with `#&zwj;ifdef/#&zwj;endif`. |
-| `<TENSOR>_LOWER_PADDING` | An array of padding elements used for the tensor dimensions before they start. Always ordered as BFYX.|
-| `<TENSOR>_ LOWER_PADDING_SIZE` | The size of the `<TENSOR>_LOWER_PADDING` array  |
-| `<TENSOR>_UPPER_PADDING`   | An array of padding elements used for the tensor dimensions after they end. Always ordered as BFYX. |
-| `<TENSOR>_UPPER_PADDING_SIZE`  | The size of the `<TENSOR>_UPPER_PADDING` array |
-| `<TENSOR>_PITCHES` | The number of elements between adjacent elements in each dimension. Always ordered as BFYX.|
-| `<TENSOR>_PITCHES_SIZE`| The size of the `<TENSOR>_PITCHES` array   |
-| `<TENSOR>_OFFSET`| The number of elements from the start of the tensor to the first valid element, bypassing the lower padding.  |
-All `<TENSOR>` values are automatically defined for every tensor
-bound to this operation, such as `INPUT0`, `INPUT1`, and `OUTPUT0`, as shown
-in the following example:
-
-```sh
-#define INPUT0_DIMS_SIZE 4
-#define INPUT0_DIMS (int []){ 1,96,55,55, }
-```
-
-## Example Kernel<a name="example-kernel"></a>
-
-```c
-#pragma OPENCL EXTENSION cl_khr_fp16 : enable
-__kernel void example_relu_kernel(
-    const __global INPUT0_TYPE*  input0,
-          __global OUTPUT0_TYPE* output)
-{
-    const uint idx  = get_global_id(0);
-    const uint idy  = get_global_id(1);
-    const uint idbf = get_global_id(2);//batches*features, as OpenCL supports 3D nd-ranges only
-    const uint feature = idbf%OUTPUT0_DIMS[1];
-    const uint batch   = idbf/OUTPUT0_DIMS[1];
-    //notice that pitches are in elements, not in bytes!
-    const uint in_id  = batch*INPUT0_PITCHES[0] + feature*INPUT0_PITCHES[1]   + idy*INPUT0_PITCHES[2]  + idx*INPUT0_PITCHES[3]  + INPUT0_OFFSET;
-    const uint out_id = batch*OUTPUT0_PITCHES[0] + feature*OUTPUT0_PITCHES[1]  + idy*OUTPUT0_PITCHES[2]  + idx*OUTPUT0_PITCHES[3]  + OUTPUT0_OFFSET;
-
-    INPUT0_TYPE value = input0[in_id];
-    //neg_slope (which is non-zero for leaky ReLU) is put automatically as #define, refer to the config xml
-    output[out_id] = value < 0 ? value * neg_slope : value;
-}
-```
-
-> **NOTE**: As described in the previous section, all items like
-> `INPUT0_TYPE` are actually defined as OpenCL (pre-)compiler inputs by
-> the Inference Engine for efficiency reasons. See [Debugging
-> Tips](#debugging-tips) for information on debugging the results.
-
-> **NOTE**: Several GPU-targeted kernels are also added to the binaries upon compilation of samples
-> so that the sample application can easy load them.
-> Refer to the `cldnn_global_custom_kernels` folder in the GPU plugin installation directory.
-
-## Debugging Tips<a name="debugging-tips"></a>
-
-* **Using `printf` in the OpenCL™ Kernels**.
-To debug the specific values, you can use `printf` in your kernels.
-However, be careful not to output excessively, which
-could generate too much data. The `printf` output is typical, so
-your output can be truncated to fit the buffer. Also, because of
-buffering, you actually get an entire buffer of output when the
-execution ends.<br>
-
-For more information, refer to the [printf
-Function](https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/printfFunction.html).
--- a/docs/OV_Runtime_UG/Extensibility_DG/Intro.md
+++ b/docs/OV_Runtime_UG/Extensibility_DG/Intro.md
@@ -1,60 +0,0 @@
-# Inference Engine Extensibility Mechanism {#openvino_docs_IE_DG_Extensibility_DG_Intro}
-
-@sphinxdirective
-
-.. toctree::
-   :maxdepth: 1
-   :hidden:
-   
-   openvino_docs_IE_DG_Extensibility_DG_AddingNGraphOps
-   openvino_docs_IE_DG_Extensibility_DG_Custom_ONNX_Ops
-   CPU Kernels Extensibility <openvino_docs_IE_DG_Extensibility_DG_CPU_Kernel>
-   GPU Kernels Extensibility <openvino_docs_IE_DG_Extensibility_DG_GPU_Kernel>
-   VPU Kernels Extensibility <openvino_docs_IE_DG_Extensibility_DG_VPU_Kernel>
-   openvino_docs_IE_DG_Extensibility_DG_Extension
-   openvino_docs_IE_DG_Extensibility_DG_Building
-
-@endsphinxdirective
-
-If your model contains operations not normally supported by OpenVINO, the Inference Engine Extensibility API lets you add support for those custom operations in a library containing custom nGraph operation sets, corresponding extensions to the Model Optimizer, and a device plugin extension. See the overview in the [Custom Operations Guide](../../HOWTO/Custom_Layers_Guide.md) to learn how these work together.
-
-To load the Extensibility library to the `InferenceEngine::Core` object, use the `InferenceEngine::Core::AddExtension` method.
-
-## Inference Engine Extension Library
-
-An Inference Engine Extension dynamic library contains the following components:
-
- * [Extension Library](Extension.md):
-    - Contains custom operation sets
-    - Provides CPU implementations for custom operations
- * [Custom nGraph Operation](AddingNGraphOps.md):
-    - Enables the use of `InferenceEngine::Core::ReadNetwork` to read Intermediate Representation (IR) with unsupported
-    operations
-    - Enables the creation of `ngraph::Function` with unsupported operations
-    - Provides a shape inference mechanism for custom operations
-
-> **NOTE**: This documentation is written based on the [Template extension](https://github.com/openvinotoolkit/openvino/tree/master/docs/template_extension), which demonstrates extension development details. You can review the complete code, which is fully compilable and up-to-date, to see how it works.
-
-## Execution Kernels
-
-The Inference Engine workflow involves the creation of custom kernels and either custom or existing operations.
-
-An _operation_ is a network building block implemented in the training framework, for example, `Convolution` in Caffe*.
-A _kernel_ is defined as the corresponding implementation in the Inference Engine.
-
-Refer to the [Model Optimizer Extensibility](../../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md)
-for details on how a mapping between framework operations and Inference Engine kernels is registered.
-
-In short, you can plug your own kernel implementations into the Inference Engine and map them to the operations in the original framework.
-
-The following pages describe how to integrate custom _kernels_ into the Inference Engine:
-
- * [Introduction to development of custom CPU kernels](CPU_Kernel.md)
- * [Introduction to development of custom GPU kernels](GPU_Kernel.md)
- * [Introduction to development of custom VPU kernels](VPU_Kernel.md)
-
-## See Also
-
-* [Build an extension library using CMake*](Building.md)
-* [Using Inference Engine Samples](../Samples_Overview.md)
-* [Hello Shape Infer SSD sample](../../../samples/cpp/hello_reshape_ssd/README.md)
--- a/docs/OV_Runtime_UG/Extensibility_DG/VPU_Kernel.md
+++ b/docs/OV_Runtime_UG/Extensibility_DG/VPU_Kernel.md
@@ -1,682 +0,0 @@
-# How to Implement Custom Layers for VPU (Intel® Neural Compute Stick 2) {#openvino_docs_IE_DG_Extensibility_DG_VPU_Kernel}
-
-To enable operations not supported by OpenVINO™ out of the box, you need a custom extension for Model Optimizer, a custom nGraph operation set, and a custom kernel for the device you will target. This page describes custom kernel support for one the VPU, the Intel® Neural Compute Stick 2 device, which uses the MYRIAD device plugin.
-
-> **NOTES:** 
-> * OpenCL\* custom layer support is available in the preview mode.
-> * This section assumes you are familiar with developing kernels using OpenCL.
-
-To customize your topology with an OpenCL layer, carry out the tasks described on this page:
-
-1. Write and compile your OpenCL code with the standalone offline OpenCL compiler (`clc`).
-2. Write a configuration file to bind the OpenCL kernel to the topology file (`.xml`) of the model IR.
-3. Pass the configuration file to the Inference Engine with the model IR.
-
-## Compile OpenCL code for VPU (Intel® Neural Compute Stick 2)
-
-> **NOTE**: OpenCL compiler, targeting Intel® Neural Compute Stick 2 for the SHAVE* processor only, is redistributed with OpenVINO.
-OpenCL support is provided by ComputeAorta* and is distributed under a license agreement between Intel® and Codeplay* Software Ltd.
-
-The OpenCL toolchain for the Intel® Neural Compute Stick 2 supports offline compilation only, so first compile OpenCL C code using the standalone `clc` compiler. You can find the compiler binary at `<INSTALL_DIR>/tools/cl_compiler`.
-
-> **NOTE**: By design, custom OpenCL layers support any OpenCL kernels written assuming OpenCL version 1.2. It also supports half float extension and is optimized for this type, because it is a native type for Intel® Movidius™ VPUs.
-
-1. Prior to running a compilation, make sure that the following variables are set:
-   * `SHAVE_MA2X8XLIBS_DIR=<INSTALL_DIR>/tools/cl_compiler/lib/`
-   * `SHAVE_LDSCRIPT_DIR=<INSTALL_DIR>/tools/cl_compiler/ldscripts/`
-   * `SHAVE_MYRIAD_LD_DIR=<INSTALL_DIR>/tools/cl_compiler/bin/`
-   * `SHAVE_MOVIASM_DIR=<INSTALL_DIR>/tools/cl_compiler/bin/`
-2. Run the compilation with the command below. You should use `--strip-binary-header` to make an OpenCL runtime-agnostic binary runnable with the Inference Engine.
-   ```bash
-   cd <INSTALL_DIR>/tools/cl_compiler/bin
-   ./clc --strip-binary-header custom_layer.cl -o custom_layer.bin
-   ```
-
-## Write a Configuration File
-
-To tie the topology IR for a layer you customize, prepare a configuration file, so that the Inference Engine can find parameters for your kernel and the execution work grid is described.
-For example, consider the following OpenCL kernel signature:
-```cpp
-__kernel void reorg_nhwc(__global const half *src, __global half *out, int w, int h, int c, int stride);
-```
-A configuration file for this kernel might be the following:
-```xml
-<CustomLayer name="ReorgYolo" type="MVCL" version="1">
-   <Kernel entry="reorg_nhwc">
-       <Source filename="reorg.bin"/>
-   </Kernel>
-   <Parameters>
-       <Tensor arg-name="src"    type="input"  port-index="0"                format="BYXF"/>
-       <Tensor arg-name="out"    type="output" port-index="0"                format="BYXF"/>
-       <Scalar arg-name="w"      type="int"    port-index="0" source="I.X"                />
-       <Scalar arg-name="h"      type="int"    port-index="0" source="I.Y"                />
-       <Scalar arg-name="c"      type="int"    port-index="0" source="I.F"                />
-       <Scalar arg-name="stride" type="int"                   source="stride"             />
-   </Parameters>
-   <WorkSizes dim="input,0" global="(Y+7)/8*8,1,1" local="8,1,1"/>
-</CustomLayer>
-```
-Each custom layer is described with the `CustomLayer` node. It has the following nodes and attributes:
-  - Root node `CustomLayer` contains the following attributes:
-    - `name` – (Required) The name of the Inference Engine layer to bind the kernel with.
-    - `type` and `version` – (Required) Reserved for future use. Set them to `MVCL` and `1` respectively.
-    - `max-shaves` – (Optional) The maximum number of SHAVE cores that should be dedicated for the layer. It is useful for debugging concurrency issues or for resource saving that memory bound kernel does not scale well with the number of cores, so more resources can be left for the rest of a topology.
-  - Sub-node `Kernel` must contain the following attributes:
-    - `entry` – The name of your kernel function as you defined it in a source file. In the example above, it is `reorg_nhwc`.
-    - Node `Source` must contain the following attributes:
-      - `filename` – The path to a compiled binary relative to the XML configuration file.
-  - Sub-node `Parameters` – Describes parameters bindings. For more information, see the description below.
-  - Sub-node `WorkSizes` – Describes local and global work group sizes and the source for dimension deduction as a pair `direction,port`. In the example above, the work group is described relatively to the dimension of the input tensor that comes through port 0 in the IR. `global` and `local` work group configurations support any simple math expressions with +,-,\*,/, and () from `B`(batch), `Y`(height), `X`(width) and `F`(channels).
-  - Sub-node `Where` – Allows to customize bindings with the `key="value"` attribute. For example, to substitute only 3x3 convolutions, write `<Where kernel="3,3"/>` in the binding xml.
-
-  Parameter description supports `Tensor` of one of tensor types such as `input`, `output`, `input_buffer`, `output_buffer` or `data`, `Scalar`, or `Data` nodes and has the following format:
-  - Each `Tensor` node of `input` or `output` type must contain the following attributes:
-    - `arg-name` – The name of a kernel parameter in the kernel signature.
-    - `type` – Node type: `input` or `output` as specified in the IR.
-    - `port-index` – A number of input/output ports as specified in the IR.
-    - `format` – The channel order in the tensor. Optional conversion layers are generated if the custom layer format is not compatible with formats of neighboring layers. `BFXY`, `BYXF`, and `ANY` formats are supported currently.
-  - Each `Tensor` node of `input_buffer` or `output_buffer` type must contain the following attributes:
-    - `arg-name` – The name of a kernel parameter in the kernel signature.
-    - `type` – Node type: `input_buffer` or `output_buffer`. Use the appropriate type to bind multiple kernels that correspond to different stages of the same layer.
-    - `port-index` – The unique identifier to bind by.
-    - `dim` – The dim source with the same `direction,port` format used for `WorkSizes` bindings.
-    - `size` – Amount of bytes needed. Current expression syntax supports only expression over dimensions of over selected input/output tensor or constants and might be expended in the future.
-
-    Here is an example of multi-stage MVN layer binding:
-  ```xml
-  <CustomLayer name="MVN" stage="0" type="MVCL" version="1">
-      <Kernel entry="reduction_mean">
-          <Source filename="mvn.bin"/>
-      </Kernel>
-      <Parameters>
-          <Tensor arg-name="src"                type="input"         port-index="0"               format="BFYX"/>
-          <Tensor arg-name="mean"               type="output_buffer" port-index="0" dim="output,0" size="Y*F*4"/>
-          <Tensor arg-name="variance"           type="output_buffer" port-index="1" dim="output,0" size="Y*F*4"/>
-          <!--other parameters  -->
-      </Parameters>
-      <WorkSizes dim="output,0" global="((Y+7)/8)*8,F,1" local="8,1,1"/>
-  </CustomLayer>
-  <CustomLayer name="MVN" stage="1" type="MVCL" version="1">
-      <Kernel entry="mvn_scale">
-          <Source filename="mvn_scale_changed_orded.bin"/>
-      </Kernel>
-      <Parameters>
-          <Tensor arg-name="src_data"           type="input"        port-index="0"               format="BFYX"/>
-          <Tensor arg-name="dst_data"           type="output"       port-index="0"               format="BFYX"/>
-          <Tensor arg-name="mean_part"          type="input_buffer" port-index="0" dim="output,0" size="Y*F*4"/>
-          <Tensor arg-name="power_mean"         type="input_buffer" port-index="1" dim="output,0" size="Y*F*4"/>
-          <!--other parameters  -->
-      </Parameters>
-      <WorkSizes dim="output,0" global="((Y+7)/8)*8,F,1" local="8,1,1"/>
-  </CustomLayer>
-  ```
-  - Each `Tensor` node that has the type `data` must contain the following attributes:
-   - `source` – A name of the blob as it is in the IR. Typical example is `weights` for convolution.
-   - `format` – Specifies the channel order in the tensor. Optional conversion layers are generated if the custom layer format is not.
-  ```xml
-  <CustomLayer name="BinaryConvolution" type="MVCL" version="1">
-    <Kernel entry="binary_convolution">
-        <Source filename="binary_layers.bin"/>
-    </Kernel>
-    <Parameters>
-        <Tensor arg-name="src_data"      type="input"   port-index="0"                      format="BFYX"/>
-        <Data   arg-name="weights_data"  type="data"                     source="weights"   format="ANY"/>
-        <Tensor arg-name="dst_data"      type="output"  port-index="0"                      format="BFYX"/>
-        <!--other parameters  -->
-    </Parameters>
-    <WorkSizes dim="output,0" global="X,Y,F" local="1,1,1"/>
-  </CustomLayer>
-  ```
-  - Each `Scalar` node must contain the following attributes:
-   - `arg-name` – The name of a kernel parameter in the kernel signature.
-   - `type` – `int` or `float` value. It is used for correct argument extraction from IR parameters.
-   - `source` – Contains the name of the parameter in the IR file or input/output (`I`/`O`, `In`/`On`, where `n` is a port number)
-   followed by dimension `B`(batch), `Y`(height), `X`(width), or `F`(channels).
-
-  - Each `Data` node must contain the following attributes:
-    - `arg-name` – The name of a kernel parameter in the kernel signature.
-    - `type` – Node type. Currently, `local_data` is the only supported value, which defines buffer allocated in fast local on-chip memory. It is limited to 100KB for all `__local` and
-    `__private` arrays defined inside the kernel as well as all `__local` parameters passed to the kernel. Note that a manual-DMA extension requires double buffering.
-    If the custom layer is detected to run out of local memory, the inference fails.
-    - `dim` – The dim source with the same `direction,port` format used for `WorkSizes` bindings.
-    - `size` – Amount of bytes needed. The current expression syntax supports only expression over dimensions of over selected input/output tensor or constants and may be extended in the future.
-  The example binding below illustrates a kernel with two local buffers passed to the kernel.
-  ```xml
-  <CustomLayer name="GRN" type="MVCL" version="1">
-      <Kernel entry="grn_NCHW">
-          <Source filename="grn.bin"/>
-      </Kernel>
-      <Parameters>
-          <Tensor arg-name="src_data" type="input"         port-index="0"                  format="BFYX"/>
-          <Tensor arg-name="dst_data" type="output"        port-index="0"                  format="BFYX"/>
-          <Data   arg-name="src"      type="local_data"                      dim="input,0" size="X*F*2" />
-          <Data   arg-name="dst"      type="local_data"                      dim="input,0" size="X*F*2" />
-          <Scalar arg-name="C"        type="int"           port-index="0"    source="I.F"               />
-          <Scalar arg-name="bias"     type="float"                           source="bias"              />
-      </Parameters>
-      <WorkSizes dim="input,0" global="X,Y,1" local="X,1,1"/>
-  </CustomLayer>
-```
-
-## Pass Configuration File to Inference Runtime
-
-> **NOTE**: If both native and custom layer implementations are present, the custom kernel has a priority over the native one.
-
-Before loading the network that features the custom layers, provide a separate configuration file and load it using the InferenceEngine::Core::SetConfig() method with the PluginConfigParams::KEY_CONFIG_FILE key and the configuration file name as a value:
-```cpp
-InferenceEngine::Core core;
-// Load custom layers
-core.SetConfig({ { InferenceEngine::PluginConfigParams::KEY_CONFIG_FILE, "<path to the xml file>" } }, "MYRIAD");
-```
-Optionally, set a path to a custom layers description with a pair of `VPU_CUSTOM_LAYERS` and  `/path/to/your/customLayers.xml`
-as a network configuration:
-```cpp
-InferenceEngine::Core core;
-std::map<std::string, std::string> networkConfig;
-config["VPU_CUSTOM_LAYERS"] = "/path/to/your/customLayers.xml";
-// Load custom layers in network config
-auto exeNetwork = core.LoadNetwork(cnnNetwork, "MYRIAD", networkConfig);
-```
-
-## Optimizing Kernels with OpenCL for VPU (Intel® Neural Compute Stick 2)
-
-This section provides optimization guidelines on writing custom layers with OpenCL for VPU devices. Knowledge about general OpenCL
-programming model and OpenCL kernel language is assumed and not a subject of this section. The OpenCL model mapping to VPU is described in the table below.
-
-| OpenCL Model  | VPU Mapping|
-|-----|----|
-| Device code | Executed on SHAVE cores    |
-| Private memory | Mapped to CMX internal memory, limited to 100KB per work group, valid only while the work group is executed |
-| Local memory   | Mapped to CMX internal memory, limited to 100KB per work group, valid only while the work group is executed |
-| Global memory  | Mapped to DDR, used to pass execution preserved parameters for inputs, outputs, and blobs                |
-| Work group     | Executed on a single SHAVE core iterating over multiple work items      |
-
-Note that by the OpenCL specification, the work group execution order is not specified. This means that it is your
-responsibility to ensure that race conditions among work groups are not introduced. Custom layer runtime spits evenly
-work grid among available compute resources and executes them in an arbitrary order. This static scheduling approach works best if the load is evenly spread out across work groups, which is a typical case for Deep Learning kernels. The following guidelines are recommended to use for work group partitioning:
-
-1. Split work evenly across work groups.
-2. Adjust work group granularity to maintain equal workload for all compute codes.
-3. Set the maximum number of cores using the `max-shaves` attribute for the `CustomLayer` node. This keeps more resources for the rest of topology. It is also useful if the kernel scalability reached its limits, which may happen while optimizing memory bound kernels or kernels with poor parallelization.
-4. Try an alternate data layout (`BFXY`/`BYXF`) for the kernel if it improves work group partitioning or data access patterns.
-Consider not just specific layer boost, but full topology performance because data conversion layers would be automatically inserted
-as appropriate.
-
-Offline OpenCL compiler (`clc`) features automatic vectorization over `get_global_id(0)` usage, if uniform access is detected.
-For example, the kernel below could be automatically vectorized:
-```cpp
-__kernel void cvtf32f16(__global float* restrict inImage, __global half*  restrict outImage,
-                        float   scale, float   bais)
-{
-    int idx = get_global_id(0) + get_global_id(1) * get_global_size(0) + get_global_id(2) * get_global_size(0) * get_global_size(1);
-    outImage[idx] = convert_half(inImage[idx]*scale+bais);
-}
-```
-However, this work-group based vectorizer (WGV) conflicts with the default LLVM vectorizer based on superword level parallelism
-(SLP) for the current compiler version. Manual vectorization is recommended to provide the best performance for non-uniform code
-patterns. WGV works if and only if vector types are not used in the code.
-
-Here is a short list of optimization tips:
-
-1. Help auto-vectorizer ensure non-aliasing pointers for kernel parameters by putting `restrict` where possible.
-  - This can give a performance boost, especially for kernels with unrolling, like `ocl_grn` from the example below.
-  - Place `restrict` markers for kernels with manually vectorized codes. In the `ocl_grn` kernel below, the unrolled version without `restrict` is up to 20% slower than the most optimal one, which combines unrolling and `restrict`.
-2. Put `#&zwj;pragma unroll N` to your loop header. The compiler does not trigger unrolling by default, so it is your responsibility to
-annotate the code with pragmas as appropriate. The `ocl_grn` version with `#&zwj;pragma unroll 4` is up to 50% faster, most of which comes from unrolling the first loop, because LLVM, in general, is better in scheduling 3-stage loops (load-compute-store), while the fist loop
- `variance += (float)(src_data[c*H*W + y*W + x] * src_data[c*H*W + y*W + x]);` is only 2-stage (load-compute). Pay
-attention to unrolling such cases first. Unrolling factor is loop-dependent. Choose the smallest number that
-still improves performance as an optimum between the kernel size and execution speed. For this specific kernel, changing the unroll factor from `4` to `6` results in the same performance, so unrolling factor equal to 4 is an optimum. For Intel® Neural Compute Stick 2, unrolling is conjugated with the automatic software pipelining for load, store, and compute stages:
-```cpp
-__kernel void ocl_grn(__global const half* restrict src_data, __global half* restrict dst_data, int C, float bias)
-{
-    int x = get_global_id(0);
-    int W = get_global_size(0);
-    int y = get_global_id(1);
-    int H = get_global_size(1);
-
-    float variance = bias + 1e-9f;
-
-    #pragma unroll 4
-    for (int c = 0; c < C; c++)
-        variance += (float)(src_data[c*H*W + y*W + x] * src_data[c*H*W + y*W + x]);
-
-    variance = 1.f / native_sqrt(variance);
-
-    #pragma unroll 4
-    for (int c = 0; c < C; c++)
-        dst_data[c*H*W + y*W + x] = (half)((float)src_data[c*H*W + y*W + x] * variance);
-}
-```
-To check the efficiency of WGV, you can compare performance of the kernel above with the kernel below, which is manually vectorized over width:
-```cpp
-__kernel void ocl_grn_line(__global const half* restrict src_data,  __global half* restrict dst_data, int C, int W, float bias)
-{
-    int y   = get_global_id(1);
-    int H   = get_global_size(1);
-
-    for (int x = 0; x < W/8; x++)
-    {
-        float8 variance = (float8)(bias+1e-9f);
-
-        #pragma unroll 4
-        for (int c = 0; c < C; c++)
-        {
-            __global const half8* restrict src_line = ((__global const half8 * restrict)(src_data + c*H*W + y*W));
-            half8 sh = src_line[x];
-            variance += convert_float8(sh*sh);
-        }
-
-        variance = 1.f/native_sqrt(variance);
-
-        #pragma unroll 4
-        for (int c = 0; c < C; c++)
-        {
-            __global const half8* restrict src_line = ((__global const half8 * restrict)(src_data + c*H*W + y*W));
-            __global       half8* restrict dst_line = ((__global       half8 * restrict)(dst_data + c*H*W + y*W));
-
-            dst_line[x] = convert_half8(convert_float8(src_line[x])*variance);
-        }
-    }
-    for (int x = W/8*8; x < W; x++)
-    {
-        float variance = bias+1e-9f;
-        #pragma unroll 4
-        for (int c = 0; c < C; c++)
-            variance += (float)(src_data[c*H*W + y*W + x]*src_data[c*H*W + y*W + x]);
-
-        variance = 1.f/native_sqrt(variance);
-
-        #pragma unroll 4
-        for (int c = 0; c < C; c++)
-            dst_data[c*H*W + y*W + x] = (float)src_data[c*H*W + y*W + x]*variance;
-    }
-}
-```
-Both versions perform the same, but the second one has more complex code.
-
-3. If it is easy to predict the work group size, you can also use the `reqd_work_group_size` kernel attribute to ask the compiler
-to unroll the code up to the local size of the work group. Note that if the kernel is actually executed with the
-different work group configuration, the result is undefined.
-
-4. Prefer to use the `half` compute if it keeps reasonable accuracy. 16-bit float is a native type for Intel® Neural Compute Stick 2, most of the functions `half_*` are mapped to a single hardware instruction.
-Use the standard `native_*` function for the rest of types.
-
-5. Prefer to use the `convert_half` function over `vstore_half` if conversion to 32-bit float is required. `convert_half` is mapped to a single hardware instruction. For the `cvtf32f16` kernel above, the line `outImage[idx] = convert_half(inImage[idx]*scale+bais);` is eight times slower than the code with `vstore_half`.
-
-6. Mind early exits. Early exit can be extremely costly for the current version of the `clc` compiler due to conflicts with the
-auto-vectorizer. The generic advice would be to setup local size by `x` dimension equal to inputs or/and outputs width.
-If it is impossible to define the work grid that exactly matches inputs or/and outputs to eliminate checks, for example,
-`if (get_global_id(0) >= width) return`, use line-wise kernel variant with manual vectorization. 
-The kernel example below demonstrates the impact of early exits on kernel performance.
-   ```cpp
-   // Initial version
-   __kernel void reorg(const __global half* restrict src, __global half* restrict out, int stride)
-   {
-     int w = get_global_id(0);
-     int W = get_global_size(0);
-
-     int h = get_global_id(1);
-     int H = get_global_size(1);
-
-     int c = get_global_id(2);
-     int C = get_global_size(2);
-
-     int C2 = C/(stride*stride);
-     int offset = c / C2;
-     int c2 = c - C2 * offset;
-
-     int H2 = H*stride;
-     int W2 = W*stride;
-
-     int h2 = h*stride + offset / stride;
-     int w2 = w*stride + offset - stride * (offset / stride);
-
-     out[W*H*c + W*h + w] = src[W2*H2*c2 + W2*h2 + w2];
-   }
-   ```
-This `reorg` kernel is auto-vectorizable, but an input for YOLO v2 topology is `NCHW=<1,64,26,26>` and it is not multiple of vector width, which is `8` for `half` data type. As a result, the Inference Engine does not select the auto-vectorized kernel.
-To compare performance of auto-vectorized and scalar version of the kernel, change the input size to`NCHW=<1,64,26,32>`. This enables the auto-vectorized version to be selected by the Inference Engine and can give you about 30% uplift.
-Since the auto-vectorized version is faster, it makes sense to enable it for the YOLO v2 topology input size by setting the local size multiple of vector, for example, 32, and adjust global sizes accordingly. As a result, the execution work grid exceeds actual input dimension, so out-of-bound checks should be inserted. See the updated kernel version below:
-   ```cpp
-   // Version with out-of-bound checks added
-   __kernel void reorg(const __global half* restrict src, __global half* restrict out, int W, int stride)
-   {
-     int w = get_global_id(0);
-     w = min(w, W-1);
-
-     int h = get_global_id(1);
-     int H = get_global_size(1);
-
-     int c = get_global_id(2);
-     int C = get_global_size(2);
-
-     int C2 = C/(stride*stride);
-     int offset = c / C2;
-     int c2 = c - C2 * offset;
-
-     int H2 = H*stride;
-     int W2 = W*stride;
-
-     int h2 = h*stride + offset / stride;
-     int w2 = w*stride + offset - stride * (offset / stride);
-
-     out[W*H*c + W*h + w] = src[W2*H2*c2 + W2*h2 + w2];
-   }
-   ```
-This code performs the same as the initial kernel above (scalar) due to branching overhead. If you replace min/max expression `w = min(w, W-1);` with `if (w >= W) return;`, runtime increases up to 2x against to code without branching (initial version).<br>
-If branching is inevitable for your element-based kernel, it is recommended to change the scheme to line-based. See the kernel variant below:
-```cpp
-// Line-wise version
-__kernel void reorg(const __global half* restrict src, __global half* restrict out, int H, int W, int stride)
-{
-    int h = min((int)get_global_id(0), H-1);
-
-    int c = get_global_id(1);
-    int C = get_global_size(1);
-    int C2 = C/(stride*stride);
-    int offset = c / C2;
-    int c2 = c - C2 * offset;
-
-    int H2 = H*stride;
-    int W2 = W*stride;
-
-    for (int w = 0; w < W; ++w)
-    {
-        int h2 = h*stride + offset / stride;
-        int w2 = w*stride + offset - stride * (offset / stride);
-
-        out[W*H*c + W*h + w] = src[W2*H2*c2 + W2*h2 + w2];
-    }
-}
-```
-This decreases the execution time up to 40% against the best performing vectorized kernel without early exits (initial version).
-7. Reuse computations among work items by using line-based kernels or sharing values though `__local` memory.
-8. Improve data access locality. Most of custom kernels are memory bound while convolution and fully connected layers are hardware-implemented. The code below demonstrates a further optimized version of the `reorg` kernel unrolled by `stride`:
-   ```cpp
-   // Unrolled line-wise version
-   __kernel void reorg_unrolled_by_stride(const __global half* restrict src, __global half* restrict dst,
-                                          int H, int W, int stride)
-   {
-     int h = min((int)get_global_id(0), H-1);
-
-     int c2 = get_global_id(1);
-     int C2 = get_global_size(1);
-     int C = C2*stride*stride;
-
-     int H2 = H*stride;
-     int W2 = W*stride;
-
-     for (int stride_y = 0; stride_y < stride; stride_y++)
-       for (int stride_x = 0; stride_x < stride; stride_x++)
-         for (int w2 = 0, w = 0; w < W; w2 += stride, w++)
-           dst[W*H*C2*(stride_y*stride+stride_x) + W*H*c2 + W*h + w] = src[W2*H2*c2 + W2*h*stride + W2*stride_y + w2 + stride_x];
-   }
-   ```
-`scr` data in this case loaded only once. As the result, the cycle count drops up to 45% against the line-wise version.
-
-9. Copy data from `__dlobal` to `__local` or `__private` memory if the data is accessed more than once. Access to
-`__dlobal` memory is orders of magnitude slower than access to `__local`/`__private` due to statically scheduled pipeline, which
-stalls completely on memory access without any prefetch. The same recommendation is applicable for scalar load/store
-from/to a `__blobal` pointer since work-group copying could be done in a vector fashion.
-
-10. Use a manual DMA extension. Local (on-chip) memory throughput is up to 24x higher than DDR throughput. Starting from OpenVINO™ 2020.1, VPU OpenCL features manual-DMA kernel extension to copy sub-tensor used by work group into local memory and performing compute without DDR evolved. Here is the simple GRN kernel implementation that runs over DDR. Local size is in the form (width of the input tensor, 1, 1) to define a large enough work group to get code automatically vectorized and unrolled, while global size is (width of the input tensor, height of the input tensor, 1):
-   ```cpp
-   __kernel void grn_NCHW(
-     __global const half* restrict src_data,
-     __global       half* restrict dst_data,
-     int C,
-     float bias)
-   {
-     float variance = bias + 1e-9f;
-
-     #pragma unroll 4
-     for (int c = 0; c < C; c++)
-     {
-       float val = (float) src_data[c*get_global_size(1)*get_global_size(0) + get_global_id(1)*get_global_size(0) + get_global_id(0)];
-       variance += val*val;
-     }
-
-     half hvariance = (half)(native_rsqrt((half)(variance/16.f))*0.25f);
-
-     #pragma unroll 4
-     for (int c = 0; c < C; c++)
-     {
-       dst_data[c*get_global_size(1)*get_global_size(0) + get_global_id(1)*get_global_size(0) + get_global_id(0)]
-       = src_data[c*get_global_size(1)*get_global_size(0) + get_global_id(1)*get_global_size(0) + get_global_id(0)] * hvariance;
-     }
-   }
-   ```
-   
-This kernel can be rewritten to introduce special data binding `__dma_preload` and `__dma_postwrite intrinsics`. This means that instead of one kernel, a group of three kernels should be implemented: `kernelName`, `__dma_preload_kernelName`, and `__dma_postwrite_kernelName`.  `__dma_preload_kernelName` for a particular work group `n` is guaranteed to be executed before the `n`-th work group itself, while `__dma_postwrite_kernelName` is guaranteed to be executed after a corresponding work group. You can define one of those functions that are intended to be used to copy data from-to `__global` and `__local` memory. The syntactics requires exact functional signature match. The example below illustrates how to prepare your kernel for manual-DMA.
-
-   ```cpp
-   __kernel void __dma_preload_grn_NCHW(
-     __global const half* restrict src,
-     __global       half* restrict dst,
-     __local        half* restrict local_src,
-     __local        half* restrict local_dst,
-     int C,
-     float bias)
-     {
-     // ToDO: copy required piece of src tensor into local_src
-   }
-   
-   __kernel void __dma_postwrite_grn_NCHW(
-     __global const half* restrict src,
-     __global       half* restrict dst,
-     __local  const half* restrict local_src,
-     __local        half* restrict local_dst,
-     int C,
-     float bias)
-   {
-     // ToDO: copy back computed piece of local_dst into dst
-   }
-   
-   __kernel void grn_NCHW(
-     __global const half* restrict src_data,
-     __global       half* restrict dst_data,
-     __local        half* restrict src,
-     __local        half* restrict dst,
-     int C,
-     float bias)
-   {
-     // same as the example above
-   }
-   ``` 
-The GRN kernel operates on channel-major tensors to compute average over full channel range and then normalizes input elements to produce the output.
-As a part of the manual DMA extension, a group of work group copy functions are introduced in addition to `async_work_group_copy`, which is also mapped to a DMA call.
-
-Here is the list of supported functions:
-```cpp
-// 2D sub-tensor copy
-event_t WorkGroupDmaCreateStrideTransaction(
-                const local T *src,
-                global T *dst,
-                size_t  src_width, // width of the line of source in bytes
-                size_t  dst_width, // width of the line of destination in bytes
-                size_t  src_stride, // stride between corresponding 2 consecutive lines of source in bytes
-                size_t  dst_stride, // stride between corresponding 2 consecutive lines of destination in bytes
-                size_t size, // total number of bytes loaded for all lines from source to destination
-                event_t  event) __OVERLOAD;
-
-
-event_t WorkGroupDmaCreateStrideTransaction(
-                const global T *src,
-                local T *dst,
-                size_t  src_width, // width of the line of source in bytes
-                size_t  dst_width, // width of the line of destination in bytes
-                size_t  src_stride, // stride between corresponding 2 consecutive lines of source in bytes
-                size_t  dst_stride, // stride between corresponding 2 consecutive lines of destination in bytes
-                size_t size, // total number of bytes loaded for all lines from source to destination
-                event_t  event) __OVERLOAD;
-
-// 3D sub-tensor copy
-event_t WorkGroupDmaCreate3DTransaction(
-                 const local T *src,
-                 global T *dst,
-                 size_t  src_width, // width of the line of source in bytes
-                 size_t  dst_width, // width of the line of destination in bytes
-                 size_t  src_stride, // stride between corresponding 2 consecutive lines of source in bytes
-                 size_t  dst_stride, // stride between corresponding 2 consecutive lines of destination in bytes
-                 size_t  num_planes, // number of planes to be copied
-                 size_t  src_plane_stride, // stride between corresponding 2 consecutive planes of source in bytes
-                 size_t  dst_plane_stride, // stride between corresponding 2 consecutive planes of destination in bytes
-                 size_t  size, // size of the loaded plane in bytes, analogues to the size in 2D case
-                 event_t  event) __OVERLOAD;
-
-event_t WorkGroupDmaCreate3DTransaction(
-                 const global T *src,
-                 local T *dst,
-                 size_t  src_width, // width of the line of source in bytes
-                 size_t  dst_width, // width of the line of destination in bytes
-                 size_t  src_stride, // stride between corresponding 2 consecutive lines of source in bytes
-                 size_t  dst_stride, // stride between corresponding 2 consecutive lines of destination in bytes
-                 size_t  num_planes, // number of planes to be copied
-                 size_t  src_plane_stride, // stride between corresponding 2 consecutive planes of source in bytes
-                 size_t  dst_plane_stride, // stride between corresponding 2 consecutive planes of destination in bytes
-                 size_t  size, // size of the loaded plane in bytes, analogues to the size in 2D case
-                 event_t  event) __OVERLOAD;
-```
-where `T` can be `uchar`, `char`, `short`, `ushort`, `int`, `uint`, `long`, `ulong`, `half` or `float`.
-
-Modified version of the GRN kernel could be the following:
-```cpp
-__kernel void __dma_preload_grn_NCHW(
-    __global const half* restrict src,
-    __global       half* restrict dst,
-    __local        half* restrict local_src,
-    __local        half* restrict local_dst,
-    int C,
-    float bias)
-{
-    WorkGroupDmaCreate3DTransaction(
-        src + get_group_id(0)*get_local_size(0)
-            + get_group_id(1)*get_local_size(1)*get_global_size(0), // src
-        local_src, // dst
-        get_local_size(0) * sizeof(half), // src width
-        get_local_size(0) * sizeof(half), // dst width
-        get_global_size(0) * sizeof(half), // src stride
-        get_local_size(0) * sizeof(half), // dst stride
-        C, // num planes
-        get_global_size(0) * get_global_size(1) * sizeof(half), // src plane stride
-        get_local_size(0) * get_local_size(1) * sizeof(half), // dst plane stride
-        get_local_size(0) * get_local_size(1) * sizeof(half), // plane size
-        0);
-}
-
-__kernel void __dma_postwrite_grn_NCHW(
-    __global const half* restrict src,
-    __global       half* restrict dst,
-    __local  const half* restrict local_src,
-    __local        half* restrict local_dst,
-    int C,
-    float bias)
-{
-    WorkGroupDmaCreate3DTransaction(
-        local_dst, // src
-        dst + get_group_id(0)*get_local_size(0)
-            + get_group_id(1)*get_local_size(1)*get_global_size(0), // dst
-        get_local_size(0) * sizeof(half), // src width
-        get_local_size(0) * sizeof(half), // dst width
-        get_local_size(0) * sizeof(half), // src stride
-        get_global_size(0) * sizeof(half), // dst stride
-        C, // num planes
-        get_local_size(0) * get_local_size(1) * sizeof(half), // src plane stride
-        get_global_size(0) * get_global_size(1) * sizeof(half), // dst plane stride
-        get_local_size(0) * get_local_size(1) * sizeof(half), // plane size
-        0);
-}
-
-__kernel void grn_NCHW(
-    __global const half* restrict src_data,
-    __global       half* restrict dst_data,
-    __local        half* restrict src,
-    __local        half* restrict dst,
-    int C,
-    float bias)
-{
-    float variance = bias + 1e-9f;
-
-    #pragma unroll 8
-    for (int c = 0; c < C; c++)
-    {
-        float val = (float) src[c*get_local_size(1)*get_local_size(0) + get_local_id(1)*get_local_size(0) + get_local_id(0)];
-        variance += val*val;
-    }
-
-    half hvariance = (half)(native_rsqrt((half)(variance/16.f))*0.25f);
-
-    #pragma unroll 8
-    for (int c = 0; c < C; c++)
-    {
-        dst[c*get_local_size(1)*get_local_size(0) + get_local_id(1)*get_local_size(0) + get_local_id(0)]
-        = src[c*get_local_size(1)*get_local_size(0) + get_local_id(1)*get_local_size(0) + get_local_id(0)] * hvariance;
-    }
-}
-```
-
-Note the `get_local_size` and `get_local_id` usage inside the kernel. 21x speedup is expected for a kernel on enet-curbs setup because it was completely limited by memory usage.
-
-An alternative method to using DMA is to use work item copy extension. Those functions are executed inside a kernel and requires work groups equal to single work item.
-
-Here is the list of supported work item functions:
-```cpp
-item_dma_event_t WorkItemDmaCreateTransaction(
-            const global T *src,
-            private T *dst,
-            size_t  size,
-            item_dma_event_t  event) __OVERLOAD;
-
-item_dma_event_t WorkItemDmaCreateTransaction(
-            const private T *src,
-            global T *dst,
-            size_t  size,
-            item_dma_event_t  event) __OVERLOAD;
-
-item_dma_event_t WorkItemDmaCreateStrideTransaction(
-                const global T *src,
-                private T *dst,
-                size_t  src_width,
-                size_t  dst_width,
-                size_t  src_stride,
-                size_t  dst_stride,
-                size_t size,
-                item_dma_event_t  event) __OVERLOAD;
-
-item_dma_event_t WorkItemDmaCreateStrideTransaction(
-                const private T *src,
-                global T *dst,
-                size_t  src_width,
-                size_t  dst_width,
-                size_t  src_stride,
-                size_t  dst_stride,
-                size_t size,
-                item_dma_event_t  event) __OVERLOAD;
-
-item_dma_event_t WorkItemDmaCreate3DTransaction(
-                const global T *src,
-                private T *dst,
-                size_t  src_width,
-                size_t  dst_width,
-                size_t  src_stride,
-                size_t  dst_stride,
-                size_t  num_planes,
-                size_t  src_plane_stride,
-                size_t  dst_plane_stride,
-                size_t  size,
-                item_dma_event_t  event) __OVERLOAD;
-
-item_dma_event_t WorkItemDmaCreate3DTransaction(
-                const private T *src,
-                global T *dst,
-                size_t  src_width,
-                size_t  dst_width,
-                size_t  src_stride,
-                size_t  dst_stride,
-                size_t  num_planes,
-                size_t  src_plane_stride,
-                size_t  dst_plane_stride,
-                size_t  size,
-                item_dma_event_t  event) __OVERLOAD;
-```
-where `T` can be `uchar`, `char`, `short`, `ushort`, `int`, `uint`, `long`, `ulong`, `half` or `float`.
--- a/docs/OV_Runtime_UG/Glossary.md
+++ b/docs/OV_Runtime_UG/Glossary.md
@@ -1,85 +0,0 @@
-# Glossary {#openvino_docs_IE_DG_Glossary}
-
-## Acronyms and Abbreviations
-
-| Abbreviation      | Description     |
-| :---              | :--- |
-| API               | Application Programming Interface |
-| AVX               | Advanced Vector Extensions |
-| clDNN             | Compute Library for Deep Neural Networks |
-| CLI               | Command Line Interface |
-| CNN               | Convolutional Neural Network |
-| CPU               | Central Processing Unit |
-| CV                | Computer Vision |
-| DL                | Deep Learning |
-| DLDT              | Intel(R) Deep Learning Deployment Toolkit |
-| DLL               | Dynamic Link Library |
-| DNN               | Deep Neural Networks |
-| ELU               | Exponential Linear rectification Unit |
-| FCN               | Fully Convolutional Network |
-| FP                | Floating Point |
-| GCC               | GNU Compiler Collection |
-| GPU               | Graphics Processing Unit |
-| HD                | High Definition |
-| IE                | Inference Engine |
-| IR                | Intermediate Representation |
-| JIT               | Just In Time |
-| JTAG              | Joint Test Action Group |
-| LPR               | License-Plate Recognition |
-| LRN               | Local Response Normalization |
-| mAP               | Mean Average Precision |
-| Intel(R) MKL-DNN  | Intel(R) Math Kernel Library Deep Neural Networks |
-| MO                | Model Optimizer |
-| MVN               | Mean Variance Normalization |
-| NCDHW             | Number of images, Channels, Depth, Height, Width |
-| NCHW              | Number of images, Channels, Height, Width |
-| NHWC              | Number of images, Height, Width, Channels |
-| NMS               | Non-Maximum Suppression |
-| NN                | Neural Network |
-| NST               | Neural Style Transfer |
-| OD                | Object Detection |
-| OS                | Operating System |
-| PCI               | Peripheral Component Interconnect |
-| PReLU             | Parametric Rectified Linear Unit |
-| PSROI             | Position Sensitive Region Of Interest |
-| RCNN, R-CNN       | Region-based Convolutional Neural Network |
-| ReLU              | Rectified Linear Unit |
-| ROI               | Region Of Interest |
-| SDK               | Software Development Kit |
-| SSD               | Single Shot multibox Detector |
-| SSE               | Streaming SIMD Extensions |
-| USB               | Universal Serial Bus |
-| VGG               | Visual Geometry Group |
-| VOC               | Visual Object Classes |
-| WINAPI            | Windows Application Programming Interface |
-
-## Terms
-
-Glossary of terms used in the Inference Engine
-
-
-| Term                        | Description         |
-| :---                        | :---                |
-| Batch | Number of images to analyze during one call of infer. Maximum batch size is a property of the network and it is set before loading of the network to the plugin. In NHWC, NCHW and NCDHW image data layout representation, the N refers to the number of images in the batch |
-| Blob | Memory container used for storing inputs, outputs of the network, weights and biases of the layers |
-| Device (Affinitity) | A preferred Intel(R) hardware device to run the inference (CPU, GPU, etc.) |
-| Extensibility mechanism, Custom layers | The mechanism that provides you with capabilities to extend the Inference Engine and Model Optimizer so that they can work with topologies containing layers that are not yet supported |
-| <code>CNNNetwork</code> | A class of the Convolutional Neural Network that Inference Engine reads from IR. Consists of topology, weights and biases |
-| <code>ExecutableNetwork</code> | An instance of the loaded network which allows the Inference Engine to request (several) infer requests and perform inference synchronously or asynchronously |
-| <code>InferRequest</code> | A class that represents the end point of inference on the model loaded to the plugin and represented by executable network. Inputs are set here, outputs should be requested from this interface as well |
-| <code>InferenceEngineProfileInfo</code> | Represents basic inference profiling information per layer |
-| Inference Engine | A C++ library with a set of classes that you can use in your application to infer input data (images) and get the result |
-| Inference Engine API | The basic default API for all supported devices, which allows you to load a model from Intermediate Representation, set input and output formats and execute the model on various devices |
-| Inference Engine <code>Core</code> | Inference Engine Core is a software component that manages inference on certain Intel(R) hardware devices: CPU, GPU, MYRIAD, GNA, etc. |
-| Layer catalog or Operations specification | A list of supported layers or operations and its parameters. Sets of supported layers are different for different plugins, please check the documentation on plugins to verify if the Inference Engine supports certain layer on the dedicated hardware |
-| <code>Layout</code> | Image data layout refers to the representation of images batch. Layout shows a sequence of 4D or 5D tensor data in memory. A typical NCHW format represents pixel in horizontal direction, rows by vertical dimension, planes by channel and images into batch |
-| <code>OutputsDataMap</code> | Structure which contains information about output precisions and layouts |
-| Precision | Represents data precision. For example, FP32 is 32-bit floating point, FP16 is 16-bit floating point. Precision can be changed before loading the network to the plugin |
-| <code>PreProcessInfo</code> | Class that represents input data for the network. It contains information about input precision, its layout, and pre-processing |
-| <code>ResponseDesc</code> | Represents debug information for an error |
-
-
-## See Also
-* [Deep Learning Model Optimizer IR Operations Catalog](../ops/opset.md)
-* [Inference Engine Memory primitives](Memory_primitives.md)
-* [Terminology](supported_plugins/Supported_Devices.md)
--- a/docs/OV_Runtime_UG/InferenceEngine_QueryAPI.md
+++ b/docs/OV_Runtime_UG/InferenceEngine_QueryAPI.md
@@ -1,235 +0,0 @@
-# Introduction to Inference Engine Device Query API {#openvino_docs_IE_DG_InferenceEngine_QueryAPI}
-
-## Inference Engine Query API (C++)
-
-@sphinxdirective
-.. raw:: html
-
-    <div id="switcher-cpp" class="switcher-anchor">C++</div>
-@endsphinxdirective
-
-The OpenVINO™ toolkit supports inferencing with several types of devices (processors or accelerators). 
-This section provides a high-level description of the process of querying of different device properties and configuration values at runtime. Refer to the [Hello Query Device С++ Sample](../../samples/cpp/hello_query_device/README.md) sources and the [Multi-Device Plugin documentation](supported_plugins/MULTI.md) for examples of using the Inference Engine Query API in user applications.
-
-### Using the Inference Engine Query API in Your Code
-
-The `InferenceEngine::Core` class provides the following API to query device information, set or get different device configuration properties:
-
-* `InferenceEngine::Core::GetAvailableDevices` - Provides a list of available devices. If there are more than one instance of a specific device, the devices are enumerated with `.suffix` where `suffix` is a unique string identifier. The device name can be passed to all methods of the `InferenceEngine::Core` class that work with devices, for example `InferenceEngine::Core::LoadNetwork`.
-* `InferenceEngine::Core::GetMetric` - Provides information about specific device.
-  `InferenceEngine::Core::GetConfig` - Gets the current value of a specific configuration key.
-* `InferenceEngine::Core::SetConfig` - Sets a new value for the configuration key.
-
-The `InferenceEngine::ExecutableNetwork` class is also extended to support the Query API:
-
-* `InferenceEngine::ExecutableNetwork::GetMetric`
-* `InferenceEngine::ExecutableNetwork::GetConfig`
-* `InferenceEngine::ExecutableNetwork::SetConfig`
-
-### Query API in the Core Class
-
-#### GetAvailableDevices
-
-@snippet snippets/InferenceEngine_QueryAPI0.cpp part0
-
-The function returns a list of available devices, for example:
-
-```
-MYRIAD.1.2-ma2480
-MYRIAD.1.4-ma2480
-CPU
-GPU.0
-GPU.1
-```
-
-Each device name can then be passed to:
-
-* `InferenceEngine::Core::LoadNetwork` to load the network to a specific device.
-* `InferenceEngine::Core::GetMetric` to get common or device specific metrics.
-* All other methods of the `InferenceEngine::Core` class that accept `deviceName`.
-
-#### GetConfig()
-
-The code below demonstrates how to understand whether the `HETERO` device dumps GraphViz `.dot` files with split graphs during the split stage:
-
-@snippet snippets/InferenceEngine_QueryAPI1.cpp part1
-
-For documentation about common configuration keys, refer to `ie_plugin_config.hpp`. Device specific configuration keys can be found in corresponding plugin folders.
-
-#### GetMetric()
-
-* To extract device properties such as available device, device name, supported configuration keys, and others, use the `InferenceEngine::Core::GetMetric` method:
-
-@snippet snippets/InferenceEngine_QueryAPI2.cpp part2
-
-A returned value appears as follows: `Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz`.
-
-> **NOTE**: All metrics have a type, which is specified during metric instantiation. The list of common device-agnostic metrics can be found in `ie_plugin_config.hpp`. Device specific metrics (for example, for HDDL or MYRIAD devices) can be found in corresponding plugin folders.
-
-### Query API in the ExecutableNetwork Class
-
-#### GetMetric()
-
-The method is used to get an executable network specific metric such as `METRIC_KEY(OPTIMAL_NUMBER_OF_INFER_REQUESTS)`:
-
-@snippet snippets/InferenceEngine_QueryAPI3.cpp part3
-
-Or the current temperature of the `MYRIAD` device:
-
-@snippet snippets/InferenceEngine_QueryAPI4.cpp part4
-
-#### GetConfig()
-
-The method is used to get information about configuration values the executable network has been created with:
-
-@snippet snippets/InferenceEngine_QueryAPI5.cpp part5
-
-#### SetConfig()
-
-The only device that supports this method is [Multi-Device](supported_plugins/MULTI.md).
-
-## Inference Engine Query API (Python)
-
-@sphinxdirective
-.. raw:: html
-
-    <div id="switcher-python" class="switcher-anchor">Python</div>
-@endsphinxdirective
-
-This section provides a high-level description of the process of querying of different device properties and configuration values. Refer to the [Hello Query Device Python Sample](../../samples/python/hello_query_device/README.md) sources and the [Multi-Device Plugin documentation](supported_plugins/MULTI.md) for examples of using the Inference Engine Query API in user applications.
-
-### Using the Inference Engine Query API in Your Code
-
-The Inference Engine [Core](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino-inference-engine-iecore) class provides the following API to query device information, set or get different device configuration properties:
-
-* [ie_api.IECore.available_devices](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.available_devices) - Provides a list of available devices. If there are more than one instance of a specific device, the devices are enumerated with .suffix where suffix is a unique string identifier. The device name can be passed to all methods of the IECore class that work with devices, for example [ie_api.IECore.load_network](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.load_network).
-* [ie_api.ieCore.get_metric](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.get_metric) - Provides information about specific device.
-* [ie_api.IECore.get_config](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.get_config) - Gets the current value of a specific configuration key.
-* [ie_api.IECore.set_config](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.set_config)  - Sets a new value for the configuration key.
-
-The [ie_api.ExecutableNetwork](api/ie_python_api/_autosummary/openvino.inference_engine.ExecutableNetwork.html) class is also extended to support the Query API:
-* [ie_api.ExecutableNetwork.get_metric](api/ie_python_api/_autosummary/openvino.inference_engine.ExecutableNetwork.html#openvino.inference_engine.ExecutableNetwork.get_metric)
-* [ie_api.ExecutableNetwork.get_config](latest/api/ie_python_api/_autosummary/openvino.inference_engine.ExecutableNetwork.html#openvino.inference_engine.ExecutableNetwork.get_config)
-* There is no method to call for set_config, but the equivalent action is described below.
-
-### Query API in the IECore Class
-
-#### Get Available Devices
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-print(ie.available_devices)
-```
-
-This code prints a list of available devices, for example:
-
-```
-MYRIAD.1.2-ma2480
-MYRIAD.1.4-ma2480
-FPGA.0
-FPGA.1
-CPU
-GPU.0
-GPU.1
-```
-
-Each device name can then be passed to:
-
-* `IECore.load_network` to load the network to a specific device.
-* `IECore.get_metric` to get common or device specific metrics.
-* All other methods of the `IECore` class that accept a device name.
-
-#### Get Metric
-
-To extract device properties such as available device, device name, supported configuration keys, and others, use the [IECore.get_metric](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.get_metric) method:
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-ie.get_metric(device_name="CPU", metric_name="FULL_DEVICE_NAME")
-```
-
-A returned value appears as follows: `Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz`.
-
-To list all supported metrics for a device:
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-ie.get_metric(device_name="GPU", metric_name="SUPPORTED_METRICS")
-```
-
-#### Get Configuration
-
-The code below uses the [IECore.get_config](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.get_config) method and demonstrates how to understand whether the HETERO device dumps .dot files with split graphs during the split stage:
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-ie.get_config(device_name="HETERO", config_name="HETERO_DUMP_GRAPH_DOT")
-```
-
-To list all supported configuration keys for a device:
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-ie.get_metric(device_name=device, metric_name="SUPPORTED_CONFIG_KEYS")
-```
-
-For documentation about common configuration keys, refer to `ie_plugin_config.hpp`. Device specific configuration keys can be found in corresponding plugin folders.
-
-
-### Query API in the ExecutableNetwork Class
-
-#### Get Metric
-
-To get the name of the loaded network:
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-net = ie.read_network(model=path_to_xml_file)
-exec_net = ie.load_network(network=net, device_name=device)
-exec_net.get_metric("NETWORK_NAME")
-```
-
-Use `exec_net.get_metric("SUPPORTED_METRICS")` to list all supported metrics for an ExecutableNetwork instance.
-
-
-#### Get Configuration
-
-The [IECore.get_config](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.get_config) method is used to get information about configuration values the executable network has been created with:
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-net = ie.read_network(model=path_to_xml_file)
-exec_net = ie.load_network(network=net, device_name="CPU")
-exec_net.get_config("CPU_THREADS_NUM")
-```
-
-Or the current temperature of MYRIAD device:
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-net = ie.read_network(model=path_to_xml_file)
-exec_net = ie.load_network(network=net, device_name="MYRIAD")
-exec_net.get_config("DEVICE_THERMAL")
-```
-
-Use `exec_net.get_metric("SUPPORTED_CONFIG_KEYS")`  to list all supported configuration keys.
-
-#### Set Configuration
-
-The only device that supports this method in the ExecutableNetwork class is the [Multi-Device](supported_plugins/MULTI.md), where you can change the priorities of the devices for the Multi plugin in real time: `exec_net.set_config({{"MULTI_DEVICE_PRIORITIES", "GPU,CPU"}})`. See the Multi-Device documentation for more details.
--- a/docs/OV_Runtime_UG/Integrate_with_customer_application_new_API.md
+++ b/docs/OV_Runtime_UG/Integrate_with_customer_application_new_API.md
@@ -80,8 +80,6 @@ Optionally, configure input and output of the model using the steps below:
         
         auto network = core.ReadNetwork("model.onnx");

-      You can find more information about the ONNX format support in the document `ONNX format support in the OpenVINO™ <https://docs.openvino.ai/latest/openvino_docs_IE_DG_ONNX_Support.html>`_   
-   
   .. tab:: nGraph
      
      .. code-block:: c
--- a/docs/OV_Runtime_UG/Known_Issues_Limitations.md
+++ b/docs/OV_Runtime_UG/Known_Issues_Limitations.md
@@ -1,58 +0,0 @@
-# Known Issues and Limitations {#openvino_docs_IE_DG_Known_Issues_Limitations}
-
-## Multiple OpenMP Loadings
-
-If the application uses the Inference Engine with third-party components that depend on Intel OpenMP, multiple loadings of the libiomp library may occur and cause OpenMP runtime initialization conflicts. This may happen, for example, if the application uses Intel® Math Kernel Library (Intel® MKL) through the “Single Dynamic Library” (<code>libmkl_rt.so</code>) mechanism and calls Intel MKL after loading the Inference Engine plugin.
-The error log looks like this:
-
-```sh
-OMP: Error #15: Initializing libiomp5.so, but found libiomp5.so already initialized.
-OMP: Hint: This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
-```
-
-Possible workarounds:
-
-*  Preload the OpenMP runtime using the <code>LD_PRELOAD</code> variable:
-   ```sh
-   LD_PRELOAD=<path_to_libiomp5.so> <path_to your_executable>
-   ```
-   This eliminates multiple loadings of libiomp, and makes all the components use this specific version of OpenMP.
-
-*  Alternatively, you can set <code>KMP_DUPLICATE_LIB_OK=TRUE</code>. However, performance degradation or incorrect results may occur in this case.
-
-
-## Old proto compiler breaks protobuf library
-
-With python protobuf library version 3.5.1, the following incompatibility can happen.
-The known case is for Cent OS 7.4.
-
-The error log looks like this:
-
-```sh
-File "../lib64/python3.5/site-packages/google/protobuf/descriptor.py", line 829, in _new_
-return _message.default_pool.AddSerializedFile(serialized_pb)
-TypeError: expected bytes, str found
-```
-
-A possible workaround is to upgrade default protobuf compiler (libprotoc 2.5.0) to newer version, for example libprotoc 2.6.1.
-
-[protobuf_issue]: https://github.com/google/protobuf/issues/4272
-
-## Dynamic batching
-Refer to the **Limitations** section of the [Dynamic batching page](DynamicBatching.md).
-
-## Static Shape Infer
-Refer to the **Limitations** section of the [Static Shape Infer page](ShapeInference.md).
-
-
-## Image Pre-Processing Performance Optimization Issue
-
-As described in [documentation for the new API](Integrate_with_customer_application_new_API.md), you can set an image blob of any size to an
-infer request using resizable input. Resize is executed during inference using the configured resize algorithm.
-
-But currently, resize algorithms are not completely optimized. So expect performance degradation if resizable input is
-specified and an input blob (to be resized) is set using `SetBlob()`. The best performance is for the 
-[CPU](supported_plugins/CPU.md) plugin only (because enabled openMP* provides parallelism).
-
-Another limitation is that currently, resize algorithms support NCHW layout only. So if you set NHWC layout for an input
-blob, NHWC is converted to NCHW before resize and back to NHWC after resize.
--- a/docs/OV_Runtime_UG/Memory_primitives.md
+++ b/docs/OV_Runtime_UG/Memory_primitives.md
@@ -1,60 +0,0 @@
-# Inference Engine Memory Primitives {#openvino_docs_IE_DG_Memory_primitives}
-
-## Inference Memory Primitives (C++)
-
-@sphinxdirective
-.. raw:: html
-
-    <div id="switcher-cpp" class="switcher-anchor">C++</div>
-@endsphinxdirective
-
-## Blobs
-
-<code>InferenceEngine::Blob</code> is the main class intended for working with memory.
-Using this class you can read and write memory, get information about the memory structure etc.
-
-The right way to create <code>Blob</code> objects with a specific layout is to use constructors with <code>InferenceEngine::TensorDesc</code>.
-<pre class="brush:cpp">
-InferenceEngine::TensorDesc tdesc(FP32, {1, 3, 227, 227}, InferenceEngine::Layout::NCHW);
-InferenceEngine::Blob::Ptr blob = InferenceEngine::make_shared_blob<float>(tdesc);
-</pre>
-
-## Layouts
-
-<code>InferenceEngine::TensorDesc</code> is a special class that provides layout format description.
-
-This class allows to create planar layouts using the standard formats (like <code>InferenceEngine::Layout::NCDHW</code>, <code>InferenceEngine::Layout::NCHW</code>, <code>InferenceEngine::Layout::NC</code>, <code>InferenceEngine::Layout::C</code> and etc) and also non-planar layouts using <code>InferenceEngine::BlockingDesc</code>.
-
-In order to create a complex layout you should use <code>InferenceEngine::BlockingDesc</code>, which allows you to define the blocked memory with offsets and strides.
-
-## Examples
-
-1. You can define a blob with dimensions {N: 1, C: 25, H: 20, W: 20} and format NHWC with using next parameters:<br/>
-<pre class="brush:cpp">
-InferenceEngine::BlockingDesc({1, 20, 20, 25}, {0, 2, 3, 1}); // or
-InferenceEngine::BlockingDesc({1, 20, 20, 25}, InferenceEngine::Layout::NHWC);
-</pre>
-2. If you have a memory with real dimensions {N: 1, C: 25, H: 20, W: 20} but with channels that are blocked by 8, you can define it using next parameters:<br/>
-<pre class="brush:cpp">
-InferenceEngine::BlockingDesc({1, 4, 20, 20, 8}, {0, 1, 2, 3, 1})
-</pre>
-3. Also you can set strides and offsets if layout contains it.
-4. If you have a complex blob layout and you don't want to calculate the real offset to data you can use the <code>InferenceEngine::TensorDesc::offset(size_t l)</code> or <code>InferenceEngine::TensorDesc::offset(SizeVector v)</code> methods.<br/>
-For example:
-<pre class="brush:cpp">
-InferenceEngine::BlockingDesc blk({1, 4, 20, 20, 8}, {0, 1, 2, 3, 1});
-InferenceEngine::TensorDesc tdesc(FP32, {1, 25, 20, 20}, blk);
-tdesc.offset(0); // = 0
-tdesc.offset(1); // = 8
-tdesc.offset({0, 0, 0, 2}); // = 16
-tdesc.offset({0, 1, 0, 2}); // = 17
-</pre>
-5. If you would like to create a TensorDesc with a planar format and for N dimensions (N can be different 1, 2, 4 and etc), you can use the <code>InferenceEngine::TensorDesc::getLayoutByDims</code> method.
-<pre class="brush:cpp">
-InferenceEngine::TensorDesc::getLayoutByDims({1}); // InferenceEngine::Layout::C
-InferenceEngine::TensorDesc::getLayoutByDims({1, 2}); // InferenceEngine::Layout::NC
-InferenceEngine::TensorDesc::getLayoutByDims({1, 2, 3, 4}); // InferenceEngine::Layout::NCHW
-InferenceEngine::TensorDesc::getLayoutByDims({1, 2, 3}); // InferenceEngine::Layout::BLOCKED
-InferenceEngine::TensorDesc::getLayoutByDims({1, 2, 3, 4, 5}); // InferenceEngine::Layout::NCDHW
-InferenceEngine::TensorDesc::getLayoutByDims({1, 2, 3, 4, 5, ...}); // InferenceEngine::Layout::BLOCKED
-</pre>
--- a/docs/OV_Runtime_UG/Model_caching_overview.md
+++ b/docs/OV_Runtime_UG/Model_caching_overview.md
@@ -8,9 +8,9 @@
    <div id="switcher-cpp" class="switcher-anchor">C++</div>
@endsphinxdirective

-As described in the [Inference Engine Developer Guide](Deep_Learning_Inference_Engine_DevGuide.md), a common application flow consists of the following steps:
+As described in the [OpenVINO™ Runtime User Guide](openvino_intro.md), a common application flow consists of the following steps:

-1. **Create an Inference Engine Core object**: First step to manage available devices and read network objects
+1. **Create a Core object**: First step to manage available devices and read network objects

 2. **Read the Intermediate Representation**: Read an Intermediate Representation file into an object of the `InferenceEngine::CNNNetwork`

@@ -72,9 +72,9 @@ To check in advance if a particular device supports model caching, your applicat
    <div id="switcher-python" class="switcher-anchor">Python</div>
@endsphinxdirective

-As described in Inference Engine Developer Guide, a common application flow consists of the following steps:
+As described in OpenVINO User Guide, a common application flow consists of the following steps:

-1. **Create an Inference Engine Core Object**
+1. **Create a Core Object**
 2. **Read the Intermediate Representation** - Read an Intermediate Representation file into an object of the [ie_api.IENetwork](api/ie_python_api/_autosummary/openvino.inference_engine.IENetwork.html)
 3. **Prepare inputs and outputs**
 4. **Set configuration** - Pass device-specific loading configurations to the device
--- a/docs/OV_Runtime_UG/ONNX_Support.md
+++ b/docs/OV_Runtime_UG/ONNX_Support.md
@@ -1,91 +0,0 @@
-# ONNX Format Support {#openvino_docs_IE_DG_ONNX_Support}
-
-## Introduction (C++)
-
-@sphinxdirective
-.. raw:: html
-
-    <div id="switcher-cpp" class="switcher-anchor">C++</div>
-@endsphinxdirective
-
-Starting with the 2020.4 release, OpenVINO™ supports reading native ONNX models. The `Core::ReadNetwork()` method provides a uniform way to read models from IR or ONNX format, it is a recommended approach to reading models. Example:
-
-```cpp
-InferenceEngine::Core core;
-auto network = core.ReadNetwork("model.onnx");
-```
-
-### Reshape Feature
-OpenVINO™ does not provide a mechanism to specify pre-processing (like mean values subtraction, reverse input channels) for the ONNX format. If an ONNX model contains dynamic shapes for input, please use the `CNNNetwork::reshape` method to reshape the model.
-
-### Weights Saved in External Files
-
-OpenVINO™ supports ONNX models that store weights in external files. It is especially useful for models larger than 2GB because of protobuf limitations. To read such models, use the `ReadNetwork` overload which takes `modelPath` as input parameter (both `std::string` and `std::wstring`). Note that the `binPath` argument of `ReadNetwork` should be empty in this case, because paths to external weights are saved directly in an ONNX model.
-Otherwise, a runtime exception is thrown. Reading models with external weights is not supported by the `ReadNetwork(const std::string& model, const Blob::CPtr& weights)` overload.
-
-Paths to external weight files are saved in an ONNX model; these paths are relative to the model's directory path.
-It means that if a model is located at `home/user/workspace/models/model.onnx` and a file that contains external weights is in `home/user/workspace/models/data/weights.bin`, then the path saved in the model should be:
-  `data/weights.bin`
-
-> **NOTE**: A single model can use many external weights files.
-
-> **NOTE**: Data of many tensors can be stored in a single external weights file (it is processed using offset and length values, which can be also saved in a model).
-
-The described mechanism is the only way to read weights from external files. The following input parameters of the `ReadNetwork` function overloads are NOT supported for ONNX models and should be passed as empty:
-* `const std::wstring& binPath`
-* `const std::string& binPath`
-* `const Blob::CPtr& weights`
-
-You can find more details about the external data mechanism in [ONNX documentation](https://github.com/onnx/onnx/blob/master/docs/ExternalData.md).
-To convert a model to use the external data feature, you can use [ONNX helper functions](https://github.com/onnx/onnx/blob/master/onnx/external_data_helper.py).
-
-Unsupported types of tensors:
-* string
-* complex64
-* complex128
-
-## Introduction (Python)
-
-@sphinxdirective
-.. raw:: html
-
-    <div id="switcher-python" class="switcher-anchor">Python</div>
-@endsphinxdirective
-
-Starting with the 2020.4 release, OpenVINO™ supports reading native ONNX models. The `IECore.read_network()` method provides a uniform way to read models from IR or ONNX format, it is a recommended approach to reading models. Example:
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-net = ie.read_network(model=path_to_onnx_file)
-```
-
-### Reshape Feature
-OpenVINO™ does not provide a mechanism to specify pre-processing (like mean values subtraction, reverse input channels) for the ONNX format. If an ONNX model contains dynamic shapes for input, please use the [IENetwork.reshape](api/ie_python_api/_autosummary/openvino.inference_engine.IENetwork.html#openvino.inference_engine.IENetwork.reshape) method to reshape the model.
-
-```python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-net = ie.read_network(model=path_to_onnx_file)
-input_layer = next(iter(net.input_info))
-net.reshape({input_layer: new_shape})
-```
-
-### Weights Saved in External Files
-
-OpenVINO™ supports ONNX models that store weights in external files. It is especially useful for models larger than 2GB because of protobuf limitations. To read such models, use the `model` parameter in the `IECore.read_network(model=path_to_onnx_file)` method. Note that the parameter for the path to the binary weight file, `weights=` should be empty in this case, because paths to external weights are saved directly in an ONNX model. Otherwise, a runtime exception is thrown. Reading models with external weights is **NOT** supported by the `read_network(weights=path_to_bin_file)` parameter.
-
-Paths to external weight files are saved in an ONNX model; these paths are relative to the model’s directory path. It means that if a model is located at: `$HOME/workspace/models/model.onnx` and a file that contains external weights: `$HOME/workspace/models/data/weights.bin`, the path saved in model should be: data/weights.bin.
-
-**NOTE**: 
-* A single model can use many external weights files.
-* Data of many tensors can be stored in a single external weights file (it is processed using offset and length values, which can be also saved in a model).
-
-The described mechanism is the only possibility to read weights from external files. The `weights` input parameter of the [IECore.read_network](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.read_network) function is NOT supported for ONNX models and should not be passed, or set as None.
-
-Unsupported types of tensors:
-* string
-* complex64
-* complex128
--- a/docs/OV_Runtime_UG/Paddle_Support.md
+++ b/docs/OV_Runtime_UG/Paddle_Support.md
@@ -1,52 +0,0 @@
-# Paddle Support in OpenVINO™ {#openvino_docs_IE_DG_Paddle_Support}
-
-Starting from the 2022.1 release, OpenVINO™ supports reading native Paddle models.
-The `Core::ReadNetwork()` method provides a uniform way to read models from either the Paddle format or IR, which is the recommended approach.
-
-## Read Paddle Models from IR
-
-The Paddle Model can be read after it is [converted](../MO_DG/prepare_model/convert_model/Convert_Model_From_Paddle.md) to [Intermediate Representation (IR)](../MO_DG/IR_and_opsets.md).
-
-**C++ Example:**
-
-```cpp
-InferenceEngine::Core core;
-auto network = core.ReadNetwork("model.xml");
-```
-
-**Python Example:**
-
-```sh
-from openvino.inference_engine import IECore
-ie = IECore()
-net = ie.read_network("model.xml")
-```
-
-## Read Paddle Models from The Paddle Format (Paddle `inference model` model type)
-
-**C++ Example:**
-
-```cpp
-InferenceEngine::Core core;
-auto network = core.ReadNetwork("model.pdmodel");
-```
-
-**Python Example:**
-
-```sh
-from openvino.inference_engine import IECore
-ie = IECore()
-net = ie.read_network("model.pdmodel")
-```
-
-**The Reshape feature:**
-
-OpenVINO™ does not provide a mechanism to specify pre-processing, such as mean values subtraction or reverse input channels, for the Paddle format.
-If a Paddle model contains dynamic shapes for input, use the `CNNNetwork::reshape` method for shape specialization.
-
-## NOTES
-
-* The Paddle [`inference model`](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/doc/doc_en/inference_en.md) mainly contains two kinds of files `model.pdmodel`(model file) and `model.pdiparams`(params file), which are used for inference.
-* The list of supported Paddle models and a description of how to export them can be found in [Convert a Paddle Model](../MO_DG/prepare_model/convert_model/Convert_Model_From_Paddle.md). The following Paddle models are supported by intel CPU only: `Fast-SCNN`, `Yolo v3`, `ppyolo`, `MobileNetv3-SSD`, `BERT`.
-* For `Normalize` Paddle Models, the input data should be in FP32 format.
-* When reading Paddle models from The Paddle format, make sure that `model.pdmodel` and `model.pdiparams` are in the same folder directory.
--- a/docs/OV_Runtime_UG/PythonPackage_Overview.md
+++ b/docs/OV_Runtime_UG/PythonPackage_Overview.md
@@ -1,14 +0,0 @@
-# OpenVINO™ Python* Package
-
-OpenVINO™ Python\* package includes types to measure model and calibrate to low precision. 
-
-The OpenVINO™ Python\* package available in the `<INSTALL_DIR>/python/python3.X` directory.
-
-The OpenVINO™ Python\* package includes the following sub-packages:
-
- - [openvino.inference_engine](../../src/bindings/python/docs/api_overview.md) - Python\* wrapper on OpenVINO™ Inference Engine.
- - `openvino.tools.accuracy_checker` - Measure accuracy.
- - `openvino.tools.benchmark` - Measure latency and throughput.
-
-## See Also
-* [Integrate with Customer Application New API](Integrate_with_customer_application_new_API.md)
--- a/docs/OV_Runtime_UG/Samples_Overview.md
+++ b/docs/OV_Runtime_UG/Samples_Overview.md
@@ -1,4 +1,4 @@
-# Inference Engine Samples {#openvino_docs_IE_DG_Samples_Overview}
+# OpenVINO Samples {#openvino_docs_IE_DG_Samples_Overview}

@sphinxdirective

@@ -19,8 +19,8 @@
   openvino_inference_engine_ie_bridges_c_samples_hello_nv12_input_classification_README
   openvino_inference_engine_samples_hello_query_device_README
   openvino_inference_engine_ie_bridges_python_sample_hello_query_device_README
-   openvino_inference_engine_samples_ngraph_function_creation_sample_README
-   openvino_inference_engine_ie_bridges_python_sample_ngraph_function_creation_sample_README
+   openvino_inference_engine_samples_model_creation_sample_README
+   openvino_inference_engine_ie_bridges_python_sample_model_creation_sample_README
   openvino_inference_engine_samples_speech_sample_README
   openvino_inference_engine_ie_bridges_python_sample_speech_sample_README
   openvino_inference_engine_samples_benchmark_app_README
@@ -28,14 +28,14 @@

@endsphinxdirective

-The Inference Engine sample applications are simple console applications that show how to utilize specific Inference Engine capabilities within an application, assist developers in executing specific tasks such as loading a model, running inference, querying specific device capabilities and etc.
+The OpenVINO sample applications are simple console applications that show how to utilize specific OpenVINO API capabilities within an application, assist developers in executing specific tasks such as loading a model, running inference, querying specific device capabilities and etc.

 After installation of Intel® Distribution of OpenVINO™ toolkit, С, C++ and Python* sample applications are available in the following directories, respectively:
 * `<INSTALL_DIR>/samples/c`
 * `<INSTALL_DIR>/samples/cpp`
 * `<INSTALL_DIR>/samples/python`

-Inference Engine sample applications include the following:
+OpenVINO sample applications include the following:

 - **Speech Sample** - Acoustic model inference based on Kaldi neural networks and speech feature vectors.
   - [Automatic Speech Recognition C++ Sample](../../samples/cpp/speech_sample/README.md)
@@ -50,7 +50,7 @@ Inference Engine sample applications include the following:
 - **Hello NV12 Input Classification Sample** – Input of any size and layout can be provided to an infer request. The sample transforms the input to the NV12 color format and pre-process it automatically during inference. The sample supports only images as inputs.
   - [Hello NV12 Input Classification C++ Sample](../../samples/cpp/hello_nv12_input_classification/README.md)
   - [Hello NV12 Input Classification C Sample](../../samples/c/hello_nv12_input_classification/README.md)
- **Hello Query Device Sample** – Query of available Inference Engine devices and their metrics, configuration values.
+- **Hello Query Device Sample** – Query of available OpenVINO devices and their metrics, configuration values.
   - [Hello Query Device C++ Sample](../../samples/cpp/hello_query_device/README.md)
   - [Hello Query Device Python* Sample](../../samples/python/hello_query_device/README.md)
 - **Hello Reshape SSD Sample** – Inference of SSD networks resized by ShapeInfer API according to an input size.
@@ -59,10 +59,10 @@ Inference Engine sample applications include the following:
 - **Image Classification Sample Async** – Inference of image classification networks like AlexNet and GoogLeNet using Asynchronous Inference Request API (the sample supports only images as inputs).
   - [Image Classification Async C++ Sample](../../samples/cpp/classification_sample_async/README.md)
   - [Image Classification Async Python* Sample](../../samples/python/classification_sample_async/README.md)
- **nGraph Function Creation Sample** – Construction of the LeNet network using the nGraph function creation sample.
-   - [nGraph Function Creation C++ Sample](../../samples/cpp/ngraph_function_creation_sample/README.md)
-   - [nGraph Function Creation Python Sample](../../samples/python/ngraph_function_creation_sample/README.md)
- 
+- **OpenVINO Model Creation Sample** – Construction of the LeNet model using the OpenVINO model creation sample.
+   - [OpenVINO Model Creation C++ Sample](../../samples/cpp/model_creation_sample/README.md)
+   - [OpenVINO Model Creation Python Sample](../../samples/python/model_creation_sample/README.md)
+
 > **NOTE**: All C++ samples support input paths containing only ASCII characters, except the Hello Classification Sample, that supports Unicode.

 ## Media Files Available for Samples
@@ -79,8 +79,8 @@ To run the sample, you can use [public](@ref omz_models_group_public) or [Intel'

 The officially supported Linux* build environment is the following:

-* Ubuntu* 18.04 LTS 64-bit or CentOS* 7 64-bit
-* GCC* 7.5.0 (for Ubuntu* 18.04) or GCC* 4.8.5 (for CentOS* 7.6)
+* Ubuntu* 18.04 LTS 64-bit or Ubuntu* 20.04 LTS 64-bit
+* GCC* 7.5.0 (for Ubuntu* 18.04) or GCC* 9.3.0 (for Ubuntu* 20.04)
 * CMake* version 3.10 or higher

 > **NOTE**: For building samples from the open-source version of OpenVINO™ toolkit, see the [build instructions on GitHub](https://github.com/openvinotoolkit/openvino/wiki/BuildingCode).
@@ -102,7 +102,7 @@ You can also build the sample applications manually:
 ```sh
 mkdir build
 ```
-> **NOTE**: If you ran the Image Classification verification script during the installation, the C++ samples build directory was already created in your home directory: `~/inference_engine_samples_build/`
+> **NOTE**: If you ran the Image Classification verification script during the installation, the C++ samples build directory was already created in your home directory: `~/inference_engine_cpp_samples_build/`

 2. Go to the created directory:
 ```sh
@@ -130,22 +130,17 @@ for the debug configuration — in `<path_to_build_directory>/intel64/Debug/`.

 The recommended Windows* build environment is the following:
 * Microsoft Windows* 10
-* Microsoft Visual Studio* 2017, or 2019
+* Microsoft Visual Studio* 2019
 * CMake* version 3.10 or higher

-> **NOTE**: If you want to use Microsoft Visual Studio 2019, you are required to install CMake 3.14.
+> **NOTE**: If you want to use Microsoft Visual Studio 2019, you are required to install CMake 3.14 or higher.

 To build the C or C++ sample applications on Windows, go to the `<INSTALL_DIR>\samples\c` or `<INSTALL_DIR>\samples\cpp` directory, respectively, and run the `build_samples_msvc.bat` batch file:
 ```sh
 build_samples_msvc.bat
 ```

-By default, the script automatically detects the highest Microsoft Visual Studio version installed on the machine and uses it to create and build
-a solution for a sample code. Optionally, you can also specify the preferred Microsoft Visual Studio version to be used by the script. Supported
-versions are `VS2017` and `VS2019`. For example, to build the C++ samples using the Microsoft Visual Studio 2017, use the following command:
-```sh
-<INSTALL_DIR>\samples\cpp\build_samples_msvc.bat VS2017
-```
+By default, the script automatically detects the highest Microsoft Visual Studio version installed on the machine and uses it to create and build a solution for a sample code

 Once the build is completed, you can find sample binaries in the following folders:
 * C samples: `C:\Users\<user>\Documents\Intel\OpenVINO\inference_engine_c_samples_build\intel64\Release`
@@ -159,7 +154,7 @@ directory.

 The officially supported macOS* build environment is the following:

-* macOS* 10.15 64-bit
+* macOS* 10.15 64-bit or higher
 * Clang* compiler from Xcode* 10.1 or higher
 * CMake* version 3.13 or higher

@@ -180,7 +175,7 @@ You can also build the sample applications manually:

 > **NOTE**: Before proceeding, make sure you have OpenVINO™ environment set correctly. This can be done manually by
 ```sh
-cd <INSTALL_DIR>/bin
+cd <INSTALL_DIR>/
 source setupvars.sh
 ```

@@ -188,7 +183,7 @@ source setupvars.sh
 ```sh
 mkdir build
 ```
-> **NOTE**: If you ran the Image Classification verification script during the installation, the C++ samples build directory was already created in your home directory: `~/inference_engine_samples_build/`
+> **NOTE**: If you ran the Image Classification verification script during the installation, the C++ samples build directory was already created in your home directory: `~/inference_engine_cpp_samples_build/`

 2. Go to the created directory:
 ```sh
@@ -217,7 +212,7 @@ for the debug configuration — in `<path_to_build_directory>/intel64/Debug/`.
 ### Get Ready for Running the Sample Applications on Linux*

 Before running compiled binary files, make sure your application can find the
-Inference Engine and OpenCV libraries.
+OpenVINO Runtime libraries.
 Run the `setupvars` script to set all necessary environment variables:
 ```sh
 source <INSTALL_DIR>/setupvars.sh
@@ -246,7 +241,7 @@ list above.
 ### Get Ready for Running the Sample Applications on Windows*

 Before running compiled binary files, make sure your application can find the
-Inference Engine and OpenCV libraries.
+OpenVINO Runtime libraries.
 Use the `setupvars` script, which sets all necessary environment variables:
 ```sh
 <INSTALL_DIR>\setupvars.bat
@@ -255,13 +250,13 @@ Use the `setupvars` script, which sets all necessary environment variables:
 To debug or run the samples on Windows in Microsoft Visual Studio, make sure you
 have properly configured **Debugging** environment settings for the **Debug**
 and **Release** configurations. Set correct paths to the OpenCV libraries, and
-debug and release versions of the Inference Engine libraries.
+debug and release versions of the OpenVINO Runtime libraries.
 For example, for the **Debug** configuration, go to the project's
 **Configuration Properties** to the **Debugging** category and set the `PATH`
 variable in the **Environment** field to the following:

 ```sh
-PATH=<INSTALL_DIR>\runtime\bin;<INSTALL_DIR>\opencv\bin;%PATH%
+PATH=<INSTALL_DIR>\runtime\bin;%PATH%
 ```
 where `<INSTALL_DIR>` is the directory in which the OpenVINO toolkit is installed.

@@ -270,4 +265,4 @@ sample, read the sample documentation by clicking the sample name in the samples
 list above.

 ## See Also
-* [Inference Engine Developer Guide](Deep_Learning_Inference_Engine_DevGuide.md)
+* [OpenVINO™ Runtime User Guide](openvino_intro.md)
--- a/docs/OV_Runtime_UG/ShapeInference.md
+++ b/docs/OV_Runtime_UG/ShapeInference.md
@@ -1,4 +1,4 @@
-# Using the Reshape Inference Feature {#openvino_docs_IE_DG_ShapeInference}
+# Changing input shapes {#openvino_docs_IE_DG_ShapeInference}

 ## Introduction (C++)

@@ -43,8 +43,7 @@ If a model has a hard-coded batch dimension, use `InferenceEngine::CNNNetwork::s

 Inference Engine takes three kinds of a model description as an input, which are converted into an `InferenceEngine::CNNNetwork` object:
 1. [Intermediate Representation (IR)](../MO_DG/IR_and_opsets.md) through `InferenceEngine::Core::ReadNetwork`
-2. [ONNX model](../OV_Runtime_UG/ONNX_Support.md) through `InferenceEngine::Core::ReadNetwork`
-3. [OpenVINO Model](../OV_Runtime_UG/model_representation.md) through the constructor of `InferenceEngine::CNNNetwork`
+2. [OpenVINO Model](../OV_Runtime_UG/model_representation.md) through the constructor of `InferenceEngine::CNNNetwork`

 `InferenceEngine::CNNNetwork` keeps an `ngraph::Function` object with the model description internally.
 The object should have fully-defined input shapes to be successfully loaded to Inference Engine plugins.
@@ -113,7 +112,7 @@ To keep the model valid after the reshape, choose a new input shape that satisfi
 For details, refer to the <a href="_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_Object_Detection_API_Models.html#tf_od_custom_input_shape">Tensorflow Object Detection API models resizing techniques</a>.

 ### Extensibility
-The Inference Engine provides a special mechanism that allows adding support of shape inference for custom operations. This mechanism is described in the [Extensibility documentation](Extensibility_DG/Intro.md)
+The Inference Engine provides a special mechanism that allows adding support of shape inference for custom operations. This mechanism is described in the [Extensibility documentation](../Extensibility_UG/Intro.md)

 ## Introduction (Python)

@@ -167,7 +166,7 @@ To feed input data of a shape that is different from the model input shape, resh

 Once the input shape of IENetwork is set, call the `IECore.load_network` method to get an ExecutableNetwork object for inference with updated shapes.

-There are other approaches to reshape the model during the stage of IR generation or [nGraph function](https://docs.openvino.ai/latest/openvino_docs_nGraph_DG_PythonAPI.html#create_an_ngraph_function_from_a_graph) creation.
+There are other approaches to reshape the model during the stage of IR generation or [OpenVINO model](https://docs.openvino.ai/latest/openvino_docs_nGraph_DG_PythonAPI.html#create_an_ngraph_function_from_a_graph) creation.

 Practically, some models are not ready to be reshaped. In this case, a new input shape cannot be set with the Model Optimizer or the `IENetwork.reshape` method.

@@ -219,7 +218,7 @@ exec_net = ie.load_network(network=net, device_name="CPU")
 ```

 ### Extensibility
-The Inference Engine provides a special mechanism that allows adding support of shape inference for custom operations. This mechanism is described in the [Extensibility documentation](Extensibility_DG/Intro.md)
+The Inference Engine provides a special mechanism that allows adding support of shape inference for custom operations. This mechanism is described in the [Extensibility documentation](../Extensibility_UG/Intro.md)

 ### See Also:

--- a/docs/OV_Runtime_UG/supported_plugins/AUTO.md
+++ b/docs/OV_Runtime_UG/supported_plugins/AUTO.md
@@ -1,4 +1,4 @@
-# Auto-Device Plugin {#openvino_docs_IE_DG_supported_plugins_AUTO}
+# Automatic device selection {#openvino_docs_IE_DG_supported_plugins_AUTO}

 ## Auto-Device Plugin Execution (C++)

@@ -39,7 +39,7 @@ There are two ways to use Auto-device:

 Both methods allow limiting the list of device candidates for the AUTO plugin.

-> **NOTE**: The Inference Engine lets you use "GPU" as an alias for "GPU.0" in function calls. 
+> **NOTE**: The OpenVINO Runtime lets you use "GPU" as an alias for "GPU.0" in function calls. 

 The Auto-device plugin supports query device optimization capabilities in metric.

@@ -49,8 +49,8 @@ The Auto-device plugin supports query device optimization capabilities in metric

 ### Enumerating Devices and Selection Logic

-The Inference Engine now features a dedicated API to enumerate devices and their capabilities. 
-See [Hello Query Device C++ Sample](../../../samples/cpp/hello_query_device/README.md).
+The OpenVINO Runtime API now features a dedicated methods to enumerate devices and their capabilities. 
+See [Hello Query Device C++ Sample](../../samples/cpp/hello_query_device/README.md).
 This is the example output from the sample (truncated to device names only):

 ```sh
@@ -85,7 +85,7 @@ For example, CPU, dGPU and iGPU can support the following precision and optimiza

 In cases when loading the network to dGPU or iGPU fails, CPU is the fall-back choice.

-According to the Auto-device selection logic from the previous section, tell the Inference Engine 
+According to the Auto-device selection logic from the previous section, tell the OpenVINO Runtime 
 to use the most suitable device from available devices as follows:

@snippet snippets/AUTO2.cpp part2
@@ -208,7 +208,7 @@ The Auto-device plugin supports query device optimization capabilities in metric

 ### Enumerating Devices and Selection Logic

-The Inference Engine now features a dedicated API to enumerate devices and their capabilities. See the [Hello Query Device Python Sample](../../../inference_engine/ie_bridges/python/sample_hello_query_device_README.html) for code.
+The OpenVINO Runtime API now features a dedicated methods to enumerate devices and their capabilities. See the [Hello Query Device Python Sample](../../samples/python/hello_query_device/README.md) for code.

 This is the example output from the sample (truncated to device names only):

--- a/docs/OV_Runtime_UG/hetero_execution.md
+++ b/docs/OV_Runtime_UG/hetero_execution.md
@@ -0,0 +1,157 @@
+# Heterogeneous execution {#openvino_docs_OV_UG_Hetero_execution}
+
+## Introducing the Heterogeneous execution
+
+The heterogeneous execution enables computing the inference of one model on several devices. The purposes of executing models in heterogeneous mode are to:
+
+* Utilize the power of accelerators to process the heaviest parts of the model and to execute unsupported operations on fallback devices like the CPU
+* Utilize all available hardware more efficiently during one inference
+
+The execution through heterogeneous mode can be divided into two independent steps:
+
+1. Setting of hardware affinity to operations (ov::Core::query_model is used internally by the Hetero device)
+2. Compiling a model to the Heterogeneous device assuming splitting the model to parts and compiling on the specified devices (via ov::device::priorities), and executing them through the Heterogeneous mode. The model is split to the subgraphs in according to the affinities where a set of conntected operations with the same affinity are supposed to be a dedicated subgraph. Each subgraph is compiled on a dedicated device and we have multiple ov::CompiledModel objects, which are connected via automatically allocated intermediate tensors.
+
+These steps are decoupled. The setting of affinities can be done automatically using the `automatic fallback` policy or in `manual` mode:
+
+- The fallback automatic policy causes "greedy" behavior and assigns all operations that can be executed on certain device according to the priorities you specify (for example, `ov::device::priorities("GPU,CPU")`).
+Automatic policy does not take into account device peculiarities such as the inability to infer some operations without other special operations placed before or after that layer. The plugin is responsible for solving such cases. If the device plugin does not support the subgraph topology constructed by the HETERO device, then you should set affinity manually.
+- Manual policy assumes explicit setting of affinities for all operations in the model using the runtime information ov::Node::get_rt_info.
+
+### Defining and Configuring the Hetero Device
+
+Following the OpenVINO™ convention of labeling devices, the Hetero execution uses the name `"HETERO"`. Configuration options for the Hetero device:
+
+| Parameter name | C++ property | Parameter values | Default | Description |
+| -------------- | ---------------- | ---------------- | --- | --- |
+| "MULTI_DEVICE_PRIORITIES" | `ov::device::priorities` | comma-separated device names with no spaces | N/A | Prioritized list of devices |
+
+### Automatic and manual policies for assigning affinities
+
+`Automatic fallback` policy decides which operation goes to which device automatically according to the support in dedicated devices (`GPU`, `CPU`, `MYRIAD`, etc) and query model step is called implicitly by Hetero device during model compilation:
+
+@sphinxdirective
+
+.. tab:: C++
+
+    .. doxygensnippet:: docs/snippets/ov_hetero.cpp
+       :language: cpp
+       :fragment: [compile_model]
+
+.. tab:: Python
+
+    .. doxygensnippet:: docs/snippets/ov_hetero.py
+       :language: python
+       :fragment: [compile_model]
+
+@endsphinxdirective
+
+Another way to annotate a model is to set all affinities `manually` using ov::Node::get_rt_info with key `"affinity"`:
+
+@sphinxdirective
+
+.. tab:: C++
+
+    .. doxygensnippet:: docs/snippets/ov_hetero.cpp
+       :language: cpp
+       :fragment: [set_manual_affinities]
+
+.. tab:: Python
+
+    .. doxygensnippet:: docs/snippets/ov_hetero.py
+       :language: python
+       :fragment: [set_manual_affinities]
+
+@endsphinxdirective
+
+The fallback policy does not work if at least one operation has an initialized `"affinity"`. If you want to adjust automatically set affinities, then get automatic affinities first, then fix them (usually, to minimize a number of total subgraphs to optimize memory transfers):
+
+@sphinxdirective
+
+.. tab:: C++
+
+    .. doxygensnippet:: docs/snippets/ov_hetero.cpp
+       :language: cpp
+       :fragment: [fix_automatic_affinities]
+
+.. tab:: Python
+
+    .. doxygensnippet:: docs/snippets/ov_hetero.py
+       :language: python
+       :fragment: [fix_automatic_affinities]
+
+@endsphinxdirective
+
+> **NOTE**: ov::Core::query_model does not depend on affinities set by a user. Instead, it queries for an operation support based on device capabilities.
+
+### Configure fallback devices
+If you want different devices in Hetero execution to have different device-specific configuration options, you can use the special helper property ov::device::properties:
+
+@sphinxdirective
+
+.. tab:: C++
+
+    .. doxygensnippet:: docs/snippets/ov_hetero.cpp
+       :language: cpp
+       :fragment: [configure_fallback_devices]
+
+.. tab:: Python
+
+    .. doxygensnippet:: docs/snippets/ov_hetero.py
+       :language: python
+       :fragment: [configure_fallback_devices]
+
+@endsphinxdirective
+
+In the example above, `CPU` device is configured to enable profiling data, while only `GPU` device has configuration property to perform inference in `f16` precision, while CPU has default execution precision.
+
+### Handling Difficult Topologies
+
+Some topologies are not friendly to heterogeneous execution on some devices or cannot be executed at all with this device.
+For example, models having activation operations that are not supported on the primary device are split by Hetero device into multiple set of subgraphs which leads to unoptimal execution.
+If transmitting data from one subgraph of a whole model to another part in heterogeneous mode takes more time than in normal execution, it may not make sense to execute them heterogeneously.
+In this case, you can define the heaviest part manually and set the affinity to avoid sending data back and forth many times during one inference.
+
+### Analyzing Performance Heterogeneous Execution
+After enabling the <code>OPENVINO_HETERO_VISUALIZE</code> environment variable, you can dump GraphViz* `.dot` files with annotations of operations per devices.
+
+The Heterogeneous device can generate two files:
+
+* `hetero_affinity_<model name>.dot` - annotation of affinities per operation.
+* `hetero_subgraphs_<model name>.dot` - annotation of affinities per graph.
+
+You can use the GraphViz* utility or a file converter to view the images. On the Ubuntu* operating system, you can use xdot:
+
+* `sudo apt-get install xdot`
+* `xdot hetero_subgraphs.dot`
+
+You can use performance data (in sample applications, it is the option `-pc`) to get the performance data on each subgraph.
+
+Here is an example of the output for Googlenet v1 running on HDDL with fallback to CPU:
+
+```
+subgraph1: 1. input preprocessing (mean data/HDDL):EXECUTED layerType:          realTime: 129   cpu: 129  execType:
+subgraph1: 2. input transfer to DDR:EXECUTED                layerType:          realTime: 201   cpu: 0    execType:
+subgraph1: 3. HDDL execute time:EXECUTED                    layerType:          realTime: 3808  cpu: 0    execType:
+subgraph1: 4. output transfer from DDR:EXECUTED             layerType:          realTime: 55    cpu: 0    execType:
+subgraph1: 5. HDDL output postprocessing:EXECUTED           layerType:          realTime: 7     cpu: 7    execType:
+subgraph1: 6. copy to IE blob:EXECUTED                      layerType:          realTime: 2     cpu: 2    execType:
+subgraph2: out_prob:          NOT_RUN                       layerType: Output   realTime: 0     cpu: 0    execType: unknown
+subgraph2: prob:              EXECUTED                      layerType: SoftMax  realTime: 10    cpu: 10   execType: ref
+Total time: 4212 microseconds
+```
+### Sample Usage
+
+OpenVINO™ sample programs can use the Heterogeneous execution used with the `-d` option:
+
+```sh
+./hello_classification <path_to_model>/squeezenet1.1.xml <path_to_pictures>/picture.jpg HETERO:GPU,CPU
+```
+where:
+- `HETERO` stands for the Heterogeneous execution
+- `GPU,CPU` points to fallback policy with priority on GPU and fallback to CPU
+
+You can point more than two devices: `-d HETERO:MYRIAD,GPU,CPU`
+
+### See Also
+[Supported Devices](supported_plugins/Supported_Devices.md)
--- a/docs/OV_Runtime_UG/img/preprocess_not_fit.png
+++ b/docs/OV_Runtime_UG/img/preprocess_not_fit.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8fed5e153636e3e556e000e3e5fc48b9da8f5a1272490550066d647d306ec24f
+size 81575
--- a/docs/OV_Runtime_UG/layout_overview.md
+++ b/docs/OV_Runtime_UG/layout_overview.md
@@ -0,0 +1,154 @@
+# Layout API overview {#openvino_docs_OV_Runtime_UG_Layout_Overview}
+
+## Introduction
+
+In few words, with layout `NCHW` it is easier to understand what model's shape `{8, 3, 224, 224}` means. Without layout it is just a 4-dimensional tensor.
+
+
+Concept of layout helps you (and your application) to understand what does each particular dimension of input/output tensor mean. For example, if your input has shape `{1, 3, 720, 1280}` and layout "NCHW" - it is clear that `N(batch) = 1`, `C(channels) = 3`, `H(height) = 720` and `W(width) = 1280`. Without layout information `{1, 3, 720, 1280}` doesn't give any idea to your application what these number mean and how to resize input image to fit model's expectations.
+
+
+Reasons when you may want to care about input/output layout:
+ - Perform model modification:
+    - Apply [preprocessing](./preprocessing_overview.md) steps, like subtract means, divide by scales, resize image, convert RGB<->BGR
+    - Set/get batch for a model
+ - Same operations, used during model conversion phase, see [Model Optimizer model conversion](../MO_DG/prepare_model/convert_model/Converting_Model.md)
+ - Improve readability of a model's input and output
+
+## Layout syntax
+
+### Short
+The easiest way is to fully specify each dimension with one alphabetical letter
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_layout.cpp
+         :language: cpp
+         :fragment: [ov:layout:simple]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_layout.py
+         :language: python
+         :fragment: [ov:layout:simple]
+
+@endsphinxdirective
+
+This assigns 'N' to first dimension, 'C' to second, 'H' to 3rd and 'W' to 4th
+
+### Advanced
+Advanced syntax allows assigning a word to a dimension. To do this, wrap layout with square brackets `[]` and specify each name separated by comma `,`
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_layout.cpp
+         :language: cpp
+         :fragment: [ov:layout:complex]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_layout.py
+         :language: python
+         :fragment: [ov:layout:complex]
+
+@endsphinxdirective
+
+
+### Partially defined layout
+If some dimension is not important, it's name can be set to `?`
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_layout.cpp
+         :language: cpp
+         :fragment: [ov:layout:partially_defined]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_layout.py
+         :language: python
+         :fragment: [ov:layout:partially_defined]
+
+@endsphinxdirective
+
+
+### Dynamic layout
+If number of dimensions is not important, ellipsis `...` can be used to specify variadic number of dimensions.
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_layout.cpp
+         :language: cpp
+         :fragment: [ov:layout:dynamic]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_layout.py
+         :language: python
+         :fragment: [ov:layout:dynamic]
+
+@endsphinxdirective
+
+### Predefined names
+
+Layout has pre-defined some widely used in computer vision dimension names:
+- N/Batch - batch size
+- C/Channels - channels dimension
+- D/Depth - depth
+- H/Height - height
+- W/Width - width
+
+These names are used in [PreProcessing API](./preprocessing_overview.md) and there is a set of helper functions to get appropriate dimension index from layout
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_layout.cpp
+         :language: cpp
+         :fragment: [ov:layout:predefined]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_layout.py
+         :language: python
+         :fragment: [ov:layout:predefined]
+
+@endsphinxdirective
+
+
+### Equality
+
+Layout names are case-insensitive, which means that ```Layout("NCHW") == Layout("nChW") == Layout("[N,c,H,w]")```
+
+### Dump layout
+
+Layout can be converted to string in advanced syntax format. Can be useful for debugging and serialization purposes
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_layout.cpp
+         :language: cpp
+         :fragment: [ov:layout:dump]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_layout.py
+         :language: python
+         :fragment: [ov:layout:dump]
+
+@endsphinxdirective
+
+## See also
+
+* <code>ov::Layout</code> C++ class documentation
--- a/docs/OV_Runtime_UG/migration_ov_2_0/common_inference_pipeline.md
+++ b/docs/OV_Runtime_UG/migration_ov_2_0/common_inference_pipeline.md
@@ -0,0 +1,200 @@
+# Inference Pipeline {#openvino_2_0_inference_pipeline}
+
+Usually to inference model with the OpenVINO™ Runtime an user needs to do the following steps in the application pipeline:
+- 1. Create Core object
+- 2. Read model from the disk
+ - 2.1. (Optional) Model preprocessing
+- 3. Load the model to the device
+- 4. Create an inference request
+- 5. Fill input tensors with data
+- 6. Start inference
+- 7. Process the inference results
+
+Code snippets below cover these steps and show how application code should be changed for migration to OpenVINO™ Runtime 2.0.
+
+## 1. Create Core
+
+Inference Engine API:
+
+@snippet docs/snippets/ie_common.cpp ie:create_core
+
+OpenVINO™ Runtime API 2.0:
+
+@snippet docs/snippets/ov_common.cpp ov_api_2_0:create_core
+
+## 2. Read model from the disk
+
+Inference Engine API:
+
+@snippet docs/snippets/ie_common.cpp ie:read_model
+
+OpenVINO™ Runtime API 2.0:
+
+@snippet docs/snippets/ov_common.cpp ov_api_2_0:read_model
+
+Read model has the same structure as in the example from [Model Creation](./graph_construction.md) migration guide.
+
+Note, you can combine read and compile model stages into a single call `ov::Core::compile_model(filename, devicename)`.
+
+### 2.1 (Optional) Model preprocessing
+
+When application's input data doesn't perfectly match with model's input format, preprocessing steps may need to be added.
+See detailed guide [how to migrate preprocessing in OpenVINO Runtime API 2.0](./preprocessing.md)
+
+## 3. Load the Model to the Device
+
+Inference Engine API:
+
+@snippet docs/snippets/ie_common.cpp ie:compile_model
+
+OpenVINO™ Runtime API 2.0:
+
+@snippet docs/snippets/ov_common.cpp ov_api_2_0:compile_model
+
+If you need to configure OpenVINO Runtime devices with additional configuration parameters, please, refer to the migration [Configure devices](./configure_devices.md) guide.
+
+## 4. Create an Inference Request
+
+Inference Engine API:
+
+@snippet docs/snippets/ie_common.cpp ie:create_infer_request
+
+OpenVINO™ Runtime API 2.0:
+
+@snippet docs/snippets/ov_common.cpp ov_api_2_0:create_infer_request
+
+## 5. Fill input tensors
+
+Inference Engine API fills inputs as `I32` precision (**not** aligned with the original model):
+
+@sphinxdirective
+
+.. tab:: IR v10
+
+    .. doxygensnippet:: docs/snippets/ie_common.cpp
+       :language: cpp
+       :fragment: [ie:get_input_tensor]
+
+.. tab:: IR v11
+
+    .. doxygensnippet:: docs/snippets/ie_common.cpp
+       :language: cpp
+       :fragment: [ie:get_input_tensor]
+       
+.. tab:: ONNX
+
+    .. doxygensnippet:: docs/snippets/ie_common.cpp
+       :language: cpp
+       :fragment: [ie:get_input_tensor]
+       
+.. tab:: Model created in code
+
+    .. doxygensnippet:: docs/snippets/ie_common.cpp
+       :language: cpp
+       :fragment: [ie:get_input_tensor]
+
+@endsphinxdirective
+
+OpenVINO™ Runtime API 2.0 fills inputs as `I64` precision (aligned with the original model)::
+
+@sphinxdirective
+
+.. tab:: IR v10
+
+    .. doxygensnippet:: docs/snippets/ov_common.cpp
+       :language: cpp
+       :fragment: [ov_api_2_0:get_input_tensor_v10]
+
+.. tab:: IR v11
+
+    .. doxygensnippet:: docs/snippets/ov_common.cpp
+       :language: cpp
+       :fragment: [ov_api_2_0:get_input_tensor_aligned]
+       
+.. tab:: ONNX
+
+    .. doxygensnippet:: docs/snippets/ov_common.cpp
+       :language: cpp
+       :fragment: [ov_api_2_0:get_input_tensor_aligned]
+       
+.. tab:: Model created in code
+
+    .. doxygensnippet:: docs/snippets/ov_common.cpp
+       :language: cpp
+       :fragment: [ov_api_2_0:get_input_tensor_aligned]
+
+@endsphinxdirective
+
+## 6. Start Inference
+
+Inference Engine API:
+
+@snippet docs/snippets/ie_common.cpp ie:inference
+
+OpenVINO™ Runtime API 2.0:
+
+@snippet docs/snippets/ov_common.cpp ov_api_2_0:inference
+
+## 7. Process the Inference Results
+
+Inference Engine API processes outputs as `I32` precision (**not** aligned with the original model):
+
+@sphinxdirective
+
+.. tab:: IR v10
+
+    .. doxygensnippet:: docs/snippets/ie_common.cpp
+       :language: cpp
+       :fragment: [ie:get_output_tensor]
+
+.. tab:: IR v11
+
+    .. doxygensnippet:: docs/snippets/ie_common.cpp
+       :language: cpp
+       :fragment: [ie:get_output_tensor]
+       
+.. tab:: ONNX
+
+    .. doxygensnippet:: docs/snippets/ie_common.cpp
+       :language: cpp
+       :fragment: [ie:get_output_tensor]
+       
+.. tab:: Model created in code
+
+    .. doxygensnippet:: docs/snippets/ie_common.cpp
+       :language: cpp
+       :fragment: [ie:get_output_tensor]
+
+@endsphinxdirective
+
+OpenVINO™ Runtime API 2.0 processes outputs:
+- For IR v10 as `I32` precision (**not** aligned with the original model) to match **old** behavior
+- For IR v11, ONNX, ov::Model, Paddle as `I64` precision (aligned with the original model) to match **new** behavior
+
+@sphinxdirective
+
+.. tab:: IR v10
+
+    .. doxygensnippet:: docs/snippets/ov_common.cpp
+       :language: cpp
+       :fragment: [ov_api_2_0:get_output_tensor_v10]
+
+.. tab:: IR v11
+
+    .. doxygensnippet:: docs/snippets/ov_common.cpp
+       :language: cpp
+       :fragment: [ov_api_2_0:get_output_tensor_aligned]
+       
+.. tab:: ONNX
+
+    .. doxygensnippet:: docs/snippets/ov_common.cpp
+       :language: cpp
+       :fragment: [ov_api_2_0:get_output_tensor_aligned]
+       
+.. tab:: Model created in code
+
+    .. doxygensnippet:: docs/snippets/ov_common.cpp
+       :language: cpp
+       :fragment: [ov_api_2_0:get_output_tensor_aligned]
+
+@endsphinxdirective
--- a/docs/OV_Runtime_UG/migration_ov_2_0/configure_devices.md
+++ b/docs/OV_Runtime_UG/migration_ov_2_0/configure_devices.md
@@ -0,0 +1,129 @@
+# Configure devices {#openvino_2_0_configure_devices}
+
+### Introduction
+
+Inference Engine API provides an [ability to configure devices](https://docs.openvino.ai/2021.4/openvino_docs_IE_DG_InferenceEngine_QueryAPI.html) via configuration keys and [get device specific metrics](https://docs.openvino.ai/2021.4/openvino_docs_IE_DG_InferenceEngine_QueryAPI.html#getmetric). The values taken from `InferenceEngine::Core::GetConfig` are requested by its string name, while return type is `InferenceEngine::Parameter` and users don't know what is the actual type is stored in this parameter.
+
+OpenVINO Runtime API 2.0 solves these issues by introducing [properties](../supported_plugins/config_properties.md), which unify metrics and configuration key concepts, but the main advantage of properties - they have C++ type:
+
+```
+static constexpr Property<std::string> full_name{"FULL_DEVICE_NAME"};
+```
+
+And the property can be requested from an inference device as:
+
+@snippet ov_properties_migration.cpp core_get_ro_property
+
+The snippets below show how to migrate from Inference Engine device configuration to OpenVINO Runtime API 2.0 steps.
+
+### Set configuration values
+
+Inference Engine API:
+
+@sphinxdirective
+
+.. tab:: Devices
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [core_set_config]
+
+.. tab:: Model Loading
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [core_load_network]
+
+.. tab:: Execution
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [executable_network_set_config]
+
+@endsphinxdirective
+
+OpenVINO Runtime API 2.0:
+
+@sphinxdirective
+
+.. tab:: Devices
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [core_set_property]
+
+.. tab:: Model Loading
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [core_compile_model]
+
+.. tab:: Execution
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [compiled_model_set_property]
+
+@endsphinxdirective
+
+### Get information
+
+Inference Engine API:
+
+@sphinxdirective
+
+.. tab:: Device configuration
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [core_get_config]
+
+.. tab:: Device metrics
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [core_get_metric]
+
+.. tab:: Execution config
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [executable_network_get_metric]
+
+.. tab:: Execution metrics
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [executable_network_get_config]
+
+@endsphinxdirective
+
+OpenVINO Runtime API 2.0:
+
+@sphinxdirective
+
+.. tab:: Device configuration
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [core_get_rw_property]
+
+.. tab:: Device metrics
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [core_get_ro_property]
+
+.. tab:: Execution config
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [compiled_model_get_rw_property]
+
+.. tab:: Execution metrics
+
+    .. doxygensnippet:: docs/snippets/ov_properties_migration.cpp
+       :language: cpp
+       :fragment: [compiled_model_get_ro_property]
+
+@endsphinxdirective
--- a/docs/OV_Runtime_UG/migration_ov_2_0/graph_construction.md
+++ b/docs/OV_Runtime_UG/migration_ov_2_0/graph_construction.md
@@ -0,0 +1,16 @@
+# Model creation in runtime {#openvino_2_0_model_creation}
+
+OpenVINO™ Runtime API 2.0 includes nGraph engine as a common part. The `ngraph` namespace was changed to `ov`, all other ngraph API is preserved as is.
+Code snippets below show how application code should be changed for migration to OpenVINO™ Runtime API 2.0.
+
+### nGraph API
+
+@snippet snippets/ngraph.cpp ngraph:graph
+
+### OpenVINO™ Runtime API 2.0:
+
+@snippet snippets/ov_graph.cpp ov:graph
+
+**See also:**
+- [Hello Model Creation C++ Sample](../../../samples/cpp/model_creation_sample/README.md)
+- [Hello Model Creation Python Sample](../../../samples/python/model_creation_sample/README.md)
--- a/docs/OV_Runtime_UG/migration_ov_2_0/intro.md
+++ b/docs/OV_Runtime_UG/migration_ov_2_0/intro.md
@@ -0,0 +1,81 @@
+# OpenVINO™ 2.0 Transition Guide {#openvino_2_0_transition_guide}
+
+@sphinxdirective
+
+.. toctree::
+   :maxdepth: 1
+   :hidden:
+   
+   openvino_2_0_inference_pipeline
+   openvino_2_0_configure_devices
+   openvino_2_0_preprocessing
+   openvino_2_0_model_creation
+      
+@endsphinxdirective
+
+### Introduction
+
+Older versions of OpenVINO (prior to 2022.1) required to change the logic of applications when an user migrates from the frameworks like TensorFlow, ONNX Runtime, PyTorch, PaddlePaddle, etc. The change of application's logic is connected with:
+
+- Model Optimizer changed input precisions for some inputs. For example, neural langauge processing models with `I64` input are becoming to have `I32` input element type.
+- Model Optimizer changed layouts for TensorFlow models ((see [Layouts in OpenVINO](../layout_overview.md))). It leads to unexpected user behavior that a user needs to use a different layout for its input data with compare to the framework:
+![tf_openvino]
+- Inference Engine API (`InferenceEngine::CNNNetwork`) also applied some conversion rules for input and output precisions because of device plugins limitations.
+- Users need to specify input shapes during model conversions in Model Optimizer and work with static shapes in the application.
+
+OpenVINO Runtime API 2.0 is introduced to align logic of working with model as it is done in the frameworks - no layout and precision changes, operates with tensor names and indeces to address inputs and outputs. OpenVINO Runtime is composed of Inference Engine API used for inference and ngraph API targeted to work with models, operations. The OpenVINO API 2.0 has common structure, naming convention styles, namespaces, removes duplicated structures. See [How to migrate to OpenVINO 2.0 API](./common_inference_pipeline.md) for details.
+
+> **NOTE**: Most important is that your existing application can continue working with OpenVINO Runtime 2.0 as it used to be, but we recommend migration to new API to unlock additional features like [Preprocessing](../preprocessing_overview.md) and [Dynamic shapes support](../DynamicBatching.md).
+
+### Introduce IR v11
+
+To support these features, OpenVINO introduced IR v11 which is generated by Model Optimizer by default since 2022.1. The model represented in IR v11 fully matches the original model in a original framework format in terms of inputs and outputs. Also, a user does not have to specify input shapes during the conversion, so the resulting IR v11 contains `-1` to denote undefined dimensions (see [Working with dynamic shapes](../DynamicBatching.md) to fully utilize this feature; or [Changning input shapes](../ShapeInference.md) to reshape to static shapes in the application).
+
+What is also important to mention - the IR v11 is fully compatible with old applications written with Inference Engine API from older versions of OpenVINO. This is achieved by adding additional runtime information to the IR v11 which is responsible for backwark compatible behavior. So, once the IR v11 is read by the old Inference Engine based application, it's internally converted to IR v10 to provide backward-compatible behavior.
+
+The IR v11 is supported by all OpenVINO Development tools including Post Training Optimization tool, Benchmark app, etc.
+
+### IR v10 compatibility
+
+OpenVINO Runtime API 2.0 also supports model in IR v10 for backward compatibility. So, if a user has an IR v10, such IR v10 can be fed to OpenVINO Runtime as well (see [migration steps](./common_inference_pipeline.md)).
+
+Some OpenVINO Development Tools also support both IR v10 and IR v11 as an input:
+- Accuracy checker also supports IR v10, but requires an additional option to denote which API is used underneath.
+- [Compile tool](../../../tools/compile_tool/README.md) compiles the model to be used in OpenVINO 2.0 API by default. If a user wants to use the resulting compiled blob in Inference Engine API, the additional `ov_api_1_0` option should be passed.
+
+But the following OpenVINO tools don't support IR v10 as an input, they require to regenerate an IR v11 from the original model with latest Model Optimizer:
+- Post Training Optimization tool
+- Deep Learning WorkBench
+
+### Differences between Inference Engine and OpenVINO Runtime 2.0
+
+Inference Engine and ngraph APIs are not deprecated, they are fully functional and can be used in applications. But OpenVINO recommends users to migrate to new OpenVINO Runtime API 2.0, because it already has additional features and this list will be extended later. The following list of additional features is supported by new API:
+- [Working with dynamic shapes](../DynamicBatching.md). The feature is quite usefull for best performance for NLP (Neural Language Processing) models, super resolution models and other which accepts dynamic input shapes.
+- [Preprocessing of the model](../preprocessing_overview.md) to add preprocessing operations to the inference models and fully ocupay the accelerator and free CPU resources.
+
+To define a difference on the API level between Inference Engine and OpenVINO RUntime API 2.0, let's define two types of behaviors:
+- **Old behavior** of OpenVINO supposes:
+  - Model Optimizer can change input element types, order of dimensions (layouts) with compare to the model from the original framework.
+  - Inference Engine can override input and output element types.
+  - Inference Engine API operates with operation names to address inputs and outputs (e.g. InferenceEngine::InferRequest::GetBlob).
+  - Does not support compiling of models with dynamic input shapes.
+- **New behavior** assumes full model aligment with the framework and is implemented in OpenVINO 2.0:
+  - Model Optimizer preserves the input element types, order of dimensions (layouts) and stores tensor names from the original models.
+  - OpenVINO Runtime 2.0 reads models in any formats (IR v10, IR v11, ONNX, PaddlePaddle, etc) as is.
+  - OpenVINO Runtime API 2.0 operates with tensor names. Note, the difference between tensor names and operations names is that in case if a single operation has several output tensors, such tensors cannot identified in a unique manner, so tensor names are used for addressing as it's usually done in the frameworks.
+  - OpenVINO Runtime API 2.0 can address input and outputs tensors also by its index. Some model formats like ONNX are sensitive to order of inputs, outputs and its preserved by OpenVINO Runtime 2.0. 
+
+The table below demonstrates which behavior **old** or **new** is used depending on a model source, used APIs.
+
+|               API             | IR v10  | IR v11  | ONNX file | Model created in code |
+|-------------------------------|---------|---------|-----------|-----------------------|
+|Inference Engine / ngraph APIs |     Old |     Old |       Old |                   Old |
+|OpenVINO Runtime API 2.0       |     Old |     New |       New |                   New |
+
+Please look at next transition guides to understand how migrate Inference Engine-based application to OpenVINO™ Runtime API 2.0:
+ - [OpenVINO™ Common Inference pipeline](common_inference_pipeline.md)
+ - [Preprocess your model](./preprocessing.md)
+ - [Configure device](./configure_devices.md)
+ - [OpenVINO™ Model Creation](graph_construction.md)
+
+[tf_openvino]: ../../img/tf_openvino.png
--- a/docs/OV_Runtime_UG/migration_ov_2_0/preprocessing.md
+++ b/docs/OV_Runtime_UG/migration_ov_2_0/preprocessing.md
@@ -0,0 +1,64 @@
+# Preprocessing {#openvino_2_0_preprocessing}
+
+### Introduction
+
+Inference Engine API has preprocessing capabilities in `InferenceEngine::CNNNetwork` class. Such preprocessing information is not a part of the main inference graph executed by the [OpenVINO devices](../supported_plugins/Device_Plugins.md), so it is stored and executed separately before an inference stage:
+- Preprocessing operations are executed on CPU processor for most of the OpenVINO inference plugins. So, instead of occupying of acceleators, CPU processor is also busy with computational tasks.
+- Preprocessing information stored in `InferenceEngine::CNNNetwork` is lost during saving back to IR file format.
+
+OpenVINO Runtime API 2.0 introduces [new way of adding preprocessing operations to the model](../preprocessing_overview.md) - each preprocessing or postprocessing operation is integrated directly to the model and compiled together with inference graph:
+- Add preprocessing operations first using `ov::preprocess::PrePostProcessor`
+- Compile model on the target then using `ov::Core::compile_model`
+
+Having preprocessing operations as a part of OpenVINO opset allows to read and serialize preprocessed model as the IR file format.
+
+It's also important to mention that since OpenVINO 2.0, the Runtime API does not assume any default layouts like Inference Engine did, for example both `{ 1, 224, 224, 3 }` and `{ 1, 3, 224, 224 }` shapes are supposed to have `NCHW` layout while only the last shape has `NCHW`. So, some preprocessing capabilities in OpenVINO Runtime API 2.0 requires explicitly set layouts, see [Layout overview](../layout_overview.md) how to do it. For example, to perform image scaling by partial dimensions `H` and `W`, preprocessing needs to know what dimensions are `H` and `W`.
+
+> **NOTE**: Use Model Optimizer preprocessing capabilities to insert and optimize preprocessing operations to the model. In this case you don't need to read model in runtime application and set preprocessing, you can use [model caching feature](../Model_caching_overview.md) to improve time to inference stage.
+
+The steps below demonstrates how to migrate preprocessing scenarios from Inference Engine API to OpenVINO Runtime API 2.0.
+The snippets suppose we need to preprocess a model input with tensor name `tensor_name`, in Inferenece Engine API using operation names to address the data, it's called `operation_name`.
+
+### Mean and scale values
+
+Inference Engine API:
+
+@snippet docs/snippets/ov_preprocessing_migration.cpp mean_scale
+
+OpenVINO Runtime API 2.0:
+
+@snippet docs/snippets/ov_preprocessing_migration.cpp ov_mean_scale
+
+### Precision and layout conversions
+
+Inference Engine API:
+
+@snippet docs/snippets/ov_preprocessing_migration.cpp conversions
+
+OpenVINO Runtime API 2.0:
+
+@snippet docs/snippets/ov_preprocessing_migration.cpp ov_conversions
+
+### Image scaling
+
+Inference Engine API:
+
+@snippet docs/snippets/ov_preprocessing_migration.cpp image_scale
+
+OpenVINO Runtime API 2.0:
+
+@snippet docs/snippets/ov_preprocessing_migration.cpp ov_image_scale
+
+### Color space conversions
+
+Inference Engine API:
+
+@snippet docs/snippets/ov_preprocessing_migration.cpp color_space
+
+OpenVINO Runtime API 2.0:
+
+@snippet docs/snippets/ov_preprocessing_migration.cpp ov_color_space
+
+**See also:**
+- [Preprocessing details](../preprocessing_details.md)
+- [NV12 classification sample](../../../samples/cpp/hello_nv12_input_classification/README.md)
--- a/docs/OV_Runtime_UG/model_representation.md
+++ b/docs/OV_Runtime_UG/model_representation.md
@@ -14,7 +14,7 @@ For details on how to build a model in OpenVINO™ Runtime, see the [Build a Mod

 ## Operations

-The `ov::Op` class represents any abstract operation in the model representation. Use this class to create [custom operations](../OV_Runtime_UG/Extensibility_DG/AddingNGraphOps.md).
+The `ov::Op` class represents any abstract operation in the model representation. Use this class to create [custom operations](../Extensibility_UG/add_openvino_ops).

 ## Operation Sets

@@ -39,7 +39,7 @@ Operation set `opsetX` integrates a list of pre-compiled operations that work

 For a complete list of operation sets supported in OpenVINO™ toolkit, see [Available Operations Sets](../ops/opset.md).

-To add suport of custom operations, see the [Add Custom OpenVINO Operations](../OV_Runtime_UG/Extensibility_DG/Intro.md) document.
+To add support of custom operations, see the [Add Custom OpenVINO Operations](../Extensibility_UG/Intro.md) document.

 To build an `ov::Model` instance from `opset8` operations, include the following files:

@@ -83,9 +83,9 @@ The following code creates a model with several outputs:
    @snippet example_ngraph_utils.cpp ov:serialize

 ### How can I develop my own transformation pass?
-   See the [Transformations Developer Guide](./nGraphTransformation.md).
+   See the [Transformations Developer Guide](./../Extensibility_UG/ov_transformations.md).

 ## See Also

 * [Available Operation Sets](../ops/opset.md)
-* [OpenVINO™ Runtime Extensibility Developer Guide](../OV_Runtime_UG/Extensibility_DG/Intro.md)
+* [OpenVINO™ Runtime Extensibility Developer Guide](../Extensibility_UG/Intro.md)
--- a/docs/OV_Runtime_UG/supported_plugins/MULTI.md
+++ b/docs/OV_Runtime_UG/supported_plugins/MULTI.md
@@ -1,4 +1,4 @@
-# Multi-Device Plugin {#openvino_docs_IE_DG_supported_plugins_MULTI}
+# Running on multiple device simultaneously {#openvino_docs_OV_UG_Running_on_multiple_devices}

 ## Introducing the Multi-Device Plugin (C++)

@@ -32,7 +32,7 @@ Following the OpenVINO™ convention of labeling devices, the Multi-Device plugi
 | "MULTI_DEVICE_PRIORITIES" | comma-separated device names with no spaces | N/A | Prioritized list of devices |

 You can set the configuration directly as a string, or use the metric key `MultiDeviceConfigParams::KEY_MULTI_DEVICE_PRIORITIES from the `multi/multi_device_config.hpp` file, which defines the same string.
- 
+
 Basically, there are three ways to specify the devices to be use by the "MULTI":

@snippet snippets/MULTI0.cpp part0
@@ -44,7 +44,7 @@ Notice that the priorities of the devices can be changed in real time for the ex
 Finally, there is a way to specify number of requests that the Multi-Device will internally keep for each device. Suppose your original app was running 4 cameras with 4 inference requests. You would probably want to share these 4 requests between 2 devices used in MULTI. The easiest way is to specify a number of requests for each device using parentheses: "MULTI:CPU(2),GPU(2)" and use the same 4 requests in your app. However, such an explicit configuration is not performance-portable and hence not recommended. Instead, the better way is to configure the individual devices and query the resulting number of requests to be used at the application level (see [Configuring the Individual Devices and Creating the Multi-Device On Top](#configuring-the-individual-devices-and-creating-the-multi-device-on-top)).

 ### Enumerating Available Devices
-The Inference Engine features a dedicated API to enumerate devices and their capabilities. See the [Hello Query Device C++ Sample](../../../samples/cpp/hello_query_device/README.md). This is example output from the sample (truncated to device names only):
+The OpenVINO Runtime API features a dedicated methods to enumerate devices and their capabilities. See the [Hello Query Device C++ Sample](../../samples/cpp/hello_query_device/README.md). This is example output from the sample (truncated to device names only):

 ```sh
  ./hello_query_device
@@ -86,13 +86,13 @@ Note that while the performance of accelerators combines really well with Multi-
 See the [Using the Multi-Device with OpenVINO samples and benchmarking the performance](#using-the-multi-device-with-openvino-samples-and-benchmarking-the-performance) section below.

 ### Querying the Optimal Number of Inference Requests
-You can use the new GetMetric API to query the optimal number of requests. Similarly, when using the Multi-Device you don't need to sum over included devices yourself, you can query metric directly:
+You can use the [configure devices](supported_plugins/config_properties.md) to query the optimal number of requests. Similarly, when using the Multi-Device you don't need to sum over included devices yourself, you can query property directly:

@snippet snippets/MULTI5.cpp part5

 ### Using the Multi-Device with OpenVINO Samples and Benchmarking the Performance

-Every OpenVINO sample that supports the `-d` (which stands for "device") command-line option transparently accepts Multi-Device. The [Benchmark Application](../../../samples/cpp/benchmark_app/README.md) is the best reference for the optimal usage of Multi-Device. As discussed earlier, you do not need to set up the number of requests, CPU streams or threads because the application provides optimal performance out of the box. Below is an example command to evaluate HDDL+GPU performance with that:
+Every OpenVINO sample that supports the `-d` (which stands for "device") command-line option transparently accepts Multi-Device. The [Benchmark Application](../../samples/cpp/benchmark_app/README.md) is the best reference for the optimal usage of Multi-Device. As discussed earlier, you do not need to set up the number of requests, CPU streams or threads because the application provides optimal performance out of the box. Below is an example command to evaluate HDDL+GPU performance with that:

 ```sh
 ./benchmark_app –d MULTI:HDDL,GPU –m <model> -i <input> -niter 1000
@@ -110,7 +110,7 @@ The Multi-Device plugin supports FP16 IR files. The CPU plugin automatically upc
@endsphinxdirective

 ### See Also
-[Supported Devices](Supported_Devices.md)
+[Supported Devices](supported_plugins/Supported_Devices.md)

 ## Introducing the Multi-Device Plugin (Python)

@@ -182,7 +182,7 @@ You can set the configuration directly as a string, or use the metric key `MULTI


 ### Enumerating Available Devices
-The Inference Engine features a dedicated API to enumerate devices and their capabilities. See the [Hello Query Device Python Sample](../../../samples/python/hello_query_device/README.md). This is example output from the sample (truncated to device names only):
+The OpenVINO Runtime API features a dedicated methods to enumerate devices and their capabilities. See the [Hello Query Device Python Sample](../../samples/python/hello_query_device/README.md). This is example output from the sample (truncated to device names only):

 ```sh
  ./hello_query_device
@@ -268,7 +268,7 @@ Note that while the performance of accelerators works well with Multi-Device, th

 ### Using the Multi-Device with OpenVINO Samples and Benchmarking the Performance

-Every OpenVINO sample that supports the `-d` (which stands for "device") command-line option transparently accepts Multi-Device. The [Benchmark application](../../../tools/benchmark_tool/README.md) is the best reference for the optimal usage of Multi-Device. As discussed earlier, you do not need to set up the number of requests, CPU streams or threads because the application provides optimal performance out of the box. Below is an example command to evaluate CPU+GPU performance with the Benchmark application:
+Every OpenVINO sample that supports the `-d` (which stands for "device") command-line option transparently accepts Multi-Device. The [Benchmark application](../../tools/benchmark_tool/README.md) is the best reference for the optimal usage of Multi-Device. As discussed earlier, you do not need to set up the number of requests, CPU streams or threads because the application provides optimal performance out of the box. Below is an example command to evaluate CPU+GPU performance with the Benchmark application:

 ```sh
 ./benchmark_app.py –d MULTI:CPU,GPU –m <model>
@@ -289,4 +289,4 @@ The Multi-Device plugin supports FP16 IR files. The CPU plugin automatically upc
@endsphinxdirective

 ### See Also
-[Supported Devices](Supported_Devices.md)
+[Supported Devices](supported_plugins/Supported_Devices.md)
--- a/docs/OV_Runtime_UG/nGraphTransformation.md
+++ b/docs/OV_Runtime_UG/nGraphTransformation.md
@@ -1,449 +0,0 @@
-# Overview of Transformations API {#ngraph_transformation}
-
-This guide contains all necessary information that you need to start implementing nGraph transformations.
-
-## Prerequisites
-Before creating a transformation, do the following:
-
-* Make sure that there is no transformation with the same functionality in the [Transformation Library](group__ie__transformation__api.html)
-* Learn how the [Transformation Library](group__ie__transformation__api.html) is structured and how transformations are organized
-* Understand where to put your transformation code
-
-### Transformation Library Structure
-OpenVINO transformations are located in the `src/common/transformations` directory.
-
-Transformations root directory contains two folders:
-* `ngraph_ops` - Contains internal opset operations that are common for plugins.
-* `transformations` - Includes all transformations, utils, runtime info attributes, and pass managers.
-
-All internal operations and transformations located inside the [Transformation Library](group__ie__transformation__api.html) can be used inside plugins.
-All legacy operations and transformations were moved to a legacy library and are not recommended to be used.
-
-### Transformation Flow Layers
-Transformation flow in the transformation library has several layers:
-
-1. Pass managers - Execute any type of transformations and provide additional debug capabilities.
-2. Transformations - Perform a particular transformation algorithm on `ngraph::Function`.
-3. Low-level functions - Take a set of nodes and perform some transformation action.
-They are not mandatory and all transformation code can be located inside the transformation.
-But if some transformation parts can potentially be reused in other transformations, we suggest keeping them as separate functions.
-
-### Location for Your Transformation Code
-To decide where to store your transformation code, please follow these rules:
-
-1. If it is a plugin-specific transformation and cannot be reused by other plugins, keep source code inside plugin.
-2. If this transformation relates to opset operation conversion or optimization, keep sources inside the transformation library.
-
-After you decide where to store your transformation code, you can start developing your own nGraph transformation.
-
-## ngraph::Function and graph representation <a name="ngraph_function"></a>
-
-nGraph function is a very simple thing: it stores shared pointers to `ngraph::op::Parameter`, `ngraph::op::Result` and  `ngraph::op::Sink` operations that are inputs, outputs and sinks of the graph.
-Sinks of the graph have no consumers and not included into results vector. All other operations hold each other via shared pointers: child operation holds its parent (hard link). If operation has no consumers and it's not Result or Sink operation
-(shared pointer counter is zero) then it will be destructed and won't be accessible anymore. Each operation in `ngraph::Function` has a `std::shared_ptr<ngraph::Node>` type.
-
-For examples of how to build an nGraph function, see the [Build nGraph Function](./model_representation.md) page.
-
-## Transformations types <a name="transformations_types"></a>
-
-nGraph has three main transformation types:
-
-* `ngraph::pass::FunctionPass` - straightforward way to work with `ngraph::Function` directly
-* `ngraph::pass::MatcherPass` - pattern-based transformation approach
-* `ngraph::pass::GraphRewrite` - container for matcher passes needed for efficient execution
-
-![transformations_structure]
-
-### ngraph::pass::FunctionPass <a name="function_pass"></a>
-
-`ngraph::pass::FunctionPass` is used for transformations that take entire `ngraph::Function` as an input and process it.
-
-Template for FunctionPass transformation class
-
-@snippet src/transformations/template_function_transformation.hpp function_pass:template_transformation_hpp
-
-@snippet src/transformations/template_function_transformation.cpp function_pass:template_transformation_cpp
-
-Using `ngraph::FunctionPass`, you need to override the `run_on_function` method where you will write the transformation code.
-Return value is `true` if the original function has changed during transformation (new operation was added, or operations replacement was made, or node attributes were changed); otherwise, it is `false`.
-For transformation API, please follow the [working with ngraph::Function](#working_with_ngraph_function) section.
-Also `ngraph::FunctionPass` based transformations can be executed via `pass::Manager`. See the examples in the [Using pass manager](#using_pass_manager) section.
-
-### ngraph::pass::MatcherPass <a name="matcher_pass"></a>
-
-`ngraph::pass::MatcherPass` is used for pattern-based transformations.
-
-Template for MatcherPass transformation class
-@snippet src/transformations/template_pattern_transformation.hpp graph_rewrite:template_transformation_hpp
-
-@snippet src/transformations/template_pattern_transformation.cpp graph_rewrite:template_transformation_cpp
-
-To use `ngraph::pass::MatcherPass`, you need to complete these steps:
-1. Create a pattern
-2. Implement a callback
-3. Register the pattern and Matcher
-4. Execute MatcherPass
-
-So let's go through each of these steps.
-
-### Create a pattern
-Pattern is a single root `ngraph::Function`. But the only difference is that you do not need to create a function object, you just need to create and connect opset or special pattern operations.
-Then you need to take the last created operation and put it as a root of the pattern. This root node will be used as a root node in pattern matching.
-> **NOTE**: Any nodes in a pattern that have no consumers and are not registered as root will not be used in pattern matching.
-
-@snippet example_ngraph_utils.cpp pattern:simple_example
-
-The `Parameter` operation in the example above has type and shape specified. These attributes are needed only to create Parameter operation class and will not be used in pattern matching.
-
-For more pattern examples, refer to the [pattern matching](#pattern_matching) section.
-
-### Implement callback
-Callback is an action applied to every pattern entrance. In general, callback is the lambda function that takes Matcher object with detected subgraph.
-
-@snippet example_ngraph_utils.cpp pattern:callback_example
-
-The example above shows the callback structure and how Matcher can be used for accessing nodes detected by pattern.
-Callback return value is `true` if root node was replaced and another pattern cannot be applied to the same root node; otherwise, it is `false`.
-> **NOTE**: It is not recommended to manipulate with nodes that are under root node. This may affect GraphRewrite execution as it is expected that all nodes that come after root node in topological order are valid and can be used in pattern matching.
-
-MatcherPass also provides functionality that allows reporting of the newly created nodes that can be used in additional pattern matching.
-If MatcherPass was registered in `pass::Manager` or `pass::GraphRewrite`, these registered nodes will be added for additional pattern matching.
-That means that matcher passes registered in `pass::GraphRewrite` will be applied to these nodes.
-
-The example below shows how single MatcherPass can fuse sequence of operations using the `register_new_node` method.
-
-@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:relu_fusion
-
-> **NOTE**: If you register multiple nodes, please add them in topological order. We do not topologically sort these nodes as it is a time-consuming operation.
-
-### Register pattern and Matcher
-The last step is to register Matcher and callback inside the MatcherPass pass. To do this, call the `register_matcher` method.
-> **NOTE**: Only one matcher can be registered for a single MatcherPass class.
-
-```cpp
-// Register matcher and callback
-register_matcher(m, callback);
-```
-### Execute MatcherPass
-MatcherPass has multiple ways to be executed:
-* Run on a single node - it can be useful if you want to run MatcherPass inside another transformation.
-@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:run_on_node
-* Run on `ngraph::Function` using GraphRewrite - this approach gives ability to run MatcherPass on whole `ngraph::Function`. Moreover, multiple MatcherPass transformation can be registered in a single GraphRewite to be executed in a single graph traversal.
-@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:graph_rewrite
-* Run on `ngraph::Function` using `pass::Manager` - this approach helps you to register MatcherPass for execution on `ngraph::Function` as another transformation types.
-@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:manager
-
-
-### ngraph::pass::GraphRewrite <a name="graph_rewrite_pass"></a>
-
-GraphRewrite pass serves for running multiple matcher passes on `ngraph::Function` in a single graph traversal.
-Example:
-
-@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:graph_rewrite
-
-In addition, GraphRewrite handles nodes that were registered by MatcherPasses during their execution. This nodes will be added to the beginning of the sequence with nodes for pattern matching.
-
-> **NOTE**: when using `pass::Manager` temporary GraphRewrite is used to execute single MatcherPass.
-
-GraphRewrite has two algorithms for MatcherPasses execution. First algorithm is straightforward. It applies each MatcherPass in registration order to current node.
-
-![graph_rewrite_execution]
-
-But it is not really efficient when you have a lot of registered passes. So first of all GraphRewrite checks that all MatcherPass patterns has type-based root node (it means that type of this node is not hidden into predicate).
-And then creates map from registered MatcherPasses. That helps to avoid additional cost of applying each MatcherPass for each node.
-
-![graph_rewrite_efficient_search]
-
-> **NOTE**: GraphRewrite execution algorithm cannot be set manually and depends only on root nodes registered inside MatcherPasses.
-
-## Pattern Matching <a name="pattern_matching"></a>
-
-Sometimes patterns cannot be expressed via regular nGraph operations or it is too complicated.
-For example, if you want to detect Convolution->Add sub-graph without specifying particular input type for Convolution operation or you want to create a pattern where some of operations can have different types.
-And for these cases nGraph provides additional helpers to construct patterns for GraphRewrite transformations.
-
-There are two main helpers:
-1. `ngraph::pattern::any_input` - helps to express inputs if their types are undefined.
-2. `ngraph::pattern::wrap_type<T>` - helps to express nodes of pattern without specifying node attributes.
-
-Let's go through the example to have better understanding of how it works:
-
-> **NOTE**: Node attributes do not participate in pattern matching and are needed only for operations creation. Only operation types participate in pattern matching.
-
-The example below shows basic usage of `pattern::any_input`.
-Here we construct Multiply pattern with arbitrary first input and Constant as a second input.
-Also as Multiply is commutative operation, it does not matter in which order we set inputs (any_input/Constant or Constant/any_input) because both cases will be matched.
-
-@snippet example_ngraph_utils.cpp pattern:label_example
-
-This example shows how we can construct a pattern when operation has arbitrary number of inputs.
-
-@snippet example_ngraph_utils.cpp pattern:concat_example
-
-This example shows how to use predicate to construct a pattern. Also it shows how to match pattern manually on given node.
-
-@snippet example_ngraph_utils.cpp pattern:predicate_example
-
-> **NOTE**: Be careful with manual matching because Matcher object holds matched nodes. To clear a match, use the m->clear_state() method.
-
-## Working with ngraph::Function <a name="working_with_ngraph_function"></a>
-
-In this chapter we will review nGraph API that allows us to manipulate with `ngraph::Function`.
-
-### ngraph::Node input and output ports
-
-First of all let's talk about `ngraph::Node` input/output ports. Each nGraph operation has input and output ports except cases when operation has `Result`, `Parameter`, or `Constant` type.
-
-Every port belongs to its node, so using a port we can access parent node, get shape and type for particular input/output, get all consumers in case of output port, and get producer node in case of input port.
-With output port we can set inputs for newly created operations.
-
-Lets look at the code example.
-
-@snippet example_ngraph_utils.cpp ngraph:ports_example
-
-You may notice that we usually construct operations in this way:
-```cpp
-std::shared_ptr<Node> neg_const = opset1::Constant::create(sub->get_input_element_type(1), Shape{1}, {-1}));
-Output<Node> data = node->input_value(0);
-auto neg = std::make_shared<ngraph::opset1::Multiply>(data, neg_const);
-```
-In this example, the `opset3::Multiply` operation takes `Output<Node>` and `std::shared_ptr<Node>` as inputs. But the constructor takes both as `Output<Node>`.
-In this case, `std::shared_ptr<Node>` will be automatically converted to `Output<Node>` if node has exactly one output port; otherwise, conversion raises an exception.
-
-### ngraph::Node replacement
-
-nGraph provides two ways for node replacement: via nGraph helper function and directly via port methods. We are going to review both of them.
-
-Let's start with nGraph helper functions. The most popular function is `ngraph::replace_node(old_node, new_node)`.
-
-We will review real replacement case where Negative operation is replaced with Multiply.
-
-![ngraph_replace_node]
-
-@snippet example_ngraph_utils.cpp ngraph:replace_node
-
-`ngraph::replace_node` has a constraint that number of output ports for both of ops must be the same; otherwise, it raises an exception.
-
-
-The alternative way to do the same replacement is the following:
-```cpp
-// All neg->output(0) consumers will be moved to mul->output(0) port
-neg->output(0).replace(mul->output(0));
-```
-
-Another transformation example is insertion.
-
-![ngraph_insert_node]
-
-@snippet example_ngraph_utils.cpp ngraph:insert_node
-
-The alternative way to the insert operation is to make a node copy and use `replace_node`:
-
-@snippet example_ngraph_utils.cpp ngraph:insert_node_with_copy
-
-### ngraph::Node elimination
-
-Another type of node replacement is its elimination.
-
-To eliminate operation, nGraph has special method that considers all limitations related to InferenceEngine.
-
-@snippet example_ngraph_utils.cpp ngraph:eliminate_node
-
-`replace_output_update_name` in case of successful replacement it automatically preserves friendly name and runtime info.
-
-
-## Transformation conditional compilation
-
-Transformation library has two internal macros to support conditional compilation feature.
-
-* `MATCHER_SCOPE(region)` - allows to disable the MatcherPass if matcher isn't used. The region name should be unique. This macro creates a local variable `matcher_name` which you should use as a matcher name.
-* `RUN_ON_FUNCTION_SCOPE(region)` - allows to disable run_on_function pass if it isn't used. The region name should be unique.
-
-## Transformation writing essentials <a name="transformation_writing_essentials"></a>
-
-When developing a transformation, you need to follow these transformation rules:
-
-###1. Operation Set (OpSet)
-
-Use the latest version of OpSet in your transformation. An exception is op_conversion transformations, where different opsets can be used.
-
-@snippet example_ngraph_utils.cpp ov:include
-
-###2. Dynamic Shape and Rank
-
-nGraph has two types for shape representation:
-`ngraph::Shape` - represents static shape.
-`ngraph::PartialShape` - represents dynamic shape. It means that rank or some of dimensions are dynamic (undefined).
-`ngraph::PartialShape` can be converted to `ngraph::Shape` using the `get_shape()` method if all dimensions are static; otherwise, conversion raises an exception.
-
-@snippet example_ngraph_utils.cpp ngraph:shape
-
-But in most cases before getting static shape using `get_shape()` method, you need to check that shape is static.
-
-Also if your transformation requires only input shape rank or particular dimension value, please do not use the `get_shape()` method. See the example below demonstrating how to avoid using `get_shape()`
-
-@snippet example_ngraph_utils.cpp ngraph:shape_check
-
-Not using `get_shape()` method makes your transformation more flexible and applicable for more cases.
-
-###3. Friendly Names
-
-Each `ngraph::Node` has a unique name (used for nGraph internals) and a friendly name. In transformations we care only about friendly name because it represents the name from intermediate representation (IR).
-Also friendly name is used as output tensor name (until we do not have other way to represent output tensor name) and user code that requests intermediate outputs based on these names.
-To avoid losing friendly name when replacing node with other node or subgraph, set the original friendly name to the latest node in replacing subgraph. See the example below.
-
-```cpp
-// Replace Div operation with Power and Multiply sub-graph and set original friendly name to Multiply operation
-auto pow = std::make_shared<ngraph::opset1::Power>(div->input(1).get_source_output(),
-                                                           op::Constant::create(div->get_input_element_type(1), Shape{1}, {-1}));
-auto mul = std::make_shared<ngraph::opset1::Multiply>(div->input(0).get_source_output(), pow);
-mul->set_friendly_name(div->get_friendly_name());
-ngraph::replace_node(div, mul);
-```
-
-In more advanced cases, when replaced operation has several outputs and we add additional consumers to its outputs, we make a decision how to set friendly name by arrangement.
-
-###4. Runtime Info
-
-Runtime info is a map `std::map<std::string, ov::Any>` located inside `ngraph::Node` class. It represents additional attributes in `ngraph::Node`.
-These attributes can be set by users or by plugins and when executing transformation that changes `ngraph::Function` we need to preserve these attributes as they will not be automatically propagated.
-In most cases, transformations have the following types: 1:1 (replace node with another node), 1:N (replace node with a sub-graph), N:1 (fuse sub-graph into a single node), N:M (any other transformation).
-Currently, there is no mechanism that automatically detects transformation types, so we need to propagate this runtime information manually. See the examples below.
-
-```cpp
-// Replace Transpose with Reshape operation (1:1)
-ngraph::copy_runtime_info(transpose, reshape);
-```
-
-```cpp
-// Replace Div operation with Power and Multiply sub-graph (1:N)
-ngraph::copy_runtime_info(div, {pow, mul});
-```
-
-```cpp
-// Fuse Convolution with Add operation (N:1)
-ngraph::copy_runtime_info({conv, bias}, {conv_ie});
-```
-
-```cpp
-// Any other transformation that replaces one sub-graph with another sub-graph (N:M)
-ngraph::copy_runtime_info({a, b, c}, {e, f});
-```
-
-When transformation has multiple fusions or decompositions, `ngraph::copy_runtime_info` must be called multiple times for each case.
-
-> **Note**: copy_runtime_info removes rt_info from destination nodes. If you want to keep it, you need to specify them in source nodes like this: copy_runtime_info({a, b, c}, {a, b})
-
-###5. Constant Folding
-
-If your transformation inserts constant sub-graphs that need to be folded, do not forget to use `ngraph::pass::ConstantFolding()` after your transformation or call constant folding directly for operation.
-The example below shows how constant subgraph can be constructed.
-
-```cpp
-// After ConstantFolding pass Power will be replaced with Constant
-auto pow = std::make_shared<ngraph::opset3::Power>(
-                    opset3::Constant::create(element::f32, Shape{1}, {2})
-                    opset3::Constant::create(element::f32, Shape{1}, {3}));
-auto mul = std::make_shared<ngraph::opset3::Multiply>(input /* not constant input */, pow);
-```
-
-Manual constant folding is more preferable than `ngraph::pass::ConstantFolding()` because it is much faster.
-
-Below you can find an example of manual constant folding:
-
-@snippet src/transformations/template_pattern_transformation.cpp manual_constant_folding
-
-## Common mistakes in transformations <a name="common_mistakes"></a>
-
-In transformation development process:
-
-* Do not use deprecated nGraph API. Deprecated methods has the `NGRAPH_DEPRECATED` macros in its definition.
-* Do not pass `shared_ptr<Node>` as an input for other node if type of node is unknown or it has multiple outputs. Use explicit output port.
-* If you replace node with another node that produces different shape, remember that new shape will not be propagated until the first `validate_nodes_and_infer_types` call for `ngraph::Function`. If you are using `pass::Manager`, it will automatically call this method after each transformation execution.
-* Do not forget to call the `ngraph::ConstantFolding` pass if your transformation creates constant subgraphs.
-* Use latest OpSet if you are not developing downgrade transformation pass.
-* When developing a callback for `ngraph::pass::MatcherPass`,  do not change nodes that come after the root node in topological order.
-
-## Using pass manager <a name="using_pass_manager"></a>
-
-`ngraph::pass::Manager` is a container class that can store the list of transformations and execute them. The main idea of this class is to have high-level representation for grouped list of transformations.
-It can register and apply any [transformation types](#transformations_types) on function.
-In addition, `ngraph::pass::Manager` has extended debug capabilities (find more information in the [how to debug transformations](#how_to_debug_transformations) section).
-
-The example below shows basic usage of `ngraph::pass::Manager`
-
-@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:manager3
-
-Another example shows how multiple matcher passes can be united into single GraphRewrite.
-
-@snippet src/transformations/template_pattern_transformation.cpp matcher_pass:manager2
-
-> **NOTE**: nGraph used to have the `pass::PassConfig` class for transformation pipeline manipulation.
-This mechanism is now obsolete and the `pass::PassConfig` class will be removed in future release.
-
-## How to debug transformations <a name="how_to_debug_transformations"></a>
-
-The most popular tool for transformations debugging is the `ngraph::pass::VisualizeTree` transformation, which visualizes ngraph::Function.
-
-Usage example:
-
-@snippet example_ngraph_utils.cpp ov:visualize
-
-`ngraph::pass::VisualizeTree` can be parametrized via environment variables:
-
-```
-OV_VISUALIZE_TREE_OUTPUT_SHAPES=1       - visualize shapes
-OV_VISUALIZE_TREE_OUTPUT_TYPES=1        - visualize types
-OV_VISUALIZE_TREE_MIN_MAX_DENORMAL=1    - pretty denormal values
-OV_VISUALIZE_TREE_RUNTIME_INFO=1        - print runtime information
-OV_VISUALIZE_TREE_IO=1                  - print I/O ports
-OV_VISUALIZE_TREE_MEMBERS_NAME=1        - print member names
-```
-
-> **Note**: current VisualTree does not have user-friendly interface and it will be changed in the nearest future. The intention is to move visualization abilities inside transformations.
-
-If you are using `ngraph::pass::Manager` to run sequence of transformations, you can get additional debug capabilities by using the following environment variables:
-
-```
-OV_PROFILE_PASS_ENABLE=1 - enables performance measurement for each transformation and prints execution status
-OV_ENABLE_VISUALIZE_TRACING=1 -  enables visualization after each transformation. By default, it saves dot and svg files.
-```
-
-> **Note**: Make sure that you have dot installed on your machine; otherwise, it will silently save only dot file without svg file.
-
-## Disabling/Enabling specific transformations for plugin X	 <a name="disabling_transformation"></a>
-
-In transformation library, we provide plugins transformations like CommonOptimizations, which contains predefined sequence of transformations.
-We also provide a tool that helps to disable or partially disable particular transformations in a transformation pipeline.
-For example, if a plugin uses the CommonOptimization transformation and needs to disable the ConvertGELU transformation, then inside the plugin we have to take the PassConfig instance
-from pass::Manger and call disable method.
-
-@snippet example_ngraph_utils.cpp ngraph:disable_gelu
-
-In some cases, we need to disable transformation for some condition:
-
-@snippet example_ngraph_utils.cpp ngraph:disable_callback
-
-In some cases, pass::Manager pipelines inside transformations may have transformations disabled by default but enabled inside plugins.
-
-@snippet example_ngraph_utils.cpp ngraph:disabled_by_default
-
-PassConfig instance taken from pass::Manager is shared across all registered transformations including nested transformations. So it does not matter where we work with this object (before passes registration or after).
-
-## Transformations testing <a name="transformations_testing"></a>
-
-If you are developing new transformation inside plugin, you need to add test into the `template_plugin/tests/functional/transformations` folder.
-We have two types of tests: nGraph reader tests located in `src/tests/functional/inference_engine/ngraph_reader` and transformation tests located in `src/tests/functional/inference_engine/transformations`
-Reader tests are IR based and test end-to-end conversion from IR to CNNNetwork. Transformation tests test single ngraph transformations or low-level functions that are used inside transformations.
-
-The basic transformation test looks like this:
-
-@snippet tests/functional/transformations/template_transformations_test.cpp transformation:test
-
-
-[ngraph_replace_node]: ./img/ngraph_replace_node.png
-[ngraph_insert_node]: ./img/ngraph_insert_node.png
-[transformations_structure]: ./img/transformations_structure.png
-[register_new_node]: ./img/register_new_node.png
-[graph_rewrite_execution]: ./img/graph_rewrite_execution.png
-[graph_rewrite_efficient_search]: ./img/graph_rewrite_efficient_search.png
--- a/docs/OV_Runtime_UG/network_state_intro.md
+++ b/docs/OV_Runtime_UG/network_state_intro.md
@@ -15,7 +15,7 @@ The section additionally provides small examples of stateful network and code to
 between data portions should be addressed. For that, networks save some data between inferences - state. When one dependent sequence is over,
 state should be reset to initial value and new sequence can be started.
 
- Several frameworks have special API for states in networks. For example, Keras have special option for RNNs `stateful` that turns on saving state 
+ Several frameworks have special API for states in networks. For example, Keras has special option for RNNs `stateful` that turns on saving state 
 between inferences. Kaldi contains special specifier `Offset` to define time offset in a network. 
 
 OpenVINO also contains special API to simplify work with networks with states. State is automatically saved between inferences, 
@@ -196,9 +196,7 @@ sink from `ngraph::Function` after deleting the node from graph with the `delete

 Let's take an IR from the previous section example. The example below demonstrates inference of two independent sequences of data. State should be reset between these sequences.

-One infer request and one thread 
-will be used in this example. Using several threads is possible if you have several independent sequences. Then each sequence can be processed in its own infer 
-request. Inference of one sequence in several infer requests is not recommended. In one infer request state will be saved automatically between inferences, but 
+One infer request and one thread will be used in this example. Using several threads is possible if you have several independent sequences. Then each sequence can be processed in its own infer request. Inference of one sequence in several infer requests is not recommended. In one infer request state will be saved automatically between inferences, but 
 if the first step is done in one infer request and the second in another, state should be set in new infer request manually (using `IVariableState::SetState` method).

@snippet openvino/docs/snippets/InferenceEngine_network_with_state_infer.cpp part1
@@ -213,7 +211,7 @@ Decsriptions can be found in [Samples Overview](./Samples_Overview.md)

 If the original framework does not have a special API for working with states, after importing the model, OpenVINO representation will not contain Assign/ReadValue layers. For example, if the original ONNX model contains RNN operations, IR will contain TensorIterator operations and the values will be obtained only after execution of the whole TensorIterator primitive. Intermediate values from each iteration will not be available. To enable you to work with these intermediate values of each iteration and receive them with a low latency after each infer request, special LowLatency and LowLatency2 transformations were introduced.

-### How to get TensorIterator/Loop operaions from different frameworks via ModelOptimizer.
+### How to get TensorIterator/Loop operations from different frameworks via ModelOptimizer.

 **ONNX and frameworks supported via ONNX format:** *LSTM, RNN, GRU* original layers are converted to the TensorIterator operation. TensorIterator body contains LSTM/RNN/GRU Cell. Peepholes, InputForget modifications are not supported, sequence_lengths optional input is supported.
 *ONNX Loop* layer is converted to the OpenVINO Loop operation.
--- a/docs/OV_Runtime_UG/openvino_intro.md
+++ b/docs/OV_Runtime_UG/openvino_intro.md
@@ -0,0 +1,52 @@
+# OpenVINO™ Runtime User Guide {#openvino_docs_OV_Runtime_User_Guide}
+
+@sphinxdirective
+
+.. _deep learning inference engine:
+
+.. toctree::
+   :maxdepth: 1
+   :hidden:
+
+   openvino_docs_IE_DG_Integrate_with_customer_application_new_API
+   <!-- should be a part of Integrate OV in user application -->
+   openvino_docs_OV_Runtime_UG_Model_Representation
+   openvino_docs_IE_DG_ShapeInference
+   openvino_docs_OV_UG_Working_with_devices
+   openvino_docs_OV_Runtime_UG_Preprocessing_Overview
+   openvino_docs_IE_DG_DynamicBatching
+   openvino_docs_IE_DG_supported_plugins_AUTO
+   openvino_docs_OV_UG_Running_on_multiple_devices
+   openvino_docs_OV_UG_Hetero_execution
+   openvino_docs_IE_DG_network_state_intro
+   openvino_2_0_transition_guide
+   openvino_docs_OV_Should_be_in_performance
+   openvino_docs_OV_Runtime_API_Changes
+
+@endsphinxdirective
+
+## Introduction
+OpenVINO Runtime is a set of C++ libraries with C and Python bindings providing a common API to deliver inference solutions on the platform of your choice. Use the OpenVINO Runtime API to read an Intermediate Representation (IR), ONNX, or PaddlePaddle model and execute it on preferred devices.
+
+OpenVINO Runtime uses a plugin architecture. Its plugins are software components that contain complete implementation for inference on a particular Intel® hardware device: CPU, GPU, VPU, etc. Each plugin implements the unified API and provides additional hardware-specific APIs, for configuring devices, or API interoperability between OpenVINO Runtime and underlying plugin backend.
+ 
+The scheme below illustrates the typical workflow for deploying a trained deep learning model: 
+
+<!-- TODO: need to update the picture below with PDPD files -->
+![](img/BASIC_FLOW_IE_C.svg)
+
+
+## Video
+
+@sphinxdirective
+
+.. list-table::
+
+   * - .. raw:: html
+
+           <iframe allowfullscreen mozallowfullscreen msallowfullscreen oallowfullscreen webkitallowfullscreen height="315" width="100%"
+           src="https://www.youtube.com/embed/e6R13V8nbak">
+           </iframe>
+   * - **Inference Engine Concept**. Duration: 3:43
+     
+@endsphinxdirective
--- a/docs/OV_Runtime_UG/openvino_temporary.md
+++ b/docs/OV_Runtime_UG/openvino_temporary.md
@@ -0,0 +1,18 @@
+# Should be moved to performance / extensibility {#openvino_docs_OV_Should_be_in_performance}
+
+@sphinxdirective
+
+.. _deep learning inference engine:
+
+.. toctree::
+   :maxdepth: 1
+   :hidden:
+
+   openvino_docs_deployment_optimization_guide_dldt_optimization_guide
+   openvino_docs_IE_DG_Model_caching_overview
+   openvino_docs_IE_DG_Int8Inference
+   openvino_docs_IE_DG_Bfloat16Inference
+
+@endsphinxdirective
+
+## TEMP: should be moved to performance / extensibility guides
--- a/docs/OV_Runtime_UG/preprocessing_details.md
+++ b/docs/OV_Runtime_UG/preprocessing_details.md
@@ -0,0 +1,346 @@
+# Preprocessing API - details {#openvino_docs_OV_Runtime_UG_Preprocessing_Details}
+
+## Preprocessing capabilities
+
+### Addressing particular input/output
+
+If your model has only one input, then simple <code>ov::preprocess::PrePostProcessor::input()</code> will get a reference to preprocessing builder for this input (tensor, steps, model):
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:input_1]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:input_1]
+
+@endsphinxdirective
+
+In general, when model has multiple inputs/outputs, each one can be addressed by tensor name
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:input_name]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:input_name]
+
+@endsphinxdirective
+
+
+Or by it's index
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:input_index]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:input_index]
+
+@endsphinxdirective
+
+C++ references:
+  * <code>ov::preprocess::InputTensorInfo</code>
+  * <code>ov::preprocess::OutputTensorInfo</code>
+  * <code>ov::preprocess::PrePostProcessor</code>
+
+
+### Supported preprocessing operations
+
+C++ references:
+* <code>ov::preprocess::PreProcessSteps</code>
+
+#### Mean/Scale normalization
+
+Typical data normalization includes 2 operations for each data item: subtract mean value and divide to standard deviation. This can be done with the following code:
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:mean_scale]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:mean_scale]
+
+@endsphinxdirective
+
+
+In Computer Vision area normalization is usually done separately for R, G, B values. To do this, [layout with 'C' dimension](./layout_overview.md) shall be defined. Example:
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:mean_scale_array]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:mean_scale_array]
+
+@endsphinxdirective
+
+C++ references:
+* <code>ov::preprocess::PreProcessSteps::mean()</code>
+* <code>ov::preprocess::PreProcessSteps::scale()</code>
+
+
+#### Convert precision
+
+In Computer Vision, image is represented by array of unsigned 8-but integer values (for each color), but model accepts floating point tensors
+
+To integrate precision conversion into execution graph as a preprocessing step, just do:
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:convert_element_type]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:convert_element_type]
+
+@endsphinxdirective
+
+C++ references:
+  * <code>ov::preprocess::InputTensorInfo::set_element_type()</code>
+  * <code>ov::preprocess::PreProcessSteps::convert_element_type()</code>
+
+
+#### Convert layout (transpose)
+
+Transposing of matrices/tensors is a typical operation in Deep Learning - you may have a BMP image 640x480 which is an array of `{480, 640, 3}` elements, but Deep Learning model can require input with shape `{1, 3, 480, 640}`
+
+Using [layout](./layout_overview.md) of user's tensor and layout of original model conversion can be done implicitly
+
+@sphinxdirective
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:convert_layout]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:convert_layout]
+
+@endsphinxdirective
+
+
+Or if you prefer manual transpose of axes without usage of [layout](./layout_overview.md) in your code, just do:
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:convert_layout_2]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:convert_layout_2]
+
+@endsphinxdirective
+
+It performs the same transpose, but we believe that approach using source and destination layout can be easier to read and understand
+
+C++ references:
+  * <code>ov::preprocess::PreProcessSteps::convert_layout()</code>
+  * <code>ov::preprocess::InputTensorInfo::set_layout()</code>
+  * <code>ov::preprocess::InputModelInfo::set_layout()</code>
+  * <code>ov::Layout</code>
+
+#### Resize image
+
+Resizing of image is a typical preprocessing step for computer vision tasks. With preprocessing API this step can also be integrated into execution graph and performed on target device.
+
+To resize the input image, it is needed to define `H` and `W` dimensions of [layout](./layout_overview.md)
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:resize_1]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:resize_1]
+
+@endsphinxdirective
+
+Or in case if original model has known spatial dimensions (widht+height), target width/height can be omitted
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:resize_2]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:resize_2]
+
+@endsphinxdirective
+
+C++ references:
+* <code>ov::preprocess::PreProcessSteps::resize()</code>
+* <code>ov::preprocess::ResizeAlgorithm</code>
+
+
+#### Color conversion
+
+Typical use case is to reverse color channels from RGB to BGR and wise versa. To do this, specify source color format in `tensor` section and perform `convert_color` preprocessing operation. In example below, user has `BGR` image and needs to convert it to `RGB` as required for model's input
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:convert_color_1]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:convert_color_1]
+
+@endsphinxdirective
+
+#### Color conversion - NV12/I420
+Preprocessing also support YUV-family source color formats, i.e. NV12 and I420.
+In advanced cases such YUV images can be splitted into separate planes, e.g. for NV12 images Y-component may come from one source and UV-component comes from another source. Concatenating such components in user's application manually is not a perfect solution from performance and device utilization perspectives, so there is a way to use Preprocessing API. For such cases there is `NV12_TWO_PLANES` and `I420_THREE_PLANES` source color formats, which will split original `input` to 2 or 3 inputs
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:convert_color_2]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:convert_color_2]
+
+@endsphinxdirective
+
+In this example, original `input` is being split to `input/y` and `input/uv` inputs. You can fill `input/y` from one source, and `input/uv` from another source. Color conversion to `RGB` will be performed using these sources, it is more optimal as there will be no additional copies of NV12 buffers.
+
+C++ references:
+* <code>ov::preprocess::ColorFormat</code>
+* <code>ov::preprocess::PreProcessSteps::convert_color</code>
+
+
+### Custom operations
+
+Preprocessing API also allows adding custom preprocessing steps into execution graph. Custom step is a function which accepts current 'input' node and returns new node after adding preprocessing step
+
+> **Note:** Custom preprocessing function shall only insert node(s) after input, it will be done during model compilation. This function will NOT be called during execution phase. This may look not trivial and require some knowledge of [OpenVINO™ operations](../ops/opset.md)
+
+If there is a need to insert some additional operations to execution graph right after input, like some specific crops and/or resizes - Preprocessing API can be a good choice to implement this
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:custom]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:custom]
+
+@endsphinxdirective
+
+C++ references:
+* <code>ov::preprocess::PreProcessSteps::custom()</code>
+* [Available Operations Sets](../ops/opset.md)
+
+## Postprocessing
+
+Postprocessing steps can be added to model outputs. As for preprocessing, these steps will be also integrated into graph and executed on selected device.
+
+Preprocessing uses flow **User tensor** -> **Steps** -> **Model input**
+
+Postprocessing is wise versa:  **Model output** -> **Steps** -> **User tensor**
+
+Comparing to preprocessing, there is not so much operations needed to do in post-processing stage, so right now only following postprocessing operations are supported:
+ - Convert [layout](./layout_overview.md)
+ - Convert element type
+ - Custom operations
+
+Usage of these operations is similar to Preprocessing. Some example is shown below:
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:postprocess]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:postprocess]
+
+@endsphinxdirective
+
+C++ references:
+* <code>ov::preprocess::PostProcessSteps</code>
+* <code>ov::preprocess::OutputModelInfo</code>
+* <code>ov::preprocess::OutputTensorInfo</code>
--- a/docs/OV_Runtime_UG/preprocessing_overview.md
+++ b/docs/OV_Runtime_UG/preprocessing_overview.md
@@ -0,0 +1,169 @@
+# Overview of Preprocessing API {#openvino_docs_OV_Runtime_UG_Preprocessing_Overview}
+
+@sphinxdirective
+
+.. toctree::
+   :maxdepth: 1
+   :hidden:
+
+   openvino_docs_OV_Runtime_UG_Preprocessing_Details
+   openvino_docs_OV_Runtime_UG_Layout_Overview
+
+@endsphinxdirective
+
+## Introduction
+
+When your input data don't perfectly fit to Neural Network model input tensor - this means that additional operations/steps are needed to transform your data to format expected by model. These operations are known as "preprocessing".
+
+### Example
+Consider the following standard example: deep learning model expects input with shape `{1, 3, 224, 224}`, `FP32` precision, `RGB` color channels order, and requires data normalization (subtract mean and divide by scale factor). But you have just a `640x480` `BGR` image (data is `{480, 640, 3}`). This means that we need some operations which will:
+ - Convert U8 buffer to FP32
+ - Transform to `planar` format: from `{1, 480, 640, 3}` to `{1, 3, 480, 640}`
+ - Resize image from 640x480 to 224x224
+ - Make `BGR->RGB` conversion as model expects `RGB`
+ - For each pixel, subtract mean values and divide by scale factor
+
+
+![](img/preprocess_not_fit.png)
+
+
+Even though all these steps can be relatively easy implemented manually in application's code before actual inference, it is possible to do it with Preprocessing API. Reasons to use this API are:
+ - Preprocessing API is easy to use
+ - Preprocessing steps will be integrated into execution graph and will be performed on selected device (CPU/GPU/VPU/etc.) rather than always being executed on CPU. This will improve selected device utilization which is always good.
+
+## Preprocessing API
+
+Intuitively, Preprocessing API consists of the following parts:
+ 1. 	**Tensor:** Declare user's data format, like shape, [layout](./layout_overview.md), precision, color format of actual user's data
+ 2. 	**Steps:** Describe sequence of preprocessing steps which need to be applied to user's data
+ 3. 	**Model:** Specify Model's data format. Usually, precision and shape are already known for model, only additional information, like [layout](./layout_overview.md) can be specified
+
+> **Note:** All model's graph modification shall be performed after model is read from disk and **before** it is being loaded on actual device.
+
+### PrePostProcessor object
+
+`ov::preprocess::PrePostProcessor` class allows specifying preprocessing and postprocessing steps for model read from disk.
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:create]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:create]
+
+@endsphinxdirective
+
+### Declare user's data format
+
+To address particular input of model/preprocessor, use `ov::preprocess::PrePostProcessor::input(input_name)` method
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:tensor]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:tensor]
+
+@endsphinxdirective
+
+
+Here we've specified all information about user's input:
+ - Precision is U8 (unsigned 8-bit integer)
+ - Data represents tensor with {1,480,640,3} shape
+ - [Layout](./layout_overview.md) is "NHWC". It means that 'height=480, width=640, channels=3'
+ - Color format is `BGR`
+
+### Declare model's layout
+
+Model's input already has information about precision and shape. Preprocessing API is not intended to modify this. The only thing that may be specified is input's data [layout](./layout_overview.md)
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:model]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:model]
+
+@endsphinxdirective
+
+
+Now, if model's input has `{1,3,224,224}` shape, preprocessing will be able to identify that model's `height=224`, `width=224`, `channels=3`. Height/width information is necessary for 'resize', and `channels` is needed for mean/scale normalization
+
+### Preprocessing steps
+
+Now we can define sequence of preprocessing steps:
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:steps]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:steps]
+
+@endsphinxdirective
+
+Here:
+ - Convert U8 to FP32 precision
+ - Convert current color format (BGR) to RGB
+ - Resize to model's height/width. **Note** that if model accepts dynamic size, e.g. {?, 3, ?, ?}, `resize` will not know how to resize the picture, so in this case you should specify target height/width on this step. See also <code>ov::preprocess::PreProcessSteps::resize()</code>
+ - Subtract mean from each channel. On this step, color format is RGB already, so `100.5` will be subtracted from each Red component, and `101.5` will be subtracted from `Blue` one.
+ - Divide each pixel data to appropriate scale value. In this example each `Red` component will be divided by 50, `Green` by 51, `Blue` by 52 respectively
+ - **Note:** last `convert_layout` step is commented out as it is not necessary to specify last layout conversion. PrePostProcessor will do such conversion automatically
+
+### Integrate steps into model
+
+We've finished with preprocessing steps declaration, now it is time to build it. For debugging purposes it is possible to print `PrePostProcessor` configuration on screen:
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
+         :language: cpp
+         :fragment: [ov:preprocess:build]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_preprocessing.py
+         :language: python
+         :fragment: [ov:preprocess:build]
+
+@endsphinxdirective
+
+
+After this, `model` will accept U8 input with `{1, 480, 640, 3}` shape, with `BGR` channels order. All conversion steps will be integrated into execution graph. Now you can load model on device and pass your image to model as is, without any data manipulation on application's side
+
+
+## See Also
+
+* [Preprocessing Details](./preprocessing_details.md)
+* [Layout API overview](./layout_overview.md)
+* <code>ov::preprocess::PrePostProcessor</code> C++ class documentation
--- a/docs/OV_Runtime_UG/protecting_model_guide.md
+++ b/docs/OV_Runtime_UG/protecting_model_guide.md
@@ -16,22 +16,22 @@ This guide demonstrates how to use OpenVINO securely with protected models.

 After a model is optimized by the OpenVINO Model Optimizer, it's deployed
 to target devices in the Intermediate Representation (IR) format. An optimized
-model is stored on an edge device and executed by the Inference Engine. 
-(ONNX and nGraph models can also be read natively by the Inference Engine.)
+model is stored on an edge device and executed by the OpenVINO Runtime. 
+(ONNX, PDPD models can also be read natively by the OpenVINO Runtime.)

 To protect deep-learning models, you can encrypt an optimized model before
 deploying it to the edge device. The edge device should keep the stored model
 protected at all times and have the model decrypted **in runtime only** for use
-by the Inference Engine.
+by the OpenVINO Runtime.

 ![deploy_encrypted_model](img/deploy_encrypted_model.png)

 ## Loading Encrypted Models

-The OpenVINO Inference Engine requires model decryption before loading. Allocate
+The OpenVINO Runtime requires model decryption before loading. Allocate
 a temporary memory block for model decryption and use the 
-`InferenceEngine::Core::ReadNetwork` method to load the model from a memory buffer.
-For more information, see the `InferenceEngine::Core` Class Reference Documentation.
+`ov::Core::read_model` method to load the model from a memory buffer.
+For more information, see the `ov::Core` Class Reference Documentation.

@snippet snippets/protecting_model_guide.cpp part0

@@ -40,12 +40,12 @@ Hardware-based protection such as Intel&reg; Software Guard Extensions
 bind them to a device. For more information, go to [Intel&reg; Software Guard
 Extensions](https://software.intel.com/en-us/sgx).

-Use `InferenceEngine::Core::ReadNetwork()` to set model representations and
+Use `ov::Core::read_model` to set model representations and
 weights respectively.

 Currently there is no way to read external weights from memory for ONNX models.
-The `ReadNetwork(const std::string& model, const Blob::CPtr& weights)` function
-should be called with `weights` passed as an empty `Blob`.
+The `ov::Core::read_model(const std::string& model, const Tensor& weights)` method
+should be called with `weights` passed as an empty `ov::Tensor`.

@snippet snippets/protecting_model_guide.cpp part1

@@ -54,7 +54,7 @@ should be called with `weights` passed as an empty `Blob`.
 - Intel® Distribution of OpenVINO™ toolkit home page: [https://software.intel.com/en-us/openvino-toolkit](https://software.intel.com/en-us/openvino-toolkit)
 - OpenVINO™ toolkit online documentation: [https://docs.openvino.ai](https://docs.openvino.ai)
 - Model Optimizer Developer Guide: [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
- Inference Engine Developer Guide: [Inference Engine Developer Guide](Deep_Learning_Inference_Engine_DevGuide.md)
- For more information on Sample Applications, see the [Inference Engine Samples Overview](Samples_Overview.md)
+- [OpenVINO™ runTime User Guide](openvino_intro.md)
+- For more information on Sample Applications, see the [OpenVINO Samples Overview](Samples_Overview.md)
 - For information on a set of pre-trained models, see the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel)
 - For IoT Libraries and Code Samples see the [Intel® IoT Developer Kit](https://github.com/intel-iot-devkit).
--- a/docs/OV_Runtime_UG/supported_plugins/CPU.md
+++ b/docs/OV_Runtime_UG/supported_plugins/CPU.md
@@ -1,4 +1,4 @@
-# CPU Plugin {#openvino_docs_IE_DG_supported_plugins_CPU}
+# CPU device {#openvino_docs_OV_UG_supported_plugins_CPU}


 ## Introducing the CPU Plugin
@@ -6,7 +6,7 @@ The CPU plugin was developed to achieve high performance of neural networks on C

 Currently, the CPU plugin uses Intel® Threading Building Blocks (Intel® TBB) in order to parallelize calculations. Please refer to the [Optimization Guide](../../optimization_guide/dldt_optimization_guide.md) for associated performance considerations.

-The set of supported layers can be expanded with [the Extensibility mechanism](../Extensibility_DG/Intro.md).
+The set of supported layers can be expanded with [the Extensibility mechanism](../../Extensibility_UG/Intro.md).

 ## Supported Platforms

--- a/docs/OV_Runtime_UG/supported_plugins/Device_Plugins.md
+++ b/docs/OV_Runtime_UG/supported_plugins/Device_Plugins.md
@@ -1,4 +1,4 @@
-# Device Plugin Support {#openvino_docs_IE_DG_Device_Plugins}
+# Working with devices {#openvino_docs_OV_UG_Working_with_devices}

@sphinxdirective

@@ -6,30 +6,30 @@
   :maxdepth: 1
   :hidden:

-   openvino_docs_IE_DG_InferenceEngine_QueryAPI
-   openvino_docs_IE_DG_supported_plugins_CPU
-   openvino_docs_IE_DG_supported_plugins_GPU
+   openvino_docs_OV_UG_query_api
+   openvino_docs_OV_UG_supported_plugins_CPU
+   openvino_docs_OV_UG_supported_plugins_GPU
   openvino_docs_IE_DG_supported_plugins_VPU
-   openvino_docs_IE_DG_supported_plugins_GNA
-   openvino_docs_IE_DG_supported_plugins_AUTO
-   openvino_docs_IE_DG_supported_plugins_HETERO
-   openvino_docs_IE_DG_supported_plugins_MULTI
-         
+   openvino_docs_OV_UG_supported_plugins_GNA
+
@endsphinxdirective

-Inference Engine uses a plugin architecture. Inference Engine plugin is a software component that contains complete implementation for inference on a certain Intel® hardware device: CPU, GPU, VPU, GNA, etc. Each plugin implements the unified API and provides additional hardware-specific APIs.
-
-The Inference Engine provides capabilities to infer deep learning models on the following device types with corresponding plugins:
+The OpenVINO Runtime provides capabilities to infer deep learning models on the following device types with corresponding plugins:

 | Plugin                                   | Device types                                                                                                                                                |
 |------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
-|[GPU plugin](GPU.md)            |Intel&reg; Processor Graphics, including Intel&reg; HD Graphics and Intel&reg; Iris&reg; Graphics                                                            |
 |[CPU plugin](CPU.md)              |Intel&reg; Xeon&reg; with Intel® Advanced Vector Extensions 2 (Intel® AVX2), Intel® Advanced Vector Extensions 512 (Intel® AVX-512), and AVX512_BF16, Intel&reg; Core&trade; Processors with Intel&reg; AVX2, Intel&reg; Atom&reg; Processors with Intel® Streaming SIMD Extensions (Intel® SSE) |
+|[GPU plugin](GPU.md)            |Intel&reg; Processor Graphics, including Intel&reg; HD Graphics and Intel&reg; Iris&reg; Graphics                                                            |
 |[VPU plugins](VPU.md) (available in the Intel® Distribution of OpenVINO™ toolkit)            |Intel® Neural Compute Stick 2 powered by the Intel® Movidius™ Myriad™ X, Intel® Vision Accelerator Design with Intel® Movidius™ VPUs                                                                                           |
 |[GNA plugin](GNA.md) (available in the Intel® Distribution of OpenVINO™ toolkit)              |Intel&reg; Speech Enabling Developer Kit, Amazon Alexa* Premium Far-Field Developer Kit, Intel&reg; Pentium&reg; Silver J5005 Processor, Intel&reg; Pentium&reg; Silver N5000 Processor, Intel&reg; Celeron&reg; J4005 Processor, Intel&reg; Celeron&reg; J4105 Processor, Intel&reg; Celeron&reg; Processor N4100, Intel&reg; Celeron&reg; Processor N4000, Intel&reg; Core&trade; i3-8121U Processor, Intel&reg; Core&trade; i7-1065G7 Processor, Intel&reg; Core&trade; i7-1060G7 Processor, Intel&reg; Core&trade; i5-1035G4 Processor, Intel&reg; Core&trade; i5-1035G7 Processor, Intel&reg; Core&trade; i5-1035G1 Processor, Intel&reg; Core&trade; i5-1030G7 Processor, Intel&reg; Core&trade; i5-1030G4 Processor, Intel&reg; Core&trade; i3-1005G1 Processor, Intel&reg; Core&trade; i3-1000G1 Processor, Intel&reg; Core&trade; i3-1000G4 Processor|
-|[Multi-Device plugin](MULTI.md) |Multi-Device plugin enables simultaneous inference of the same network on several Intel&reg; devices in parallel    |   
-|[Auto-Device plugin](AUTO.md) |Auto-Device plugin enables selecting Intel&reg; device for inference automatically |   
-|[Heterogeneous plugin](HETERO.md) |Heterogeneous plugin enables automatic inference splitting between several Intel&reg; devices (for example if a device doesn't [support certain layers](#supported-layers)).                                                           |
+
+OpenVINO runtime also has several execution capabilities which work on top of other devices:
+
+| Capability                                   | Description                                                                                                                                                |
+|------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
+|[Multi-Device execution](../multi_device.md) |Multi-Device enables simultaneous inference of the same model on several devices in parallel    |
+|[Auto-Device selection](../auto_device_selection.md) |Auto-Device selection enables selecting Intel&reg; device for inference automatically |
+|[Heterogeneous execution](../hetero_execution.md) |Heterogeneous execution enables automatic inference splitting between several devices (for example if a device doesn't [support certain operation](#supported-layers)).                                                           |

 Devices similar to the ones we have used for benchmarking can be accessed using [Intel® DevCloud for the Edge](https://devcloud.intel.com/edge/), a remote development environment with access to Intel® hardware and the latest versions of the Intel® Distribution of the OpenVINO™ Toolkit. [Learn more](https://devcloud.intel.com/edge/get_started/devcloud/) or [Register here](https://inteliot.force.com/DevcloudForEdge/s/).

--- a/docs/OV_Runtime_UG/supported_plugins/GNA.md
+++ b/docs/OV_Runtime_UG/supported_plugins/GNA.md
@@ -1,4 +1,4 @@
-# GNA Plugin {#openvino_docs_IE_DG_supported_plugins_GNA}
+# GNA device {#openvino_docs_OV_UG_supported_plugins_GNA}
 ## Introducing the GNA Plugin

 The Intel® Gaussian & Neural Accelerator is a low-power neural coprocessor for continuous inference at the edge.
@@ -225,10 +225,10 @@ For the list of supported layers, see the **GNA** column of the **Supported Laye

 Limitations include:

- Only 1D convolutions are natively supported.
+- Only 1D convolutions are natively supported on the HW prior to GNA 3.0; 2D convolutions have specific limitations (see the table above).
 - The number of output channels for convolutions must be a multiple of 4.
 - The maximum number of filters is 65532 for GNA 2.0 and 8192 for GNA 3.0.
- Permute layer support is limited to the cases where no data reordering is needed or when reordering is happening for two dimensions, at least one of which is not greater than 8.
+- Transpose layer support is limited to the cases where no data reordering is needed or when reordering is happening for two dimensions, at least one of which is not greater than 8.
 - Splits and concatenations are supported for continuous portions of memory (e.g., split of 1,2,3,4 to 1,1,3,4 and 1,1,3,4 or concats of 1,2,3,4 and 1,2,3,5 to 2,2,3,4).
 - For Multiply, Add and Subtract layers, auto broadcasting is only supported for constant inputs.

@@ -236,13 +236,13 @@ Limitations include:

 The Intel® GNA 1.0 and 2.0 hardware natively supports only 1D convolutions.

-However, 2D convolutions can be mapped to 1D when a convolution kernel moves in a single direction. GNA Plugin performs such a transformation for Kaldi `nnet1` convolution. From this perspective, the Intel® GNA hardware convolution operation accepts an `NHWC` input and produces an `NHWC` output. Because OpenVINO™ only supports the `NCHW` layout, you may need to insert `Permute` layers before or after convolutions.
+However, 2D convolutions can be mapped to 1D when a convolution kernel moves in a single direction. GNA Plugin performs such a transformation for Kaldi `nnet1` convolution. From this perspective, the Intel® GNA hardware convolution operation accepts an `NHWC` input and produces an `NHWC` output. Because OpenVINO™ only supports the `NCHW` layout, you may need to insert `Transpose` layers before or after convolutions.

-For example, the Kaldi model optimizer inserts such a permute after convolution for the [rm_cnn4a network](https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/rm_cnn4a_smbr/). This `Permute` layer is automatically removed by the GNA Plugin, because the Intel® GNA hardware convolution layer already produces the required `NHWC` result.
+For example, the Kaldi model optimizer inserts such a transpose after convolution for the [rm_cnn4a network](https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/rm_cnn4a_smbr/). This `Transpose` layer is automatically removed by the GNA Plugin, because the Intel® GNA hardware convolution layer already produces the required `NHWC` result.

 ## Operation Precision

-Intel® GNA essentially operates in the low-precision mode, which represents a mix of 8-bit (`I8`), 16-bit (`I16`), and 32-bit (`I32`) integer computations. Outputs calculated using a reduced integer precision are different from the scores calculated using the floating point format, for example, `FP32` outputs calculated on CPU using the Inference Engine [CPU Plugin](CPU.md).
+Intel® GNA essentially operates in the low-precision mode, which represents a mix of 8-bit (`I8`), 16-bit (`I16`), and 32-bit (`I32`) integer computations. Outputs calculated using a reduced integer precision are different from the scores calculated using the floating point format, for example, `FP32` outputs calculated on CPU using the OpenVINO [CPU device](CPU.md).

 Unlike other plugins supporting low-precision execution, the GNA plugin can calculate quantization factors at the model loading time, so you can run a model without calibration using the [Post-Training Optimization Tool](@ref pot_README).
 However, this mode may not provide satisfactory accuracy because the internal quantization algorithm is based on heuristics which may or may not be efficient, depending on the model and dynamic range of input data.
--- a/docs/OV_Runtime_UG/supported_plugins/GPU.md
+++ b/docs/OV_Runtime_UG/supported_plugins/GPU.md
@@ -1,4 +1,4 @@
-# GPU Plugin {#openvino_docs_IE_DG_supported_plugins_GPU}
+# GPU device {#openvino_docs_OV_UG_supported_plugins_GPU}

@sphinxdirective

@@ -6,14 +6,14 @@
   :maxdepth: 1
   :hidden:

-   openvino_docs_IE_DG_supported_plugins_GPU_RemoteBlob_API
+   openvino_docs_OV_UG_supported_plugins_GPU_RemoteBlob_API


@endsphinxdirective

 The GPU plugin uses the Intel® Compute Library for Deep Neural Networks (clDNN) to infer deep neural networks.
 clDNN is an open source performance library for Deep Learning (DL) applications intended for acceleration of Deep Learning Inference on Intel® Processor Graphics including Intel® HD Graphics, Intel® Iris® Graphics, Intel® Iris® Xe Graphics, and Intel® Iris® Xe MAX graphics.
-For an in-depth description of clDNN, see [Inference Engine source files](https://github.com/openvinotoolkit/openvino/tree/master/src/plugins/intel_gpu/) and [Accelerate Deep Learning Inference with Intel® Processor Graphics](https://software.intel.com/en-us/articles/accelerating-deep-learning-inference-with-intel-processor-graphics).
+For an in-depth description of clDNN, see [OpenVINO Runtime GPU plugin source files](https://github.com/openvinotoolkit/openvino/tree/master/src/plugins/intel_gpu/) and [Accelerate Deep Learning Inference with Intel® Processor Graphics](https://software.intel.com/en-us/articles/accelerating-deep-learning-inference-with-intel-processor-graphics).

 ## Device Naming Convention
 * Devices are enumerated as "GPU.X" where `X={0, 1, 2,...}`. Only Intel® GPU devices are considered.
--- a/docs/OV_Runtime_UG/supported_plugins/GPU_RemoteBlob_API.md
+++ b/docs/OV_Runtime_UG/supported_plugins/GPU_RemoteBlob_API.md
@@ -1,4 +1,4 @@
-Remote Blob API of GPU Plugin {#openvino_docs_IE_DG_supported_plugins_GPU_RemoteBlob_API}
+Remote Blob API of GPU Plugin {#openvino_docs_OV_UG_supported_plugins_GPU_RemoteBlob_API}
 ================================

 The GPU plugin implementation of the `RemoteContext` and `RemoteBlob` interfaces supports GPU
--- a/Show More
+++ b/Show More