diff --git a/README.md b/README.md
index 7d54e9e8f9c..30c314e76de 100644
--- a/README.md
+++ b/README.md
@@ -46,3 +46,5 @@ Please report questions, issues and suggestions using:
 [Inference Engine]:https://software.intel.com/en-us/articles/OpenVINO-InferEngine
 [Model Optimizer]:https://software.intel.com/en-us/articles/OpenVINO-ModelOptimizer
 [nGraph]:https://docs.openvinotoolkit.org/latest/openvino_docs_nGraph_DG_DevGuide.html
+[tag on StackOverflow]:https://stackoverflow.com/search?q=%23openvino
+
diff --git a/docs/HOWTO/Custom_Layers_Guide.md b/docs/HOWTO/Custom_Layers_Guide.md
index 5816311af8c..1de91356304 100644
--- a/docs/HOWTO/Custom_Layers_Guide.md
+++ b/docs/HOWTO/Custom_Layers_Guide.md
@@ -337,7 +337,7 @@ operation for the CPU plugin. The code of  the library is described in the [Exte
 In order to build the extension run the following:<br>
 ```bash
 mkdir build && cd build
-source /opt/intel/openvino/bin/setupvars.sh
+source /opt/intel/openvino_2021/bin/setupvars.sh
 cmake .. -DCMAKE_BUILD_TYPE=Release
 make --jobs=$(nproc)
 ```
@@ -368,7 +368,7 @@ python3 mri_reconstruction_demo.py \
 - [Model Optimizer Extensibility](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md)
 - [Inference Engine Extensibility Mechanism](../IE_DG/Extensibility_DG/Intro.md)
 - [Inference Engine Samples Overview](../IE_DG/Samples_Overview.md)
-- [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_intel_index)
+- [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel)
 - For IoT Libraries and Code Samples see the [Intel® IoT Developer Kit](https://github.com/intel-iot-devkit).
 
 ## Converting Models:
diff --git a/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md b/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md
index 25c14035144..89997e0f0ce 100644
--- a/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md
+++ b/docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md
@@ -1,88 +1,122 @@
 # Inference Engine Developer Guide {#openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide}
 
-## Introduction to the OpenVINO™ Toolkit
+> **NOTE:** [Intel® System Studio](https://software.intel.com/en-us/system-studio) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019).
 
-The OpenVINO™ toolkit is a comprehensive toolkit that you can use to develop and deploy vision-oriented solutions on
-Intel® platforms. Vision-oriented means the solutions use images or videos to perform specific tasks.
-A few of the solutions use cases include autonomous navigation, digital surveillance cameras, robotics,
-and mixed-reality headsets.
-
-The OpenVINO™ toolkit:
-
-* Enables CNN-based deep learning inference on the edge
-* Supports heterogeneous execution across an Intel&reg; CPU, Intel&reg; Integrated Graphics, Intel&reg; Neural Compute Stick 2
-* Speeds time-to-market via an easy-to-use library of computer vision functions and pre-optimized kernels
-* Includes optimized calls for computer vision standards including OpenCV\*, OpenCL&trade;, and OpenVX\*
-
-The OpenVINO™ toolkit includes the following components:
-
-* Intel® Deep Learning Deployment Toolkit (Intel® DLDT)
-    - [Deep Learning Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) — A cross-platform command-line tool for importing models and
-    preparing them for optimal execution with the Deep Learning Inference Engine. The Model Optimizer supports converting Caffe*,
-    TensorFlow*, MXNet*, Kaldi*, ONNX* models.
-    - [Deep Learning Inference Engine](inference_engine_intro.md) — A unified API to allow high performance inference on many hardware types
-    including Intel® CPU, Intel® Processor Graphics, Intel® FPGA, Intel® Neural Compute Stick 2.
-    - [nGraph](../nGraph_DG/nGraph_dg.md) — graph representation and manipulation engine which is used to represent a model inside Inference Engine and allows the run-time model construction without using Model Optimizer.
-* [OpenCV](https://docs.opencv.org/) — OpenCV* community version compiled for Intel® hardware.
-Includes PVL libraries for computer vision.
-* Drivers and runtimes for OpenCL™ version 2.1
-* [Intel® Media SDK](https://software.intel.com/en-us/media-sdk)
-* [OpenVX*](https://software.intel.com/en-us/cvsdk-ovx-guide) — Intel's implementation of OpenVX*
-optimized for running on Intel® hardware (CPU, GPU, IPU).
-* [Demos and samples](Samples_Overview.md).
-
-
-This Guide provides overview of the Inference Engine describing the typical workflow for performing
+This Guide provides an overview of the Inference Engine describing the typical workflow for performing
 inference of a pre-trained and optimized deep learning model and a set of sample applications.
 
-> **NOTES:**
-> - Before you perform inference with the Inference Engine, your models should be converted to the Inference Engine format using the Model Optimizer or built directly in run-time using nGraph API. To learn about how to use Model Optimizer, refer to the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md). To learn about the pre-trained and optimized models delivered with the OpenVINO™ toolkit, refer to [Pre-Trained Models](@ref omz_models_intel_index).
-> - [Intel® System Studio](https://software.intel.com/en-us/system-studio) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019).
+> **NOTE:** Before you perform inference with the Inference Engine, your models should be converted to the Inference Engine format using the Model Optimizer or built directly in run-time using nGraph API. To learn about how to use Model Optimizer, refer to the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md). To learn about the pre-trained and optimized models delivered with the OpenVINO™ toolkit, refer to [Pre-Trained Models](@ref omz_models_group_intel).
 
+After you have used the Model Optimizer to create an Intermediate Representation (IR), use the Inference Engine to infer the result for a given input data.
 
-## Table of Contents
+Inference Engine is a set of C++ libraries providing a common API to deliver inference solutions on the platform of your choice: CPU, GPU, or VPU. Use the Inference Engine API to read the Intermediate Representation, set the input and output formats, and execute the model on devices. While the C++ libraries is the primary implementation, C libraries and Python bindings are also available.
 
-* [Inference Engine API Changes History](API_Changes.md)
+For Intel® Distribution of OpenVINO™ toolkit, Inference Engine binaries are delivered within release packages. 
 
-* [Introduction to Inference Engine](inference_engine_intro.md)
+The open source version is available in the [OpenVINO™ toolkit GitHub repository](https://github.com/openvinotoolkit/openvino) and can be built for supported platforms using the <a href="https://github.com/openvinotoolkit/openvino/wiki/BuildingCode">Inference Engine Build Instructions</a>.    
 
-* [Understanding Inference Engine Memory Primitives](Memory_primitives.md)
+To learn about how to use the Inference Engine API for your application, see the [Integrating Inference Engine in Your Application](Integrate_with_customer_application_new_API.md) documentation.
 
-* [Introduction to Inference Engine Device Query API](InferenceEngine_QueryAPI.md)
+For complete API Reference, see the [Inference Engine API References](./api_references.html) section.
 
-* [Adding Your Own Layers to the Inference Engine](Extensibility_DG/Intro.md)
+Inference Engine uses a plugin architecture. Inference Engine plugin is a software component that contains complete implementation for inference on a certain Intel&reg; hardware device: CPU, GPU, VPU, etc. Each plugin implements the unified API and provides additional hardware-specific APIs.
 
-* [Integrating Inference Engine in Your Application](Integrate_with_customer_application_new_API.md)
+## Modules in the Inference Engine component
+### Core Inference Engine Libraries ###
 
-* [[DEPRECATED] Migration from Inference Engine Plugin API to Core API](Migration_CoreAPI.md)
+Your application must link to the core Inference Engine libraries:
+* Linux* OS:
+    - `libinference_engine.so`, which depends on `libinference_engine_transformations.so`, `libtbb.so`, `libtbbmalloc.so` and `libngraph.so`
+* Windows* OS:
+    - `inference_engine.dll`, which depends on `inference_engine_transformations.dll`, `tbb.dll`, `tbbmalloc.dll` and `ngraph.dll`
+* macOS*:
+    - `libinference_engine.dylib`, which depends on `libinference_engine_transformations.dylib`, `libtbb.dylib`, `libtbbmalloc.dylib` and `libngraph.dylib`
 
-* [Introduction to Performance Topics](Intro_to_Performance.md)
+The required C++ header files are located in the `include` directory.
 
-* [Inference Engine Python API Overview](../../inference-engine/ie_bridges/python/docs/api_overview.md)
+This library contains the classes to:
+* Create Inference Engine Core object to work with devices and read network (InferenceEngine::Core)
+* Manipulate network information (InferenceEngine::CNNNetwork)
+* Execute and pass inputs and outputs (InferenceEngine::ExecutableNetwork and InferenceEngine::InferRequest)
 
-* [Using Dynamic Batching feature](DynamicBatching.md)
+### Plugin Libraries to Read a Network Object ###
 
-* [Using Static Shape Infer feature](ShapeInference.md)
+Starting from 2020.4 release, Inference Engine introduced a concept of `CNNNetwork` reader plugins. Such plugins can be automatically dynamically loaded by Inference Engine in runtime depending on file format:
+* Linux* OS:
+    - `libinference_engine_ir_reader.so` to read a network from IR
+    - `libinference_engine_onnx_reader.so` to read a network from ONNX model format
+* Windows* OS:
+    - `inference_engine_ir_reader.dll` to read a network from IR
+    - `inference_engine_onnx_reader.dll` to read a network from ONNX model format
 
-* [Using Low-Precision 8-bit Integer Inference](Int8Inference.md)
+### Device-Specific Plugin Libraries ###
 
-* [Using Bfloat16 Inference](Bfloat16Inference.md)
+For each supported target device, Inference Engine provides a plugin — a DLL/shared library that contains complete implementation for inference on this particular device. The following plugins are available:
 
-* Utilities to Validate Your Converted Model
-    * [Using Cross Check Tool for Per-Layer Comparison Between Plugins](../../inference-engine/tools/cross_check_tool/README.md)
+| Plugin  | Device Type                   |
+| ------- | ----------------------------- |
+|CPU      |	Intel® Xeon® with Intel® AVX2 and AVX512, Intel® Core™ Processors with Intel® AVX2, Intel® Atom® Processors with Intel® SSE |
+|GPU      | Intel® Processor Graphics, including Intel® HD Graphics and Intel® Iris® Graphics |
+|MYRIAD   |	Intel® Neural Compute Stick 2 powered by the Intel® Movidius™ Myriad™ X |
+|GNA      |	Intel&reg; Speech Enabling Developer Kit, Amazon Alexa* Premium Far-Field Developer Kit, Intel&reg; Pentium&reg; Silver J5005 Processor, Intel&reg; Pentium&reg; Silver N5000 Processor, Intel&reg; Celeron&reg; J4005 Processor, Intel&reg; Celeron&reg; J4105 Processor, Intel&reg; Celeron&reg; Processor N4100, Intel&reg; Celeron&reg; Processor N4000, Intel&reg; Core&trade; i3-8121U Processor, Intel&reg; Core&trade; i7-1065G7 Processor, Intel&reg; Core&trade; i7-1060G7 Processor, Intel&reg; Core&trade; i5-1035G4 Processor, Intel&reg; Core&trade; i5-1035G7 Processor, Intel&reg; Core&trade; i5-1035G1 Processor, Intel&reg; Core&trade; i5-1030G7 Processor, Intel&reg; Core&trade; i5-1030G4 Processor, Intel&reg; Core&trade; i3-1005G1 Processor, Intel&reg; Core&trade; i3-1000G1 Processor, Intel&reg; Core&trade; i3-1000G4 Processor |
+|HETERO   | Automatic splitting of a network inference between several devices (for example if a device doesn't support certain layers|
+|MULTI    | Simultaneous inference of the same network on several devices in parallel|
 
-* [Supported Devices](supported_plugins/Supported_Devices.md)
-    * [GPU](supported_plugins/CL_DNN.md)
-    * [CPU](supported_plugins/CPU.md)
-    * [VPU](supported_plugins/VPU.md)
-      * [MYRIAD](supported_plugins/MYRIAD.md)
-      * [HDDL](supported_plugins/HDDL.md)
-    * [Heterogeneous execution](supported_plugins/HETERO.md)
-    * [GNA](supported_plugins/GNA.md)
-    * [MULTI](supported_plugins/MULTI.md)
+The table below shows the plugin libraries and additional dependencies for Linux, Windows and macOS platforms.
 
-* [Pre-Trained Models](@ref omz_models_intel_index)
+| Plugin | Library name for Linux      | Dependency libraries for Linux                              | Library name for Windows | Dependency libraries for Windows                                                                       | Library name for macOS       | Dependency libraries for macOS              |
+|--------|-----------------------------|-------------------------------------------------------------|--------------------------|--------------------------------------------------------------------------------------------------------|------------------------------|---------------------------------------------|
+| CPU    | `libMKLDNNPlugin.so`        | `libinference_engine_lp_transformations.so`                 | `MKLDNNPlugin.dll`       | `inference_engine_lp_transformations.dll`                                                              | `libMKLDNNPlugin.so`      | `inference_engine_lp_transformations.dylib` |
+| GPU    | `libclDNNPlugin.so`         | `libinference_engine_lp_transformations.so`, `libOpenCL.so` | `clDNNPlugin.dll`        | `OpenCL.dll`, `inference_engine_lp_transformations.dll`                                                |  Is not supported            |  -                                          |
+| MYRIAD | `libmyriadPlugin.so`        | `libusb.so`,                                                | `myriadPlugin.dll`       | `usb.dll`                                                                                              | `libmyriadPlugin.so`      | `libusb.dylib`                              |
+| HDDL   | `libHDDLPlugin.so`          | `libbsl.so`, `libhddlapi.so`, `libmvnc-hddl.so`             | `HDDLPlugin.dll`         | `bsl.dll`, `hddlapi.dll`, `json-c.dll`, `libcrypto-1_1-x64.dll`, `libssl-1_1-x64.dll`, `mvnc-hddl.dll` |  Is not supported            |  -                                          |
+| GNA    | `libGNAPlugin.so`           | `libgna.so`,                                                | `GNAPlugin.dll`          | `gna.dll`                                                                                              |  Is not supported            |  -                                          |
+| HETERO | `libHeteroPlugin.so`        | Same as for selected plugins                                | `HeteroPlugin.dll`       | Same as for selected plugins                                                                           | `libHeteroPlugin.so`      |  Same as for selected plugins               |
+| MULTI  | `libMultiDevicePlugin.so`   | Same as for selected plugins                                | `MultiDevicePlugin.dll`  | Same as for selected plugins                                                                           | `libMultiDevicePlugin.so` |  Same as for selected plugins               |
 
-* [Known Issues](Known_Issues_Limitations.md)
+> **NOTE**: All plugin libraries also depend on core Inference Engine libraries.
 
-**Typical Next Step:** [Introduction to Inference Engine](inference_engine_intro.md)
+Make sure those libraries are in your computer's path or in the place you pointed to in the plugin loader. Make sure each plugin's related dependencies are in the:
+
+* Linux: `LD_LIBRARY_PATH`
+* Windows: `PATH`
+* macOS: `DYLD_LIBRARY_PATH`
+
+On Linux and macOS, use the script `bin/setupvars.sh` to set the environment variables.
+
+On Windows, run the `bin\setupvars.bat` batch file to set the environment variables.
+
+To learn more about supported devices and corresponding plugins, see the [Supported Devices](supported_plugins/Supported_Devices.md) chapter.
+
+## Common Workflow for Using the Inference Engine API
+
+The common workflow contains the following steps:
+
+1. **Create Inference Engine Core object** - Create an `InferenceEngine::Core` object to work with different devices, all device plugins are managed internally by the `Core` object. Register extensions with custom nGraph operations (`InferenceEngine::Core::AddExtension`).
+
+2. **Read the Intermediate Representation** - Using the `InferenceEngine::Core` class, read an Intermediate Representation file into an object of the `InferenceEngine::CNNNetwork` class. This class represents the network in the host memory.
+
+3. **Prepare inputs and outputs format** - After loading the network, specify input and output precision and the layout on the network. For these specification, use the `InferenceEngine::CNNNetwork::getInputsInfo()` and `InferenceEngine::CNNNetwork::getOutputsInfo()`.
+
+4. Pass per device loading configurations specific to this device (`InferenceEngine::Core::SetConfig`), and register extensions to this device (`InferenceEngine::Core::AddExtension`).
+
+5. **Compile and Load Network to device** - Use the `InferenceEngine::Core::LoadNetwork()` method with specific device (e.g. `CPU`, `GPU`, etc.) to compile and load the network on the device. Pass in the per-target load configuration for this compilation and load operation.
+
+6. **Set input data** - With the network loaded, you have an `InferenceEngine::ExecutableNetwork` object. Use this object to create an `InferenceEngine::InferRequest` in which you signal the input buffers to use for input and output. Specify a device-allocated memory and copy it into the device memory directly, or tell the device to use your application memory to save a copy.
+
+7. **Execute** - With the input and output memory now defined, choose your execution mode:
+
+    * Synchronously - `InferenceEngine::InferRequest::Infer()` method. Blocks until inference is completed.
+    * Asynchronously - `InferenceEngine::InferRequest::StartAsync()` method. Check status with the `InferenceEngine::InferRequest::Wait()` method (0 timeout), wait, or specify a completion callback.
+
+8. **Get the output** - After inference is completed, get the output memory or read the memory you provided earlier. Do this with the `InferenceEngine::IInferRequest::GetBlob()` method.
+
+## Video: Inference Engine Concept
+[![](https://img.youtube.com/vi/e6R13V8nbak/0.jpg)](https://www.youtube.com/watch?v=e6R13V8nbak)
+\htmlonly
+<iframe width="560" height="315" src="https://www.youtube.com/embed/e6R13V8nbak" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+\endhtmlonly
+
+## Further Reading
+
+For more details on the Inference Engine API, refer to the [Integrating Inference Engine in Your Application](Integrate_with_customer_application_new_API.md) documentation.
diff --git a/docs/IE_DG/Int8Inference.md b/docs/IE_DG/Int8Inference.md
index 1443800f307..1f580bbd4e2 100644
--- a/docs/IE_DG/Int8Inference.md
+++ b/docs/IE_DG/Int8Inference.md
@@ -2,61 +2,18 @@
 
 ## Disclaimer
 
-Inference Engine with low-precision 8-bit integer inference requires the following prerequisites to be satisfied:
-- Inference Engine [CPU Plugin](supported_plugins/CPU.md) must be built with the Intel® Math Kernel Library (Intel® MKL) dependency. In the Intel® Distribution of OpenVINO™ it is 
-  satisfied by default, this is mostly the requirement if you are using OpenVINO™ available in open source, because [open source version of OpenVINO™](https://github.com/openvinotoolkit/openvino) can be built with OpenBLAS* that is unacceptable if you want to use 8-bit integer inference.
-- Intel® platforms that support at least one extension to x86 instruction set from the following list:
+Low-precision 8-bit inference is optimized for:
+- Intel® architecture processors with the following instruction set architecture extensions:  
+  - Intel® Advanced Vector Extensions 512 Vector Neural Network Instructions (Intel® AVX-512 VNNI)
   - Intel® Advanced Vector Extensions 512 (Intel® AVX-512)
   - Intel® Advanced Vector Extensions 2.0 (Intel® AVX2)
   - Intel® Streaming SIMD Extensions 4.2 (Intel® SSE4.2)
-- A model must be quantized. To quantize the model, you can use the [Post-Training Optimization Tool](@ref pot_README) delivered with the Intel® Distribution of OpenVINO™ toolkit release package.
-
-The 8-bit inference feature was validated on the following topologies:
-* **Classification models:**
-	* Caffe\* DenseNet-121, DenseNet-161, DenseNet-169, DenseNet-201
-    * Caffe Inception v1, Inception v2, Inception v3, Inception v4
-    * Caffe YOLO v1 tiny, YOLO v3
-	* Caffe ResNet-50 v1, ResNet-101 v1, ResNet-152 v1, ResNet-269 v1
-    * Caffe ResNet-18
-	* Caffe MobileNet, MobileNet v2
-    * Caffe SE ResNeXt-50
-	* Caffe SqueezeNet v1.0, SqueezeNet v1.1
-	* Caffe VGG16, VGG19
-    * TensorFlow\* DenseNet-121, DenseNet-169
-    * TensorFlow Inception v1, Inception v2, Inception v3, Inception v4, Inception ResNet v2
-    * TensorFlow Lite Inception v1, Inception v2, Inception v3, Inception v4, Inception ResNet v2
-    * TensorFlow Lite MobileNet v1, MobileNet v2
-    * TensorFlow MobileNet v1, MobileNet v2
-    * TensorFlow ResNet-50 v1.5, ResNet-50 v1, ResNet-101 v1, ResNet-152 v1, ResNet-50 v2, ResNet-101 v2, ResNet-152 v2
-    * TensorFlow VGG16, VGG19
-    * TensorFlow YOLO v3
-    * MXNet\* CaffeNet
-    * MXNet DenseNet-121, DenseNet-161, DenseNet-169, DenseNet-201
-    * MXNet Inception v3,  inception_v4
-    * MXNet Mobilenet, Mobilenet v2
-    * MXNet ResNet-101 v1, ResNet-152 v1, ResNet-101 v2, ResNet-152 v2
-    * MXNet ResNeXt-101
-    * MXNet SqueezeNet v1.1
-    * MXNet VGG16, VGG19
-    
-
-* **Object detection models:**
-	* Caffe SSD GoogLeNet 
-    * Caffe SSD MobileNet
-    * Caffe SSD SqueezeNet
-	* Caffe SSD VGG16 300, SSD VGG16 512
-    * TensorFlow SSD MobileNet v1, SSD MobileNet v2
-    * MXNet SSD Inception v3 512
-    * MXNet SSD MobileNet 512
-    * MXNet SSD ResNet-50 512
-    * MXNet SSD VGG16 300
-    * ONNX\* SSD ResNet 34
-
-* **Semantic segmentation models:**
-    * Unet2D
-
-* **Recommendation system models:**
-    * NCF
+- Intel® processor graphics:
+  - Intel® Iris® Xe Graphics
+  - Intel® Iris® Xe MAX Graphics
+- A model must be quantized. You can use a quantized model from [OpenVINO™ Toolkit Intel's Pre-Trained Models](@ref omz_models_group_intel) or quantize a model yourself. For quantization, you can use the:
+  - [Post-Training Optimization Tool](@ref pot_README) delivered with the Intel® Distribution of OpenVINO™ toolkit release package.
+  - [Neural Network Compression Framework](https://www.intel.com/content/www/us/en/artificial-intelligence/posts/openvino-nncf.html) available on GitHub: https://github.com/openvinotoolkit/nncf
 
 ## Introduction
 
@@ -65,63 +22,62 @@ A lot of investigation was made in the field of deep learning with the idea of u
 
 8-bit computations (referred to as `int8`) offer better performance compared to the results of inference in higher precision (for example, `fp32`), because they allow loading more data into a single processor instruction. Usually the cost for significant boost is a reduced accuracy. However, it is proved that an accuracy drop can be negligible and depends on task requirements, so that the application engineer can set up the maximum accuracy drop that is acceptable.
 
-Current Inference Engine solution for low-precision inference uses Intel MKL-DNN and supports inference of the following layers in 8-bit integer computation mode:
-* Convolution
-* FullyConnected
-* ReLU
-* ReLU6
-* Reshape
-* Permute
-* Pooling
-* Squeeze
-* Eltwise
-* Concat
-* Resample
-* MVN
 
-This means that 8-bit inference can only be performed with the CPU plugin on the layers listed above. All other layers are executed in the format supported by the CPU plugin: 32-bit floating point format (`fp32`).
+Let's explore quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model. Use [Model Downloader](@ref omz_tools_downloader) tool to download the `fp16` model from [OpenVINO™ Toolkit - Open Model Zoo repository](https://github.com/openvinotoolkit/open_model_zoo):
+```sh
+./downloader.py --name resnet-50-tf --precisions FP16-INT8
+```
+After that you should quantize model by [Model Quantizer](@ref omz_tools_downloader) tool.
+```sh
+./quantizer.py --model_dir public/resnet-50-tf --dataset_dir <DATASET_DIR> --precisions=FP16-INT8
+```
+The simplest way to infer the model and collect performance counters is [C++ Benchmark Application](../../inference-engine/samples/benchmark_app/README.md). 
+```sh
+./benchmark_app -m resnet-50-tf.xml -d CPU -niter 1 -api sync -report_type average_counters  -report_folder pc_report_dir
+```
+If you infer the model in the OpenVINO™ CPU plugin and collect performance counters, all operations (except last not quantized SoftMax) are executed in INT8 precision.  
 
 ## Low-Precision 8-bit Integer Inference Workflow
 
-For 8-bit integer computations, a model must be quantized. If the model is not quantized then you can use the [Post-Training Optimization Tool](@ref pot_README) to quantize the model. The quantization process adds `FakeQuantize` layers on activations and weights for most layers. Read more about mathematical computations under the hood in the [white paper](https://intel.github.io/mkl-dnn/ex_int8_simplenet.html).
+For 8-bit integer computations, a model must be quantized. Quantized models can be downloaded from [Overview of OpenVINO™ Toolkit Intel's Pre-Trained Models](@ref omz_models_group_intel). If the model is not quantized, you can use the [Post-Training Optimization Tool](@ref pot_README) to quantize the model. The quantization process adds [FakeQuantize](../ops/quantization/FakeQuantize_1.md) layers on activations and weights for most layers. Read more about mathematical computations in the [Uniform Quantization with Fine-Tuning](https://github.com/openvinotoolkit/nncf/blob/develop/docs/compression_algorithms/Quantization.md).
 
 8-bit inference pipeline includes two stages (also refer to the figure below):
-1. *Offline stage*, or *model quantization*. During this stage, `FakeQuantize` layers are added before most layers to have quantized tensors before layers in a way that low-precision accuracy drop for 8-bit integer inference satisfies the specified threshold. The output of this stage is a quantized model. Quantized model precision is not changed, quantized tensors are in original precision range (`fp32`). `FakeQuantize` layer has `Quantization Levels` attribute which defines quants count. Quants count defines precision which is used during inference. For `int8` range `Quantization Levels` attribute value has to be 255 or 256.
+1. *Offline stage*, or *model quantization*. During this stage, [FakeQuantize](../ops/quantization/FakeQuantize_1.md) layers are added before most layers to have quantized tensors before layers in a way that low-precision accuracy drop for 8-bit integer inference satisfies the specified threshold. The output of this stage is a quantized model. Quantized model precision is not changed, quantized tensors are in original precision range (`fp32`). `FakeQuantize` layer has `levels` attribute which defines quants count. Quants count defines precision which is used during inference. For `int8` range `levels` attribute value has to be 255 or 256. To quantize the model, you can use the [Post-Training Optimization Tool](@ref pot_README) delivered with the Intel® Distribution of OpenVINO™ toolkit release package.
+
+   When you pass the quantized IR to the OpenVINO™ plugin, the plugin automatically recognizes it as a quantized model and performs 8-bit inference. Note, if you pass a quantized model to another plugin that does not support 8-bit inference but supports all operations from the model, the model is inferred in precision that this plugin supports.
+
+2. *Run-time stage*. This stage is an internal procedure of the OpenVINO™ plugin. During this stage, the quantized model is loaded to the plugin. The plugin uses `Low Precision Transformation` component to update the model to infer it in low precision:
+   - Update `FakeQuantize` layers to have quantized output tensors in low precision range and add dequantization layers to compensate the update. Dequantization layers are pushed through as many layers as possible to have more layers in low precision. After that, most layers have quantized input tensors in low precision range and can be inferred in low precision. Ideally, dequantization layers should be fused in the next `FakeQuantize` layer.
+   - Weights are quantized and stored in `Constant` layers. 
 
-2. *Run-time stage*. This stage is an internal procedure of the [CPU Plugin](supported_plugins/CPU.md). During this stage, the quantized model is loaded to the plugin. The plugin updates each `FakeQuantize` layer on activations and weights to have `FakeQuantize` output tensor values in low precision range. 
 ![int8_flow]
 
-### Offline Stage: Model Quantization
-
-To infer a layer in low precision and get maximum performance, the input tensor for the layer has to be quantized and each value has to be in the target low precision range. For this purpose, `FakeQuantize` layer is used in the OpenVINO™ intermediate representation file (IR). To quantize the model, you can use the [Post-Training Optimization Tool](@ref pot_README) delivered with the Intel® Distribution of OpenVINO™ toolkit release package.
-
-When you pass the calibrated IR to the [CPU plugin](supported_plugins/CPU.md), the plugin automatically recognizes it as a quantized model and performs 8-bit inference. Note, if you pass a quantized model to another plugin that does not support 8-bit inference, the model is inferred in precision that this plugin supports.
-
-### Run-Time Stage: Quantization
-
-This is the second stage of the 8-bit integer inference. After you load the quantized model IR to a plugin, the pluing uses the `Low Precision Transformation` component to update the model to infer it in low precision:
-* Updates `FakeQuantize` layers to have quantized output tensors in low precision range and add dequantization layers to compensate the update. Dequantization layers are pushed through as many layers as possible to have more layers in low precision. After that, most layers have quantized input tensors in low precision range and can be inferred in low precision. Ideally, dequantization layers should be fused in next `FakeQuantize` or `ScaleShift` layers.
-* Weights are quantized and stored in `Const` layers.
-* Biases are updated to avoid shifts in dequantization layers.
-
 ## Performance Counters
 
 Information about layer precision is stored in the performance counters that are
-available from the Inference Engine API. The layers have the following marks:
-* Suffix `I8` for layers that had 8-bit data type input and were computed in 8-bit precision
-* Suffix `FP32` for layers computed in 32-bit precision
+available from the Inference Engine API. For example, the part of performance counters table for quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model inference on [CPU Plugin](supported_plugins/CPU.md) looks as follows:
 
-For example, the performance counters table for the Inception model can look as follows:
 
-```
-inception_5b/5x5_reduce       EXECUTED       layerType: Convolution        realTime: 417        cpu: 417            execType: gemm_blas_I8
-inception_5b/output           EXECUTED       layerType: Concat             realTime: 34         cpu: 34             execType: ref_I8
-inception_5b/output_U8_nhw... EXECUTED       layerType: Reorder            realTime: 33092      cpu: 33092          execType: reorder_I8
-inception_5b/output_oScale... EXECUTED       layerType: ScaleShift         realTime: 1390       cpu: 1390           execType: jit_avx2_FP32
-inception_5b/output_oScale... EXECUTED       layerType: Reorder            realTime: 143        cpu: 143            execType: reorder_FP32
-inception_5b/pool             EXECUTED       layerType: Pooling            realTime: 59301      cpu: 59301          execType: ref_any_I8
-```
+| layerName                                                 | execStatus | layerType    | execType             | realTime (ms) | cpuTime (ms) |
+| --------------------------------------------------------- | ---------- | ------------ | -------------------- | ------------- | ------------ |
+| resnet\_model/batch\_normalization\_15/FusedBatchNorm/Add | EXECUTED   | Convolution  | jit\_avx512\_1x1\_I8 | 0.377         | 0.377        |
+| resnet\_model/conv2d\_16/Conv2D/fq\_input\_0              | NOT\_RUN   | FakeQuantize | undef                | 0             | 0            |
+| resnet\_model/batch\_normalization\_16/FusedBatchNorm/Add | EXECUTED   | Convolution  | jit\_avx512\_I8      | 0.499         | 0.499        |
+| resnet\_model/conv2d\_17/Conv2D/fq\_input\_0              | NOT\_RUN   | FakeQuantize | undef                | 0             | 0            |
+| resnet\_model/batch\_normalization\_17/FusedBatchNorm/Add | EXECUTED   | Convolution  | jit\_avx512\_1x1\_I8 | 0.399         | 0.399        |
+| resnet\_model/add\_4/fq\_input\_0                         | NOT\_RUN   | FakeQuantize | undef                | 0             | 0            |
+| resnet\_model/add\_4                                      | NOT\_RUN   | Eltwise      | undef                | 0             | 0            |
+| resnet\_model/add\_5/fq\_input\_1                         | NOT\_RUN   | FakeQuantize | undef                | 0             | 0            |
 
-The `execType` column of the table includes inference primitives with specific suffixes.
 
-[int8_flow]: img/cpu_int8_flow.png
\ No newline at end of file
+> The `exeStatus` column of the table includes possible values:
+> - `EXECUTED` - layer was executed by standalone primitive,
+> - `NOT_RUN` - layer was not executed by standalone primitive or was fused with another operation and executed in another layer primitive.  
+>
+> The `execType` column of the table includes inference primitives with specific suffixes. The layers have the following marks:
+> * Suffix `I8` for layers that had 8-bit data type input and were computed in 8-bit precision
+> * Suffix `FP32` for layers computed in 32-bit precision 
+
+All `Convolution` layers are executed in int8 precision. Rest layers are fused into Convolutions using post operations optimization technique, which is described in [Internal CPU Plugin Optimizations](supported_plugins/CPU.md).
+
+[int8_flow]: img/cpu_int8_flow.png
diff --git a/docs/IE_DG/Intro_to_Performance.md b/docs/IE_DG/Intro_to_Performance.md
index 12913c5811c..6dbdd35cef4 100644
--- a/docs/IE_DG/Intro_to_Performance.md
+++ b/docs/IE_DG/Intro_to_Performance.md
@@ -29,7 +29,7 @@ Refer to the [Benchmark App](../../inference-engine/samples/benchmark_app/README
 ## Using Async API
 To gain better performance on accelerators, such as VPU, the Inference Engine uses the asynchronous approach (see
 [Integrating Inference Engine in Your Application (current API)](Integrate_with_customer_application_new_API.md)).
-The point is amortizing the costs of data transfers, by pipe-lining, see [Async API explained](@ref omz_demos_object_detection_demo_ssd_async_README).
+The point is amortizing the costs of data transfers, by pipe-lining, see [Async API explained](@ref omz_demos_object_detection_demo_cpp).
 Since the pipe-lining relies on the availability of the parallel slack, running multiple inference requests in parallel is essential.
 Refer to the [Benchmark App](../../inference-engine/samples/benchmark_app/README.md) sample, which enables running a number of inference requests in parallel. Specifying different number of request produces different throughput measurements.
 
diff --git a/docs/IE_DG/Samples_Overview.md b/docs/IE_DG/Samples_Overview.md
index b59d5a576ae..8243fc7f7d6 100644
--- a/docs/IE_DG/Samples_Overview.md
+++ b/docs/IE_DG/Samples_Overview.md
@@ -47,7 +47,7 @@ To run the sample applications, you can use images and videos from the media fil
 
 ## Samples that Support Pre-Trained Models
 
-To run the sample, you can use [public](@ref omz_models_public_index) or [Intel's](@ref omz_models_intel_index) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader_README).
+To run the sample, you can use [public](@ref omz_models_group_public) or [Intel's](@ref omz_models_group_intel) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader).
 
 ## Build the Sample Applications
 
@@ -209,7 +209,7 @@ vi <user_home_directory>/.bashrc
 
 2. Add this line to the end of the file:
 ```sh
-source /opt/intel/openvino/bin/setupvars.sh
+source /opt/intel/openvino_2021/bin/setupvars.sh
 ```
 
 3. Save and close the file: press the **Esc** key, type `:wq` and press the **Enter** key.
@@ -246,4 +246,4 @@ sample, read the sample documentation by clicking the sample name in the samples
 list above.
 
 ## See Also
-* [Introduction to Inference Engine](inference_engine_intro.md)
+* [Inference Engine Developer Guide](Deep_Learning_Inference_Engine_DevGuide.md)
diff --git a/docs/IE_DG/ShapeInference.md b/docs/IE_DG/ShapeInference.md
index ea86911ff39..93b27c621b5 100644
--- a/docs/IE_DG/ShapeInference.md
+++ b/docs/IE_DG/ShapeInference.md
@@ -66,8 +66,8 @@ Shape collision during shape propagation may be a sign that a new shape does not
 Changing the model input shape may result in intermediate operations shape collision.
 
 Examples of such operations:
-- [`Reshape` operation](../ops/shape/Reshape_1.md) with a hard-coded output shape value
-- [`MatMul` operation](../ops/matrix/MatMul_1.md) with the `Const` second input cannot be resized by spatial dimensions due to operation semantics
+- [Reshape](../ops/shape/Reshape_1.md) operation with a hard-coded output shape value
+- [MatMul](../ops/matrix/MatMul_1.md) operation with the `Const` second input cannot be resized by spatial dimensions due to operation semantics
 
 Model structure and logic should not change significantly after model reshaping.
 - The Global Pooling operation is commonly used to reduce output feature map of classification models output.
@@ -100,7 +100,7 @@ Here is a code example:
 
 @snippet snippets/ShapeInference.cpp part0
 
-Shape Inference feature is used in [Smart classroom sample](@ref omz_demos_smart_classroom_demo_README).
+Shape Inference feature is used in [Smart Classroom Demo](@ref omz_demos_smart_classroom_demo_cpp).
 
 ## Extensibility
 
diff --git a/docs/IE_DG/Tools_Overview.md b/docs/IE_DG/Tools_Overview.md
index 6600554785b..f0741105387 100644
--- a/docs/IE_DG/Tools_Overview.md
+++ b/docs/IE_DG/Tools_Overview.md
@@ -6,9 +6,9 @@ The OpenVINO™ toolkit installation includes the following tools:
 
 |Tool                                                                         | Location in the Installation Directory|
 |-----------------------------------------------------------------------------|---------------------------------------|
-|[Accuracy Checker Tool](@ref omz_tools_accuracy_checker_README)              | `<INSTALL_DIR>/deployment_tools/tools/open_model_zoo/tools/accuracy_checker`|
+|[Accuracy Checker Tool](@ref omz_tools_accuracy_checker)              | `<INSTALL_DIR>/deployment_tools/tools/open_model_zoo/tools/accuracy_checker`|
 |[Post-Training Optimization Tool](@ref pot_README)                           | `<INSTALL_DIR>/deployment_tools/tools/post_training_optimization_toolkit`|
-|[Model Downloader](@ref omz_tools_downloader_README)                         | `<INSTALL_DIR>/deployment_tools/tools/model_downloader`| 
+|[Model Downloader](@ref omz_tools_downloader)                         | `<INSTALL_DIR>/deployment_tools/tools/model_downloader`| 
 |[Cross Check Tool](../../inference-engine/tools/cross_check_tool/README.md)  | `<INSTALL_DIR>/deployment_tools/tools/cross_check_tool`|
 |[Compile Tool](../../inference-engine/tools/compile_tool/README.md)          | `<INSTALL_DIR>/deployment_tools/inference_engine/lib/intel64/`|
 
diff --git a/docs/IE_DG/img/cpu_int8_flow.png b/docs/IE_DG/img/cpu_int8_flow.png
index 130e54ceafa..794430126b2 100644
--- a/docs/IE_DG/img/cpu_int8_flow.png
+++ b/docs/IE_DG/img/cpu_int8_flow.png
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3965f4830c45518ee1dc169c2b1760cae83f8a8819023770a28893c6cef558c2
-size 68441
+oid sha256:83bcd7888d3843ddfd9a601288627e98f5874290c00b9988bf1beac9209f2e8d
+size 79741
diff --git a/docs/IE_DG/inference_engine_intro.md b/docs/IE_DG/inference_engine_intro.md
index d23f168ba51..c262436d101 100644
--- a/docs/IE_DG/inference_engine_intro.md
+++ b/docs/IE_DG/inference_engine_intro.md
@@ -1,5 +1,11 @@
-Introduction to Inference Engine {#openvino_docs_IE_DG_inference_engine_intro}
-================================
+# Introduction to Inference Engine {#openvino_docs_IE_DG_inference_engine_intro}
+
+> **NOTE:** [Intel® System Studio](https://software.intel.com/en-us/system-studio) is an all-in-one, cross-platform tool suite, purpose-built to simplify system bring-up and improve system and IoT device application performance on Intel® platforms. If you are using the Intel® Distribution of OpenVINO™ with Intel® System Studio, go to [Get Started with Intel® System Studio](https://software.intel.com/en-us/articles/get-started-with-openvino-and-intel-system-studio-2019).
+
+This Guide provides an overview of the Inference Engine describing the typical workflow for performing
+inference of a pre-trained and optimized deep learning model and a set of sample applications.
+
+> **NOTE:** Before you perform inference with the Inference Engine, your models should be converted to the Inference Engine format using the Model Optimizer or built directly in run-time using nGraph API. To learn about how to use Model Optimizer, refer to the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md). To learn about the pre-trained and optimized models delivered with the OpenVINO™ toolkit, refer to [Pre-Trained Models](@ref omz_models_intel_index).
 
 After you have used the Model Optimizer to create an Intermediate Representation (IR), use the Inference Engine to infer the result for a given input data.
 
diff --git a/docs/IE_DG/protecting_model_guide.md b/docs/IE_DG/protecting_model_guide.md
index 2074d223014..78fe08b3318 100644
--- a/docs/IE_DG/protecting_model_guide.md
+++ b/docs/IE_DG/protecting_model_guide.md
@@ -58,5 +58,5 @@ should be called with `weights` passed as an empty `Blob`.
 - Model Optimizer Developer Guide: [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
 - Inference Engine Developer Guide: [Inference Engine Developer Guide](Deep_Learning_Inference_Engine_DevGuide.md)
 - For more information on Sample Applications, see the [Inference Engine Samples Overview](Samples_Overview.md)
-- For information on a set of pre-trained models, see the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_intel_index)
+- For information on a set of pre-trained models, see the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel)
 - For IoT Libraries and Code Samples see the [Intel® IoT Developer Kit](https://github.com/intel-iot-devkit).
diff --git a/docs/IE_DG/supported_plugins/MULTI.md b/docs/IE_DG/supported_plugins/MULTI.md
index a6b4aaefc9f..f20443ca4c2 100644
--- a/docs/IE_DG/supported_plugins/MULTI.md
+++ b/docs/IE_DG/supported_plugins/MULTI.md
@@ -92,11 +92,20 @@ Notice that until R2 you had to calculate number of requests in your application
 Notice that every OpenVINO sample that supports "-d" (which stays for "device") command-line option transparently accepts the multi-device.
 The [Benchmark Application](../../../inference-engine/samples/benchmark_app/README.md) is the best reference to the optimal usage of the multi-device. As discussed multiple times earlier, you don't need to setup number of requests, CPU streams or threads as the application provides optimal out of the box performance.
 Below is example command-line to evaluate HDDL+GPU performance with that:
-```bash
-$ ./benchmark_app –d MULTI:HDDL,GPU –m <model> -i <input> -niter 1000
+
+```sh
+./benchmark_app –d MULTI:HDDL,GPU –m <model> -i <input> -niter 1000
 ```
 Notice that you can use the FP16 IR to work with multi-device (as CPU automatically upconverts it to the fp32) and rest of devices support it naturally. 
 Also notice that no demos are (yet) fully optimized for the multi-device, by means of supporting the OPTIMAL_NUMBER_OF_INFER_REQUESTS metric, using the GPU streams/throttling, and so on.
 
+## Video: MULTI Plugin
+[![](https://img.youtube.com/vi/xbORYFEmrqU/0.jpg)](https://www.youtube.com/watch?v=xbORYFEmrqU)
+\htmlonly
+<iframe width="560" height="315" src="https://www.youtube.com/embed/xbORYFEmrqU" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+\endhtmlonly
+
 ## See Also
 * [Supported Devices](Supported_Devices.md)
+
+
diff --git a/docs/IE_DG/supported_plugins/Supported_Devices.md b/docs/IE_DG/supported_plugins/Supported_Devices.md
index c687e4ae602..514b4bd58a7 100644
--- a/docs/IE_DG/supported_plugins/Supported_Devices.md
+++ b/docs/IE_DG/supported_plugins/Supported_Devices.md
@@ -16,6 +16,8 @@ The Inference Engine provides unique capabilities to infer deep learning models
 |[Multi-Device plugin](MULTI.md) |Multi-Device plugin enables simultaneous inference of the same network on several Intel&reg; devices in parallel    |   
 |[Heterogeneous plugin](HETERO.md) |Heterogeneous plugin enables automatic inference splitting between several Intel&reg; devices (for example if a device doesn't [support certain layers](#supported-layers)).                                                           |
 
+Devices similar to the ones we have used for benchmarking can be accessed using [Intel® DevCloud for the Edge](https://devcloud.intel.com/edge/), a remote development environment with access to Intel® hardware and the latest versions of the Intel® Distribution of the OpenVINO™ Toolkit. [Learn more](https://devcloud.intel.com/edge/get_started/devcloud/) or [Register here](https://inteliot.force.com/DevcloudForEdge/s/).
+
 ## Supported Configurations
 
 The Inference Engine can inference models in different formats with various input and output formats.
diff --git a/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md b/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md
index 3b657f52a35..d21ab41bd5e 100644
--- a/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md
+++ b/docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md
@@ -115,3 +115,22 @@ Model Optimizer produces an Intermediate Representation (IR) of the network, whi
 * [Known Issues](Known_Issues_Limitations.md)
 
 **Typical Next Step:** [Preparing and Optimizing your Trained Model with Model Optimizer](prepare_model/Prepare_Trained_Model.md)
+
+## Video: Model Optimizer Concept
+
+[![](https://img.youtube.com/vi/Kl1ptVb7aI8/0.jpg)](https://www.youtube.com/watch?v=Kl1ptVb7aI8)
+\htmlonly
+<iframe width="560" height="315" src="https://www.youtube.com/embed/Kl1ptVb7aI8" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+\endhtmlonly
+
+## Video: Model Optimizer Basic Operation
+[![](https://img.youtube.com/vi/BBt1rseDcy0/0.jpg)](https://www.youtube.com/watch?v=BBt1rseDcy0)
+\htmlonly
+<iframe width="560" height="315" src="https://www.youtube.com/embed/BBt1rseDcy0" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+\endhtmlonly
+
+## Video: Choosing the Right Precision
+[![](https://img.youtube.com/vi/RF8ypHyiKrY/0.jpg)](https://www.youtube.com/watch?v=RF8ypHyiKrY)
+\htmlonly
+<iframe width="560" height="315" src="https://www.youtube.com/embed/RF8ypHyiKrY" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+\endhtmlonly
diff --git a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md
index 97f801bd06b..275b8e786d0 100644
--- a/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md
+++ b/docs/MO_DG/prepare_model/convert_model/Convert_Model_From_TensorFlow.md
@@ -404,6 +404,11 @@ Refer to [Supported Framework Layers ](../Supported_Frameworks_Layers.md) for th
 
 The Model Optimizer provides explanatory messages if it is unable to run to completion due to issues like typographical errors, incorrectly used options, or other issues. The message describes the potential cause of the problem and gives a link to the [Model Optimizer FAQ](../Model_Optimizer_FAQ.md). The FAQ has instructions on how to resolve most issues. The FAQ also includes links to relevant sections in the Model Optimizer Developer Guide to help you understand what went wrong.
 
+## Video: Converting a TensorFlow Model
+[![](https://img.youtube.com/vi/QW6532LtiTc/0.jpg)](https://www.youtube.com/watch?v=QW6532LtiTc)
+\htmlonly
+<iframe width="560" height="315" src="https://www.youtube.com/embed/QW6532LtiTc" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
+\endhtmlonly
 
 ## Summary
 In this document, you learned:
diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md
index e2886e272e0..6683d6b9b8a 100644
--- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md
+++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_Object_Detection_API_Models.md
@@ -106,7 +106,7 @@ Models with `keep_aspect_ratio_resizer` were trained to recognize object in real
 Inference Engine comes with a number of samples that use Object Detection API models including:
 
 * [Object Detection for SSD Sample](../../../../../inference-engine/samples/object_detection_sample_ssd/README.md) --- for RFCN, SSD and Faster R-CNNs
-* [Mask R-CNN Sample for TensorFlow* Object Detection API Models](@ref omz_demos_mask_rcnn_demo_README) --- for Mask R-CNNs
+* [Mask R-CNN Sample for TensorFlow* Object Detection API Models](@ref omz_demos_mask_rcnn_demo_cpp) --- for Mask R-CNNs
 
 There are a number of important notes about feeding input images to the samples:
 
@@ -1047,4 +1047,4 @@ The Mask R-CNN models are cut at the end with the sub-graph replacer `ObjectDete
 
 ```SecondStageBoxPredictor_1/Conv_3/BiasAdd|SecondStageBoxPredictor_1/Conv_1/BiasAdd```
 
-One of these two nodes produces output mask tensors. The child nodes of these nodes are related to post-processing which is implemented in the [Mask R-CNN demo](@ref omz_demos_mask_rcnn_demo_README) and should be cut off.
+One of these two nodes produces output mask tensors. The child nodes of these nodes are related to post-processing which is implemented in the [Mask R-CNN demo](@ref omz_demos_mask_rcnn_demo_cpp) and should be cut off.
diff --git a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md
index 99748b7b18f..109714dcea6 100644
--- a/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md
+++ b/docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md
@@ -110,7 +110,7 @@ where:
 
 > **NOTE:** The color channel order (RGB or BGR) of an input data should match the channel order of the model training dataset. If they are different, perform the `RGB<->BGR` conversion specifying the command-line parameter: `--reverse_input_channels`. Otherwise, inference results may be incorrect. For more information about the parameter, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](../Converting_Model_General.md).
 
-OpenVINO&trade; toolkit provides a demo that uses YOLOv3 model. For more information, refer to [Object Detection C++ Demo](@ref omz_demos_object_detection_demo_ssd_async_README).
+OpenVINO&trade; toolkit provides a demo that uses YOLOv3 model. For more information, refer to [Object Detection C++ Demo](@ref omz_demos_object_detection_demo_cpp).
 
 ## Convert YOLOv1 and YOLOv2 Models to the IR
 
diff --git a/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md b/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md
index 73e439d83fe..99b4cd703c1 100644
--- a/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md
+++ b/docs/MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md
@@ -1,53 +1,53 @@
 # Model Optimizer Extensibility {#openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Customize_Model_Optimizer}
 
-* [Model Representation in Memory](#model-representation-in-memory)
-* [Model Conversion Pipeline](#model-conversion-pipeline)
-  * [Model Loading](#model-loading)
-  * [Operations Attributes Extracting](#operations-attributes-extracting)
-  * [Front Phase](#front-phase)
-  * [Partial Inference](#partial-inference)
-  * [Middle Phase](#middle-phase)
-  * [NHWC to NCHW Layout Change](#layout-change)
-  * [Back Phase](#back-phase)
-  * [Intermediate Representation Emitting](#ir-emitting)
-* [Graph Traversal and Modification Using `Port`s and `Connection`s](#graph-ports-and-conneсtions)
-  * [Ports](#intro-ports)
-  * [Connections](#intro-connections)
-* [Model Optimizer Extensions](#extensions)
-  * [Model Optimizer Operation](#extension-operation)
-  * [Operation Extractor](#operation-extractor)
-  * [Graph Transformation Extensions](#graph-transformations)
-    * [Front Phase Transformations](#front-phase-transformations)
-      * [Pattern-Defined Front Phase Transformations](#pattern-defined-front-phase-transformations)
-      * [Specific Operation Front Phase Transformations](#specific-operation-front-phase-transformations)
-      * [Generic Front Phase Transformations](#generic-front-phase-transformations)
-      * [Node Name Pattern Front Phase Transformations](#node-name-pattern-front-phase-transformations)
-      * [Front Phase Transformations Using Start and End Points](#start-end-points-front-phase-transformations)
-      * [Generic Front Phase Transformations Enabled with Transformations Configuration File](#generic-transformations-config-front-phase-transformations)
-    * [Middle Phase Transformations](#middle-phase-transformations)
-      * [Pattern-Defined Middle Phase Transformations](#pattern-defined-middle-phase-transformations)
-      * [Generic Middle Phase Transformations](#generic-middle-phase-transformations)
-    * [Back Phase Transformations](#back-phase-transformations)
-      * [Pattern-Defined Back Phase Transformations](#pattern-defined-back-phase-transformations)
-      * [Generic Back Phase Transformations](#generic-back-phase-transformations)
+- <a href="#model-representation-in-memory">Model Representation in Memory</a>
+- <a href="#model-conversion-pipeline">Model Conversion Pipeline</a>
+  - <a href="#model-loading">Model Loading</a>
+  - <a href="#operations-attributes-extracting">Operations Attributes Extracting</a>
+  - <a href="#front-phase">Front Phase</a>
+  - <a href="#partial-inference">Partial Inference</a>
+  - <a href="#middle-phase">Middle Phase</a>
+  - <a href="#layout-change">NHWC to NCHW Layout Change</a>
+  - <a href="#back-phase">Back Phase</a>
+  - <a href="#ir-emitting">Intermediate Representation Emitting</a>
+- <a href="#graph-ports-and-conneсtions">Graph Traversal and Modification Using <code>Port</code>s and <code>Connection</code>s</a>
+  - <a href="#intro-ports">Ports</a>
+  - <a href="#intro-conneсtions">Connections</a>
+- <a href="#extensions">Model Optimizer Extensions</a>
+  - <a href="#operation">Model Optimizer Operation</a>
+  - <a href="#extension-extractor">Operation Extractor</a>
+  - <a href="#graph-transformations">Graph Transformation Extensions</a>
+    - <a href="#front-phase-transformations">Front Phase Transformations</a>
+      - <a href="#pattern-defined-front-phase-transformations">Pattern-Defined Front Phase Transformations</a>
+      - <a href="#specific-operation-front-phase-transformations">Specific Operation Front Phase Transformations</a>
+      - <a href="#generic-front-phase-transformations">Generic Front Phase Transformations</a>
+      - <a href="#node-name-pattern-front-phase-transformations">Node Name Pattern Front Phase Transformations</a>
+      - <a href="#start-end-points-front-phase-transformations">Front Phase Transformations Using Start and End Points</a>
+      - <a href="#generic-transformations-config-front-phase-transformations">Generic Front Phase Transformations Enabled with Transformations Configuration File</a>
+    - <a href="#middle-phase-transformations">Middle Phase Transformations</a>
+      - <a href="#pattern-defined-middle-phase-transformations">Pattern-Defined Middle Phase Transformations</a>
+      - <a href="#generic-middle-phase-transformations">Generic Middle Phase Transformations</a>
+    - <a href="#back-phase-transformations">Back Phase Transformations</a>
+      - <a href="#pattern-defined-back-phase-transformations">Pattern-Defined Back Phase Transformations</a>
+      - <a href="#generic-back-phase-transformations">Generic Back Phase Transformations</a>
+- <a href="#see-also">See Also</a>
 
-Model Optimizer extensibility mechanism allows to support new operations and custom transformations to generate the
-optimized Intermediate Representation (IR) as described in the
+<a name="model-optimizer-extensibility"></a>Model Optimizer extensibility mechanism enables support of new operations and custom transformations to generate the
+optimized intermediate representation (IR) as described in the
 [Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™](../../IR_and_opsets.md). This
-mechanism is a core part of the Model Optimizer and the Model Optimizer uses it under the hood, so the Model Optimizer
-itself is a huge set of examples on how to add custom logic to support your model.
+mechanism is a core part of the Model Optimizer. The Model Optimizer itself uses it under the hood, being a huge set of examples on how to add custom logic to support your model.
 
 There are several cases when the customization is needed:
 
 * A model contains operation(s) not known for the Model Optimizer, but these operation(s) could be expressed as a
-combination of supported operations. In this case a custom transformation should be implemented to replace unsupported
+combination of supported operations. In this case, a custom transformation should be implemented to replace unsupported
 operation(s) with supported ones.
-* A model contains sub-graph of operations which can be replaced with a smaller number of operations to get the better
+* A model contains sub-graph of operations that can be replaced with a smaller number of operations to get the better
 performance. This example corresponds to so called fusing transformations. For example, replace a sub-graph performing
 the following calculation \f$x / (1.0 + e^{-(beta * x)})\f$ with a single operation of type
 [Swish](../../../ops/activation/Swish_4.md).
-* A model contains a custom framework operation (the operation which is not a part of an official operation set of the
-framework) which was developed using the framework extensibility mechanism. In this case the Model Optimizer should know
+* A model contains a custom framework operation (the operation that is not a part of an official operation set of the
+framework) that was developed using the framework extensibility mechanism. In this case, the Model Optimizer should know
 how to handle the operation and generate a corresponding section in an IR for it.
 
 It is necessary to figure out how the Model Optimizer represents a model in a memory and converts it to an IR before
@@ -61,14 +61,13 @@ The model can be represented as a directed graph where nodes are operations and
 producer operation (node) to a consumer operation (node).
 
 Model Optimizer uses Python class `mo.graph.graph.Graph` instance to represent the computation graph in memory during
-the model conversion. This class is inherited from `networkx.MultiDiGraph` class of the standard `networkx` Python
+the model conversion. This class is inherited from the `networkx.MultiDiGraph` class of the standard `networkx` Python
 library and provides many convenient methods to traverse and modify the graph. Refer to the `mo/graph/graph.py` file for
 the examples.
 
-Model Optimizer keeps all necessary information about the operation in a node attributes. Model Optimizer uses class
-`mo.graph.graph.Node` defined in the  `mo/graph/graph.py` file which is a wrapper on top of a `networkx` node attributes
-dictionary and provides many convenient methods to work with the node. For example, the node `my_node` attribute with a
-name `'my_attr'` can be retrieved from the node with the following code `my_node.my_attr` which is equivalent to obtaining
+Model Optimizer keeps all necessary information about the operation in node attributes. Model Optimizer uses the `mo.graph.graph.Node` class defined in the  `mo/graph/graph.py` file, which is a wrapper on top of a `networkx` node attributes
+dictionary, and provides many convenient methods to work with the node. For example, the node `my_node` attribute with a
+name `'my_attr'` can be retrieved from the node with the following code `my_node.my_attr`, which is equivalent to obtaining
 attribute with name `'my_attr'` in the `graph.node['my_node']` dictionary. Refer to the `mo/graph/graph.py` for the
 class implementation details.
 
@@ -76,12 +75,12 @@ An operation may have several inputs and outputs. For example, operation [Split]
 two inputs: data to split and axis to split along, and variable number of outputs depending on a value of attribute
 `num_splits`. Each input data to the operation is passed to a specific operation **input port**. An operation produces
 an output data from an **output port**. Input and output ports are numbered from 0 independently. Model Optimizer uses
-classes `mo.graph.port.Port` and `mo.graph.connection.Connection` which are useful abstraction to perform graph
-modifications like nodes connecting/re-connecting and a graph traversing. These classes are widely used in the Model
+classes `mo.graph.port.Port` and `mo.graph.connection.Connection`, which are useful abstraction to perform graph
+modifications like nodes connecting/re-connecting and graph traversing. These classes are widely used in the Model
 Optimizer code so it is easy to find a lot of usage examples.
 
 There is no dedicated class corresponding to an edge, so low-level graph manipulation is needed to get access to
-edge attributes if needed. Meanwhile most manipulations with nodes connections should be done with help of
+edge attributes if needed. Meanwhile, most manipulations with nodes connections should be done with help of the
 `mo.graph.connection.Connection` and `mo.graph.port.Port` classes. Thus, low-level graph manipulation is error prone and
 is strongly not recommended.
 
@@ -94,19 +93,19 @@ A model conversion pipeline can be represented with the following diagram:
 
 ![Model Conversion pipeline](../../../img/MO_conversion_pipeline.png)
 
-Lets review each conversion step in details.
+Each conversion step is reviewed in details below.
 
 ### Model Loading <a name="model-loading"></a>
-Model Optimizer gets as input a trained model file. The model loader component of the Model Optimizer reads a model file
+Model Optimizer gets a trained model file as an input. The model loader component of the Model Optimizer reads a model file
 using Python bindings provided with the framework and builds an in-memory representation of a computation graph. There
 is a separate loader for each supported framework. These loaders are implemented in the
 `extensions/load/<FRAMEWORK>/loader.py` files of the Model Optimizer.
 
-> **NOTE**: Model Optimizer uses a special parser for Caffe\* models built on top of `caffe.proto` file. In case of a
+> **NOTE**: Model Optimizer uses a special parser for Caffe\* models built on top of the `caffe.proto` file. In case of a
 > model loading failure, the Model Optimizer throws an error and requests to prepare the parser that can read the model.
 > For more information on how to prepare the custom Caffe\* parser, refer to the [Model Optimizer Frequently Asked Questions #1](../Model_Optimizer_FAQ.md).
 
-The result of a model loading step is a `Graph` object which can be depicted like in the following example:
+The result of a model loading step is a `Graph` object, which can be depicted like in the following example:
 
 ![Graph After Load](../../../img/MO_graph_after_loader.png)
 
@@ -114,16 +113,16 @@ Model Optimizer loader saves an operation instance framework description (usuall
 attribute usually with a name `pb` for each operation of an input model. It is important that this is a
 **framework-specific** description of an operation. This means that an operation, for example,
 [Convolution](../../../ops/convolution/Convolution_1.md) may be represented differently in, for example, Caffe\* and
-TensorFlow\* frameworks but perform exactly the same calculations from a mathematical point of view.
+TensorFlow\* frameworks but performs the same calculations from a mathematical point of view.
 
-In the example above the "Operation 2" has one input and two outputs. The tensor produced from the output port 0 is
+In the example above, the "Operation 2" has one input and two outputs. The tensor produced from the output port 0 is
 consumed with the "Operation 5" (the input port 0) and "Operation 3" (the input port 1). The tensor produced from the
 output port 1 is consumed with the "Operation 4" (the input port 0).
 
 Each edge has two attributes `in` and `out` containing the input port number of the consumer node and the output port
-number of the producer node. These attribute describe the fact that nodes are operations consuming some input tensors
+number of the producer node. These attributes describe the fact that nodes are operations consuming some input tensors
 and producing some output tensors. But nodes themselves are "black boxes" from the Model Optimizer perspective because
-they don't contain required information about the operation they perform.
+they do not contain required information about the operation they perform.
 
 ### Operations Attributes Extracting <a name="operations-attributes-extracting"></a>
 The next step is to parse framework-dependent operation representation saved in a node attribute and update the node
@@ -159,22 +158,22 @@ document). Detailed list of common node attributes and their values is provided
 [Model Optimizer Operation](#extension-operation).
 
 ### Front Phase <a name="front-phase"></a>
-Due to legacy reasons an user must specify shapes for all not fully-defined inputs of the model. In contrast, other
-machine learning frameworks like TensorFlow\* let user create a model with undefined or partially defined input shapes.
+For legacy reasons, you must specify shapes for all not fully-defined inputs of the model. In contrast, other
+machine learning frameworks like TensorFlow\* let you create a model with undefined or partially defined input shapes.
 As an example, undefined dimension is marked with an integer value `-1` in a TensorFlow\* model or has some string name
 in an ONNX\* model.
 
-During the front phase the Model Optimizer knows shape of the model inputs and constants only and does not know shapes
+During the front phase, the Model Optimizer knows shape of the model inputs and constants only and does not know shapes
 (and even ranks) of the intermediate tensors. But information about shapes may not be needed to implement particular
 transformation. For example, the transformation `extensions/front/TopKNormalize.py` removes an attribute `k`  from a
 `TopK` node and adds an input constant with the value `k`. The transformation is needed to convert a `TopK` operation
-which comes from frameworks where a number of output elements is defined as an attribute of the operation to the
-OpenVINO&trade; [TopK](../../../ops/sort/TopK_3.md) operation semantic which requires this value to be a separate input.
+that comes from frameworks where a number of output elements is defined as an attribute of the operation to the
+OpenVINO&trade; [TopK](../../../ops/sort/TopK_3.md) operation semantic, which requires this value to be a separate input.
 
-It is important to mention that sometimes it seems like a transformation cannot be implemented during the front phase
+It is important to mention that sometimes it seems like transformation cannot be implemented during the front phase
 because the actual values of inputs or shapes are needed. But in fact shapes or values manipulations can be implemented
-using operations which are added to the graph. Consider the transformation
-`extensions/front/onnx/flattenONNX_to_reshape.py` which replaces an ONNX\* operation
+using operations that are added to the graph. Consider the 
+`extensions/front/onnx/flattenONNX_to_reshape.py` transformation, which replaces an ONNX\* operation
 [Flatten](https://github.com/onnx/onnx/blob/master/docs/Operators.md#Flatten) with a sub-graph of operations performing
 the following (for the case when `axis` is not equal to 0 and 1):
 
@@ -185,14 +184,14 @@ the following (for the case when `axis` is not equal to 0 and 1):
 [Reshape](../../../ops/shape/Reshape_1.md) specification for an explanation of this value).
 4. Use the concatenated value as the second input to the `Reshape` operation.
 
-It is highly recommended to write shape-agnostic transformations to avoid model reshape-ability issues. Refer to
+It is highly recommended that you write shape-agnostic transformations to avoid model reshape-ability issues. Refer to
 [Using Shape Inference](../../../IE_DG/ShapeInference.md) for more information related to the reshaping of a model.
 
 More information on how to develop front phase transformations and dedicated API description is provided in the
 [Front Phase Transformations](#front-phase-transformations).
 
 ### Partial Inference <a name="partial-inference"></a>
-Model Optimizer performs a partial inference of a model during a model conversion. This procedure includes output shapes
+Model Optimizer performs a partial inference of a model during model conversion. This procedure includes output shapes
 calculation of all operations in a model and constant folding (value calculation for constant sub-graphs). The constant
 folding is needed for the shape inference because in some cases evaluation of constant sub-graph is needed to calculate
 output shapes. For example, the output shape for the [Reshape](../../../ops/shape/Reshape_1.md) operation may be
@@ -213,22 +212,22 @@ files.
 > [Const](../../../ops/infrastructure/Constant_1.md) operations defined with respective operation attributes.
 
 Model Optimizer inserts "data" nodes to the computation graph before starting the partial inference phase. The data node
-corresponds to the specific tensor produced with the operation. Each data node contains two attributes: `shape`
-containing the shape of the tensor and `value` which may contain the actual value of the tensor. The value for a `value`
+corresponds to the specific tensor produced with the operation. Each data node contains two attributes: `shape`,
+containing the shape of the tensor, and `value`, which may contain the actual value of the tensor. The value for a `value`
 attribute is equal to `None` if this tensor value cannot be calculated. This happens in two cases: when a tensor value
 depends on a values passed to the [Parameter](../../../ops/infrastructure/Parameter_1.md) operation of a model or the
 Model Optimizer does not have value propagation implementation for the operation.
 
-The graph before running the partial inference can be depicted like in the following example:
+Before running partial inference, the graph can be depicted like in the following example:
 
 ![Graph Before Partial Inference](../../../img/MO_graph_before_partial_inference.png)
 
 The difference in a graph structure with a graph during the front phase is not only in the data nodes, but also in the
-edge attributes. Note, that an `out` attribute is specified for edges **from operation** nodes only, while an `in`
+edge attributes. Note that an `out` attribute is specified for edges **from operation** nodes only, while an `in`
 attribute is specified for edges **from data** nodes only. This corresponds to the fact that a tensor (data node) is
 produced from a specific output port of an operation and is consumed with a specific input port of an operation. Also,
 a unique data node is created for each output port of an operation and may be used as an input node for several
-operation nodes, like the data node "data2_0" which is consumed with the input port 1 of the operation "Operation 3" and
+operation nodes, like the data node "data2_0", which is consumed with the input port 1 of the operation "Operation 3" and
 input port 0 of the operation "Operation 5".
 
 Now consider how the Model Optimizer performs shape and value propagation. Model Optimizer performs graph nodes
@@ -236,13 +235,13 @@ topological sort. An error message is thrown if a graph contains a cycle. Then s
 each node in the graph according to the topological order. Each node of the graph must have an attribute called `infer`
 with a shape inference function, which is a function with one parameter – an instance of the `Node` class. The `infer`
 attribute is usually set in the operation extractor or when a node is added in some transformation using the Model
-Optimizer operation class inherited from `mo.pos.Op` class. Refer to the [Model Optimizer Operation](#extension-operation)
+Optimizer operation class inherited from the `mo.pos.Op` class. Refer to the [Model Optimizer Operation](#extension-operation)
 and [Operation Extractor](#operation-extractor) for more information on how to specify a shape inference function.
 
 A shape inference function should calculate an operation (node) output shape(s) based on input shape(s) and operation
 (node) attribute(s) and update `shape` and optionally `value` attributes of the corresponding data node(s). A simplified
 example of the shape infer function for the [Reshape](../../../ops/shape/Reshape_1.md) operation (the full version is
-available in the file `mo/ops/reshape.py`):
+available in the `mo/ops/reshape.py` file):
 
 ```py
     @staticmethod
@@ -273,12 +272,12 @@ them.
 
 > **NOTE**: There is a legacy approach to read data node attribute like `input_shape = op_node.in_node(0).shape` and
 > modify data nodes attributes like `op_node.out_node(0).shape = some_value`. This approach is still used in the Model
-> Optimizer code but is not recommended. Instead use approach described in the [Ports](#intro-ports).
+> Optimizer code but is not recommended. Instead, use the approach described in the [Ports](#intro-ports).
 
 ### Middle Phase <a name="middle-phase"></a>
-The middle phase starts after the partial inference. At this phase a graph contains data nodes and output shapes of all
-operations in the graph have been calculated. Any transformation implemented at this stage must update `shape`
-attribute for all newly added operations. It is highly recommended to use API desribed in the
+The middle phase starts after partial inference. At this phase, a graph contains data nodes and output shapes of all
+operations in the graph have been calculated. Any transformation implemented at this stage must update the `shape`
+attribute for all newly added operations. It is highly recommended to use API described in the
 [Graph Traversal and Modification Using `Port`s and `Connection`s](#graph-ports-and-conneсtions) because modification of
 a graph using this API causes automatic re-inference of affected nodes as well as necessary data nodes creation.
 
@@ -290,10 +289,10 @@ There are several middle transformations responsible for changing model layout f
 are triggered by default for TensorFlow\* models only because it is the only framework with Convolution operations in
 NHWC layout.
 
-> **NOTE**: If a TensorFlow\* model is in NCHW layout then an user should specify `--disable_nhwc_to_nchw` command line
+> **NOTE**: If a TensorFlow\* model is in NCHW layout, you should specify the `--disable_nhwc_to_nchw` command line
 > parameter to disable these transformations.
 
-The layout change is a complex problem and detailed explanation of it is out of scope of this document. A very brief
+The layout change is a complex problem and detailed explanation of it is out of this document scope. A very brief
 explanation of this process is provided below:
 
 1. Model Optimizer changes output shapes of most of operations producing 4D and 5D (four dimensional and five
@@ -313,11 +312,11 @@ Refer to the source code of these transformations for more details on how the la
 ### Back Phase <a name="back-phase"></a>
 The back phase starts after the layout change to NCHW. This phase contains mostly the following transformations:
 
-1. Transformations which should be working with a graph in the NCHW layout and thus cannot be implemented in the middle
+1. Transformations that should work with a graph in the NCHW layout and thus cannot be implemented in the middle
 phase.
-2. Transformations which replace nodes corresponding to internal Model Optimizer operations with nodes corresponding to
+2. Transformations that replace nodes corresponding to internal Model Optimizer operations with nodes corresponding to the 
 [opset](@ref openvino_docs_ops_opset) operations.
-3. Transformations which normalize operations inputs according to the specification.
+3. Transformations that normalize operations inputs according to the specification.
 4. Final optimization transformations.
 
 A graph structure during the back phase is the same as during the middle phase. There is no difference in writing middle
@@ -330,30 +329,30 @@ More information on how to develop back transformations and dedicated API descri
 The last phase of a model conversion is the Intermediate Representation emitting. Model Optimizer performs the following
 steps:
 
-1. Iterates over all operation nodes in the graph and checks that all nodes have attribute `type` set. This attribute
-defines the operation type and used in the Inference Engine to instantiate proper operation from the
+1. Iterates over all operation nodes in the graph and checks that all nodes have the `type` attribute set. This attribute
+defines the operation type and is used in the Inference Engine to instantiate proper operation from the
 [opset](@ref openvino_docs_ops_opset) specified in the `version` attribute of the node. If some node does not have
-attribute `type` or its values is equal to `None` then the Model Optimizer exits with an error.
+attribute `type` or its values is equal to `None`, the Model Optimizer exits with an error.
 2. Performs type inference of graph operations similar to the shape inference. Inferred data types are saved to a port
 attributes in the IR.
 3. Performs topological sort of the graph and changes `id` attribute of all operation nodes to be sequential integer
 values starting from 0.
 4. Saves all Constants values to the `.bin` file. Constants with the same value are shared among different operations.
-5. Generates `.xml` file defining a graph structure. The information about operation inputs and outputs are prepared
-uniformly for all operations regardless of their type. A list of attributes to be saved to an `.xml` file is defined
+5. Generates an `.xml` file defining a graph structure. The information about operation inputs and outputs are prepared
+uniformly for all operations regardless of their type. A list of attributes to be saved to the `.xml` file is defined
 with the `backend_attrs()` or `supported_attrs()` of the `Op` class used for a graph node instantiation. For more
-information on how the operation attributes are saved to XML refer to the function `prepare_emit_ir()` in
+information on how the operation attributes are saved to XML, refer to the function `prepare_emit_ir()` in
 the `mo/pipeline/common.py` file and [Model Optimizer Operation](#extension-operation).
 
 ## Graph Traversal and Modification Using `Port`s and `Connection`s <a name="graph-ports-and-conneсtions"></a>
 There are three APIs for a graph traversal and transformation used in the Model Optimizer:
-1. The API provided with the `networkx` Python library for the `networkx.MultiDiGraph` class which is the base class for
+1. The API provided with the `networkx` Python library for the `networkx.MultiDiGraph` class, which is the base class for
 the `mo.graph.graph.Graph` object. Refer to the [Model Representation in Memory](#model-representation-in-memory) for
 more details. For example, the following methods belong to this API level: `graph.add_edges_from([list])`,
 `graph.add_node(x, attrs)`, `graph.out_edges(node_id)` etc where `graph` is a an instance of the `networkx.MultiDiGraph`
 class. **This is the lowest-level API and its usage should be avoided in the Model Optimizer transformations**.
 2. The API built around the `mo.graph.graph.Node` class. The `Node` class is the primary class to work with graph nodes
-and their attributes. **There are some `Node` class methods not recommended to use and some functions defined in the
+and their attributes. **There are some `Node` class methods not recommended for use and some functions defined in the
 `mo.graph.graph` have been deprecated**. Examples of such methods and functions are:
 `node.in_node(y)`, `node.out_node(x)`, `node.get_outputs()`, `node.insert_node_after(n1, y)`, `create_edge(n1, n2)` etc.
 Refer to the `mo/graph/graph.py` for more details.
@@ -364,24 +363,24 @@ Refer to the `mo/graph/graph.py` for more details.
 transformations and operations implementation**.
 
 The main benefit of using Model Optimizer Graph API is that it hides some internal implementation details (the fact that
-the graph contains data nodes), provides API to perform safe and predictable graph manipulations and adds operation
+the graph contains data nodes), provides API to perform safe and predictable graph manipulations, and adds operation
 semantic to the graph. This is achieved with introduction of concepts of ports and connections. This chapter is
 dedicated to the Model Optimizer Graph API and does not cover other two non-recommended APIs.
 
 ### Ports <a name="intro-ports"></a>
-An operation semantic describes how many inputs and outputs the operation have. For example, operations
+An operation semantic describes how many inputs and outputs the operation has. For example, operations
 [Parameter](../../../ops/infrastructure/Parameter_1.md) and [Const](../../../ops/infrastructure/Constant_1.md) have no
 inputs and have one output, operation [ReLU](../../../ops/activation/ReLU_1.md) has one input and one output, operation
 [Split](../../../ops/movement/Split_1.md) has 2 inputs and variable number of outputs depending on the value of the
 attribute `num_splits`.
 
 Each operation node in the graph (an instance of the `Node` class) has 0 or more input and output ports (instances of
-the `mo.graph.port.Port` class). `Port` object has several attributes:
+the `mo.graph.port.Port` class). The `Port` object has several attributes:
 * `node` - the instance of the `Node` object the port belongs to.
-* `idx` - the port number. Input and output ports are numbered independently starting from `0`. Thus operation
+* `idx` - the port number. Input and output ports are numbered independently starting from `0`. Thus, operation
 [ReLU](../../../ops/activation/ReLU_1.md) has one input port (with index `0`) and one output port (with index `0`).
 * `type` - the type of the port. Could be equal to either `"in"` or `"out"`.
-* `data` - the object which should be used to get attributes of the corresponding data node. This object has methods
+* `data` - the object that should be used to get attributes of the corresponding data node. This object has methods
 `get_shape()` / `set_shape()` and `get_value()` / `set_value()` to get/set shape/value of the corresponding data node.
 For example, `in_port.data.get_shape()` returns an input shape of a tensor connected to input port `in_port`
 (`in_port.type == 'in'`), `out_port.data.get_value()` returns a value of a tensor produced from output port `out_port`
@@ -398,42 +397,42 @@ input/output port.
 
 Attributes `in_ports_count` and `out_ports_count` of the `Op` class instance define default number of input and output
 ports to be created for the `Node` . However, additional input/output ports can be added using methods
-`add_input_port()` and `add_output_port()`. Port also can be removed using `delete_input_port()` and
+`add_input_port()` and `add_output_port()`. Port also can be removed using the `delete_input_port()` and
 `delete_output_port()` methods.
 
-The `Port` class is just an abstraction which works with edges incoming/outgoing to/from a specific `Node` instance. For
+The `Port` class is just an abstraction that works with edges incoming/outgoing to/from a specific `Node` instance. For
 example, output port with `idx = 1` corresponds to the outgoing edge of a node with an attribute `out = 1`, the input
 port with `idx = 2` corresponds to the incoming edge of a node with an attribute `in = 2`.
 
-Consider an example of a graph part with 4 operation nodes "Op1", "Op2", "Op3" and "Op4" and a number of data nodes
+Consider the example of a graph part with 4 operation nodes "Op1", "Op2", "Op3", and "Op4" and a number of data nodes
 depicted with light green boxes.
 
 ![Ports example 1](../../../img/MO_ports_example_1.png)
 
 Operation nodes have input ports (yellow squares) and output ports (light purple squares). Input port may not be
 connected. For example, the input port 2 of node "Op1" does not have incoming edge, while output port always has an
-associated data node (after the partial inference when the data nodes are added to the graph) which may have no
+associated data node (after the partial inference when the data nodes are added to the graph), which may have no
 consumers.
 
 Ports can be used to traverse a graph. The method `get_source()` of an input port returns an output port producing the
-tensor the input port consumes. It is important that the method works the same during front, middle and back phases of a
-model conversion even though the graph structure changes (there is no data nodes in the graph during the front phase).
+tensor consumed by the input port. It is important that the method works the same during front, middle and back phases of a
+model conversion even though the graph structure changes (there are no data nodes in the graph during the front phase).
 
-Let's assume that there are 4 instances of `Node` object `op1, op2, op3` and `op4` corresponding to nodes "Op1", "Op2",
-"Op3" and "Op4" correspondingly. The result of `op2.in_port(0).get_source()` and `op4.in_port(1).get_source()` is the
+Let's assume that there are 4 instances of `Node` object `op1, op2, op3`, and `op4` corresponding to nodes "Op1", "Op2",
+"Op3", and "Op4", respectively. The result of `op2.in_port(0).get_source()` and `op4.in_port(1).get_source()` is the
 same object `op1.out_port(1)` of type `Port`.
 
 The method `get_destination()` of an output port returns the input port of the node consuming this tensor. If there are
-multiple consumers of this tensor then the error is raised. The method `get_destinations()` of an output port returns a
+multiple consumers of this tensor, the error is raised. The method `get_destinations()` of an output port returns a
 list of input ports consuming the tensor.
 
 The method `disconnect()` removes a node incoming edge corresponding to the specific input port. The method removes
 several edges if it is applied during the front phase for a node output port connected with multiple nodes.
 
 The method `port.connect(another_port)` connects output port `port` and input port `another_port`. The method handles
-situations when the graph contains data nodes (middle and back phases) and not just creates an edge between two nodes
-but also automatically creates data node or re-uses existing data node. If the method is used during the front phase and
-data nodes do not exist the method creates edge and properly sets `in` and `out` edge attributes.
+situations when the graph contains data nodes (middle and back phases) and does not create an edge between two nodes
+but also automatically creates data node or reuses existing data node. If the method is used during the front phase and
+data nodes do not exist, the method creates edge and properly sets `in` and `out` edge attributes.
 
 For example, applying the following two methods to the graph above will result in the graph depicted below:
 
@@ -454,16 +453,16 @@ and source output port producing data. So each port is connected with one or mor
 Model Optimizer uses the `mo.graph.connection.Connection` class to represent a connection.
 
 There is only one method `get_connection()` of the `Port` class to get the instance of the corresponding `Connection`
-object. If the port is not connected then the returned value is `None`.
+object. If the port is not connected, the returned value is `None`.
 
-For example, the method `op3.out_port(0).get_connection()` returns a `Connection` object encapsulating edges from node
+For example, the `op3.out_port(0).get_connection()` method returns a `Connection` object encapsulating edges from node
 "Op3" to data node "data_3_0" and two edges from data node "data_3_0" to two ports of the node "Op4".
 
 The `Connection` class provides methods to get source and destination(s) ports the connection corresponds to:
 * `connection.get_source()` - returns an output `Port` object producing the tensor.
 * `connection.get_destinations()` - returns a list of input `Port`s consuming the data.
-* `connection.get_destination()` - returns a single input `Port` consuming the data. If there are multiple consumers
-then the exception is raised.
+* `connection.get_destination()` - returns a single input `Port` consuming the data. If there are multiple consumers,
+the exception is raised.
 
 The `Connection` class provides methods to modify a graph by changing a source or destination(s) of a connection. For
 example, the function call `op3.out_port(0).get_connection().set_source(op1.out_port(0))` changes source port of edges
@@ -472,22 +471,22 @@ below:
 
 ![Connection example 1](../../../img/MO_connection_example_1.png)
 
-Another example is the method `connection.set_destination(dest_port)`. It disconnects `dest_port` and all input ports
-the connection is currently connected to and connects the connection source port to the `dest_port`.
+Another example is the method `connection.set_destination(dest_port)`. It disconnects `dest_port` and all input ports to which
+the connection is currently connected and connects the connection source port to `dest_port`.
 
-Note that connection work seamlessly during front, middle and back phases and hides the fact that the graph structure is
+Note that connection works seamlessly during front, middle, and back phases and hides the fact that the graph structure is
 different.
 
 > **NOTE**: Refer to the `Connection` class implementation in the `mo/graph/connection.py` for a full list of available
 methods.
 
 ## Model Optimizer Extensions <a name="extensions"></a>
-Model Optimizer extensions allow to inject some logic to the model conversion pipeline without changing the Model
+Model Optimizer extensions enable you to inject some logic to the model conversion pipeline without changing the Model
 Optimizer core code. There are three types of the Model Optimizer extensions:
 
 1. Model Optimizer operation.
 2. A framework operation extractor.
-3. A model transformation which can be executed during front, middle or back phase of the model conversion.
+3. A model transformation, which can be executed during front, middle or back phase of the model conversion.
 
 An extension is just a plain text file with a Python code. The file should contain a class (or classes) inherited from
 one of extension base classes. Extension files should be saved to a directory with the following structure:
@@ -509,11 +508,11 @@ Model Optimizer uses the same layout internally to keep built-in extensions. The
 > **NOTE**: The name of a root directory with extensions should not be equal to "extensions" because it will result in a
 > name collision with the built-in Model Optimizer extensions.
 
-> **NOTE**: Model Optimizer itself is built using these extensions so there are huge number of examples on how to use
+> **NOTE**: Model Optimizer itself is built using these extensions so there is a huge number of examples on how to use
 > them in the Model Optimizer code.
 
 ### Model Optimizer Operation <a name="extension-operation"></a>
-Model Optimizer defines a class `mo.ops.Op` (`Op` will be used later in the document to be short) which is a base class
+Model Optimizer defines a class `mo.ops.Op` (`Op` will be used later in the document to be short), which is a base class
 for an operation used in the Model Optimizer. The instance of the `Op` class serves several purposes:
 
 1. Stores the operation attributes.
@@ -525,7 +524,7 @@ graph.
 
 It is important to mention that there is no connection between the instance of the `Op` class and the `Node` object
 created from it. The `Op` class is just an attributes container describing the operation. Model Optimizer uses the `Op`
-class during a model conversion to create node of the graph with attributes copied from the `Op` class instance. Graph
+class during a model conversion to create a node of the graph with attributes copied from the `Op` class instance. Graph
 manipulations are performed with graph `Node`s and their attributes and does not involve `Op`s.
 
 There are a number of common attributes used in the operations. Here is the list of these attributes with description.
@@ -536,19 +535,19 @@ There are a number of common attributes used in the operations. Here is the list
 * `type` — type of the operation according to the [opset specification](@ref openvino_docs_ops_opset). For the internal
 Model Optimizer operations this attribute should be set to `None`. The model conversion fails if an operation with
 `type` equal to `None` comes to the IR emitting phase. **Mandatory**.
-* `version` — the operation set (opset) name the operation belongs to. If not specified then the Model Optimizer sets it
+* `version` — the operation set (opset) name the operation belongs to. If not specified, the Model Optimizer sets it
 equal to `experimental`. Refer to [nGraph Basic Concepts](@ref openvino_docs_nGraph_DG_basic_concepts) for more
 information about operation sets. **Mandatory**.
-* `op` — Model Optimizer type of the operation. In many cases the value of `type` is equal to the value of `op`. But
-when the Model Optimizer cannot instantiate opset operation during model loading it creates an instance of an internal
+* `op` — Model Optimizer type of the operation. In many cases, the value of `type` is equal to the value of `op`. But
+when the Model Optimizer cannot instantiate the opset operation during model loading, it creates an instance of an internal
 operation and the attribute `op` is used as a type of this internal operation. Later in the pipeline the node created
 from an internal operation will be replaced during front, middle or back phase with node(s) created from the opset.
 * `infer` — the attribute defines a function calculating output tensor(s) shape and optionally value(s). The attribute
 may be set to `None` for internal Model Optimizer operations used during the front phase only. Refer to the
 [Partial Inference](#partial-inference) for more information about the shape inference function.
 * `type_infer` — the attribute defines a function calculating output tensor(s) data type. If the attribute is not
-defined then the default function is used. The function checks if the node attribute `data_type` is set and then
-propagates this type to the output tensor from the port 0, otherwise it propagates the data type of the tensor coming
+defined, the default function is used. The function checks if the node attribute `data_type` is set and then
+propagates this type to the output tensor from the port 0; otherwise, it propagates the data type of the tensor coming
 into the input port 0 to the output tensor from the port 0.
 * `in_ports_count` — default number of input ports to be created for the operation. Additional ports can be created or
 redundant ports can be removed using dedicated `Node` class API methods.
@@ -556,7 +555,7 @@ redundant ports can be removed using dedicated `Node` class API methods.
 redundant ports can be removed using dedicated `Node` class API methods.
 
 Here is an example of the Model Optimizer class for the operation [SoftMax](../../../ops/activation/SoftMax_1.md) from
-the file `mo/ops/softmax.py` with the in code comments.
+the `mo/ops/softmax.py` file with the comments in code.
 
 ```py
 class Softmax(Op):
@@ -564,7 +563,7 @@ class Softmax(Op):
     # "Op.get_op_class_by_name()" static method
     op = 'SoftMax'
 
-    # the operation works as an extractor by default. This is a legacy behaviour not recommended for using currently,
+    # the operation works as an extractor by default. This is a legacy behavior not recommended for use currently,
     # thus "enabled" class attribute is set to False. The recommended approach is to use dedicated extractor extension
     enabled = False
 
@@ -611,14 +610,14 @@ example from the `mo/ops/pooling.py` file:
 ```
 
 The `backend_attrs()` function returns a list of records. A record can be of one of the following formats:
-1. A string defining the attribute to be saved to the IR. If the value of the attribute is `None` then the attribute is
-not saved. Example of this case are `rounding_type` and `auto_pad`.
+1. A string defining the attribute to be saved to the IR. If the value of the attribute is `None`, the attribute is
+not saved. Examples of this case are `rounding_type` and `auto_pad`.
 2. A tuple where the first element is a string defining the name of the attribute as it will appear in the IR and the
 second element is a function to produce the value for this attribute. The function gets an instance of the `Node` as the
-only parameter and returns a string with the value to be saved to the IR. Example of this case are `strides`, `kernel`,
+only parameter and returns a string with the value to be saved to the IR. Examples of this case are `strides`, `kernel`,
 `pads_begin` and `pads_end`.
 3. A tuple where the first element is a string defining the name of the attribute as it will appear in the IR and the
-second element is the name of tha `Node` attribute to get the value from. Example of this case are `pool-method` and
+second element is the name of the `Node` attribute to get the value from. Examples of this case are `pool-method` and
 `exclude-pad`.
 
 ### Operation Extractor <a name="extension-extractor"></a>
@@ -626,7 +625,7 @@ Model Optimizer runs specific extractor for each operation in the model during t
 [operations-attributes-extracting](#operations-attributes-extracting) for more information about this process.
 
 There are several types of Model Optimizer extractor extensions:
-1. The generic one which is described in this section.
+1. The generic one, which is described in this section.
 2. The special extractor for Caffe\* models with Python layers. This kind of extractor is described in the
 [Extending the Model Optimizer with Caffe* Python Layers](Extending_Model_Optimizer_with_Caffe_Python_Layers.md).
 3. The special extractor for MXNet\* models with custom operations. This kind of extractor is described in the
@@ -634,9 +633,9 @@ There are several types of Model Optimizer extractor extensions:
 4. The special extractor and fallback to Caffe\* for shape inference is described in the
 [Legacy Mode for Caffe* Custom Layers](Legacy_Mode_for_Caffe_Custom_Layers.md).
 
-This chapter is focused on the option #1 which provides a generic mechanism for the operation extractor applicable for
-all frameworks. Model Optimizer provides class `mo.front.extractor.FrontExtractorOp` as a base class to implement the
-extractor. It has a class method `extract` which gets the only parameter `Node` which corresponds to the graph node to
+This chapter is focused on the option #1, which provides a generic mechanism for the operation extractor applicable for
+all frameworks. Model Optimizer provides the `mo.front.extractor.FrontExtractorOp` class as a base class to implement the
+extractor. It has a class method `extract`, which gets the only parameter `Node`, which corresponds to the graph node to
 extract data from. The operation description in the original framework format is stored in the attribute `pb` of the
 node. The extractor goal is to parse this attribute and save necessary attributes to the corresponding node of the
 graph. Consider the extractor for the TensorFlow\* operation `Const` (refer to the file
@@ -716,7 +715,7 @@ used to parse operation attributes encoded with a framework-specific representat
 A common practice is to use `update_node_stat()` method of the dedicated `Op` class to update the node attributes. This
 method does the following:
 
-1. Sets values for common attributes like `op`, `type`, `infer`, `in_ports_count`, `out_ports_count`, `version` etc to
+1. Sets values for common attributes like `op`, `type`, `infer`, `in_ports_count`, `out_ports_count`, `version` to
 values specific to the dedicated operation (`Const` operation in this case).
 2. Uses methods `supported_attrs()` and `backend_attrs()` defined in the `Op` class to update specific node attribute
 `IE`. The IR emitter uses the value stored in the `IE` attribute to pre-process attribute values and save them to IR.
@@ -728,11 +727,11 @@ these attributes are parsed from the particular instance of the operation.
 
 ### Graph Transformation Extensions <a name="graph-transformations"></a>
 Model Optimizer provides various base classes to implement [Front Phase Transformations](#front-phase-transformations),
-[Middle Phase Transformations](#middle-phase-transformations) and [Back Phase Transformations](#back-phase-transformations).
+[Middle Phase Transformations](#middle-phase-transformations), and [Back Phase Transformations](#back-phase-transformations).
 All classes have the following common class attributes and methods:
 1. Attribute `enabled` specifies whether the transformation is enabled or not. The value can be changed during runtime
 to enable or disable execution of the transformation during a model conversion. Default value is `True`.
-2. Attribute `id` specifies a unique transformation string identifier. This transformation identified can be used to
+2. Attribute `id` specifies a unique transformation string identifier. This transformation identifier can be used to
 enable (disable) the transformation by setting environment variable `MO_ENABLED_TRANSFORMS` (`MO_DISABLED_TRANSFORMS`)
 with a comma separated list of `id`s. The environment variables override the value of the `enabled` attribute of the
 transformation. Instead of using `id` attribute value you can add fully defined class name to `MO_ENABLED_TRANSFORMS` 
@@ -747,21 +746,21 @@ graph cleanup removes nodes of the graph not reachable from the model inputs. De
 input(s) were changed during the transformation or developer can set this attribute manually in the transformation for
 the specific nodes. Default value is `False`.
 5. Attribute `graph_condition` specifies a list of functions with one parameter -- `Graph` object. The transformation
-is executed if and only if all functions return `True`. If the attribute is not set then no check is performed.
-7. Method `run_before()` returns a list of transformation classes which this transformation should be executed before.
-8. Method `run_after()` returns a list of transformation classes which this transformation should be executed after.
+is executed if and only if all functions return `True`. If the attribute is not set, no check is performed.
+1. Method `run_before()` returns a list of transformation classes which this transformation should be executed before.
+2. Method `run_after()` returns a list of transformation classes which this transformation should be executed after.
 
-> **NOTE**: Some of the transformation types have specific class attributes and methods which are explained in the
+> **NOTE**: Some of the transformation types have specific class attributes and methods, which are explained in the
 > corresponding sections of this document.
 
 Model Optimizer builds a graph of dependencies between registered transformations and executes them in the topological
-order. In order to execute the transformation during a proper model conversion phase the Model Optimizer defines several
-anchor transformations which does nothing. All transformations are ordered with respect to these anchor transformations.
+order. To execute the transformation during a proper model conversion phase, the Model Optimizer defines several
+anchor transformations that do nothing. All transformations are ordered with respect to these anchor transformations.
 The diagram below shows anchor transformations, some of built-in transformations and dependencies between them:
 
 ![Transformations Graph](../../../img/MO_transformations_graph.png)
 
-User defined transformations are executed after corresponding `Start` and before corresponding `Finish` anchor
+User-defined transformations are executed after the corresponding `Start` and before the corresponding `Finish` anchor
 transformations by default (if `run_before()` and `run_after()` methods have not been overridden).
 
 > **NOTE**: The `PreMiddleStart` and `PostMiddleStart` anchors were introduced due to historical reasons to refactor
@@ -801,10 +800,10 @@ works differently:
 The sub-graph pattern is defined in the `pattern()` function. This function should return a dictionary with two keys:
 `nodes` and `edges`:
 * The value for the `nodes` key is a list of tuples with two elements.
-   * The first element is an alias name for a node which will be used to define edges between nodes and in the
+   * The first element is an alias name for a node that will be used to define edges between nodes and in the
    transformation function.
-   * The second element is a dictionary with attributes. The key is a name of an attribute which should exist in the
-   node. The value for the attribute can be some specific value to match or a function which gets a single parameter -
+   * The second element is a dictionary with attributes. The key is a name of an attribute that should exist in the
+   node. The value for the attribute can be some specific value to match or a function that gets a single parameter -
    the attribute value from the node. The function should return the result of attribute comparison with a dedicated
    value.
 * The value for the `edges` key is a list of tuples with two or three elements.
@@ -871,7 +870,7 @@ class MishFusion(FrontReplacementSubgraph):
 This type of transformation is implemented using `mo.front.common.replacement.FrontReplacementOp` as base class and
 works the following way.
 1. Developer defines an operation type to trigger the transformation.
-2. Model Optimizer search for all nodes in the graph with the attribute `op` equal to the specified value.
+2. Model Optimizer searches for all nodes in the graph with the attribute `op` equal to the specified value.
 3. Model Optimizer executes developer-defined function performing graph transformation for each instance of a matched
 node. Developer can override different functions in the base transformation class and the Model Optimizer works
 differently:
@@ -921,7 +920,7 @@ class Pack(FrontReplacementOp):
 ```
 
 ##### Generic Front Phase Transformations <a name="generic-front-phase-transformations"></a>
-Model Optimizer provides mechanism to implement generic front phase transformation. This type of transformation is
+Model Optimizer provides a mechanism to implement generic front phase transformation. This type of transformation is
 implemented using `mo.front.common.replacement.FrontReplacementSubgraph` or
 `mo.front.common.replacement.FrontReplacementPattern` as base classes. The only condition to execute the transformation
 is to check that it is enabled. Then the Model Optimizer executes the method `find_and_replace_pattern(self, graph)` and
@@ -968,7 +967,7 @@ class SqueezeNormalize(FrontReplacementPattern):
                             'attribute'.format(squeeze_node.soft_get('name')))
 ```
 
-Refer to the `mo/front/common/replacement.py` for the implementation details on how these front phase transformations
+Refer to `mo/front/common/replacement.py` for the implementation details on how these front phase transformations
 work.
 
 ##### Node Name Pattern Front Phase Transformations <a name="node-name-pattern-front-phase-transformations"></a>
@@ -1104,10 +1103,10 @@ for more examples of this type of transformation.
 ##### Front Phase Transformations Using Start and End Points <a name="start-end-points-front-phase-transformations"></a>
 This type of transformation is implemented using `mo.front.tf.replacement.FrontReplacementFromConfigFileSubGraph` as a
 base class and works the following way.
-1. Developer prepares a JSON configuration file which defines the sub-graph to match using two lists of node names:
+1. Developer prepares a JSON configuration file that defines the sub-graph to match using two lists of node names:
 "start" and "end" nodes.
-2. Model Optimizer executes developer-defined transformation **only** when an user specifies the path to the
-configuration file using the command line parameter `--transformations_config`.Model Optimizer performs the following
+2. Model Optimizer executes developer-defined transformation **only** when a user specifies the path to the
+configuration file using the command line parameter `--transformations_config`. Model Optimizer performs the following
 steps to match the sub-graph:
    1. Starts a graph traversal from every start node following the direction of the graph edges. The search stops in an
    end node or in case of a node without consumers. All visited nodes are added to the matched sub-graph.
@@ -1115,9 +1114,9 @@ steps to match the sub-graph:
    "start" list. In this step the edges are traversed in the opposite edge direction. All newly visited nodes are added
    to the matched sub-graph. This step is needed to add nodes required for calculation values of internal nodes of the
    matched sub-graph.
-   3. Checks that all "end" nodes were reached from "start" nodes. If no then exit with error.
+   3. Checks that all "end" nodes were reached from "start" nodes. If no, exits with an error.
    4. Check that there are no [Parameter](../../../ops/infrastructure/Parameter_1.md) operations among added nodes. If
-   they exist then the sub-graph depends on the inputs of the model. Such configuration is considered incorrect so the
+   they exist, the sub-graph depends on the inputs of the model. Such configuration is considered incorrect so the
    Model Optimizer exits with an error.
 
 This algorithm finds all nodes "between" start and end nodes and nodes needed for calculation of non-input nodes of the
@@ -1160,7 +1159,7 @@ The example of a JSON configuration file for a transformation with start and end
 
 The format of the file is similar to the one provided as an example in the
 [Node Name Pattern Front Phase Transformations](#node-name-pattern-front-phase-transformations). There difference is in
-the value of the `match_kind` parameter which should be equal to `points` and the format of the `instances` parameter
+the value of the `match_kind` parameter, which should be equal to `points` and the format of the `instances` parameter
 which should be a dictionary with two keys `start_points` and `end_points` defining start and end node names
 correspondingly.
 
@@ -1168,7 +1167,7 @@ correspondingly.
 > always equal to `true`.
 
 > **NOTE**: This sub-graph match algorithm has a limitation that each start node must have only one input. Therefore, it
-> is not possible to specify, for example, [Convolution](../../../ops/convolution/Convolution_1.md) node as input
+> is not possible to specify, for example, the [Convolution](../../../ops/convolution/Convolution_1.md) node as input
 > because it has two inputs: data tensor and tensor with weights.
 
 For other examples of transformations with points, please refer to the
@@ -1259,7 +1258,7 @@ graph structure changes.
 Refer to the `extensions/middle/L2NormToNorm.py` for the example of a pattern-defined middle transformation.
 
 ##### Generic Middle Phase Transformations <a name="generic-middle-phase-transformations"></a>
-Model Optimizer provides mechanism to implement generic middle phase transformations. This type of transformation is
+Model Optimizer provides a mechanism to implement generic middle phase transformations. This type of transformation is
 implemented using `mo.middle.replacement.MiddleReplacementPattern` as a base class and works similarly to the
 [Generic Front Phase Transformations](#generic-front-phase-transformations). The only difference is that the
 transformation entry function name is `find_and_replace_pattern(self, graph: Graph)`.
@@ -1290,7 +1289,7 @@ implemented using `mo.back.replacement.BackReplacementPattern` as a base class a
 
 Refer to the `extensions/back/GatherNormalizer.py` for the example of a such type of transformation.
 
-## See Also
+## See Also <a name="see-also"></a>
 * [Deep Learning Network Intermediate Representation and Operation Sets in OpenVINO™](../../IR_and_opsets.md)
 * [Converting a Model to Intermediate Representation (IR)](../convert_model/Converting_Model.md)
 * [nGraph Basic Concepts](@ref openvino_docs_nGraph_DG_basic_concepts)
diff --git a/docs/benchmarks/performance_benchmarks.md b/docs/benchmarks/performance_benchmarks.md
index 169c83c9bea..7969b2929ff 100644
--- a/docs/benchmarks/performance_benchmarks.md
+++ b/docs/benchmarks/performance_benchmarks.md
@@ -1,261 +1,12 @@
-# Get a Deep Learning Model Performance Boost with Intel® Platforms {#openvino_docs_performance_benchmarks}
+# Performance Benchmarks {#openvino_docs_performance_benchmarks}
 
-## Increase Performance for Deep Learning Inference
+The [Intel® Distribution of OpenVINO™ toolkit](https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit.html) helps accelerate deep learning inference across a variety of Intel® processors and accelerators.  
 
-The [Intel® Distribution of OpenVINO™ toolkit](https://software.intel.com/en-us/openvino-toolkit) helps accelerate deep learning inference across a variety of Intel® processors and accelerators. Rather than a one-size-fits-all solution, Intel offers a powerful portfolio of scalable hardware and software solutions, powered by the Intel® Distribution of OpenVINO™ toolkit, to meet the various performance, power, and price requirements of any use case. The benchmarks below demonstrate high performance gains on several public neural networks for a streamlined, quick deployment on **Intel® CPU and VPU** platforms. Use this data to help you decide which hardware is best for your applications and solutions, or to plan your AI workload on the Intel computing already included in your solutions.
+The benchmarks below demonstrate high performance gains on several public neural networks on multiple Intel® CPUs, GPUs and VPUs covering a broad performance range. Use this data to help you decide which hardware is best for your applications and solutions, or to plan your AI workload on the Intel computing already included in your solutions. 
 
-Measuring inference performance involves many variables and is extremely use-case and application dependent. We use the below four parameters for measurements, which are key elements to consider for a successful deep learning inference application:
+Use the links below to review the benchmarking results for each alternative: 
 
-1. **Throughput** - Measures the number of inferences delivered within a latency threshold. (for example, number of Frames Per Second - FPS). When deploying a system with deep learning inference, select the throughput that delivers the best trade-off between latency and power for the price and performance that meets your requirements.
-2. **Value** - While throughput is important, what is more critical in edge AI deployments is the performance efficiency or performance-per-cost. Application performance in throughput per dollar of system cost is the best measure of value.
-3. **Efficiency** - System power is a key consideration from the edge to the data center. When selecting deep learning solutions, power efficiency (throughput/watt) is a critical factor to consider. Intel designs provide excellent power efficiency for running deep learning workloads.
-4. **Latency** - This measures the synchronous execution of inference requests and is reported in milliseconds. Each inference request (for example: preprocess, infer, postprocess) is allowed to complete before the next is started. This performance metric is relevant in usage scenarios where a single image input needs to be acted upon as soon as possible. An example would be the healthcare sector where medical personnel only request analysis of a single ultra sound scanning image or in real-time or near real-time applications for example an industrial robot's response to actions in its environment or obstacle avoidance for autonomous vehicles.   
+* [Intel® Distribution of OpenVINO™ toolkit Benchmark Results](performance_benchmarks_openvino.md)  
+* [OpenVINO™ Model Server Benchmark Results](performance_benchmarks_ovms.md)  
 
-\htmlonly
-<!-- these CDN links and scripts are required.  Add them to the <head> of your website -->
-<link href="https://fonts.googleapis.com/css2?family=Roboto:wght@100;300;400;500;600;700;900&display=swap" rel="stylesheet" type="text/css">
-<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" type="text/css">
-<script src="https://cdn.jsdelivr.net/npm/chart.js@2.9.3/dist/Chart.min.js"></script>
-<script src="https://cdn.jsdelivr.net/npm/chartjs-plugin-datalabels"></script>
-<script src="https://cdnjs.cloudflare.com/ajax/libs/chartjs-plugin-annotation/0.5.7/chartjs-plugin-annotation.min.js"></script> 
-<script src="https://cdn.jsdelivr.net/npm/chartjs-plugin-barchart-background@1.3.0/build/Plugin.Barchart.Background.min.js"></script>
-<script src="https://cdn.jsdelivr.net/npm/chartjs-plugin-deferred@1"></script>
-<!-- download this file and place on your server (or include the styles inline) -->
-<link rel="stylesheet" href="ovgraphs.css" type="text/css">
-\endhtmlonly
-
-
-\htmlonly
-<script src="bert-large-uncased-whole-word-masking-squad-int8-0001-ov-2021-2-185.js" id="bert-large-uncased-whole-word-masking-squad-int8-0001-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="deeplabv3-tf-ov-2021-2-185.js" id="deeplabv3-tf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="densenet-121-tf-ov-2021-2-185.js" id="densenet-121-tf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="faster-rcnn-resnet50-coco-tf-ov-2021-2-185.js" id="faster-rcnn-resnet50-coco-tf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="googlenet-v1-tf-ov-2021-2-185.js" id="googlenet-v1-tf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="inception-v3-tf-ov-2021-2-185.js" id="inception-v3-tf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="mobilenet-ssd-cf-ov-2021-2-185.js" id="mobilenet-ssd-cf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="mobilenet-v1-1-0-224-tf-ov-2021-2-185.js" id="mobilenet-v1-1-0-224-tf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="mobilenet-v2-pytorch-ov-2021-2-185.js" id="mobilenet-v2-pytorch-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="resnet-18-pytorch-ov-2021-2-185.js" id="resnet-18-pytorch-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="resnet-50-tf-ov-2021-2-185.js" id="resnet-50-tf-ov-2021-2-185"></script>
-\endhtmlonly
-
-
-\htmlonly
-<script src="se-resnext-50-cf-ov-2021-2-185.js" id="se-resnext-50-cf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="squeezenet1-1-cf-ov-2021-2-185.js" id="squeezenet1-1-cf-ov-2021-2-185"></script>
-\endhtmlonly
-
-
-\htmlonly
-<script src="ssd300-cf-ov-2021-2-185.js" id="ssd300-cf-ov-2021-2-185"></script>
-\endhtmlonly
-
-\htmlonly
-<script src="yolo-v3-tf-ov-2021-2-185.js" id="yolo-v3-tf-ov-2021-2-185"></script>
-\endhtmlonly
-
-
-## Platform Configurations
-
-Intel® Distribution of OpenVINO™ toolkit performance benchmark numbers are based on release 2021.2. 
-
-Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. Performance results are based on testing as of December 9, 2020 and may not reflect all publicly available updates. See configuration disclosure for details. No product can be absolutely secure. 
-
-Performance varies by use, configuration and other factors. Learn more at [www.intel.com/PerformanceIndex](https://www.intel.com/PerformanceIndex).
-
-Your costs and results may vary. 
-
-© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
-
-Intel optimizations, for Intel compilers or other products, may not optimize to the same degree for non-Intel products.
-
-Testing by Intel done on: see test date for each HW platform below.
-
-**CPU Inference Engines**
-
-|                                 | Intel® Xeon® E-2124G               | Intel® Xeon® W1290P                | Intel® Xeon® Silver 4216R               | 
-| ------------------------------- | ----------------------             | ---------------------------        | ----------------------------            |
-| Motherboard                     | ASUS* WS C246 PRO                  | ASUS* WS W480-ACE                  | Intel® Server Board S2600STB            |
-| CPU                             | Intel® Xeon® E-2124G CPU @ 3.40GHz | Intel® Xeon® W-1290P CPU @ 3.70GHz | Intel® Xeon® Silver 4216R CPU @ 2.20GHz |
-| Hyper Threading                 | OFF                                | ON                                 | ON                                      |
-| Turbo Setting                   | ON                                 | ON                                 | ON                                      |
-| Memory                          | 2 x 16 GB DDR4 2666MHz             | 4 x 16 GB DDR4 @ 2666MHz           |12 x 32 GB DDR4 2666MHz                  | 
-| Operating System                | Ubuntu* 18.04 LTS                  | Ubuntu* 18.04 LTS                  | Ubuntu* 18.04 LTS                       |
-| Kernel Version                  | 5.3.0-24-generic                   | 5.3.0-24-generic                   | 5.3.0-24-generic                        | 
-| BIOS Vendor                     | American Megatrends Inc.*          | American Megatrends Inc.           | Intel Corporation                       |
-| BIOS Version                    | 0904                               | 607                                | SE5C620.86B.02.01.<br>0009.092820190230 |
-| BIOS Release                    | April 12, 2019                     | May 29, 2020                       | September 28, 2019                      |
-| BIOS Settings                   | Select optimized default settings, <br>save & exit | Select optimized default settings, <br>save & exit | Select optimized default settings, <br>change power policy <br>to "performance", <br>save & exit |
-| Batch size                      | 1                                  | 1                                  | 1                            
-| Precision                       | INT8                               | INT8                               | INT8                         
-| Number of concurrent inference requests | 4                          | 5                                  | 32                           
-| Test Date                       | December 9, 2020                   | December 9, 2020                   | December 9, 2020             
-| Power dissipation, TDP in Watt  | [71](https://ark.intel.com/content/www/us/en/ark/products/134854/intel-xeon-e-2124g-processor-8m-cache-up-to-4-50-ghz.html#tab-blade-1-0-1)                    | [125](https://ark.intel.com/content/www/us/en/ark/products/199336/intel-xeon-w-1290p-processor-20m-cache-3-70-ghz.html)                          | [125](https://ark.intel.com/content/www/us/en/ark/products/193394/intel-xeon-silver-4216-processor-22m-cache-2-10-ghz.html#tab-blade-1-0-1) |
-| CPU Price on September 29, 2020, USD<br>Prices may vary  | [213](https://ark.intel.com/content/www/us/en/ark/products/134854/intel-xeon-e-2124g-processor-8m-cache-up-to-4-50-ghz.html)     | [539](https://ark.intel.com/content/www/us/en/ark/products/199336/intel-xeon-w-1290p-processor-20m-cache-3-70-ghz.html)     |[1,002](https://ark.intel.com/content/www/us/en/ark/products/193394/intel-xeon-silver-4216-processor-22m-cache-2-10-ghz.html)                 | 
-
-**CPU Inference Engines (continue)**
-
-|                                 | Intel® Xeon® Gold 5218T                 | Intel® Xeon® Platinum 8270               | 
-| ------------------------------- | ----------------------------            | ----------------------------             |
-| Motherboard                     | Intel® Server Board S2600STB            | Intel® Server Board S2600STB             |
-| CPU                             | Intel® Xeon® Gold 5218T CPU @ 2.10GHz   | Intel® Xeon® Platinum 8270 CPU @ 2.70GHz |
-| Hyper Threading                 | ON                                      | ON                                       |
-| Turbo Setting                   | ON                                      | ON                                       |
-| Memory                          | 12 x 32 GB DDR4 2666MHz                 | 12 x 32 GB DDR4 2933MHz                  |
-| Operating System                | Ubuntu* 18.04 LTS                       | Ubuntu* 18.04 LTS                        |
-| Kernel Version                  | 5.3.0-24-generic                        | 5.3.0-24-generic                         |
-| BIOS Vendor                     | Intel Corporation                       | Intel Corporation                        |
-| BIOS Version                    | SE5C620.86B.02.01.<br>0009.092820190230 | SE5C620.86B.02.01.<br>0009.092820190230  |
-| BIOS Release                    | September 28, 2019                      | September 28, 2019                       |
-| BIOS Settings                   | Select optimized default settings, <br>change power policy to "performance", <br>save & exit | Select optimized default settings, <br>change power policy to "performance", <br>save & exit |
-| Batch size                      | 1                                       | 1                                        |
-| Precision                       | INT8                                    | INT8                                     |
-| Number of concurrent inference requests |32                               | 52                                       |
-| Test Date                       | December 9, 2020                        | December 9, 2020                         |
-| Power dissipation, TDP in Watt  | [105](https://ark.intel.com/content/www/us/en/ark/products/193953/intel-xeon-gold-5218t-processor-22m-cache-2-10-ghz.html#tab-blade-1-0-1)             | [205](https://ark.intel.com/content/www/us/en/ark/products/192482/intel-xeon-platinum-8270-processor-35-75m-cache-2-70-ghz.html#tab-blade-1-0-1)                          |
-| CPU Price on September 29, 2020, USD<br>Prices may vary  | [1,349](https://ark.intel.com/content/www/us/en/ark/products/193953/intel-xeon-gold-5218t-processor-22m-cache-2-10-ghz.html)                        | [7,405](https://ark.intel.com/content/www/us/en/ark/products/192482/intel-xeon-platinum-8270-processor-35-75m-cache-2-70-ghz.html)                        |
-
-
-**CPU Inference Engines (continue)**
-
-|                      | Intel® Core™ i7-8700T               | Intel® Core™ i9-10920X               | Intel® Core™ i9-10900TE<br>(iEi Flex BX210AI)| 11th Gen Intel® Core™ i7-1185G7 |
-| -------------------- | ----------------------------------- |--------------------------------------| ---------------------------------------------|---------------------------------|
-| Motherboard          | GIGABYTE* Z370M DS3H-CF             | ASUS* PRIME X299-A II                | iEi / B595                                   | Intel Corporation<br>internal/Reference<br>Validation Platform |
-| CPU                  | Intel® Core™ i7-8700T CPU @ 2.40GHz | Intel® Core™ i9-10920X CPU @ 3.50GHz | Intel® Core™ i9-10900TE CPU @ 1.80GHz        | 11th Gen Intel® Core™ i7-1185G7 @ 3.00GHz |
-| Hyper Threading      | ON                                  | ON                                   | ON                                           | ON                                        |
-| Turbo Setting        | ON                                  | ON                                   | ON                                           | ON                                        |
-| Memory               | 4 x 16 GB DDR4 2400MHz              | 4 x 16 GB DDR4 2666MHz               | 2 x 8 GB DDR4 @ 2400MHz                      | 2 x 8 GB DDR4 3200MHz                     |
-| Operating System     | Ubuntu* 18.04 LTS                   | Ubuntu* 18.04 LTS                    | Ubuntu* 18.04 LTS                            | Ubuntu* 18.04 LTS                         |
-| Kernel Version       | 5.3.0-24-generic                    | 5.3.0-24-generic                     | 5.8.0-05-generic                             | 5.8.0-05-generic                          |
-| BIOS Vendor          | American Megatrends Inc.*           | American Megatrends Inc.*            | American Megatrends Inc.*                    | Intel Corporation                         |
-| BIOS Version         | F11                                 | 505                                  | Z667AR10                                     | TGLSFWI1.R00.3425.<br>A00.2010162309      |
-| BIOS Release         | March 13, 2019                      | December 17, 2019                    | July 15, 2020                                | October 16, 2020                          |
-| BIOS Settings        | Select optimized default settings, <br>set OS type to "other", <br>save & exit | Default Settings | Default Settings      | Default Settings                          |
-| Batch size           | 1                                   | 1                                    | 1                                            | 1                                         |
-| Precision            | INT8                                | INT8                                 | INT8                                         | INT8                                      |
-| Number of concurrent inference requests |4                 | 24                                   | 5                                            | 4                                         |
-| Test Date            | December 9, 2020                    | December 9, 2020                     | December 9, 2020                             | December 9, 2020                          |
-| Power dissipation, TDP in Watt                             | [35](https://ark.intel.com/content/www/us/en/ark/products/129948/intel-core-i7-8700t-processor-12m-cache-up-to-4-00-ghz.html#tab-blade-1-0-1) | [165](https://ark.intel.com/content/www/us/en/ark/products/198012/intel-core-i9-10920x-x-series-processor-19-25m-cache-3-50-ghz.html) | [35](https://ark.intel.com/content/www/us/en/ark/products/203901/intel-core-i9-10900te-processor-20m-cache-up-to-4-60-ghz.html)  | [28](https://ark.intel.com/content/www/us/en/ark/products/208664/intel-core-i7-1185g7-processor-12m-cache-up-to-4-80-ghz-with-ipu.html#tab-blade-1-0-1) |
-| CPU Price on September 29, 2020, USD<br>Prices may vary    | [303](https://ark.intel.com/content/www/us/en/ark/products/129948/intel-core-i7-8700t-processor-12m-cache-up-to-4-00-ghz.html)                | [700](https://ark.intel.com/content/www/us/en/ark/products/198012/intel-core-i9-10920x-x-series-processor-19-25m-cache-3-50-ghz.html) | [444](https://ark.intel.com/content/www/us/en/ark/products/203901/intel-core-i9-10900te-processor-20m-cache-up-to-4-60-ghz.html) | [426](https://ark.intel.com/content/www/us/en/ark/products/208664/intel-core-i7-1185g7-processor-12m-cache-up-to-4-80-ghz-with-ipu.html#tab-blade-1-0-0)             |
-
-
-**CPU Inference Engines (continue)**
-
-|                      | Intel® Core™ i5-8500               | Intel® Core™ i5-10500TE               | Intel® Core™ i5-10500TE<br>(iEi Flex-BX210AI)|
-| -------------------- | ---------------------------------- | -----------------------------------   |-------------------------------------- |
-| Motherboard          | ASUS* PRIME Z370-A                 | GIGABYTE* Z490 AORUS PRO AX           | iEi / B595                            |
-| CPU                  | Intel® Core™ i5-8500 CPU @ 3.00GHz | Intel® Core™ i5-10500TE CPU @ 2.30GHz | Intel® Core™ i5-10500TE CPU @ 2.30GHz |
-| Hyper Threading      | OFF                                | ON                                    | ON                                    |
-| Turbo Setting        | ON                                 | ON                                    | ON                                    |
-| Memory               | 2 x 16 GB DDR4 2666MHz             | 2 x 16 GB DDR4 @ 2666MHz              | 1 x 8 GB DDR4 @ 2400MHz               |
-| Operating System     | Ubuntu* 18.04 LTS                  | Ubuntu* 18.04 LTS                     | Ubuntu* 18.04 LTS                     |
-| Kernel Version       | 5.3.0-24-generic                   | 5.3.0-24-generic                      | 5.3.0-24-generic                      |
-| BIOS Vendor          | American Megatrends Inc.*          | American Megatrends Inc.*             | American Megatrends Inc.*             |
-| BIOS Version         | 2401                               | F3                                    | Z667AR10                              |
-| BIOS Release         | July 12, 2019                      | March 25, 2020                        | July 17, 2020                         |
-| BIOS Settings        | Select optimized default settings, <br>save & exit | Select optimized default settings, <br>set OS type to "other", <br>save & exit | Default Settings |
-| Batch size           | 1                                  | 1                                     | 1                                     |
-| Precision            | INT8                               | INT8                                  | INT8                                  |
-| Number of concurrent inference requests | 3               | 4                                     | 4                                    |
-| Test Date            | December 9, 2020                   | December 9, 2020                      | December 9, 2020                      |
-| Power dissipation, TDP in Watt                            | [65](https://ark.intel.com/content/www/us/en/ark/products/129939/intel-core-i5-8500-processor-9m-cache-up-to-4-10-ghz.html#tab-blade-1-0-1)| [35](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html)  | [35](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html) |
-| CPU Price on September 29, 2020, USD<br>Prices may vary   | [192](https://ark.intel.com/content/www/us/en/ark/products/129939/intel-core-i5-8500-processor-9m-cache-up-to-4-10-ghz.html)               | [195](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html) | [195](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html) |
-
-
-**CPU Inference Engines (continue)**
-
-|                      | Intel Atom® x5-E3940                  | Intel® Core™ i3-8100               | 
-| -------------------- | ----------------------------------    |----------------------------------- |
-| Motherboard          |                                       | GIGABYTE* Z390 UD                  |
-| CPU                  | Intel Atom® Processor E3940 @ 1.60GHz | Intel® Core™ i3-8100 CPU @ 3.60GHz |
-| Hyper Threading      | OFF                                   | OFF                                |
-| Turbo Setting        | ON                                    | OFF                                |
-| Memory               | 1 x 8 GB DDR3 1600MHz                 | 4 x 8 GB DDR4 2400MHz              |
-| Operating System     | Ubuntu* 18.04 LTS                     | Ubuntu* 18.04 LTS                  |
-| Kernel Version       | 5.3.0-24-generic                      | 5.3.0-24-generic                   |
-| BIOS Vendor          | American Megatrends Inc.*             | American Megatrends Inc.*          |
-| BIOS Version         | 5.12                                  | F8                                 |
-| BIOS Release         | September 6, 2017                     | May 24, 2019                       |
-| BIOS Settings        | Default settings                      | Select optimized default settings, <br> set OS type to "other", <br>save & exit |
-| Batch size           | 1                                     | 1                                  |
-| Precision            | INT8                                  | INT8                               |
-| Number of concurrent inference requests | 4                  | 4                                  |
-| Test Date            | December 9, 2020                         | December 9, 2020                      |
-| Power dissipation, TDP in Watt | [9.5](https://ark.intel.com/content/www/us/en/ark/products/96485/intel-atom-x5-e3940-processor-2m-cache-up-to-1-80-ghz.html)                                                              | [65](https://ark.intel.com/content/www/us/en/ark/products/126688/intel-core-i3-8100-processor-6m-cache-3-60-ghz.html#tab-blade-1-0-1)|
-| CPU Price on September 29, 2020, USD<br>Prices may vary  | [34](https://ark.intel.com/content/www/us/en/ark/products/96485/intel-atom-x5-e3940-processor-2m-cache-up-to-1-80-ghz.html)                                                        | [117](https://ark.intel.com/content/www/us/en/ark/products/126688/intel-core-i3-8100-processor-6m-cache-3-60-ghz.html)       |
-
-
-
-**Accelerator Inference Engines**
-
-|                                         | Intel® Neural Compute Stick 2         | Intel® Vision Accelerator Design<br>with Intel® Movidius™ VPUs (Mustang-V100-MX8) | 
-| --------------------------------------- | ------------------------------------- | ------------------------------------- |
-| VPU                                     | 1 X Intel® Movidius™ Myriad™ X MA2485 | 8 X Intel® Movidius™ Myriad™ X MA2485 |
-| Connection                              | USB 2.0/3.0                           | PCIe X4                               |
-| Batch size                              | 1                                     | 1                                     |
-| Precision                               | FP16                                  | FP16                                  |
-| Number of concurrent inference requests | 4                                     | 32                                    |
-| Power dissipation, TDP in Watt          | 2.5                                   | [30](https://www.mouser.com/ProductDetail/IEI/MUSTANG-V100-MX8-R10?qs=u16ybLDytRaZtiUUvsd36w%3D%3D)          |
-| CPU Price, USD<br>Prices may vary | [69](https://ark.intel.com/content/www/us/en/ark/products/140109/intel-neural-compute-stick-2.html) (from December 9, 2020) | [214](https://www.arrow.com/en/products/mustang-v100-mx8-r10/iei-technology?gclid=Cj0KCQiA5bz-BRD-ARIsABjT4ng1v1apmxz3BVCPA-tdIsOwbEjTtqnmp_rQJGMfJ6Q2xTq6ADtf9OYaAhMUEALw_wcB) (from December 9, 2020)                           |
-| Host Computer                           | Intel® Core™ i7                       | Intel® Core™ i5                       |
-| Motherboard                             | ASUS* Z370-A II                       | Uzelinfo* / US-E1300                  |
-| CPU                                     | Intel® Core™ i7-8700 CPU @ 3.20GHz    | Intel® Core™ i5-6600 CPU @ 3.30GHz    |
-| Hyper Threading                         | ON                                    | OFF                                   |
-| Turbo Setting                           | ON                                    | ON                                    |
-| Memory                                  | 4 x 16 GB DDR4 2666MHz                | 2 x 16 GB DDR4 2400MHz                |
-| Operating System                        | Ubuntu* 18.04 LTS                     | Ubuntu* 18.04 LTS                     |
-| Kernel Version                          | 5.0.0-23-generic                      | 5.0.0-23-generic                      |
-| BIOS Vendor                             | American Megatrends Inc.*             | American Megatrends Inc.*             |
-| BIOS Version                            | 411                                   | 5.12                                  |
-| BIOS Release                            | September 21, 2018                    | September 21, 2018                    |
-| Test Date                               | December 9, 2020                      | December 9, 2020                      |        
-
-Please follow this link for more detailed configuration descriptions: [Configuration Details](https://docs.openvinotoolkit.org/resources/benchmark_files/system_configurations_2021.2.html)
-
-\htmlonly
-<style>
-    .footer {
-        display: none;
-    }
-</style>
-<div class="opt-notice-wrapper">
-<p class="opt-notice">
-\endhtmlonly
-Results may vary. For workloads and configurations visit: [www.intel.com/PerformanceIndex](https://www.intel.com/PerformanceIndex) and [Legal Information](../Legal_Information.md).
-\htmlonly
-</p>
-</div>
-\endhtmlonly
+Performance for a particular application can also be evaluated virtually using [Intel® DevCloud for the Edge](https://devcloud.intel.com/edge/), a remote development environment with access to Intel® hardware and the latest versions of the Intel® Distribution of the OpenVINO™ Toolkit. [Learn more](https://devcloud.intel.com/edge/get_started/devcloud/) or [Register here](https://inteliot.force.com/DevcloudForEdge/s/).
diff --git a/docs/benchmarks/performance_benchmarks_faq.md b/docs/benchmarks/performance_benchmarks_faq.md
index 9b26dd57366..f48c3cf38fd 100644
--- a/docs/benchmarks/performance_benchmarks_faq.md
+++ b/docs/benchmarks/performance_benchmarks_faq.md
@@ -39,8 +39,10 @@ The image size used in the inference depends on the network being benchmarked. T
 |    [squeezenet1.1-CF](https://github.com/opencv/open_model_zoo/tree/master/models/public/squeezenet1.1)						     |    SqueezeNet_v1.1_ILSVRC-2012_Caffe    |    classification           |    227x227						 |
 |    [ssd300-CF](https://github.com/opencv/open_model_zoo/tree/master/models/public/ssd300)										     |    SSD (VGG-16)_VOC-2007_Caffe          |    object detection         |    300x300						 |
 |    [yolo_v3-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tf)                            | 	  TF Keras YOLO v3 Modelset            |	 object detection	      |    300x300                        |
+|    [yolo_v4-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v4-tf)                            | 	  Yolo-V4 TF                           |	 object detection	     |    608x608                        |
 |    [ssd_mobilenet_v1_coco-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssd_mobilenet_v1_coco)   |    ssd_mobilenet_v1_coco                |    object detection         |    300x300                        |
 |    [ssdlite_mobilenet_v2-TF](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssdlite_mobilenet_v2)     |    ssd_mobilenet_v2                     |    object detection         |    300x300                        |
+|    [unet-camvid-onnx-0001](https://github.com/openvinotoolkit/open_model_zoo/blob/master/models/intel/unet-camvid-onnx-0001/description/unet-camvid-onnx-0001.md)            |    U-Net                    |    semantic segmentation          |    368x480                        |
 
 #### 7. Where can I purchase the specific hardware used in the benchmarking?
 Intel partners with various vendors all over the world. Visit the [Intel® AI: In Production Partners & Solutions Catalog](https://www.intel.com/content/www/us/en/internet-of-things/ai-in-production/partners-solutions-catalog.html) for a list of Equipment Makers and the [Supported Devices](../IE_DG/supported_plugins/Supported_Devices.md) documentation. You can also remotely test and run models before purchasing any hardware by using [Intel® DevCloud for the Edge](http://devcloud.intel.com/edge/).
diff --git a/docs/benchmarks/performance_benchmarks_openvino.md b/docs/benchmarks/performance_benchmarks_openvino.md
new file mode 100644
index 00000000000..456f593db14
--- /dev/null
+++ b/docs/benchmarks/performance_benchmarks_openvino.md
@@ -0,0 +1,272 @@
+# Intel® Distribution of OpenVINO™ toolkit Benchmark Results {#openvino_docs_performance_benchmarks_openvino}
+
+This benchmark setup includes a single machine on which both the benchmark application and the OpenVINO™ installation reside.  
+
+The benchmark application loads the Inference Engine (SW) at run time and executes inferences on the specified hardware inference engine, (CPU, GPU or VPU). The benchmark application measures the time spent on actual inferencing (excluding any pre or post processing) and then reports on the inferences per second (or Frames Per Second). For more information on the benchmark application, please also refer to the entry 5 of the [FAQ section](performance_benchmarks_faq.md). 
+
+Devices similar to the ones we have used for benchmarking can be accessed using [Intel® DevCloud for the Edge](https://devcloud.intel.com/edge/), a remote development environment with access to Intel® hardware and the latest versions of the Intel® Distribution of the OpenVINO™ Toolkit. [Learn more](https://devcloud.intel.com/edge/get_started/devcloud/) or [Register here](https://inteliot.force.com/DevcloudForEdge/s/).
+
+Measuring inference performance involves many variables and is extremely use-case and application dependent. We use the below four parameters for measurements, which are key elements to consider for a successful deep learning inference application:
+
+- **Throughput** - Measures the number of inferences delivered within a latency threshold. (for example, number of Frames Per Second - FPS). When deploying a system with deep learning inference, select the throughput that delivers the best trade-off between latency and power for the price and performance that meets your requirements.
+- **Value** - While throughput is important, what is more critical in edge AI deployments is the performance efficiency or performance-per-cost. Application performance in throughput per dollar of system cost is the best measure of value.
+- **Efficiency** - System power is a key consideration from the edge to the data center. When selecting deep learning solutions, power efficiency (throughput/watt) is a critical factor to consider. Intel designs provide excellent power efficiency for running deep learning workloads.
+- **Latency** - This measures the synchronous execution of inference requests and is reported in milliseconds. Each inference request (for example: preprocess, infer, postprocess) is allowed to complete before the next is started. This performance metric is relevant in usage scenarios where a single image input needs to be acted upon as soon as possible. An example would be the healthcare sector where medical personnel only request analysis of a single ultra sound scanning image or in real-time or near real-time applications for example an industrial robot's response to actions in its environment or obstacle avoidance for autonomous vehicles. 
+
+
+\htmlonly
+<!-- these CDN links and scripts are required.  Add them to the <head> of your website -->
+<link href="https://fonts.googleapis.com/css2?family=Roboto:wght@100;300;400;500;600;700;900&display=swap" rel="stylesheet" type="text/css">
+<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css" type="text/css">
+<script src="https://cdn.jsdelivr.net/npm/chart.js@2.9.3/dist/Chart.min.js"></script>
+<script src="https://cdn.jsdelivr.net/npm/chartjs-plugin-datalabels"></script>
+<script src="https://cdnjs.cloudflare.com/ajax/libs/chartjs-plugin-annotation/0.5.7/chartjs-plugin-annotation.min.js"></script> 
+<script src="https://cdn.jsdelivr.net/npm/chartjs-plugin-barchart-background@1.3.0/build/Plugin.Barchart.Background.min.js"></script>
+<script src="https://cdn.jsdelivr.net/npm/chartjs-plugin-deferred@1"></script>
+<!-- download this file and place on your server (or include the styles inline) -->
+<link rel="stylesheet" href="ovgraphs.css" type="text/css">
+\endhtmlonly
+
+
+\htmlonly
+<script src="bert-large-uncased-whole-word-masking-squad-int8-0001-ov-2021-3-338-5.js" id="bert-large-uncased-whole-word-masking-squad-int8-0001-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="deeplabv3-tf-ov-2021-3-338-5.js" id="deeplabv3-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="densenet-121-tf-ov-2021-3-338-5.js" id="densenet-121-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="faster-rcnn-resnet50-coco-tf-ov-2021-3-338-5.js" id="faster-rcnn-resnet50-coco-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="googlenet-v1-tf-ov-2021-3-338-5.js" id="googlenet-v1-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="inception-v3-tf-ov-2021-3-338-5.js" id="inception-v3-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="mobilenet-ssd-cf-ov-2021-3-338-5.js" id="mobilenet-ssd-cf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="mobilenet-v1-1-0-224-tf-ov-2021-3-338-5.js" id="mobilenet-v1-1-0-224-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="mobilenet-v2-pytorch-ov-2021-3-338-5.js" id="mobilenet-v2-pytorch-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="resnet-18-pytorch-ov-2021-3-338-5.js" id="resnet-18-pytorch-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="resnet-50-tf-ov-2021-3-338-5.js" id="resnet-50-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+
+\htmlonly
+<script src="se-resnext-50-cf-ov-2021-3-338-5.js" id="se-resnext-50-cf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="squeezenet1-1-cf-ov-2021-3-338-5.js" id="squeezenet1-1-cf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+
+\htmlonly
+<script src="ssd300-cf-ov-2021-3-338-5.js" id="ssd300-cf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="yolo-v3-tf-ov-2021-3-338-5.js" id="yolo-v3-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="yolo-v4-tf-ov-2021-3-338-5.js" id="yolo-v4-tf-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+\htmlonly
+<script src="unet-camvid-onnx-0001-ov-2021-3-338-5.js" id="unet-camvid-onnx-0001-ov-2021-3-338-5"></script>
+\endhtmlonly
+
+
+## Platform Configurations
+
+Intel® Distribution of OpenVINO™ toolkit performance benchmark numbers are based on release 2021.3. 
+
+Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. Performance results are based on testing as of March 15, 2021 and may not reflect all publicly available updates. See configuration disclosure for details. No product can be absolutely secure. 
+
+Performance varies by use, configuration and other factors. Learn more at [www.intel.com/PerformanceIndex](https://www.intel.com/PerformanceIndex).
+
+Your costs and results may vary. 
+
+© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
+
+Intel optimizations, for Intel compilers or other products, may not optimize to the same degree for non-Intel products.
+
+Testing by Intel done on: see test date for each HW platform below.
+
+**CPU Inference Engines**
+
+|                                 | Intel® Xeon® E-2124G               | Intel® Xeon® W1290P                | Intel® Xeon® Silver 4216R               | 
+| ------------------------------- | ----------------------             | ---------------------------        | ----------------------------            |
+| Motherboard                     | ASUS* WS C246 PRO                  | ASUS* WS W480-ACE                  | Intel® Server Board S2600STB            |
+| CPU                             | Intel® Xeon® E-2124G CPU @ 3.40GHz | Intel® Xeon® W-1290P CPU @ 3.70GHz | Intel® Xeon® Silver 4216R CPU @ 2.20GHz |
+| Hyper Threading                 | OFF                                | ON                                 | ON                                      |
+| Turbo Setting                   | ON                                 | ON                                 | ON                                      |
+| Memory                          | 2 x 16 GB DDR4 2666MHz             | 4 x 16 GB DDR4 @ 2666MHz           |12 x 32 GB DDR4 2666MHz                  | 
+| Operating System                | Ubuntu* 18.04 LTS                  | Ubuntu* 18.04 LTS                  | Ubuntu* 18.04 LTS                       |
+| Kernel Version                  | 5.3.0-24-generic                   | 5.3.0-24-generic                   | 5.3.0-24-generic                        | 
+| BIOS Vendor                     | American Megatrends Inc.*          | American Megatrends Inc.           | Intel Corporation                       |
+| BIOS Version                    | 0904                               | 607                                | SE5C620.86B.02.01.<br>0009.092820190230 |
+| BIOS Release                    | April 12, 2019                     | May 29, 2020                       | September 28, 2019                      |
+| BIOS Settings                   | Select optimized default settings, <br>save & exit | Select optimized default settings, <br>save & exit | Select optimized default settings, <br>change power policy <br>to "performance", <br>save & exit |
+| Batch size                      | 1                                  | 1                                  | 1                            
+| Precision                       | INT8                               | INT8                               | INT8                         
+| Number of concurrent inference requests | 4                          | 5                                  | 32                           
+| Test Date                       | March 15, 2021                     | March 15, 2021                     | March 15, 2021             
+| Power dissipation, TDP in Watt  | [71](https://ark.intel.com/content/www/us/en/ark/products/134854/intel-xeon-e-2124g-processor-8m-cache-up-to-4-50-ghz.html#tab-blade-1-0-1)                    | [125](https://ark.intel.com/content/www/us/en/ark/products/199336/intel-xeon-w-1290p-processor-20m-cache-3-70-ghz.html)                          | [125](https://ark.intel.com/content/www/us/en/ark/products/193394/intel-xeon-silver-4216-processor-22m-cache-2-10-ghz.html#tab-blade-1-0-1) |
+| CPU Price on Mach 15th, 2021, USD<br>Prices may vary  | [213](https://ark.intel.com/content/www/us/en/ark/products/134854/intel-xeon-e-2124g-processor-8m-cache-up-to-4-50-ghz.html)     | [539](https://ark.intel.com/content/www/us/en/ark/products/199336/intel-xeon-w-1290p-processor-20m-cache-3-70-ghz.html)     |[1,002](https://ark.intel.com/content/www/us/en/ark/products/193394/intel-xeon-silver-4216-processor-22m-cache-2-10-ghz.html)                 | 
+
+**CPU Inference Engines (continue)**
+
+|                                 | Intel® Xeon® Gold 5218T                 | Intel® Xeon® Platinum 8270               | Intel® Xeon® Platinum 8380               |
+| ------------------------------- | ----------------------------            | ----------------------------             | -----------------------------------------|
+| Motherboard                     | Intel® Server Board S2600STB            | Intel® Server Board S2600STB             | Intel Corporation / WilsonCity           |
+| CPU                             | Intel® Xeon® Gold 5218T CPU @ 2.10GHz   | Intel® Xeon® Platinum 8270 CPU @ 2.70GHz | Intel® Xeon® Platinum 8380 CPU @ 2.30GHz |
+| Hyper Threading                 | ON                                      | ON                                       | ON                                       |
+| Turbo Setting                   | ON                                      | ON                                       | ON                                       |
+| Memory                          | 12 x 32 GB DDR4 2666MHz                 | 12 x 32 GB DDR4 2933MHz                  | 16 x 16 GB DDR4 3200MHz                  |
+| Operating System                | Ubuntu* 18.04 LTS                       | Ubuntu* 18.04 LTS                        | Ubuntu* 18.04 LTS                        |
+| Kernel Version                  | 5.3.0-24-generic                        | 5.3.0-24-generic                         | 5.3.0-24-generic                         |
+| BIOS Vendor                     | Intel Corporation                       | Intel Corporation                        | Intel Corporation                        |
+| BIOS Version                    | SE5C620.86B.02.01.<br>0009.092820190230 | SE5C620.86B.02.01.<br>0009.092820190230  | WLYDCRB1.SYS.0020.<br>P86.2103050636     |
+| BIOS Release                    | September 28, 2019                      | September 28, 2019                       | March 5, 2021                            |
+| BIOS Settings                   | Select optimized default settings, <br>change power policy to "performance", <br>save & exit | Select optimized default settings, <br>change power policy to "performance", <br>save & exit | Select optimized default settings, <br>change power policy to "performance", <br>save & exit |
+| Batch size                      | 1                                       | 1                                        | 1                                        |
+| Precision                       | INT8                                    | INT8                                     | INT8                                     |
+| Number of concurrent inference requests |32                               | 52                                       | 80                                       |
+| Test Date                       | March 15, 2021                          | March 15, 2021                           | March 22, 2021                           |
+| Power dissipation, TDP in Watt  | [105](https://ark.intel.com/content/www/us/en/ark/products/193953/intel-xeon-gold-5218t-processor-22m-cache-2-10-ghz.html#tab-blade-1-0-1)           | [205](https://ark.intel.com/content/www/us/en/ark/products/192482/intel-xeon-platinum-8270-processor-35-75m-cache-2-70-ghz.html#tab-blade-1-0-1)          | [270](https://ark.intel.com/content/www/us/en/ark/products/212287/intel-xeon-platinum-8380-processor-60m-cache-2-30-ghz.html) |
+| CPU Price, USD<br>Prices may vary  | [1,349](https://ark.intel.com/content/www/us/en/ark/products/193953/intel-xeon-gold-5218t-processor-22m-cache-2-10-ghz.html) (on Mach 15th, 2021) | [7,405](https://ark.intel.com/content/www/us/en/ark/products/192482/intel-xeon-platinum-8270-processor-35-75m-cache-2-70-ghz.html) (on Mach 15th, 2021)   | [8,099](https://ark.intel.com/content/www/us/en/ark/products/212287/intel-xeon-platinum-8380-processor-60m-cache-2-30-ghz.html) (on March 26th, 2021) |
+
+
+**CPU Inference Engines (continue)**
+
+|                      | Intel® Core™ i7-8700T               | Intel® Core™ i9-10920X               | 11th Gen Intel® Core™ i7-1185G7 |
+| -------------------- | ----------------------------------- |--------------------------------------| --------------------------------|
+| Motherboard          | GIGABYTE* Z370M DS3H-CF             | ASUS* PRIME X299-A II                | Intel Corporation<br>internal/Reference<br>Validation Platform |
+| CPU                  | Intel® Core™ i7-8700T CPU @ 2.40GHz | Intel® Core™ i9-10920X CPU @ 3.50GHz | 11th Gen Intel® Core™ i7-1185G7 @ 3.00GHz |
+| Hyper Threading      | ON                                  | ON                                   | ON                                        |
+| Turbo Setting        | ON                                  | ON                                   | ON                                        |
+| Memory               | 4 x 16 GB DDR4 2400MHz              | 4 x 16 GB DDR4 2666MHz               | 2 x 8 GB DDR4 3200MHz                     |
+| Operating System     | Ubuntu* 18.04 LTS                   | Ubuntu* 18.04 LTS                    | Ubuntu* 18.04 LTS                         |
+| Kernel Version       | 5.3.0-24-generic                    | 5.3.0-24-generic                     | 5.8.0-05-generic                          |
+| BIOS Vendor          | American Megatrends Inc.*           | American Megatrends Inc.*            | Intel Corporation                         |
+| BIOS Version         | F11                                 | 505                                  | TGLSFWI1.R00.3425.<br>A00.2010162309      |
+| BIOS Release         | March 13, 2019                      | December 17, 2019                    | October 16, 2020                          |
+| BIOS Settings        | Select optimized default settings, <br>set OS type to "other", <br>save & exit | Default Settings | Default Settings   |
+| Batch size           | 1                                   | 1                                    | 1                                         |
+| Precision            | INT8                                | INT8                                 | INT8                                      |
+| Number of concurrent inference requests |4                 | 24                                   | 4                                         |
+| Test Date            | March 15, 2021                      | March 15, 2021                       | March 15, 2021                          |
+| Power dissipation, TDP in Watt                             | [35](https://ark.intel.com/content/www/us/en/ark/products/129948/intel-core-i7-8700t-processor-12m-cache-up-to-4-00-ghz.html#tab-blade-1-0-1) | [165](https://ark.intel.com/content/www/us/en/ark/products/198012/intel-core-i9-10920x-x-series-processor-19-25m-cache-3-50-ghz.html) | [28](https://ark.intel.com/content/www/us/en/ark/products/208664/intel-core-i7-1185g7-processor-12m-cache-up-to-4-80-ghz-with-ipu.html#tab-blade-1-0-1) |
+| CPU Price on Mach 15th, 2021, USD<br>Prices may vary    | [303](https://ark.intel.com/content/www/us/en/ark/products/129948/intel-core-i7-8700t-processor-12m-cache-up-to-4-00-ghz.html)                | [700](https://ark.intel.com/content/www/us/en/ark/products/198012/intel-core-i9-10920x-x-series-processor-19-25m-cache-3-50-ghz.html) | [426](https://ark.intel.com/content/www/us/en/ark/products/208664/intel-core-i7-1185g7-processor-12m-cache-up-to-4-80-ghz-with-ipu.html#tab-blade-1-0-0)             |
+
+
+**CPU Inference Engines (continue)**
+
+|                      | Intel® Core™ i5-8500               | Intel® Core™ i5-10500TE               |
+| -------------------- | ---------------------------------- | -----------------------------------   |
+| Motherboard          | ASUS* PRIME Z370-A                 | GIGABYTE* Z490 AORUS PRO AX           |
+| CPU                  | Intel® Core™ i5-8500 CPU @ 3.00GHz | Intel® Core™ i5-10500TE CPU @ 2.30GHz |
+| Hyper Threading      | OFF                                | ON                                    |
+| Turbo Setting        | ON                                 | ON                                    |
+| Memory               | 2 x 16 GB DDR4 2666MHz             | 2 x 16 GB DDR4 @ 2666MHz              |
+| Operating System     | Ubuntu* 18.04 LTS                  | Ubuntu* 18.04 LTS                     |
+| Kernel Version       | 5.3.0-24-generic                   | 5.3.0-24-generic                      |
+| BIOS Vendor          | American Megatrends Inc.*          | American Megatrends Inc.*             |
+| BIOS Version         | 2401                               | F3                                    |
+| BIOS Release         | July 12, 2019                      | March 25, 2020                        |
+| BIOS Settings        | Select optimized default settings, <br>save & exit | Select optimized default settings, <br>set OS type to "other", <br>save & exit |
+| Batch size           | 1                                  | 1                                     |
+| Precision            | INT8                               | INT8                                  |
+| Number of concurrent inference requests | 3               | 4                                     |
+| Test Date            | March 15, 2021                     | March 15, 2021                      |
+| Power dissipation, TDP in Watt                            | [65](https://ark.intel.com/content/www/us/en/ark/products/129939/intel-core-i5-8500-processor-9m-cache-up-to-4-10-ghz.html#tab-blade-1-0-1)| [35](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html)  |
+| CPU Price on Mach 15th, 2021, USD<br>Prices may vary   | [192](https://ark.intel.com/content/www/us/en/ark/products/129939/intel-core-i5-8500-processor-9m-cache-up-to-4-10-ghz.html)               | [195](https://ark.intel.com/content/www/us/en/ark/products/203891/intel-core-i5-10500te-processor-12m-cache-up-to-3-70-ghz.html) |
+
+
+**CPU Inference Engines (continue)**
+
+|                      | Intel Atom® x5-E3940                  | Intel Atom® x6425RE                               | Intel® Core™ i3-8100               | 
+| -------------------- | --------------------------------------|-------------------------------                    |----------------------------------- |
+| Motherboard          |                                       | Intel Corporation /<br>ElkhartLake LPDDR4x T3 CRB | GIGABYTE* Z390 UD                  |
+| CPU                  | Intel Atom® Processor E3940 @ 1.60GHz | Intel Atom® x6425RE<br>Processor @ 1.90GHz        | Intel® Core™ i3-8100 CPU @ 3.60GHz |
+| Hyper Threading      | OFF                                   | OFF                                               | OFF                                |
+| Turbo Setting        | ON                                    | ON                                                | OFF                                |
+| Memory               | 1 x 8 GB DDR3 1600MHz                 | 2 x 4GB DDR4 3200 MHz                             | 4 x 8 GB DDR4 2400MHz              |
+| Operating System     | Ubuntu* 18.04 LTS                     | Ubuntu* 18.04 LTS                                 | Ubuntu* 18.04 LTS                  |
+| Kernel Version       | 5.3.0-24-generic                      | 5.8.0-050800-generic                              | 5.3.0-24-generic                   |
+| BIOS Vendor          | American Megatrends Inc.*             | Intel Corporation                                 | American Megatrends Inc.*          |
+| BIOS Version         | 5.12                                  | EHLSFWI1.R00.2463.<br>A03.2011200425              | F8                                 |
+| BIOS Release         | September 6, 2017                     | November 22, 2020                                 | May 24, 2019                       |
+| BIOS Settings        | Default settings                      | Default settings                                  | Select optimized default settings, <br> set OS type to "other", <br>save & exit |
+| Batch size           | 1                                     | 1                                                 | 1                                  |
+| Precision            | INT8                                  | INT8                                              | INT8                               |
+| Number of concurrent inference requests | 4                  | 4                                                 | 4                                  |
+| Test Date            | March 15, 2021                        | March 15, 2021                                    | March 15, 2021                     |
+| Power dissipation, TDP in Watt | [9.5](https://ark.intel.com/content/www/us/en/ark/products/96485/intel-atom-x5-e3940-processor-2m-cache-up-to-1-80-ghz.html)  | [12](https://ark.intel.com/content/www/us/en/ark/products/207899/intel-atom-x6425re-processor-1-5m-cache-1-90-ghz.html) | [65](https://ark.intel.com/content/www/us/en/ark/products/126688/intel-core-i3-8100-processor-6m-cache-3-60-ghz.html#tab-blade-1-0-1)|
+| CPU Price, USD<br>Prices may vary  | [34](https://ark.intel.com/content/www/us/en/ark/products/96485/intel-atom-x5-e3940-processor-2m-cache-up-to-1-80-ghz.html) (on March 15th, 2021) | [59](https://ark.intel.com/content/www/us/en/ark/products/207899/intel-atom-x6425re-processor-1-5m-cache-1-90-ghz.html) (on March 26th, 2021) | [117](https://ark.intel.com/content/www/us/en/ark/products/126688/intel-core-i3-8100-processor-6m-cache-3-60-ghz.html) (on March 15th, 2021)  |
+
+
+
+**Accelerator Inference Engines**
+
+|                                         | Intel® Neural Compute Stick 2         | Intel® Vision Accelerator Design<br>with Intel® Movidius™ VPUs (Mustang-V100-MX8) | 
+| --------------------------------------- | ------------------------------------- | ------------------------------------- |
+| VPU                                     | 1 X Intel® Movidius™ Myriad™ X MA2485 | 8 X Intel® Movidius™ Myriad™ X MA2485 |
+| Connection                              | USB 2.0/3.0                           | PCIe X4                               |
+| Batch size                              | 1                                     | 1                                     |
+| Precision                               | FP16                                  | FP16                                  |
+| Number of concurrent inference requests | 4                                     | 32                                    |
+| Power dissipation, TDP in Watt          | 2.5                                   | [30](https://www.arrow.com/en/products/mustang-v100-mx8-r10/iei-technology?gclid=Cj0KCQiA5bz-BRD-ARIsABjT4ng1v1apmxz3BVCPA-tdIsOwbEjTtqnmp_rQJGMfJ6Q2xTq6ADtf9OYaAhMUEALw_wcB)          |
+| CPU Price, USD<br>Prices may vary | [69](https://ark.intel.com/content/www/us/en/ark/products/140109/intel-neural-compute-stick-2.html) (from March 15, 2021) | [1180](https://www.arrow.com/en/products/mustang-v100-mx8-r10/iei-technology?gclid=Cj0KCQiA5bz-BRD-ARIsABjT4ng1v1apmxz3BVCPA-tdIsOwbEjTtqnmp_rQJGMfJ6Q2xTq6ADtf9OYaAhMUEALw_wcB) (from March 15, 2021)                           |
+| Host Computer                           | Intel® Core™ i7                       | Intel® Core™ i5                       |
+| Motherboard                             | ASUS* Z370-A II                       | Uzelinfo* / US-E1300                  |
+| CPU                                     | Intel® Core™ i7-8700 CPU @ 3.20GHz    | Intel® Core™ i5-6600 CPU @ 3.30GHz    |
+| Hyper Threading                         | ON                                    | OFF                                   |
+| Turbo Setting                           | ON                                    | ON                                    |
+| Memory                                  | 4 x 16 GB DDR4 2666MHz                | 2 x 16 GB DDR4 2400MHz                |
+| Operating System                        | Ubuntu* 18.04 LTS                     | Ubuntu* 18.04 LTS                     |
+| Kernel Version                          | 5.0.0-23-generic                      | 5.0.0-23-generic                      |
+| BIOS Vendor                             | American Megatrends Inc.*             | American Megatrends Inc.*             |
+| BIOS Version                            | 411                                   | 5.12                                  |
+| BIOS Release                            | September 21, 2018                    | September 21, 2018                    |
+| Test Date                               | March 15, 2021                        | March 15, 2021                      |        
+
+Please follow this link for more detailed configuration descriptions: [Configuration Details](https://docs.openvinotoolkit.org/resources/benchmark_files/system_configurations_2021.3.html)
+
+\htmlonly
+<style>
+    .footer {
+        display: none;
+    }
+</style>
+<div class="opt-notice-wrapper">
+<p class="opt-notice">
+\endhtmlonly
+Results may vary. For workloads and configurations visit: [www.intel.com/PerformanceIndex](https://www.intel.com/PerformanceIndex) and [Legal Information](../Legal_Information.md).
+\htmlonly
+</p>
+</div>
+\endhtmlonly
diff --git a/docs/benchmarks/performance_benchmarks_ovms.md b/docs/benchmarks/performance_benchmarks_ovms.md
new file mode 100644
index 00000000000..604a68438ed
--- /dev/null
+++ b/docs/benchmarks/performance_benchmarks_ovms.md
@@ -0,0 +1,376 @@
+# OpenVINO™ Model Server Benchmark Results {#openvino_docs_performance_benchmarks_ovms}
+
+OpenVINO™ Model Server is an open-source, production-grade inference platform that exposes a set of models via a convenient inference API over gRPC or HTTP/REST. It employs the inference engine libraries for from the Intel® Distribution of OpenVINO™ toolkit to extend workloads across Intel® hardware including CPU, GPU and others.
+
+![OpenVINO™ Model Server](../img/performance_benchmarks_ovms_01.png)
+
+## Measurement Methodology
+
+OpenVINO™ Model Server is measured in multiple-client-single-server configuration using two hardware platforms connected by ethernet network. The network bandwidth depends on the platforms as well as models under investigation and it is set to not be a bottleneck for workload intensity. This connection is dedicated only to the performance measurements. The benchmark setup is consists of four main parts:
+
+![OVMS Benchmark Setup Diagram](../img/performance_benchmarks_ovms_02.png)
+
+* **OpenVINO™ Model Server** is launched as a docker container on the server platform and it listens (and answers on) requests from clients. OpenVINO™ Model Server is run on the same machine as the OpenVINO™ toolkit benchmark application in corresponding benchmarking. Models served by OpenVINO™ Model Server are located in a local file system mounted into the docker container. The OpenVINO™ Model Server instance communicates with other components via ports over a dedicated docker network.
+
+* **Clients** are run in separated physical machine referred to as client platform. Clients are implemented in Python3 programming language based on TensorFlow* API and they work as parallel processes. Each client waits for a response from OpenVINO™ Model Server before it will send a new next request. The role played by the clients is also verification of responses.
+
+* **Load balancer** works on the client platform in a docker container. HAProxy is used for this purpose. Its main role is counting of requests forwarded from clients to OpenVINO™ Model Server, estimating its latency, and sharing this information by Prometheus service. The reason of locating the load balancer on the client site is to simulate real life scenario that includes impact of physical network on reported metrics.
+
+* **Execution Controller** is launched on the client platform. It is responsible for synchronization of the whole measurement process, downloading metrics from the load balancer, and presenting the final report of the execution.
+
+## 3D U-Net (FP32)
+![](../img/throughput_ovms_3dunet.png)
+## resnet-50-TF (INT8)
+![](../img/throughput_ovms_resnet50_int8.png)
+## resnet-50-TF (FP32)
+![](../img/throughput_ovms_resnet50_fp32.png)
+## bert-large-uncased-whole-word-masking-squad-int8-0001 (INT8)
+![](../img/throughput_ovms_bertlarge_int8.png)
+
+## bert-large-uncased-whole-word-masking-squad-0001 (FP32)
+![](../img/throughput_ovms_bertlarge_fp32.png)
+## Platform Configurations
+
+OpenVINO™ Model Server performance benchmark numbers are based on release 2021.3. Performance results are based on testing as of March 15, 2021 and may not reflect all publicly available updates. 
+
+**Platform with Intel® Xeon® Gold 6252**
+
+<table>
+  <tr>
+    <th></th>
+    <th><strong>Server Platform</strong></th>
+    <th><strong>Client Platform</strong></th>
+  </tr>
+  <tr>
+    <td><strong>Motherboard</strong></td>
+    <td>Intel® Server Board S2600WF H48104-872</td>
+    <td>Inspur YZMB-00882-104 NF5280M5</td>
+  </tr>
+  <tr>
+    <td><strong>Memory</strong></td>
+    <td>Hynix 16 x 16GB @ 2666 MT/s DDR4</td>
+    <td>Samsung 16 x 16GB @ 2666 MT/s DDR4</td>
+  </tr>
+  <tr>
+    <td><strong>CPU</strong></td>
+    <td>Intel® Xeon® Gold 6252 CPU @ 2.10GHz</td>
+    <td>Intel® Xeon® Platinum 8260M CPU @ 2.40GHz</td>
+  </tr>
+  <tr>
+    <td><strong>Selected CPU Flags</strong></td>
+    <td>Hyper Threading, Turbo Boost, DL Boost</td>
+    <td>Hyper Threading, Turbo Boost, DL Boost</td>
+  </tr>
+  <tr>
+    <td><strong>CPU Thermal Design Power</strong></td>
+    <td>150 W</td>
+    <td>162 W</td>
+  </tr>
+  <tr>
+    <td><strong>Operating System</strong></td>
+    <td>Ubuntu 20.04.2 LTS</td>
+    <td>Ubuntu 20.04.2 LTS</td>
+  </tr>
+  <tr>
+    <td><strong>Kernel Version</strong></td>
+    <td>5.4.0-65-generic</td>
+    <td>5.4.0-54-generic</td>
+  </tr>
+  <tr>
+    <td><strong>BIOS Vendor</strong></td>
+    <td>Intel® Corporation</td>
+    <td>American Megatrends Inc.</td>
+  </tr>
+  <tr>
+    <td><strong>BIOS Version and Release Date</strong></td>
+    <td>SE5C620.86B.02.01, date: 03/26/2020</td>
+    <td>4.1.16, date: 06/23/2020</td>
+  </tr>
+  <tr>
+    <td><strong>Docker Version</strong></td>
+    <td>20.10.3</td>
+    <td>20.10.3</td>
+  </tr>
+  <tr>
+    <td><strong>Network Speed</strong></td>
+    <td colspan="2" align="center">40 Gb/s</td>
+  </tr>
+</table>
+
+**Platform with Intel® Core™ i9-10920X**
+
+<table>
+<tr>
+  <th></th>
+  <th><strong>Server Platform</strong></th>
+  <th><strong>Client Platform</strong></th>
+</tr>
+<tr>
+  <td><strong>Motherboard</strong></td>
+  <td>ASUSTeK COMPUTER INC. PRIME X299-A II</td>
+  <td>ASUSTeK COMPUTER INC. PRIME Z370-P</td>
+</tr>
+<tr>
+  <td><strong>Memory</strong></td>
+  <td>Corsair 4 x 16GB @ 2666 MT/s DDR4</td>
+  <td>Corsair 4 x 16GB @ 2133 MT/s DDR4</td>
+</tr>
+<tr>
+  <td><strong>CPU</strong></td>
+  <td>Intel® Core™ i9-10920X CPU @ 3.50GHz</td>
+  <td>Intel® Core™ i7-8700T CPU @ 2.40GHz</td>
+</tr>
+<tr>
+  <td><strong>Selected CPU Flags</strong></td>
+  <td>Hyper Threading, Turbo Boost, DL Boost</td>
+  <td>Hyper Threading, Turbo Boost</td>
+</tr>
+<tr>
+  <td><strong>CPU Thermal Design Power</strong></td>
+  <td>165 W</td>
+  <td>35 W</td>
+</tr>
+<tr>
+  <td><strong>Operating System</strong></td>
+  <td>Ubuntu 20.04.1 LTS</td>
+  <td>Ubuntu 20.04.1 LTS</td>
+</tr>
+
+<tr>
+  <td><strong>Kernel Version</strong></td>
+  <td>5.4.0-52-generic</td>
+  <td>5.4.0-56-generic</td>
+</tr>
+<tr>
+  <td><strong>BIOS Vendor</strong></td>
+  <td>American Megatrends Inc.</td>
+  <td>American Megatrends Inc.</td>
+</tr>
+<tr>
+  <td><strong>BIOS Version and Release Date</strong></td>
+  <td>0603, date: 03/05/2020</td>
+  <td>2401, date: 07/15/2019</td>
+</tr>
+<tr>
+  <td><strong>Docker Version</strong></td>
+  <td>19.03.13</td>
+  <td>19.03.14</td>
+</tr>
+</tr>
+<tr>
+  <td><strong>Network Speed</strong></td>
+  <td colspan="2" align="center">10 Gb/s</td>
+</tr>
+</table>
+
+**Platform with Intel® Core™ i7-8700T**
+
+<table>
+<tr>
+  <th></th>
+  <th><strong>Server Platform</strong></th>
+  <th><strong>Client Platform</strong></th>
+</tr>
+<tr>
+  <td><strong>Motherboard</strong></td>
+  <td>ASUSTeK COMPUTER INC. PRIME Z370-P</td>
+  <td>ASUSTeK COMPUTER INC. PRIME X299-A II</td>
+</tr>
+<tr>
+  <td><strong>Memory</strong></td>
+  <td>Corsair 4 x 16GB @ 2133 MT/s DDR4</td>
+  <td>Corsair 4 x 16GB @ 2666 MT/s DDR4</td>
+</tr>
+<tr>
+  <td><strong>CPU</strong></td>
+  <td>Intel® Core™ i7-8700T CPU @ 2.40GHz</td>
+  <td>Intel® Core™ i9-10920X CPU @ 3.50GHz</td>
+</tr>
+<tr>
+  <td><strong>Selected CPU Flags</strong></td>
+  <td>Hyper Threading, Turbo Boost</td>
+  <td>Hyper Threading, Turbo Boost, DL Boost</td>
+</tr>
+<tr>
+  <td><strong>CPU Thermal Design Power</strong></td>
+  <td>35 W</td>
+  <td>165 W</td>
+</tr>
+<tr>
+  <td><strong>Operating System</strong></td>
+  <td>Ubuntu 20.04.1 LTS</td>
+  <td>Ubuntu 20.04.1 LTS</td>
+</tr>
+
+<tr>
+  <td><strong>Kernel Version</strong></td>
+  <td>5.4.0-56-generic</td>
+  <td>5.4.0-52-generic</td>
+</tr>
+<tr>
+  <td><strong>BIOS Vendor</strong></td>
+  <td>American Megatrends Inc.</td>
+  <td>American Megatrends Inc.</td>
+</tr>
+<tr>
+  <td><strong>BIOS Version and Release Date</strong></td>
+  <td>2401, date: 07/15/2019</td>
+  <td>0603, date: 03/05/2020</td>
+</tr>
+<tr>
+  <td><strong>Docker Version</strong></td>
+  <td>19.03.14</td>
+  <td>19.03.13</td>
+</tr>
+</tr>
+<tr>
+  <td><strong>Network Speed</strong></td>
+  <td colspan="2" align="center">10 Gb/s</td>
+</tr>
+</table>
+
+**Platform with Intel® Core™ i5-8500**
+
+<table>
+<tr>
+  <th></th>
+  <th><strong>Server Platform</strong></th>
+  <th><strong>Client Platform</strong></th>
+</tr>
+<tr>
+  <td><strong>Motherboard</strong></td>
+  <td>ASUSTeK COMPUTER INC. PRIME Z370-A</td>
+  <td>Gigabyte Technology Co., Ltd. Z390 UD</td>
+</tr>
+<tr>
+  <td><strong>Memory</strong></td>
+  <td>Corsair 2 x 16GB @ 2133 MT/s DDR4</td>
+  <td>029E 4 x 8GB @ 2400 MT/s DDR4</td>
+</tr>
+<tr>
+  <td><strong>CPU</strong></td>
+  <td>Intel® Core™ i5-8500 CPU @ 3.00GHz</td>
+  <td>Intel® Core™ i3-8100 CPU @ 3.60GHz</td>
+</tr>
+<tr>
+  <td><strong>Selected CPU Flags</strong></td>
+  <td>Turbo Boost</td>
+  <td>-</td>
+</tr>
+<tr>
+  <td><strong>CPU Thermal Design Power</strong></td>
+  <td>65 W</td>
+  <td>65 W</td>
+</tr>
+<tr>
+  <td><strong>Operating System</strong></td>
+  <td>Ubuntu 20.04.1 LTS</td>
+  <td>Ubuntu 20.04.1 LTS</td>
+</tr>
+<tr>
+  <td><strong>Kernel Version</strong></td>
+  <td>5.4.0-52-generic</td>
+  <td>5.4.0-52-generic</td>
+</tr>
+<tr>
+  <td><strong>BIOS Vendor</strong></td>
+  <td>American Megatrends Inc.</td>
+  <td>American Megatrends Inc.</td>
+</tr>
+<tr>
+  <td><strong>BIOS Version and Release Date</strong></td>
+  <td>2401, date: 07/12/2019</td>
+  <td>F10j, date: 09/16/2020</td>
+</tr>
+<tr>
+  <td><strong>Docker Version</strong></td>
+  <td>19.03.13</td>
+  <td>20.10.0</td>
+</tr>
+</tr>
+<tr>
+  <td><strong>Network Speed</strong></td>
+  <td colspan="2" align="center">40 Gb/s</td>
+</tr>
+</table>
+
+**Platform with Intel® Core™ i3-8100**
+<table>
+<tr>
+  <th></th>
+  <th><strong>Server Platform</strong></th>
+  <th><strong>Client Platform</strong></th>
+</tr>
+<tr>
+  <td><strong>Motherboard</strong></td>
+  <td>Gigabyte Technology Co., Ltd. Z390 UD</td>
+  <td>ASUSTeK COMPUTER INC. PRIME Z370-A</td>
+</tr>
+<tr>
+  <td><strong>Memory</strong></td>
+  <td>029E 4 x 8GB @ 2400 MT/s DDR4</td>
+  <td>Corsair 2 x 16GB @ 2133 MT/s DDR4</td>
+</tr>
+<tr>
+  <td><strong>CPU</strong></td>
+  <td>Intel® Core™ i3-8100 CPU @ 3.60GHz</td>
+  <td>Intel® Core™ i5-8500 CPU @ 3.00GHz</td>
+</tr>
+<tr>
+  <td><strong>Selected CPU Flags</strong></td>
+  <td>-</td>
+  <td>Turbo Boost</td>
+</tr>
+<tr>
+  <td><strong>CPU Thermal Design Power</strong></td>
+  <td>65 W</td>
+  <td>65 W</td>
+</tr>
+<tr>
+  <td><strong>Operating System</strong></td>
+  <td>Ubuntu 20.04.1 LTS</td>
+  <td>Ubuntu 20.04.1 LTS</td>
+</tr>
+<tr>
+  <td><strong>Kernel Version</strong></td>
+  <td>5.4.0-52-generic</td>
+  <td>5.4.0-52-generic</td>
+</tr>
+<tr>
+  <td><strong>BIOS Vendor</strong></td>
+  <td>American Megatrends Inc.</td>
+  <td>American Megatrends Inc.</td>
+</tr>
+<tr>
+  <td><strong>BIOS Version and Release Date</strong></td>
+  <td>F10j, date: 09/16/2020</td>
+  <td>2401, date: 07/12/2019</td>
+</tr>
+<tr>
+  <td><strong>Docker Version</strong></td>
+  <td>20.10.0</td>
+  <td>19.03.13</td>
+</tr>
+</tr>
+<tr>
+  <td><strong>Network Speed</strong></td>
+  <td colspan="2" align="center">40 Gb/s</td>
+</tr>
+</table>
+
+
+\htmlonly
+<style>
+    .footer {
+        display: none;
+    }
+</style>
+<div class="opt-notice-wrapper">
+<p class="opt-notice">
+\endhtmlonly
+Results may vary. For workloads and configurations visit: [www.intel.com/PerformanceIndex](https://www.intel.com/PerformanceIndex) and [Legal Information](../Legal_Information.md).
+\htmlonly
+</p>
+</div>
+\endhtmlonly
+
diff --git a/docs/benchmarks/performance_int8_vs_fp32.md b/docs/benchmarks/performance_int8_vs_fp32.md
index 42dd26b9cce..35be3673e1a 100644
--- a/docs/benchmarks/performance_int8_vs_fp32.md
+++ b/docs/benchmarks/performance_int8_vs_fp32.md
@@ -7,9 +7,9 @@ The table below illustrates the speed-up factor for the performance gain by swit
     <th></th>
     <th></th>
     <th>Intel® Core™ <br>i7-8700T</th>
-    <th>Intel® Xeon® <br>Gold <br>5218T</th>
-    <th>Intel® Xeon® <br>Platinum <br>8270</th>
     <th>Intel® Core™ <br>i7-1185G7</th>
+    <th>Intel® Xeon® <br>W-1290P</th>
+    <th>Intel® Xeon® <br>Platinum <br>8270</th>
   </tr>
   <tr align="left">
     <th>OpenVINO <br>benchmark <br>model name</th>
@@ -20,161 +20,177 @@ The table below illustrates the speed-up factor for the performance gain by swit
     <td>bert-large-<br>uncased-whole-word-<br>masking-squad-0001</td>
     <td>SQuAD</td>
     <td>1.6</td>
-    <td>2.7</td>
-    <td>2.0</td>
-    <td>2.6</td>
+    <td>3.0</td>
+    <td>1.6</td>
+    <td>2.3</td>
   </tr>
   <tr>
     <td>brain-tumor-<br>segmentation-<br>0001-MXNET</td>
     <td>BraTS</td>
-    <td>1.5</td>
+    <td>1.6</td>
     <td>1.9</td>
     <td>1.7</td>
-    <td>1.8</td>
+    <td>1.7</td>
   </tr>
   <tr>
     <td>deeplabv3-TF</td>
     <td>VOC 2012<br>Segmentation</td>
-    <td>1.5</td>
-    <td>2.4</td>
-    <td>2.8</td>
+    <td>2.1</td>
     <td>3.1</td>
+    <td>3.1</td>
+    <td>3.0</td>
   </tr>
   <tr>
     <td>densenet-121-TF</td>
     <td>ImageNet</td>
-    <td>1.6</td>
-    <td>3.2</td>
-    <td>3.2</td>
-    <td>3.2</td>
+    <td>1.8</td>
+    <td>3.5</td>
+    <td>1.9</td>
+    <td>3.8</td>
   </tr>
   <tr>
     <td>facenet-<br>20180408-<br>102900-TF</td>
     <td>LFW</td>
     <td>2.0</td>
     <td>3.6</td>
-    <td>3.5</td>
-    <td>3.4</td>
+    <td>2.2</td>
+    <td>3.7</td>
   </tr>
   <tr>
     <td>faster_rcnn_<br>resnet50_coco-TF</td>
     <td>MS COCO</td>
-    <td>1.7</td>
-    <td>3.4</td>
-    <td>3.4</td>
-    <td>3.4</td>
+    <td>1.9</td>
+    <td>3.8</td>
+    <td>2.0</td>
+    <td>3.5</td>
   </tr>
   <tr>
     <td>googlenet-v1-TF</td>
     <td>ImageNet</td>
     <td>1.8</td>
     <td>3.6</td>
-    <td>3.7</td>
-    <td>3.5</td>
+    <td>2.0</td>
+    <td>3.9</td>
   </tr>
   <tr>
     <td>inception-v3-TF</td>
     <td>ImageNet</td>
-    <td>1.8</td>
+    <td>1.9</td>
     <td>3.8</td>
+    <td>2.0</td>
     <td>4.0</td>
-    <td>3.5</td>
   </tr>
   <tr>
     <td>mobilenet-<br>ssd-CF</td>
     <td>VOC2012</td>
-    <td>1.5</td>
+    <td>1.7</td>
     <td>3.1</td>
+    <td>1.8</td>
     <td>3.6</td>
-    <td>3.1</td>
   </tr>
   <tr>
     <td>mobilenet-v1-1.0-<br>224-TF</td>
     <td>ImageNet</td>
-    <td>1.5</td>
-    <td>3.2</td>
-    <td>4.1</td>
+    <td>1.7</td>
     <td>3.1</td>
+    <td>1.8</td>
+    <td>4.1</td>
   </tr>
   <tr>
     <td>mobilenet-v2-1.0-<br>224-TF</td>
     <td>ImageNet</td>
-    <td>1.3</td>
-    <td>2.7</td>
-    <td>4.3</td>
-    <td>2.5</td>
+    <td>1.5</td>
+    <td>2.4</td>
+    <td>1.8</td>
+    <td>3.9</td>
   </tr>
   <tr>
     <td>mobilenet-v2-<br>pytorch</td>
     <td>ImageNet</td>
-    <td>1.4</td>
-    <td>2.8</td>
-    <td>4.6</td>
+    <td>1.6</td>
     <td>2.4</td>
+    <td>1.9</td>
+    <td>3.9</td>
   </tr>
   <tr>
     <td>resnet-18-<br>pytorch</td>
     <td>ImageNet</td>
     <td>1.9</td>
     <td>3.7</td>
-    <td>3.8</td>
-    <td>3.6</td>
+    <td>2.1</td>
+    <td>4.2</td>
   </tr>
   <tr>
     <td>resnet-50-<br>pytorch</td>
     <td>ImageNet</td>
-    <td>1.8</td>
-    <td>3.6</td>
+    <td>1.9</td>
+    <td>3.7</td>
+    <td>2.0</td>
     <td>3.9</td>
-    <td>3.4</td>
   </tr>
   <tr>
     <td>resnet-50-<br>TF</td>
     <td>ImageNet</td>
-    <td>1.8</td>
+    <td>1.9</td>
     <td>3.6</td>
+    <td>2.0</td>
     <td>3.9</td>
-    <td>3.4</td>
   </tr>
   <tr>
     <td>squeezenet1.1-<br>CF</td>
     <td>ImageNet</td>
-    <td>1.6</td>
-    <td>2.9</td>
-    <td>3.4</td>
+    <td>1.7</td>
     <td>3.2</td>
+    <td>1.8</td>
+    <td>3.4</td>
   </tr>
   <tr>
     <td>ssd_mobilenet_<br>v1_coco-tf</td>
     <td>VOC2012</td>
-    <td>1.6</td>
-    <td>3.1</td>
-    <td>3.7</td>
+    <td>1.7</td>
     <td>3.0</td>
+    <td>1.9</td>
+    <td>3.6</td>
   </tr>
   <tr>
     <td>ssd300-CF</td>
     <td>MS COCO</td>
     <td>1.8</td>
-    <td>3.7</td>
-    <td>3.7</td>
-    <td>3.8</td>
+    <td>4.4</td>
+    <td>1.9</td>
+    <td>3.9</td>
   </tr>
   <tr>
     <td>ssdlite_<br>mobilenet_<br>v2-TF</td>
     <td>MS COCO</td>
-    <td>1.4</td>
-    <td>2.3</td>
-    <td>3.9</td>
+    <td>1.7</td>
     <td>2.5</td>
+    <td>2.2</td>
+    <td>3.4</td>
   </tr>
   <tr>
     <td>yolo_v3-TF</td>
     <td>MS COCO</td>
     <td>1.8</td>
-    <td>3.8</td>
+    <td>4.0</td>
+    <td>1.9</td>
     <td>3.9</td>
-    <td>3.6</td>
+  </tr>
+  <tr>
+    <td>yolo_v4-TF</td>
+    <td>MS COCO</td>
+    <td>1.7</td>
+    <td>3.4</td>
+    <td>1.7</td>
+    <td>2.8</td>
+  </tr>
+  <tr>
+    <td>unet-camvid-onnx-0001</td>
+    <td>MS COCO</td>
+    <td>1.6</td>
+    <td>3.8</td>
+    <td>1.6</td>
+    <td>3.7</td>
   </tr>
 </table>
 
@@ -187,7 +203,7 @@ The following table shows the absolute accuracy drop that is calculated as the d
     <th></th>
     <th>Intel® Core™ <br>i9-10920X CPU<br>@ 3.50GHZ (VNNI)</th>
     <th>Intel® Core™ <br>i9-9820X CPU<br>@ 3.30GHz (AVX512)</th>
-    <th>Intel® Core™ <br>i7-6700 CPU<br>@ 4.0GHz (AVX2)</th>
+    <th>Intel® Core™ <br>i7-6700K CPU<br>@ 4.0GHz (AVX2)</th>
     <th>Intel® Core™ <br>i7-1185G7 CPU<br>@ 4.0GHz (TGL VNNI)</th>
   </tr>
   <tr align="left">
@@ -196,176 +212,203 @@ The following table shows the absolute accuracy drop that is calculated as the d
     <th>Metric Name</th>
     <th colspan="4" align="center">Absolute Accuracy Drop, %</th>
   </tr>
+  <tr>
+    <td>bert-large-uncased-whole-word-masking-squad-0001</td>
+    <td>SQuAD</td>
+    <td>F1</td>
+    <td>0.62</td>
+    <td>0.88</td>
+    <td>0.52</td>
+    <td>0.62</td>
+  </tr>
   <tr>
     <td>brain-tumor-<br>segmentation-<br>0001-MXNET</td>
     <td>BraTS</td>
     <td>Dice-index@ <br>Mean@ <br>Overall Tumor</td>
-    <td>0.08</td>
-    <td>0.08</td>
-    <td>0.08</td>
-    <td>0.08</td>
+    <td>0.09</td>
+    <td>0.10</td>
+    <td>0.11</td>
+    <td>0.09</td>
   </tr>
   <tr>
     <td>deeplabv3-TF</td>
     <td>VOC 2012<br>Segmentation</td>
     <td>mean_iou</td>
-    <td>0.73</td>
-    <td>1.10</td>
-    <td>1.10</td>
-    <td>0.73</td>
+    <td>0.09</td>
+    <td>0.41</td>
+    <td>0.41</td>
+    <td>0.09</td>
   </tr>
   <tr>
     <td>densenet-121-TF</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.73</td>
-    <td>0.72</td>
-    <td>0.72</td>
-    <td>0.73</td>
+    <td>0.54</td>
+    <td>0.57</td>
+    <td>0.57</td>
+    <td>0.54</td>
   </tr>
   <tr>
     <td>facenet-<br>20180408-<br>102900-TF</td>
     <td>LFW</td>
     <td>pairwise_<br>accuracy<br>_subsets</td>
-    <td>0.02</td>
-    <td>0.02</td>
-    <td>0.02</td>
-    <td>0.47</td>
+    <td>0.05</td>
+    <td>0.12</td>
+    <td>0.12</td>
+    <td>0.05</td>
   </tr>
   <tr>
     <td>faster_rcnn_<br>resnet50_coco-TF</td>
     <td>MS COCO</td>
     <td>coco_<br>precision</td>
-    <td>0.21</td>
-    <td>0.20</td>
-    <td>0.20</td>
-    <td>0.21</td>
+    <td>0.04</td>
+    <td>0.04</td>
+    <td>0.04</td>
+    <td>0.04</td>
   </tr>
   <tr>
     <td>googlenet-v1-TF</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.03</td>
     <td>0.01</td>
+    <td>0.00</td>
+    <td>0.00</td>
     <td>0.01</td>
-    <td>0.03</td>
   </tr>
   <tr>
     <td>inception-v3-TF</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.03</td>
-    <td>0.01</td>
-    <td>0.01</td>
-    <td>0.03</td>
+    <td>0.04</td>
+    <td>0.00</td>
+    <td>0.00</td>
+    <td>0.04</td>
   </tr>
   <tr>
     <td>mobilenet-<br>ssd-CF</td>
     <td>VOC2012</td>
     <td>mAP</td>
-    <td>0.35</td>
-    <td>0.34</td>
-    <td>0.34</td>
-    <td>0.35</td>
+    <td>0.77</td>
+    <td>0.77</td>
+    <td>0.77</td>
+    <td>0.77</td>
   </tr>
   <tr>
     <td>mobilenet-v1-1.0-<br>224-TF</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.27</td>
-    <td>0.20</td>
-    <td>0.20</td>
-    <td>0.27</td>
+    <td>0.26</td>
+    <td>0.28</td>
+    <td>0.28</td>
+    <td>0.26</td>
   </tr>
   <tr>
     <td>mobilenet-v2-1.0-<br>224-TF</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.44</td>
-    <td>0.92</td>
-    <td>0.92</td>
-    <td>0.44</td>
+    <td>0.40</td>
+    <td>0.76</td>
+    <td>0.76</td>
+    <td>0.40</td>
   </tr>
   <tr>
     <td>mobilenet-v2-<br>PYTORCH</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.25</td>
-    <td>7.42</td>
-    <td>7.42</td>
-    <td>0.25</td>
+    <td>0.36</td>
+    <td>0.52</td>
+    <td>0.52</td>
+    <td>0.36</td>
   </tr>
   <tr>
     <td>resnet-18-<br>pytorch</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.26</td>
     <td>0.25</td>
     <td>0.25</td>
-    <td>0.26</td>
+    <td>0.25</td>
+    <td>0.25</td>
   </tr>
   <tr>
     <td>resnet-50-<br>PYTORCH</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.18</td>
     <td>0.19</td>
+    <td>0.21</td>
+    <td>0.21</td>
     <td>0.19</td>
-    <td>0.18</td>
   </tr>
   <tr>
     <td>resnet-50-<br>TF</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
-    <td>0.15</td>
-    <td>0.11</td>
-    <td>0.11</td>
-    <td>0.15</td>
+    <td>0.10</td>
+    <td>0.08</td>
+    <td>0.08</td>
+    <td>0.10</td>
   </tr>
   <tr>
     <td>squeezenet1.1-<br>CF</td>
     <td>ImageNet</td>
     <td>acc@top-1</td>
+    <td>0.63</td>
     <td>0.66</td>
-    <td>0.64</td>
-    <td>0.64</td>
     <td>0.66</td>
+    <td>0.63</td>
   </tr>
   <tr>
     <td>ssd_mobilenet_<br>v1_coco-tf</td>
     <td>VOC2012</td>
     <td>COCO mAp</td>
-    <td>0.24</td>
-    <td>3.07</td>
-    <td>3.07</td>
-    <td>0.24</td>
+    <td>0.18</td>
+    <td>3.06</td>
+    <td>3.06</td>
+    <td>0.18</td>
   </tr>
   <tr>
     <td>ssd300-CF</td>
     <td>MS COCO</td>
     <td>COCO mAp</td>
-    <td>0.06</td>
     <td>0.05</td>
     <td>0.05</td>
-    <td>0.06</td>
+    <td>0.05</td>
+    <td>0.05</td>
   </tr>
   <tr>
     <td>ssdlite_<br>mobilenet_<br>v2-TF</td>
     <td>MS COCO</td>
     <td>COCO mAp</td>
-    <td>0.14</td>
+    <td>0.11</td>
     <td>0.43</td>
     <td>0.43</td>
-    <td>0.14</td>
+    <td>0.11</td>
   </tr>
   <tr>
     <td>yolo_v3-TF</td>
     <td>MS COCO</td>
     <td>COCO mAp</td>
-    <td>0.12</td>
-    <td>0.35</td>
-    <td>0.35</td>
-    <td>0.12</td>
+    <td>0.11</td>
+    <td>0.24</td>
+    <td>0.24</td>
+    <td>0.11</td>
+  </tr>
+  <tr>
+    <td>yolo_v4-TF</td>
+    <td>MS COCO</td>
+    <td>COCO mAp</td>
+    <td>0.01</td>
+    <td>0.09</td>
+    <td>0.09</td>
+    <td>0.01</td>
+  </tr>
+  <tr>
+    <td>unet-camvid-<br>onnx-0001</td>
+    <td>MS COCO</td>
+    <td>COCO mAp</td>
+    <td>0.31</td>
+    <td>0.31</td>
+    <td>0.31</td>
+    <td>0.31</td>
   </tr>
 </table>
 
diff --git a/docs/doxygen/ie_docs.xml b/docs/doxygen/ie_docs.xml
index 010d7cac724..8ca6ff2588e 100644
--- a/docs/doxygen/ie_docs.xml
+++ b/docs/doxygen/ie_docs.xml
@@ -270,7 +270,6 @@ limitations under the License.
         <tab id="deploying_inference" type="usergroup" title="Deploying Inference" url="@ref openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide">
             <!-- Inference Engine Developer Guide -->
             <tab type="usergroup" title="Inference Engine Developer Guide" url="@ref openvino_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide">
-                <tab type="user" title="Introduction to Inference Engine" url="@ref openvino_docs_IE_DG_inference_engine_intro"/>
                 <tab type="user" title="Inference Engine API Changes History" url="@ref openvino_docs_IE_DG_API_Changes"/>
                 <tab type="user" title="Inference Engine Memory primitives" url="@ref openvino_docs_IE_DG_Memory_primitives"/>
                 <tab type="user" title="Inference Engine Device Query API" url="@ref openvino_docs_IE_DG_InferenceEngine_QueryAPI"/>
diff --git a/docs/doxygen/openvino_docs.xml b/docs/doxygen/openvino_docs.xml
index 0ca1c093271..92238645a05 100644
--- a/docs/doxygen/openvino_docs.xml
+++ b/docs/doxygen/openvino_docs.xml
@@ -100,11 +100,14 @@ limitations under the License.
             <!-- Tuning for Performance -->
             <tab type="usergroup" title="Tuning for Performance">
                 <!-- Performance Benchmarks -->
-                <tab type="usergroup" title="Performance Measures" url="@ref openvino_docs_performance_benchmarks">
-                    <tab type="user" title="Performance Information Frequently Asked Questions" url="@ref openvino_docs_performance_benchmarks_faq"/>
-                    <tab type="user" title="Download Performance Data Spreadsheet in MS Excel* Format" url="https://docs.openvinotoolkit.org/downloads/benchmark_files/OV-2021.2-Download-Excel.xlsx"/>
-                    <tab type="user" title="INT8 vs. FP32 Comparison on Select Networks and Platforms" url="@ref openvino_docs_performance_int8_vs_fp32"/>
-                </tab>
+                <tab type="usergroup" title="Performance Benchmark Results" url="@ref openvino_docs_performance_benchmarks">
+                    <tab type="usergroup" title="Intel® Distribution of OpenVINO™ toolkit Benchmark Results" url="@ref openvino_docs_performance_benchmarks_openvino">
+                        <tab type="user" title="Performance Information Frequently Asked Questions" url="@ref openvino_docs_performance_benchmarks_faq"/>
+                        <tab type="user" title="Download Performance Data Spreadsheet in MS Excel* Format" url="https://docs.openvinotoolkit.org/downloads/benchmark_files/OV-2021.3-Download-Excel.xlsx"/>
+                        <tab type="user" title="INT8 vs. FP32 Comparison on Select Networks and Platforms" url="@ref openvino_docs_performance_int8_vs_fp32"/>
+                    </tab>
+                    <tab type="user" title="OpenVINO™ Model Server Benchmark Results" url="@ref openvino_docs_performance_benchmarks_ovms"/>
+                </tab>    
                 <tab type="user" title="Performance Optimization Guide" url="@ref openvino_docs_optimization_guide_dldt_optimization_guide"/>
                 <!-- POT DevGuide -->
                 <xi:include href="pot_docs.xml" xpointer="xpointer(//tab[@id='pot'])">
@@ -166,6 +169,7 @@ limitations under the License.
                 <tab type="user" title="Hello Classification C Sample" url="@ref openvino_inference_engine_ie_bridges_c_samples_hello_classification_README"/>
                 <tab type="user" title="Image Classification Python* Sample" url="@ref openvino_inference_engine_ie_bridges_python_sample_classification_sample_README"/>
                 <tab type="user" title="Hello Reshape SSD C++ Sample" url="@ref openvino_inference_engine_samples_hello_reshape_ssd_README"/>
+                <tab type="user" title="Hello Reshape SSD Python Sample" url="@ref openvino_inference_engine_samples_python_hello_reshape_ssd_README"/>
                 <tab type="user" title="Hello NV12 Input Classification C++ Sample" url="@ref openvino_inference_engine_samples_hello_nv12_input_classification_README"/>
                 <tab type="user" title="Hello NV12 Input Classification C Sample" url="@ref openvino_inference_engine_ie_bridges_c_samples_hello_nv12_input_classification_README"/>
                 <tab type="user" title="Hello Query Device C++ Sample" url="@ref openvino_inference_engine_samples_hello_query_device_README"/>
@@ -181,7 +185,15 @@ limitations under the License.
                 <tab type="user" title="Benchmark C++ Tool" url="@ref openvino_inference_engine_samples_benchmark_app_README"/>
                 <tab type="user" title="Benchmark Python* Tool" url="@ref openvino_inference_engine_tools_benchmark_tool_README"/>
             </tab>
-
+            <!-- Reference Implementations -->
+            <tab type="usergroup" title="Reference Implementations" url="">
+                <tab type="usergroup" title="Speech Library and Speech Recognition Demos" url="@ref openvino_inference_engine_samples_speech_libs_and_demos_Speech_libs_and_demos">
+                    <tab type="user" title="Speech Library" url="@ref openvino_inference_engine_samples_speech_libs_and_demos_Speech_library"/>
+                    <tab type="user" title="Offline Speech Recognition Demo" url="@ref openvino_inference_engine_samples_speech_libs_and_demos_Offline_speech_recognition_demo"/>
+                    <tab type="user" title="Live Speech Recognition Demo" url="@ref openvino_inference_engine_samples_speech_libs_and_demos_Live_speech_recognition_demo"/>
+                    <tab type="user" title="Kaldi* Statistical Language Model Conversion Tool" url="@ref openvino_inference_engine_samples_speech_libs_and_demos_Kaldi_SLM_conversion_tool"/>
+                </tab>    
+            </tab>
             <!-- DL Streamer Examples -->
             <tab type="usergroup" title="DL Streamer Examples" url="@ref gst_samples_README">
                 <tab type="usergroup" title="Command Line Samples" url="">
diff --git a/docs/gapi/face_beautification.md b/docs/gapi/face_beautification.md
index 539b0ca9b7e..25619ae8e0b 100644
--- a/docs/gapi/face_beautification.md
+++ b/docs/gapi/face_beautification.md
@@ -12,11 +12,11 @@ This sample requires:
 
 * PC with GNU/Linux* or Microsoft Windows* (Apple macOS* is supported but was not tested)
 * OpenCV 4.2 or higher built with [Intel® Distribution of OpenVINO™ Toolkit](https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit.html) (building with [Intel® TBB](https://www.threadingbuildingblocks.org/intel-tbb-tutorial) is a plus)
-* The following pre-trained models from the [Open Model Zoo](@ref omz_models_intel_index)
-      * [face-detection-adas-0001](@ref omz_models_intel_face_detection_adas_0001_description_face_detection_adas_0001)
-      * [facial-landmarks-35-adas-0002](@ref omz_models_intel_facial_landmarks_35_adas_0002_description_facial_landmarks_35_adas_0002)
+* The following pre-trained models from the [Open Model Zoo](@ref omz_models_group_intel)
+      * [face-detection-adas-0001](@ref omz_models_model_face_detection_adas_0001)
+      * [facial-landmarks-35-adas-0002](@ref omz_models_model_facial_landmarks_35_adas_0002)
 
-To download the models from the Open Model Zoo, use the [Model Downloader](@ref omz_tools_downloader_README) tool.
+To download the models from the Open Model Zoo, use the [Model Downloader](@ref omz_tools_downloader) tool.
 
 ## Face Beautification Algorithm
 We will implement a simple face beautification algorithm using a combination of modern Deep Learning techniques and traditional Computer Vision. The general idea behind the algorithm is to make face skin smoother while preserving face features like eyes or a mouth contrast. The algorithm identifies parts of the face using a DNN inference, applies different filters to the parts found, and then combines it into the final result using basic image arithmetics:
diff --git a/docs/gapi/gapi_face_analytics_pipeline.md b/docs/gapi/gapi_face_analytics_pipeline.md
index 83dcf4594ca..6b544485668 100644
--- a/docs/gapi/gapi_face_analytics_pipeline.md
+++ b/docs/gapi/gapi_face_analytics_pipeline.md
@@ -11,12 +11,12 @@ This sample requires:
 
 * PC with GNU/Linux* or Microsoft Windows* (Apple macOS* is supported but was not tested)
 * OpenCV 4.2 or higher built with [Intel® Distribution of OpenVINO™ Toolkit](https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit.html) (building with [Intel® TBB](https://www.threadingbuildingblocks.org/intel-tbb-tutorial)
-* The following pre-trained models from the [Open Model Zoo](@ref omz_models_intel_index):
-    * [face-detection-adas-0001](@ref omz_models_intel_face_detection_adas_0001_description_face_detection_adas_0001)
-    * [age-gender-recognition-retail-0013](@ref omz_models_intel_age_gender_recognition_retail_0013_description_age_gender_recognition_retail_0013)
-    * [emotions-recognition-retail-0003](@ref omz_models_intel_emotions_recognition_retail_0003_description_emotions_recognition_retail_0003)
+* The following pre-trained models from the [Open Model Zoo](@ref omz_models_group_intel):
+    * [face-detection-adas-0001](@ref omz_models_model_face_detection_adas_0001)
+    * [age-gender-recognition-retail-0013](@ref omz_models_model_age_gender_recognition_retail_0013)
+    * [emotions-recognition-retail-0003](@ref omz_models_model_emotions_recognition_retail_0003)
 
-To download the models from the Open Model Zoo, use the [Model Downloader](@ref omz_tools_downloader_README) tool.
+To download the models from the Open Model Zoo, use the [Model Downloader](@ref omz_tools_downloader) tool.
 
 ## Introduction: Why G-API
 Many computer vision algorithms run on a video stream rather than on individual images. Stream processing usually consists of multiple steps – like decode, preprocessing, detection, tracking, classification (on detected objects), and visualization – forming a *video processing pipeline*. Moreover, many these steps of such pipeline can run in parallel – modern platforms have different hardware blocks on the same chip like decoders and GPUs, and extra accelerators can be plugged in as extensions, like Intel® Movidius™ Neural Compute Stick for deep learning offload.
@@ -26,7 +26,7 @@ Given all this manifold of options and a variety in video analytics algorithms,
 Starting with version 4.2, OpenCV offers a solution to this problem. OpenCV G-API now can manage Deep Learning inference (a cornerstone of any modern analytics pipeline) with a traditional Computer Vision as well as video capturing/decoding, all in a single pipeline. G-API takes care of pipelining itself – so if the algorithm or platform changes, the execution model adapts to it automatically.
 
 ## Pipeline Overview
-Our sample application is based on [Interactive Face Detection](omz_demos_interactive_face_detection_demo_README) demo from Open Model Zoo. A simplified pipeline consists of the following steps:
+Our sample application is based on [Interactive Face Detection](@ref omz_demos_interactive_face_detection_demo_cpp) demo from Open Model Zoo. A simplified pipeline consists of the following steps:
 
 1. Image acquisition and decode
 2. Detection with preprocessing
diff --git a/docs/get_started/get_started_dl_workbench.md b/docs/get_started/get_started_dl_workbench.md
index 52f36c5b80a..795767f3c73 100644
--- a/docs/get_started/get_started_dl_workbench.md
+++ b/docs/get_started/get_started_dl_workbench.md
@@ -9,13 +9,13 @@ In this guide, you will:
 
 [DL Workbench](@ref workbench_docs_Workbench_DG_Introduction) is a web-based graphical environment that enables you to easily use various sophisticated
 OpenVINO™ toolkit components:
-* [Model Downloader](@ref omz_tools_downloader_README) to download models from the [Intel® Open Model Zoo](@ref omz_models_intel_index) 
+* [Model Downloader](@ref omz_tools_downloader) to download models from the [Intel® Open Model Zoo](@ref omz_models_group_intel) 
 with pretrained models for a range of different tasks
 * [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) to transform models into
 the Intermediate Representation (IR) format
 * [Post-Training Optimization toolkit](@ref pot_README) to calibrate a model and then execute it in the
  INT8 precision
-* [Accuracy Checker](@ref omz_tools_accuracy_checker_README) to determine the accuracy of a model
+* [Accuracy Checker](@ref omz_tools_accuracy_checker) to determine the accuracy of a model
 * [Benchmark Tool](@ref openvino_inference_engine_samples_benchmark_app_README) to estimate inference performance on supported devices
 
 ![](./dl_workbench_img/DL_Workbench.jpg)
@@ -70,10 +70,10 @@ The simplified OpenVINO™ DL Workbench workflow is:
 
 ## Run Baseline Inference
 
-This section illustrates a sample use case of how to infer a pretrained model from the [Intel® Open Model Zoo](@ref omz_models_intel_index) with an autogenerated noise dataset on a CPU device.
-
+This section illustrates a sample use case of how to infer a pretrained model from the [Intel® Open Model Zoo](@ref omz_models_group_intel) with an autogenerated noise dataset on a CPU device.
+\htmlonly
 <iframe width="560" height="315" src="https://www.youtube.com/embed/9TRJwEmY0K4" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
-
+\endhtmlonly
 
 Once you log in to the DL Workbench, create a project, which is a combination of a model, a dataset, and a target device. Follow the steps below:
 
diff --git a/docs/get_started/get_started_linux.md b/docs/get_started/get_started_linux.md
index a01d5a11c67..3aa945a05a1 100644
--- a/docs/get_started/get_started_linux.md
+++ b/docs/get_started/get_started_linux.md
@@ -18,7 +18,7 @@ In addition, demo scripts, code samples and demo applications are provided to he
 * **[Code Samples](../IE_DG/Samples_Overview.md)** - Small console applications that show you how to: 
     * Utilize specific OpenVINO capabilities in an application
     * Perform specific tasks, such as loading a model, running inference, querying specific device capabilities, and more.
-* **[Demo Applications](@ref omz_demos_README)** - Console applications that provide robust application templates to help you implement specific deep learning scenarios. These applications involve increasingly complex processing pipelines that gather analysis data from several models that run inference simultaneously, such as detecting a person in a video stream along with detecting the person's physical attributes, such as age, gender, and emotional state.
+* **[Demo Applications](@ref omz_demos)** - Console applications that provide robust application templates to help you implement specific deep learning scenarios. These applications involve increasingly complex processing pipelines that gather analysis data from several models that run inference simultaneously, such as detecting a person in a video stream along with detecting the person's physical attributes, such as age, gender, and emotional state.
 
 ## <a name="openvino-installation"></a>Intel® Distribution of OpenVINO™ toolkit Installation and Deployment Tools Directory Structure
 This guide assumes you completed all Intel® Distribution of OpenVINO™ toolkit installation and configuration steps. If you have not yet installed and configured the toolkit, see [Install Intel® Distribution of OpenVINO™ toolkit for Linux*](../install_guides/installing-openvino-linux.md).
@@ -46,9 +46,9 @@ The primary tools for deploying your models and applications are installed to th
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`samples/`      | Inference Engine samples. Contains source code for C++ and Python* samples and build scripts. See the [Inference Engine Samples Overview](../IE_DG/Samples_Overview.md). |
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`src/`          | Source files for CPU extensions.|
 | `model_optimizer/`                      | Model Optimizer directory. Contains configuration scripts, scripts to run the Model Optimizer and other files. See the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
-| `open_model_zoo/`                       | Open Model Zoo directory. Includes the Model Downloader tool to download [pre-trained OpenVINO](@ref omz_models_intel_index) and public models, OpenVINO models documentation, demo applications and the Accuracy Checker tool to evaluate model accuracy.|
+| `open_model_zoo/`                       | Open Model Zoo directory. Includes the Model Downloader tool to download [pre-trained OpenVINO](@ref omz_models_group_intel) and public models, OpenVINO models documentation, demo applications and the Accuracy Checker tool to evaluate model accuracy.|
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`demos/`        | Demo applications for inference scenarios. Also includes documentation and build scripts.| 
-| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`intel_models/` | Pre-trained OpenVINO models and associated documentation. See the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_intel_index).|
+| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`intel_models/` | Pre-trained OpenVINO models and associated documentation. See the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel).|
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`tools/`        | Model Downloader and Accuracy Checker tools. |
 | `tools/`                                | Contains a symbolic link to the Model Downloader folder and auxiliary tools to work with your models: Calibration tool, Benchmark and Collect Statistics tools.|
 
@@ -197,7 +197,7 @@ Each demo and code sample is a separate application, but they use the same behav
 
 * [Code Samples](../IE_DG/Samples_Overview.md) - Small console applications that show how to utilize specific OpenVINO capabilities within an application and execute specific tasks such as loading a model, running inference, querying specific device capabilities, and more.
 
-* [Demo Applications](@ref omz_demos_README) - Console applications that provide robust application templates to support developers in implementing specific deep learning scenarios. They may also involve more complex processing pipelines that gather analysis from several models that run inference simultaneously. For example concurrently detecting a person in a video stream and detecting attributes such as age, gender and/or emotions.
+* [Demo Applications](@ref omz_demos) - Console applications that provide robust application templates to support developers in implementing specific deep learning scenarios. They may also involve more complex processing pipelines that gather analysis from several models that run inference simultaneously. For example concurrently detecting a person in a video stream and detecting attributes such as age, gender and/or emotions.
  
 Inputs you'll need to specify:
 - **A compiled OpenVINO™ code sample or demo application** that runs inferencing against a model that has been run through the Model Optimizer, resulting in an IR, using the other inputs you provide.
@@ -209,7 +209,7 @@ Inputs you'll need to specify:
 
 To perform sample inference, run the Image Classification code sample and Security Barrier Camera demo application that were automatically compiled when you ran the Image Classification and Inference Pipeline demo scripts. The binary files are in the `~/inference_engine_cpp_samples_build/intel64/Release` and `~/inference_engine_demos_build/intel64/Release` directories, respectively.
 
-To run other sample code or demo applications, build them from the source files delivered as part of the OpenVINO toolkit. To learn how to build these, see the [Inference Engine Code Samples Overview](../IE_DG/Samples_Overview.md) and [Demo Applications Overview](@ref omz_demos_README) sections.
+To run other sample code or demo applications, build them from the source files delivered as part of the OpenVINO toolkit. To learn how to build these, see the [Inference Engine Code Samples Overview](../IE_DG/Samples_Overview.md) and [Demo Applications Overview](@ref omz_demos) sections.
 
 ### <a name="download-models"></a> Step 1: Download the Models
 
@@ -219,7 +219,7 @@ You must have a model that is specific for you inference task. Example model typ
 - Custom (Often based on SSD)
 
 Options to find a model suitable for the OpenVINO™ toolkit are:
-- Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader_README).
+- Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader).
 - Download from GitHub*, Caffe* Zoo, TensorFlow* Zoo, etc.
 - Train your own model.
         
@@ -449,7 +449,7 @@ Throughput: 375.3339402 FPS
 
 ### <a name="run-security-barrier"></a>Step 5: Run the Security Barrier Camera Demo Application
 
-> **NOTE**: The Security Barrier Camera Demo Application is automatically compiled when you ran the Inference Pipeline demo scripts. If you want to build it manually, see the [Demo Applications Overview](@ref omz_demos_README) section.
+> **NOTE**: The Security Barrier Camera Demo Application is automatically compiled when you ran the Inference Pipeline demo scripts. If you want to build it manually, see the [Demo Applications Overview](@ref omz_demos) section.
 
 To run the **Security Barrier Camera Demo Application** using an input image on the prepared IRs:
 
diff --git a/docs/get_started/get_started_macos.md b/docs/get_started/get_started_macos.md
index 14456171d60..980b02d0be2 100644
--- a/docs/get_started/get_started_macos.md
+++ b/docs/get_started/get_started_macos.md
@@ -18,7 +18,7 @@ In addition, demo scripts, code samples and demo applications are provided to he
 * **[Code Samples](../IE_DG/Samples_Overview.md)** - Small console applications that show you how to:
     * Utilize specific OpenVINO capabilities in an application.
     * Perform specific tasks, such as loading a model, running inference, querying specific device capabilities, and more.
-* **[Demo Applications](@ref omz_demos_README)** - Console applications that provide robust application templates to help you implement specific deep learning scenarios. These applications involve increasingly complex processing pipelines that gather analysis data from several models that run inference simultaneously, such as detecting a person in a video stream along with detecting the person's physical attributes, such as age, gender, and emotional state.
+* **[Demo Applications](@ref omz_demos)** - Console applications that provide robust application templates to help you implement specific deep learning scenarios. These applications involve increasingly complex processing pipelines that gather analysis data from several models that run inference simultaneously, such as detecting a person in a video stream along with detecting the person's physical attributes, such as age, gender, and emotional state.
 
 ## <a name="openvino-installation"></a>Intel® Distribution of OpenVINO™ toolkit Installation and Deployment Tools Directory Structure
 This guide assumes you completed all Intel® Distribution of OpenVINO™ toolkit installation and configuration steps. If you have not yet installed and configured the toolkit, see [Install Intel® Distribution of OpenVINO™ toolkit for macOS*](../install_guides/installing-openvino-macos.md).
@@ -48,9 +48,9 @@ The primary tools for deploying your models and applications are installed to th
 | `~intel_models/` | Symbolic link to the `intel_models` subfolder of the `open_model_zoo` folder.|
 | `model_optimizer/`                      | Model Optimizer directory. Contains configuration scripts, scripts to run the Model Optimizer and other files. See the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).|
 | `ngraph/`                               | nGraph directory. Includes the nGraph header and library files. |
-| `open_model_zoo/`                       | Open Model Zoo directory. Includes the Model Downloader tool to download [pre-trained OpenVINO](@ref omz_models_intel_index) and public models, OpenVINO models documentation, demo applications and the Accuracy Checker tool to evaluate model accuracy.|
+| `open_model_zoo/`                       | Open Model Zoo directory. Includes the Model Downloader tool to download [pre-trained OpenVINO](@ref omz_models_group_intel) and public models, OpenVINO models documentation, demo applications and the Accuracy Checker tool to evaluate model accuracy.|
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`demos/`        | Demo applications for inference scenarios. Also includes documentation and build scripts.| 
-| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`intel_models/` | Pre-trained OpenVINO models and associated documentation. See the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_intel_index).|
+| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`intel_models/` | Pre-trained OpenVINO models and associated documentation. See the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel).|
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`models`        | Intel's trained and public models that can be obtained with Model Downloader.|
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`tools/`        | Model Downloader and Accuracy Checker tools. |
 | `tools/`                                | Contains a symbolic link to the Model Downloader folder and auxiliary tools to work with your models: Calibration tool, Benchmark and Collect Statistics tools.|
@@ -200,7 +200,7 @@ Inputs you need to specify when using a code sample or demo application:
 
 To perform sample inference, run the Image Classification code sample and Security Barrier Camera demo application that are automatically compiled when you run the Image Classification and Inference Pipeline demo scripts. The binary files are in the `~/inference_engine_samples_build/intel64/Release` and `~/inference_engine_demos_build/intel64/Release` directories, respectively.
 
-You can also build all available sample code and demo applications from the source files delivered with the OpenVINO toolkit. To learn how to do this, see the instructions in the [Inference Engine Code Samples Overview](../IE_DG/Samples_Overview.md) and [Demo Applications Overview](@ref omz_demos_README) sections.
+You can also build all available sample code and demo applications from the source files delivered with the OpenVINO toolkit. To learn how to do this, see the instructions in the [Inference Engine Code Samples Overview](../IE_DG/Samples_Overview.md) and [Demo Applications Overview](@ref omz_demos) sections.
 
 ### <a name="download-models"></a> Step 1: Download the Models
 
@@ -210,7 +210,7 @@ You must have a model that is specific for you inference task. Example model typ
 - Custom (Often based on SSD)
 
 Options to find a model suitable for the OpenVINO™ toolkit are:
-- Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader_README). 
+- Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader). 
 - Download from GitHub*, Caffe* Zoo, TensorFlow* Zoo, and other resources.
 - Train your own model.
         
@@ -422,7 +422,7 @@ classid probability label
 
 ### <a name="run-security-barrier"></a>Step 5: Run the Security Barrier Camera Demo Application
 
-> **NOTE**: The Security Barrier Camera Demo Application is automatically compiled when you run the Inference Pipeline demo scripts. If you want to build it manually, see the instructions in the [Demo Applications Overview](@ref omz_demos_README) section.
+> **NOTE**: The Security Barrier Camera Demo Application is automatically compiled when you run the Inference Pipeline demo scripts. If you want to build it manually, see the instructions in the [Demo Applications Overview](@ref omz_demos) section.
 
 To run the **Security Barrier Camera Demo Application** using an input image on the prepared IRs:
 
diff --git a/docs/get_started/get_started_raspbian.md b/docs/get_started/get_started_raspbian.md
index afb821debec..5f3baf87d2f 100644
--- a/docs/get_started/get_started_raspbian.md
+++ b/docs/get_started/get_started_raspbian.md
@@ -43,8 +43,8 @@ The primary tools for deploying your models and applications are installed to th
 The OpenVINO™ workflow on Raspbian* OS is as follows:
 1. **Get a pre-trained model** for your inference task. If you want to use your model for inference, the model must be converted to the `.bin` and `.xml` Intermediate Representation (IR) files, which are used as input by Inference Engine. On Raspberry PI, OpenVINO™ toolkit includes only the Inference Engine module. The Model Optimizer is not supported on this platform. To get the optimized models you can use one of the following options:
    
-   * Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader_README).
-    <br> For more information on pre-trained models, see [Pre-Trained Models Documentation](@ref omz_models_intel_index)
+   * Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader).
+    <br> For more information on pre-trained models, see [Pre-Trained Models Documentation](@ref omz_models_group_intel)
    
    * Convert a model using the Model Optimizer from a full installation of Intel® Distribution of OpenVINO™ toolkit on one of the supported platforms. Installation instructions are available:
      * [Installation Guide for macOS*](../install_guides/installing-openvino-macos.md)
@@ -62,10 +62,10 @@ Follow the steps below to run pre-trained Face Detection network using Inference
    ```
 2. Build the Object Detection Sample with the following command:
    ```sh
-   cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=armv7-a" /opt/intel/openvino/deployment_tools/inference_engine/samples/cpp
+   cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=armv7-a" /opt/intel/openvino_2021/deployment_tools/inference_engine/samples/cpp
    make -j2 object_detection_sample_ssd
    ```
-3. Download the pre-trained Face Detection model with the [Model Downloader tool](@ref omz_tools_downloader_README):
+3. Download the pre-trained Face Detection model with the [Model Downloader tool](@ref omz_tools_downloader):
 ```sh
 git clone --depth 1 https://github.com/openvinotoolkit/open_model_zoo
 cd open_model_zoo/tools/downloader
diff --git a/docs/get_started/get_started_windows.md b/docs/get_started/get_started_windows.md
index 0255a1bb396..c8c7ee23d1f 100644
--- a/docs/get_started/get_started_windows.md
+++ b/docs/get_started/get_started_windows.md
@@ -19,7 +19,7 @@ In addition, demo scripts, code samples and demo applications are provided to he
 * **[Code Samples](../IE_DG/Samples_Overview.md)** - Small console applications that show you how to:
     * Utilize specific OpenVINO capabilities in an application.
     * Perform specific tasks, such as loading a model, running inference, querying specific device capabilities, and more.
-* **[Demo Applications](@ref omz_demos_README)** - Console applications that provide robust application templates to help you implement specific deep learning scenarios. These applications involve increasingly complex processing pipelines that gather analysis data from several models that run inference simultaneously, such as detecting a person in a video stream along with detecting the person's physical attributes, such as age, gender, and emotional state.
+* **[Demo Applications](@ref omz_demos)** - Console applications that provide robust application templates to help you implement specific deep learning scenarios. These applications involve increasingly complex processing pipelines that gather analysis data from several models that run inference simultaneously, such as detecting a person in a video stream along with detecting the person's physical attributes, such as age, gender, and emotional state.
 
 ## <a name="openvino-installation"></a>Intel® Distribution of OpenVINO™ toolkit Installation and Deployment Tools Directory Structure
 This guide assumes you completed all Intel® Distribution of OpenVINO™ toolkit installation and configuration steps. If you have not yet installed and configured the toolkit, see [Install Intel® Distribution of OpenVINO™ toolkit for Windows*](../install_guides/installing-openvino-windows.md).
@@ -45,9 +45,9 @@ The primary tools for deploying your models and applications are installed to th
 | `~intel_models\` | Symbolic link to the `intel_models` subfolder of the `open_model_zoo` folder. |
 | `model_optimizer\`                      | Model Optimizer directory. Contains configuration scripts, scripts to run the Model Optimizer and other files. See the [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md). |
 | `ngraph\`                               | nGraph directory. Includes the nGraph header and library files. |   
-| `open_model_zoo\`                       | Open Model Zoo directory. Includes the Model Downloader tool to download [pre-trained OpenVINO](@ref omz_models_intel_index) and public models, OpenVINO models documentation, demo applications and the Accuracy Checker tool to evaluate model accuracy.|
+| `open_model_zoo\`                       | Open Model Zoo directory. Includes the Model Downloader tool to download [pre-trained OpenVINO](@ref omz_models_group_intel) and public models, OpenVINO models documentation, demo applications and the Accuracy Checker tool to evaluate model accuracy.|
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`demos\`        | Demo applications for inference scenarios. Also includes documentation and build scripts.| 
-| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`intel_models\` | Pre-trained OpenVINO models and associated documentation. See the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_intel_index).|
+| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`intel_models\` | Pre-trained OpenVINO models and associated documentation. See the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel).|
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`models`        | Intel's trained and public models that can be obtained with Model Downloader.|
 | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`tools\`        | Model Downloader and Accuracy Checker tools. |
 | `tools\`                                | Contains a symbolic link to the Model Downloader folder and auxiliary tools to work with your models: Calibration tool, Benchmark and Collect Statistics tools.|
@@ -199,7 +199,7 @@ Inputs you need to specify when using a code sample or demo application:
 
 To perform sample inference, run the Image Classification code sample and Security Barrier Camera demo application that are automatically compiled when you run the Image Classification and Inference Pipeline demo scripts. The binary files are in the `C:\Users\<USER_ID>\Intel\OpenVINO\inference_engine_cpp_samples_build\intel64\Release` and `C:\Users\<USER_ID>\Intel\OpenVINO\inference_engine_demos_build\intel64\Release` directories, respectively.
 
-You can also build all available sample code and demo applications from the source files delivered with the OpenVINO™ toolkit. To learn how to do this, see the instruction in the [Inference Engine Code Samples Overview](../IE_DG/Samples_Overview.md) and [Demo Applications Overview](@ref omz_demos_README) sections.
+You can also build all available sample code and demo applications from the source files delivered with the OpenVINO™ toolkit. To learn how to do this, see the instruction in the [Inference Engine Code Samples Overview](../IE_DG/Samples_Overview.md) and [Demo Applications Overview](@ref omz_demos) sections.
 
 ### <a name="download-models"></a> Step 1: Download the Models
 
@@ -209,7 +209,7 @@ You must have a model that is specific for you inference task. Example model typ
 - Custom (Often based on SSD)
 
 Options to find a model suitable for the OpenVINO™ toolkit are:
-- Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using the [Model Downloader tool](@ref omz_tools_downloader_README).
+- Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using the [Model Downloader tool](@ref omz_tools_downloader).
 - Download from GitHub*, Caffe* Zoo, TensorFlow* Zoo, and other resources.
 - Train your own model.
         
@@ -425,7 +425,7 @@ classid probability label
 
 ### <a name="run-security-barrier"></a>Step 5: Run the Security Barrier Camera Demo Application
 
-> **NOTE**: The Security Barrier Camera Demo Application is automatically compiled when you run the Inference Pipeline demo scripts. If you want to build it manually, see the instructions in the [Demo Applications Overview](@ref omz_demos_README) section.
+> **NOTE**: The Security Barrier Camera Demo Application is automatically compiled when you run the Inference Pipeline demo scripts. If you want to build it manually, see the instructions in the [Demo Applications Overview](@ref omz_demos) section.
 
 To run the **Security Barrier Camera Demo Application** using an input image on the prepared IRs:
 
diff --git a/docs/how_tos/how-to-links.md b/docs/how_tos/how-to-links.md
index 2f1840690ba..f263f22b5d2 100644
--- a/docs/how_tos/how-to-links.md
+++ b/docs/how_tos/how-to-links.md
@@ -44,7 +44,6 @@ To learn about what is *custom operation* and how to work with them in the Deep
 [![](https://img.youtube.com/vi/Kl1ptVb7aI8/0.jpg)](https://www.youtube.com/watch?v=Kl1ptVb7aI8)
 <iframe width="560" height="315" src="https://www.youtube.com/embed/Kl1ptVb7aI8" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
 
-
 ## Computer Vision with Intel
 
 [![](https://img.youtube.com/vi/FZZD4FCvO9c/0.jpg)](https://www.youtube.com/watch?v=FZZD4FCvO9c)
diff --git a/docs/img/int8vsfp32.png b/docs/img/int8vsfp32.png
index b4889ea2252..9ecbdc8be7b 100644
--- a/docs/img/int8vsfp32.png
+++ b/docs/img/int8vsfp32.png
@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0109b9cbc2908f786f6593de335c725f8ce5c800f37a7d79369408cc47eb8471
-size 25725
+oid sha256:e14f77f61f12c96ccf302667d51348a1e03579679155199910e3ebdf7d6adf06
+size 37915
diff --git a/docs/img/performance_benchmarks_ovms_01.png b/docs/img/performance_benchmarks_ovms_01.png
new file mode 100644
index 00000000000..54473efc5b1
--- /dev/null
+++ b/docs/img/performance_benchmarks_ovms_01.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:d86125db1e295334c04e92d0645c773f679d21bf52e25dce7c887fdf972b7a28
+size 19154
diff --git a/docs/img/performance_benchmarks_ovms_02.png b/docs/img/performance_benchmarks_ovms_02.png
new file mode 100644
index 00000000000..1a39e7fbff6
--- /dev/null
+++ b/docs/img/performance_benchmarks_ovms_02.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:bf8b156026d35b023e57c5cb3ea9136c93a819c1e2aa77be57d1619db4151065
+size 373890
diff --git a/docs/img/throughput_ovms_3dunet.png b/docs/img/throughput_ovms_3dunet.png
new file mode 100644
index 00000000000..261310190a5
--- /dev/null
+++ b/docs/img/throughput_ovms_3dunet.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:e5a472a62de53998194bc1471539139807e00cbb75fd9edc605e7ed99b5630af
+size 18336
diff --git a/docs/img/throughput_ovms_bertlarge_fp32.png b/docs/img/throughput_ovms_bertlarge_fp32.png
new file mode 100644
index 00000000000..8fb4e484e17
--- /dev/null
+++ b/docs/img/throughput_ovms_bertlarge_fp32.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:2f7c58da93fc7966e154bdade48d408401b097f4b0306b7c85aa4256ad72b59d
+size 18118
diff --git a/docs/img/throughput_ovms_bertlarge_int8.png b/docs/img/throughput_ovms_bertlarge_int8.png
new file mode 100644
index 00000000000..90e6e3a9426
--- /dev/null
+++ b/docs/img/throughput_ovms_bertlarge_int8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:104d8cd5eac2d1714db85df9cba5c2cfcc113ec54d428cd6e979e75e10473be6
+size 17924
diff --git a/docs/img/throughput_ovms_resnet50_fp32.png b/docs/img/throughput_ovms_resnet50_fp32.png
new file mode 100644
index 00000000000..324acaf22ec
--- /dev/null
+++ b/docs/img/throughput_ovms_resnet50_fp32.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:3ad19ace847da73176f20f21052f9dd23fd65779f4e1027b2debdaf8fc772c00
+size 18735
diff --git a/docs/img/throughput_ovms_resnet50_int8.png b/docs/img/throughput_ovms_resnet50_int8.png
new file mode 100644
index 00000000000..fdd92852fa9
--- /dev/null
+++ b/docs/img/throughput_ovms_resnet50_int8.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:32116d6d1acc20d8cb2fa10e290e052e3146ba1290f1c5e4aaf16a85388b6ec6
+size 19387
diff --git a/docs/index.md b/docs/index.md
index 17fe2451d3f..ee0739a1e1e 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -19,7 +19,7 @@ The following diagram illustrates the typical OpenVINO™ workflow (click to see
 ### Model Preparation, Conversion and Optimization
 
 You can use your framework of choice to prepare and train a Deep Learning model or just download a pretrained model from the Open Model Zoo. The Open Model Zoo includes Deep Learning solutions to a variety of vision problems, including object recognition, face recognition, pose estimation, text detection, and action recognition, at a range of measured complexities.
-Several of these pretrained models are used also in the [code samples](IE_DG/Samples_Overview.md) and [application demos](@ref omz_demos_README). To download models from the Open Model Zoo, the [Model Downloader](@ref omz_tools_downloader_README) tool is used.
+Several of these pretrained models are used also in the [code samples](IE_DG/Samples_Overview.md) and [application demos](@ref omz_demos). To download models from the Open Model Zoo, the [Model Downloader](@ref omz_tools_downloader) tool is used.
 
 One of the core component of the OpenVINO™ toolkit is the [Model Optimizer](MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) a cross-platform command-line
 tool that converts a trained neural network from its source framework to an open-source, nGraph-compatible [Intermediate Representation (IR)](MO_DG/IR_and_opsets.md) for use in inference operations. The Model Optimizer imports models trained in popular frameworks such as Caffe*, TensorFlow*, MXNet*, Kaldi*, and ONNX* and performs a few optimizations to remove excess layers and group operations when possible into simpler, faster graphs.
@@ -27,16 +27,17 @@ tool that converts a trained neural network from its source framework to an open
 
 If your neural network model contains layers that are not in the list of known layers for supported frameworks, you can adjust the conversion and optimization process through use of  [Custom Layers](HOWTO/Custom_Layers_Guide.md).
 
-Run the [Accuracy Checker utility](@ref omz_tools_accuracy_checker_README) either against source topologies or against the output representation to evaluate the accuracy of inference. The Accuracy Checker is also part of the [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction), an integrated web-based performance analysis studio.
+Run the [Accuracy Checker utility](@ref omz_tools_accuracy_checker) either against source topologies or against the output representation to evaluate the accuracy of inference. The Accuracy Checker is also part of the [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction), an integrated web-based performance analysis studio.
 
 Useful documents for model optimization:
 * [Model Optimizer Developer Guide](MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
 * [Intermediate Representation and Opsets](MO_DG/IR_and_opsets.md)
 * [Custom Layers Guide](HOWTO/Custom_Layers_Guide.md)
-* [Accuracy Checker utility](@ref omz_tools_accuracy_checker_README)
+* [Accuracy Checker utility](@ref omz_tools_accuracy_checker)
 * [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction)
-* [Model Downloader](@ref omz_tools_downloader_README) utility
-* [Pretrained Models (Open Model Zoo)](@ref omz_models_public_index)
+* [Model Downloader](@ref omz_tools_downloader) utility
+* [Intel's Pretrained Models (Open Model Zoo)](@ref omz_models_group_intel)
+* [Public Pretrained Models (Open Model Zoo)](@ref omz_models_group_public)
 
 ### Running and Tuning Inference
 The other core component of OpenVINO™ is the [Inference Engine](IE_DG/Deep_Learning_Inference_Engine_DevGuide.md), which manages the loading and compiling of the optimized neural network model, runs inference operations on input data, and outputs the results. Inference Engine can execute synchronously or asynchronously, and its plugin architecture manages the appropriate compilations for execution on multiple Intel® devices, including both workhorse CPUs and specialized graphics and video processing platforms (see below, Packaging and Deployment).
@@ -46,7 +47,7 @@ You can use OpenVINO™ Tuning Utilities with the Inference Engine to trial and
 For a full browser-based studio integrating these other key tuning utilities, try the [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction).
 ![](img/OV-diagram-step3.png)
 
-OpenVINO™ toolkit includes a set of [inference code samples](IE_DG/Samples_Overview.md) and [application demos](@ref omz_demos_README) showing how inference is run and output processed for use in retail environments, classrooms, smart camera applications, and other solutions.
+OpenVINO™ toolkit includes a set of [inference code samples](IE_DG/Samples_Overview.md) and [application demos](@ref omz_demos) showing how inference is run and output processed for use in retail environments, classrooms, smart camera applications, and other solutions.
 
 OpenVINO also makes use of open-Source and Intel™ tools for traditional graphics processing and performance management. Intel® Media SDK supports accelerated rich-media processing, including transcoding. OpenVINO™ optimizes calls to the rich OpenCV and OpenVX libraries for processing computer vision workloads. And the new DL Streamer integration further accelerates video pipelining and performance.
 
@@ -54,7 +55,7 @@ Useful documents for inference tuning:
 * [Inference Engine Developer Guide](IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)
 * [Inference Engine API References](./api_references.html)
 * [Inference Code Samples](IE_DG/Samples_Overview.md)
-* [Application Demos](@ref omz_demos_README)
+* [Application Demos](@ref omz_demos)
 * [Post-Training Optimization Tool Guide](@ref pot_README)
 * [Deep Learning Workbench Guide](@ref workbench_docs_Workbench_DG_Introduction)
 * [Intel Media SDK](https://github.com/Intel-Media-SDK/MediaSDK)
@@ -82,15 +83,15 @@ The Inference Engine's plug-in architecture can be extended to meet other specia
 Intel® Distribution of OpenVINO™ toolkit includes the following components:
 
 - [Deep Learning Model Optimizer](MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) - A cross-platform command-line tool for importing models and preparing them for optimal execution with the Inference Engine. The Model Optimizer imports, converts, and optimizes models, which were trained in popular frameworks, such as Caffe*, TensorFlow*, MXNet*, Kaldi*, and ONNX*.
-- [Deep Learning Inference Engine](IE_DG/inference_engine_intro.md) - A unified API to allow high performance inference on many hardware types including Intel® CPU, Intel® Integrated Graphics, Intel® Neural Compute Stick 2, Intel® Vision Accelerator Design with Intel® Movidius™ vision processing unit (VPU).
+- [Deep Learning Inference Engine](IE_DG/Deep_Learning_Inference_Engine_DevGuide.md) - A unified API to allow high performance inference on many hardware types including Intel® CPU, Intel® Integrated Graphics, Intel® Neural Compute Stick 2, Intel® Vision Accelerator Design with Intel® Movidius™ vision processing unit (VPU).
 - [Inference Engine Samples](IE_DG/Samples_Overview.md) - A set of simple console applications demonstrating how to use the Inference Engine in your applications.
 - [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) -  A web-based graphical environment that allows you to easily use various sophisticated OpenVINO™ toolkit components.
 - [Post-Training Optimization tool](@ref pot_README) - A tool to calibrate a model and then execute it in the INT8 precision.
 - Additional Tools - A set of tools to work with your models including [Benchmark App](../inference-engine/tools/benchmark_tool/README.md), [Cross Check Tool](../inference-engine/tools/cross_check_tool/README.md), [Compile tool](../inference-engine/tools/compile_tool/README.md).
-- [Open Model Zoo](@ref omz_models_intel_index)     
-    - [Demos](@ref omz_demos_README) - Console applications that provide robust application templates to help you implement specific deep learning scenarios.
-    - Additional Tools - A set of tools to work with your models including [Accuracy Checker Utility](@ref omz_tools_accuracy_checker_README) and [Model Downloader](@ref omz_tools_downloader_README).
-    - [Documentation for Pretrained Models](@ref omz_models_intel_index) - Documentation for pretrained models that are available in the [Open Model Zoo repository](https://github.com/opencv/open_model_zoo).
+- [Open Model Zoo](@ref omz_models_group_intel)     
+    - [Demos](@ref omz_demos) - Console applications that provide robust application templates to help you implement specific deep learning scenarios.
+    - Additional Tools - A set of tools to work with your models including [Accuracy Checker Utility](@ref omz_tools_accuracy_checker) and [Model Downloader](@ref omz_tools_downloader).
+    - [Documentation for Pretrained Models](@ref omz_models_group_intel) - Documentation for pretrained models that are available in the [Open Model Zoo repository](https://github.com/opencv/open_model_zoo).
 - Deep Learning Streamer (DL Streamer) – Streaming analytics framework, based on GStreamer, for constructing graphs of media analytics components. DL Streamer can be installed by the Intel® Distribution of OpenVINO™ toolkit installer. Its open source version is available on [GitHub](https://github.com/opencv/gst-video-analytics). For the DL Streamer documentation, see:
     - [DL Streamer Samples](@ref gst_samples_README)
     - [API Reference](https://openvinotoolkit.github.io/dlstreamer_gst/)
diff --git a/docs/install_guides/installing-openvino-apt.md b/docs/install_guides/installing-openvino-apt.md
index 812c6195f2c..66518696991 100644
--- a/docs/install_guides/installing-openvino-apt.md
+++ b/docs/install_guides/installing-openvino-apt.md
@@ -6,6 +6,31 @@ This guide provides installation steps for Intel® Distribution of OpenVINO™ t
 
 > **NOTE**: Intel® Graphics Compute Runtime for OpenCL™ is not a part of OpenVINO™ APT distribution. You can install it from the [Intel® Graphics Compute Runtime for OpenCL™ GitHub repo](https://github.com/intel/compute-runtime). 
 
+## Included with Runtime Package
+
+The following components are installed with the OpenVINO runtime package:
+
+| Component | Description|
+|-----------|------------|
+| [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)| The engine that runs a deep learning model. It includes a set of libraries for an easy inference integration into your applications. |
+| [OpenCV*](https://docs.opencv.org/master/) | OpenCV* community version compiled for Intel® hardware. |
+| Deep Learning Streamer (DL Streamer) | Streaming analytics framework, based on GStreamer, for constructing graphs of media analytics components. For the DL Streamer documentation, see [DL Streamer Samples](@ref gst_samples_README), [API Reference](https://openvinotoolkit.github.io/dlstreamer_gst/), [Elements](https://github.com/opencv/gst-video-analytics/wiki/Elements), [Tutorial](https://github.com/opencv/gst-video-analytics/wiki/DL%20Streamer%20Tutorial). |
+
+## Included with Developer Package
+
+The following components are installed with the OpenVINO developer package:
+
+| Component | Description|
+|-----------|------------|
+| [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) | This tool imports, converts, and optimizes models that were trained in popular frameworks to a format usable by Intel tools, especially the Inference Engine. <br>Popular frameworks include Caffe\*, TensorFlow\*, MXNet\*, and ONNX\*. |
+| [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md) | The engine that runs a deep learning model. It includes a set of libraries for an easy inference integration into your applications.|
+| [OpenCV*](https://docs.opencv.org/master/) | OpenCV\* community version compiled for Intel® hardware |
+| [Sample Applications](../IE_DG/Samples_Overview.md)           | A set of simple console applications demonstrating how to use the Inference Engine in your applications. |
+| [Demo Applications](@ref omz_demos) | A set of console applications that demonstrate how you can use the Inference Engine in your applications to solve specific use cases. |
+| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader) and other  |
+| [Documentation for Pre-Trained Models ](@ref omz_models_group_intel)                                   | Documentation for the pre-trained models available in the [Open Model Zoo repo](https://github.com/opencv/open_model_zoo).  |
+| Deep Learning Streamer (DL Streamer)   | Streaming analytics framework, based on GStreamer\*, for constructing graphs of media analytics components. For the DL Streamer documentation, see [DL Streamer Samples](@ref gst_samples_README), [API Reference](https://openvinotoolkit.github.io/dlstreamer_gst/), [Elements](https://github.com/opencv/gst-video-analytics/wiki/Elements), [Tutorial](https://github.com/opencv/gst-video-analytics/wiki/DL%20Streamer%20Tutorial). |
+
 ## Set up the Repository
 ### Install the GPG key for the repository
 
@@ -76,7 +101,7 @@ apt-cache search openvino
 ## Install the runtime or developer packages using the APT Package Manager
 Intel® OpenVINO will be installed in: `/opt/intel/openvino_<VERSION>.<UPDATE>.<BUILD_NUM>`
 
-A symlink will be created: `/opt/intel/openvino`
+A symlink will be created: `/opt/intel/openvino_<VERSION>`
 
 ---
 ### To Install a specific version
diff --git a/docs/install_guides/installing-openvino-docker-linux.md b/docs/install_guides/installing-openvino-docker-linux.md
index 7f301c5f795..12eeb0c2831 100644
--- a/docs/install_guides/installing-openvino-docker-linux.md
+++ b/docs/install_guides/installing-openvino-docker-linux.md
@@ -10,8 +10,8 @@ This guide provides the steps for creating a Docker* image with Intel® Distribu
 
 - Ubuntu\* 18.04 long-term support (LTS), 64-bit
 - Ubuntu\* 20.04 long-term support (LTS), 64-bit
-- CentOS\* 7
-- RHEL\* 8
+- CentOS\* 7.6
+- Red Hat* Enterprise Linux* 8.2 (64 bit)
 
 **Host Operating Systems**
 
@@ -144,7 +144,7 @@ RUN /bin/mkdir -p '/usr/local/lib' && \
 
 WORKDIR /opt/libusb-1.0.22/
 RUN /usr/bin/install -c -m 644 libusb-1.0.pc '/usr/local/lib/pkgconfig' && \
-    cp /opt/intel/openvino/deployment_tools/inference_engine/external/97-myriad-usbboot.rules /etc/udev/rules.d/ && \
+    cp /opt/intel/openvino_2021/deployment_tools/inference_engine/external/97-myriad-usbboot.rules /etc/udev/rules.d/ && \
     ldconfig
 ```
    - **CentOS 7**:
@@ -175,11 +175,11 @@ RUN /bin/mkdir -p '/usr/local/lib' && \
     /bin/mkdir -p '/usr/local/include/libusb-1.0' && \
     /usr/bin/install -c -m 644 libusb.h '/usr/local/include/libusb-1.0' && \
     /bin/mkdir -p '/usr/local/lib/pkgconfig' && \
-    printf "\nexport LD_LIBRARY_PATH=\${LD_LIBRARY_PATH}:/usr/local/lib\n" >> /opt/intel/openvino/bin/setupvars.sh
+    printf "\nexport LD_LIBRARY_PATH=\${LD_LIBRARY_PATH}:/usr/local/lib\n" >> /opt/intel/openvino_2021/bin/setupvars.sh
 
 WORKDIR /opt/libusb-1.0.22/
 RUN /usr/bin/install -c -m 644 libusb-1.0.pc '/usr/local/lib/pkgconfig' && \
-    cp /opt/intel/openvino/deployment_tools/inference_engine/external/97-myriad-usbboot.rules /etc/udev/rules.d/ && \
+    cp /opt/intel/openvino_2021/deployment_tools/inference_engine/external/97-myriad-usbboot.rules /etc/udev/rules.d/ && \
     ldconfig
 ```
 2. Run the Docker* image:
diff --git a/docs/install_guides/installing-openvino-linux-ivad-vpu.md b/docs/install_guides/installing-openvino-linux-ivad-vpu.md
index ab2962542d8..cd86804307c 100644
--- a/docs/install_guides/installing-openvino-linux-ivad-vpu.md
+++ b/docs/install_guides/installing-openvino-linux-ivad-vpu.md
@@ -11,9 +11,9 @@ For Intel® Vision Accelerator Design with Intel® Movidius™ VPUs, the followi
 
 1. Set the environment variables:
 ```sh
-source /opt/intel/openvino/bin/setupvars.sh
+source /opt/intel/openvino_2021/bin/setupvars.sh
 ```
-> **NOTE**: The `HDDL_INSTALL_DIR` variable is set to `<openvino_install_dir>/deployment_tools/inference_engine/external/hddl`. If you installed the Intel® Distribution of OpenVINO™ to the default install directory, the `HDDL_INSTALL_DIR` was set to `/opt/intel/openvino//deployment_tools/inference_engine/external/hddl`.
+> **NOTE**: The `HDDL_INSTALL_DIR` variable is set to `<openvino_install_dir>/deployment_tools/inference_engine/external/hddl`. If you installed the Intel® Distribution of OpenVINO™ to the default install directory, the `HDDL_INSTALL_DIR` was set to `/opt/intel/openvino_2021//deployment_tools/inference_engine/external/hddl`.
 
 2. Install dependencies:
 ```sh
@@ -52,7 +52,7 @@ E: [ncAPI] [    965618] [MainThread] ncDeviceOpen:677   Failed to find a device,
 ```sh
 kill -9 $(pidof hddldaemon autoboot)
 pidof hddldaemon autoboot # Make sure none of them is alive
-source /opt/intel/openvino/bin/setupvars.sh
+source /opt/intel/openvino_2021/bin/setupvars.sh
 ${HDDL_INSTALL_DIR}/bin/bsl_reset
 ```
 
diff --git a/docs/install_guides/installing-openvino-linux.md b/docs/install_guides/installing-openvino-linux.md
index df4c0413152..955a50a0bae 100644
--- a/docs/install_guides/installing-openvino-linux.md
+++ b/docs/install_guides/installing-openvino-linux.md
@@ -22,24 +22,24 @@ The Intel® Distribution of OpenVINO™ toolkit for Linux\*:
 | Component                                                                                           | Description                                                                                                                                                                                                                                                                                                   |  
 |-----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) | This tool imports, converts, and optimizes models that were trained in popular frameworks to a format usable by Intel tools, especially the Inference Engine. <br>Popular frameworks include Caffe\*, TensorFlow\*, MXNet\*, and ONNX\*.                                                                              |
-| [Inference Engine](../IE_DG/inference_engine_intro.md)               | This is the engine that runs the deep learning model. It includes a set of libraries for an easy inference integration into your applications.                                                                                                                                                                |
+| [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)               | This is the engine that runs the deep learning model. It includes a set of libraries for an easy inference integration into your applications.                                                                                                                                                                |
 | Intel® Media SDK                                                                                    | Offers access to hardware accelerated video codecs and frame processing                                                                                                                                                                                                                                       |
 | [OpenCV](https://docs.opencv.org/master/)                                                           | OpenCV\* community version compiled for Intel® hardware                                                                                                                                                                                                                                                       |
 | [Inference Engine Code Samples](../IE_DG/Samples_Overview.md)           | A set of simple console applications demonstrating how to utilize specific OpenVINO capabilities in an application and how to perform specific tasks, such as loading a model, running inference, querying specific device capabilities, and more. |
-| [Demo Applications](@ref omz_demos_README)           | A set of simple console applications that provide robust application templates to help you implement specific deep learning scenarios. |
-| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker_README), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader_README) and other  |
-| [Documentation for Pre-Trained Models ](@ref omz_models_intel_index)                                   | Documentation for the pre-trained models available in the [Open Model Zoo repo](https://github.com/opencv/open_model_zoo).  |
+| [Demo Applications](@ref omz_demos)           | A set of simple console applications that provide robust application templates to help you implement specific deep learning scenarios. |
+| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader) and other  |
+| [Documentation for Pre-Trained Models ](@ref omz_models_group_intel)                                   | Documentation for the pre-trained models available in the [Open Model Zoo repo](https://github.com/opencv/open_model_zoo).  |
 | Deep Learning Streamer (DL Streamer)   | Streaming analytics framework, based on GStreamer, for constructing graphs of media analytics components. For the DL Streamer documentation, see [DL Streamer Samples](@ref gst_samples_README), [API Reference](https://openvinotoolkit.github.io/dlstreamer_gst/), [Elements](https://github.com/opencv/gst-video-analytics/wiki/Elements), [Tutorial](https://github.com/opencv/gst-video-analytics/wiki/DL%20Streamer%20Tutorial). |
 
 **Could Be Optionally Installed**
 
 [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare performance of deep learning models on various Intel® architecture
 configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components:
-* [Model Downloader](@ref omz_tools_downloader_README)
-* [Intel® Open Model Zoo](@ref omz_models_intel_index)
+* [Model Downloader](@ref omz_tools_downloader)
+* [Intel® Open Model Zoo](@ref omz_models_group_intel)
 * [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
 * [Post-training Optimization Tool](@ref pot_README)
-* [Accuracy Checker](@ref omz_tools_accuracy_checker_README)
+* [Accuracy Checker](@ref omz_tools_accuracy_checker)
 * [Benchmark Tool](../../inference-engine/samples/benchmark_app/README.md)
 
 Proceed to an [easy installation from Docker](@ref workbench_docs_Workbench_DG_Install_from_Docker_Hub) to get started.
@@ -49,7 +49,6 @@ Proceed to an [easy installation from Docker](@ref workbench_docs_Workbench_DG_I
 **Hardware**
 
 * 6th to 11th generation Intel® Core™ processors and Intel® Xeon® processors 
-* Intel® Xeon® processor E family (formerly code named Sandy Bridge, Ivy Bridge, Haswell, and Broadwell)
 * 3rd generation Intel® Xeon® Scalable processor (formerly code named Cooper Lake)
 * Intel® Xeon® Scalable processor (formerly Skylake and Cascade Lake)
 * Intel Atom® processor with support for Intel® Streaming SIMD Extensions 4.1 (Intel® SSE4.1)
@@ -67,6 +66,7 @@ Proceed to an [easy installation from Docker](@ref workbench_docs_Workbench_DG_I
 **Operating Systems**
 
 - Ubuntu 18.04.x long-term support (LTS), 64-bit
+- Ubuntu 20.04.0 long-term support (LTS), 64-bit
 - CentOS 7.6, 64-bit (for target only)
 - Yocto Project v3.0, 64-bit (for target only and requires modifications)
 
@@ -415,7 +415,7 @@ trusted-host = mirrors.aliyun.com
 - [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
 - [Inference Engine Developer Guide](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md).
 - For more information on Sample Applications, see the [Inference Engine Samples Overview](../IE_DG/Samples_Overview.md).
-- For information on a set of pre-trained models, see the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_intel_index)
+- For information on a set of pre-trained models, see the [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel)
 - For IoT Libraries and Code Samples see the [Intel® IoT Developer Kit](https://github.com/intel-iot-devkit).
 
 To learn more about converting models, go to:
diff --git a/docs/install_guides/installing-openvino-macos.md b/docs/install_guides/installing-openvino-macos.md
index 9489d3a3732..0797d625ca8 100644
--- a/docs/install_guides/installing-openvino-macos.md
+++ b/docs/install_guides/installing-openvino-macos.md
@@ -24,22 +24,22 @@ The following components are installed by default:
 | Component                                                                                           | Description                                                                                                                                                                                                                                                  |
 | :-------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) | This tool imports, converts, and optimizes models, which were trained in popular frameworks, to a format usable by Intel tools, especially the Inference Engine. <br> Popular frameworks include Caffe*, TensorFlow*, MXNet\*, and ONNX\*. |
-| [Inference Engine](../IE_DG/inference_engine_intro.md)               | This is the engine that runs a deep learning model. It includes a set of libraries for an easy inference integration into your applications.                                                                                                               |
+| [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)               | This is the engine that runs a deep learning model. It includes a set of libraries for an easy inference integration into your applications.                                                                                                               |
 | [OpenCV\*](https://docs.opencv.org/master/)                                                         | OpenCV\* community version compiled for Intel® hardware                                                                                                                                                                                                      |
 | [Sample Applications](../IE_DG/Samples_Overview.md)                                                                                | A set of simple console applications demonstrating how to use the Inference Engine in your applications. |
-| [Demos](@ref omz_demos_README)                                   | A set of console applications that demonstrate how you can use the Inference Engine in your applications to solve specific use-cases  |
-| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker_README), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader_README) and other  |
-| [Documentation for Pre-Trained Models ](@ref omz_models_intel_index)                                   | Documentation for the pre-trained models available in the [Open Model Zoo repo](https://github.com/opencv/open_model_zoo)  |
+| [Demos](@ref omz_demos)                                   | A set of console applications that demonstrate how you can use the Inference Engine in your applications to solve specific use-cases  |
+| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader) and other  |
+| [Documentation for Pre-Trained Models ](@ref omz_models_group_intel)                                   | Documentation for the pre-trained models available in the [Open Model Zoo repo](https://github.com/opencv/open_model_zoo)  |
 
 **Could Be Optionally Installed**
 
 [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare performance of deep learning models on various Intel® architecture
 configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components:
-* [Model Downloader](@ref omz_tools_downloader_README)
-* [Intel® Open Model Zoo](@ref omz_models_intel_index)
+* [Model Downloader](@ref omz_tools_downloader)
+* [Intel® Open Model Zoo](@ref omz_models_group_intel)
 * [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
 * [Post-training Optimization Tool](@ref pot_README)
-* [Accuracy Checker](@ref omz_tools_accuracy_checker_README)
+* [Accuracy Checker](@ref omz_tools_accuracy_checker)
 * [Benchmark Tool](../../inference-engine/samples/benchmark_app/README.md)
 
 Proceed to an [easy installation from Docker](@ref workbench_docs_Workbench_DG_Install_from_Docker_Hub) to get started.
@@ -53,7 +53,6 @@ The development and target platforms have the same requirements, but you can sel
 > **NOTE**: The current version of the Intel® Distribution of OpenVINO™ toolkit for macOS* supports inference on Intel CPUs and Intel® Neural Compute Sticks 2 only.
 
 * 6th to 11th generation Intel® Core™ processors and Intel® Xeon® processors 
-* Intel® Xeon® processor E family (formerly code named Sandy Bridge, Ivy Bridge, Haswell, and Broadwell)
 * 3rd generation Intel® Xeon® Scalable processor (formerly code named Cooper Lake)
 * Intel® Xeon® Scalable processor (formerly Skylake and Cascade Lake)
 * Intel® Neural Compute Stick 2
@@ -280,7 +279,7 @@ Follow the steps below to uninstall the Intel® Distribution of OpenVINO™ Tool
 
 - To learn more about the verification applications, see `README.txt` in `/opt/intel/openvino_2021/deployment_tools/demo/`.
 
-- For detailed description of the pre-trained models, go to the [Overview of OpenVINO toolkit Pre-Trained Models](@ref omz_models_intel_index) page.
+- For detailed description of the pre-trained models, go to the [Overview of OpenVINO toolkit Pre-Trained Models](@ref omz_models_group_intel) page.
 
 - More information on [sample applications](../IE_DG/Samples_Overview.md).
 
diff --git a/docs/install_guides/installing-openvino-raspbian.md b/docs/install_guides/installing-openvino-raspbian.md
index eade02a472d..0695ef9e772 100644
--- a/docs/install_guides/installing-openvino-raspbian.md
+++ b/docs/install_guides/installing-openvino-raspbian.md
@@ -18,7 +18,7 @@ The OpenVINO toolkit for Raspbian OS is an archive with pre-installed header fil
 
 | Component                                                                                           | Description                                                                                                                                                                                                                                                  |
 | :-------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [Inference Engine](../IE_DG/inference_engine_intro.md)               | This is the engine that runs the deep learning model. It includes a set of libraries for an easy inference integration into your applications.                                                                                                               |
+| [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)               | This is the engine that runs the deep learning model. It includes a set of libraries for an easy inference integration into your applications.                                                                                                               |
 | [OpenCV\*](https://docs.opencv.org/master/)                                                         | OpenCV\* community version compiled for Intel® hardware. |
 | [Sample Applications](../IE_DG/Samples_Overview.md)                                             | A set of simple console applications demonstrating how to use Intel's Deep Learning Inference Engine in your applications.               |
 
@@ -94,12 +94,12 @@ CMake is installed. Continue to the next section to set the environment variable
 
 You must update several environment variables before you can compile and run OpenVINO toolkit applications. Run the following script to temporarily set the environment variables:
 ```sh
-source /opt/intel/openvino/bin/setupvars.sh
+source /opt/intel/openvino_2021/bin/setupvars.sh
 ```
 
 **(Optional)** The OpenVINO environment variables are removed when you close the shell. As an option, you can permanently set the environment variables as follows:
 ```sh
-echo "source /opt/intel/openvino/bin/setupvars.sh" >> ~/.bashrc
+echo "source /opt/intel/openvino_2021/bin/setupvars.sh" >> ~/.bashrc
 ```
 
 To test your change, open a new terminal. You will see the following:
@@ -118,11 +118,11 @@ Continue to the next section to add USB rules for Intel® Neural Compute Stick 2
    Log out and log in for it to take effect.
 2. If you didn't modify `.bashrc` to permanently set the environment variables, run `setupvars.sh` again after logging in:
    ```sh
-   source /opt/intel/openvino/bin/setupvars.sh
+   source /opt/intel/openvino_2021/bin/setupvars.sh
    ```
 3. To perform inference on the Intel® Neural Compute Stick 2, install the USB rules running the `install_NCS_udev_rules.sh` script:
    ```sh
-   sh /opt/intel/openvino/install_dependencies/install_NCS_udev_rules.sh
+   sh /opt/intel/openvino_2021/install_dependencies/install_NCS_udev_rules.sh
    ```
 4. Plug in your Intel® Neural Compute Stick 2.
 
@@ -138,14 +138,13 @@ Follow the next steps to run pre-trained Face Detection network using Inference
    ```
 2. Build the Object Detection Sample:
    ```sh
-   cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=armv7-a" /opt/intel/openvino/deployment_tools/inference_engine/samples/cpp
+   cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=armv7-a" /opt/intel/openvino_2021/deployment_tools/inference_engine/samples/cpp
    ```
-
    ```sh
    make -j2 object_detection_sample_ssd
    ```
 3. Download the pre-trained Face Detection model with the Model Downloader or copy it from the host machine:
-    ```sh
+   ```sh
    git clone --depth 1 https://github.com/openvinotoolkit/open_model_zoo
    cd open_model_zoo/tools/downloader
    python3 -m pip install -r requirements.in
@@ -165,9 +164,9 @@ Read the next topic if you want to learn more about OpenVINO workflow for Raspbe
 
 If you want to use your model for inference, the model must be converted to the .bin and .xml Intermediate Representation (IR) files that are used as input by Inference Engine. OpenVINO™ toolkit support on Raspberry Pi only includes the Inference Engine module of the Intel® Distribution of OpenVINO™ toolkit. The Model Optimizer is not supported on this platform. To get the optimized models you can use one of the following options:
 
-* Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader_README).
+* Download public and Intel's pre-trained models from the [Open Model Zoo](https://github.com/opencv/open_model_zoo) using [Model Downloader tool](@ref omz_tools_downloader).
 
-   For more information on pre-trained models, see [Pre-Trained Models Documentation](@ref omz_models_intel_index)
+   For more information on pre-trained models, see [Pre-Trained Models Documentation](@ref omz_models_group_intel)
 
 * Convert the model using the Model Optimizer from a full installation of Intel® Distribution of OpenVINO™ toolkit on one of the supported platforms. Installation instructions are available:
 
diff --git a/docs/install_guides/installing-openvino-windows.md b/docs/install_guides/installing-openvino-windows.md
index 8de98761d15..56e963d1ea4 100644
--- a/docs/install_guides/installing-openvino-windows.md
+++ b/docs/install_guides/installing-openvino-windows.md
@@ -16,11 +16,10 @@ Your installation is complete when these are all completed:
 
 2. Install the dependencies:
 
-   - [Microsoft Visual Studio* with C++ **2019 or 2017** with MSBuild](http://visualstudio.microsoft.com/downloads/)  
-   - [CMake **3.10 or higher** 64-bit](https://cmake.org/download/)
-   > **NOTE**: If you want to use Microsoft Visual Studio 2019, you are required to install CMake 3.14.
+   - [Microsoft Visual Studio* 2019 with MSBuild](http://visualstudio.microsoft.com/downloads/)
+   - [CMake 3.14 or higher 64-bit](https://cmake.org/download/)
    - [Python **3.6** - **3.8** 64-bit](https://www.python.org/downloads/windows/)
-   > **IMPORTANT**: As part of this installation, make sure you click the option to add the application to your `PATH` environment variable.
+   > **IMPORTANT**: As part of this installation, make sure you click the option **[Add Python 3.x to PATH](https://docs.python.org/3/using/windows.html#installation-steps)** to add Python to your `PATH` environment variable.
 
 3. <a href="#set-the-environment-variables">Set Environment Variables</a>         
 
@@ -58,22 +57,22 @@ The following components are installed by default:
 | Component                                                                                          | Description                                                                                                                                                                                                                                   |
 |:---------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 |[Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) |This tool imports, converts, and optimizes models that were trained in popular frameworks to a format usable by Intel tools, especially the Inference Engine.<br><strong>NOTE</strong>: Popular frameworks include such frameworks as Caffe\*, TensorFlow\*, MXNet\*, and ONNX\*.         |
-|[Inference Engine](../IE_DG/inference_engine_intro.md)               |This is the engine that runs the deep learning model. It includes a set of libraries for an easy inference integration into your applications.                                                                                                 |
+|[Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)               |This is the engine that runs the deep learning model. It includes a set of libraries for an easy inference integration into your applications.                                                                                                 |
 |[OpenCV\*](https://docs.opencv.org/master/)                                                         |OpenCV* community version compiled for Intel® hardware                                                                                                                                                                                         |
 |[Inference Engine Samples](../IE_DG/Samples_Overview.md)                             |A set of simple console applications demonstrating how to use Intel's Deep Learning Inference Engine in your applications.  |
-| [Demos](@ref omz_demos_README)                                   | A set of console applications that demonstrate how you can use the Inference Engine in your applications to solve specific use-cases  |
-| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker_README), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader_README) and other  |
-| [Documentation for Pre-Trained Models ](@ref omz_models_intel_index)                                   | Documentation for the pre-trained models available in the [Open Model Zoo repo](https://github.com/opencv/open_model_zoo)  |
+| [Demos](@ref omz_demos)                                   | A set of console applications that demonstrate how you can use the Inference Engine in your applications to solve specific use-cases  |
+| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader) and other  |
+| [Documentation for Pre-Trained Models ](@ref omz_models_group_intel)                                   | Documentation for the pre-trained models available in the [Open Model Zoo repo](https://github.com/opencv/open_model_zoo)  |
 
 **Could Be Optionally Installed**
 
 [Deep Learning Workbench](@ref workbench_docs_Workbench_DG_Introduction) (DL Workbench) is a platform built upon OpenVINO™ and provides a web-based graphical environment that enables you to optimize, fine-tune, analyze, visualize, and compare performance of deep learning models on various Intel® architecture
 configurations. In the DL Workbench, you can use most of OpenVINO™ toolkit components:
-* [Model Downloader](@ref omz_tools_downloader_README)
-* [Intel® Open Model Zoo](@ref omz_models_intel_index)
+* [Model Downloader](@ref omz_tools_downloader)
+* [Intel® Open Model Zoo](@ref omz_models_group_intel)
 * [Model Optimizer](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
 * [Post-training Optimization Tool](@ref pot_README)
-* [Accuracy Checker](@ref omz_tools_accuracy_checker_README)
+* [Accuracy Checker](@ref omz_tools_accuracy_checker)
 * [Benchmark Tool](../../inference-engine/samples/benchmark_app/README.md)
 
 Proceed to an [easy installation from Docker](@ref workbench_docs_Workbench_DG_Install_from_Docker_Hub) to get started.
@@ -83,7 +82,6 @@ Proceed to an [easy installation from Docker](@ref workbench_docs_Workbench_DG_I
 **Hardware**
 
 * 6th to 11th generation Intel® Core™ processors and Intel® Xeon® processors 
-* Intel® Xeon® processor E family (formerly code named Sandy Bridge, Ivy Bridge, Haswell, and Broadwell)
 * 3rd generation Intel® Xeon® Scalable processor (formerly code named Cooper Lake)
 * Intel® Xeon® Scalable processor (formerly Skylake and Cascade Lake)
 * Intel Atom® processor with support for Intel® Streaming SIMD Extensions 4.1 (Intel® SSE4.1)
@@ -134,12 +132,9 @@ The screen example below indicates you are missing two dependencies:
 
 You must update several environment variables before you can compile and run OpenVINO™ applications. Open the Command Prompt, and run the `setupvars.bat` batch file to temporarily set your environment variables:
 ```sh
-cd C:\Program Files (x86)\Intel\openvino_2021\bin\
-```
-
-```sh
-setupvars.bat
+"C:\Program Files (x86)\Intel\openvino_2021\bin\setupvars.bat"
 ```
+> **IMPORTANT**: Windows PowerShell* is not recommended to run the configuration commands, please use the Command Prompt instead.
 
 <strong>(Optional)</strong>: OpenVINO toolkit environment variables are removed when you close the Command Prompt window. As an option, you can permanently set the environment variables manually.
 
@@ -314,7 +309,7 @@ Use these steps to update your Windows `PATH` if a command you execute returns a
 
 5. If you need to add CMake to the `PATH`, browse to the directory in which you installed CMake. The default directory is `C:\Program Files\CMake`.
 
-6. If you need to add Python to the `PATH`, browse to the directory in which you installed Python. The default directory is `C:\Users\<USER_ID>\AppData\Local\Programs\Python\Python36\Python`.
+6. If you need to add Python to the `PATH`, browse to the directory in which you installed Python. The default directory is `C:\Users\<USER_ID>\AppData\Local\Programs\Python\Python36\Python`. Note that the `AppData` folder is hidden by default. To view hidden files and folders, see the [Windows 10 instructions](https://support.microsoft.com/en-us/windows/view-hidden-files-and-folders-in-windows-10-97fbc472-c603-9d90-91d0-1166d1d9f4b5). 
 
 7. Click **OK** repeatedly to close each screen.
 
@@ -350,11 +345,11 @@ To learn more about converting deep learning models, go to:
 
 - [Intel Distribution of OpenVINO Toolkit home page](https://software.intel.com/en-us/openvino-toolkit)
 - [OpenVINO™ Release Notes](https://software.intel.com/en-us/articles/OpenVINO-RelNotes)
-- [Introduction to Inference Engine](../IE_DG/inference_engine_intro.md)
+- [Introduction to Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)
 - [Inference Engine Developer Guide](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)
 - [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
 - [Inference Engine Samples Overview](../IE_DG/Samples_Overview.md)
-- [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_intel_index)
+- [Overview of OpenVINO™ Toolkit Pre-Trained Models](@ref omz_models_group_intel)
 - [Intel® Neural Compute Stick 2 Get Started](https://software.intel.com/en-us/neural-compute-stick/get-started)
 
 
diff --git a/docs/install_guides/installing-openvino-yum.md b/docs/install_guides/installing-openvino-yum.md
index 5fc6143ae51..27e464d1b84 100644
--- a/docs/install_guides/installing-openvino-yum.md
+++ b/docs/install_guides/installing-openvino-yum.md
@@ -6,6 +6,18 @@ This guide provides installation steps for the Intel® Distribution of OpenVINO
 
 > **NOTE**: Intel® Graphics Compute Runtime for OpenCL™ is not a part of OpenVINO™ YUM distribution. You can install it from the [Intel® Graphics Compute Runtime for OpenCL™ GitHub repo](https://github.com/intel/compute-runtime).
 
+> **NOTE**: Only runtime packages are available via the YUM repository.
+
+## Included with Runtime Package
+
+The following components are installed with the OpenVINO runtime package:
+
+| Component | Description|
+|-----------|------------|
+| [Inference Engine](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)| The engine that runs a deep learning model. It includes a set of libraries for an easy inference integration into your applications. |
+| [OpenCV*](https://docs.opencv.org/master/) | OpenCV* community version compiled for Intel® hardware. |
+| Deep Learning Stream (DL Streamer) | Streaming analytics framework, based on GStreamer, for constructing graphs of media analytics components. For the DL Streamer documentation, see [DL Streamer Samples](@ref gst_samples_README), [API Reference](https://openvinotoolkit.github.io/dlstreamer_gst/), [Elements](https://github.com/opencv/gst-video-analytics/wiki/Elements), [Tutorial](https://github.com/opencv/gst-video-analytics/wiki/DL%20Streamer%20Tutorial). |
+
 ## Set up the Repository
 
 > **NOTE:** You must be logged in as root to set up and install the repository.
@@ -61,7 +73,7 @@ Results:
 intel-openvino-2021 Intel(R) Distribution of OpenVINO 2021
 ```
   
-### To list the available OpenVINO packages
+### To list available OpenVINO packages
 Use the following command:
 ```sh
 yum list intel-openvino*
@@ -69,11 +81,11 @@ yum list intel-openvino*
 
 ---
   
-## Install the runtime packages Using the YUM Package Manager
+## Install Runtime Packages Using the YUM Package Manager
 
 Intel® OpenVINO will be installed in: `/opt/intel/openvino_<VERSION>.<UPDATE>.<BUILD_NUM>`
 <br>
-A symlink will be created: `/opt/intel/openvino`
+A symlink will be created: `/opt/intel/openvino_<VERSION>`
 
 ---
 
diff --git a/docs/install_guides/movidius-setup-guide.md b/docs/install_guides/movidius-setup-guide.md
index 421dfbab402..c26ebbda38d 100644
--- a/docs/install_guides/movidius-setup-guide.md
+++ b/docs/install_guides/movidius-setup-guide.md
@@ -46,7 +46,7 @@ The `hddldaemon` is a system service, a binary executable that is run to manage
 `<IE>` refers to the following default OpenVINO&trade; Inference Engine directories:
 -  **Linux:**	   
  ```
- /opt/intel/openvino/inference_engine
+ /opt/intel/openvino_2021/inference_engine
  ```
 -  **Windows:**	    
 ``` 
diff --git a/docs/install_guides/pypi-openvino-dev.md b/docs/install_guides/pypi-openvino-dev.md
index 3da7e3c1088..b8c3dcc3e52 100644
--- a/docs/install_guides/pypi-openvino-dev.md
+++ b/docs/install_guides/pypi-openvino-dev.md
@@ -13,7 +13,7 @@ OpenVINO™ toolkit is a comprehensive toolkit for quickly developing applicatio
 | Component                                                                                           | Description                                                                                                                                                                                                                                                                                                   |  
 |-----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | [Model Optimizer](https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html) | This tool imports, converts, and optimizes models that were trained in popular frameworks to a format usable by Intel tools, especially the Inference Engine. <br>Popular frameworks include Caffe\*, TensorFlow\*, MXNet\*, and ONNX\*.                                                                              |
-| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](https://docs.openvinotoolkit.org/latest/omz_tools_accuracy_checker_README.html), [Post-Training Optimization Tool](https://docs.openvinotoolkit.org/latest/pot_README.html)  |
+| Additional Tools                                   | A set of tools to work with your models including [Accuracy Checker utility](https://docs.openvinotoolkit.org/latest/omz_tools_accuracy_checker.html), [Post-Training Optimization Tool](https://docs.openvinotoolkit.org/latest/pot_README.html)  |
 
 **The Runtime Package Includes the Following Components Installed by Dependency:**
 
diff --git a/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md b/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md
index 69411e3f31f..48450817c5b 100644
--- a/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md
+++ b/docs/ops/detection/ExperimentalDetectronDetectionOutput_6.md
@@ -97,7 +97,7 @@ tensor elements.
 
 * *class_agnostic_box_regression*
 
-    * **Description**: *class_agnostic_box_regression* attribute ia a flag specifies whether to delete background 
+    * **Description**: *class_agnostic_box_regression* attribute is a flag that specifies whether to delete background 
     classes or not.
     * **Range of values**:
       * `true` means background classes should be deleted
diff --git a/docs/ops/detection/ExperimentalDetectronROIFeatureExtractor_6.md b/docs/ops/detection/ExperimentalDetectronROIFeatureExtractor_6.md
index 407c4301dc4..2eb40fd6978 100644
--- a/docs/ops/detection/ExperimentalDetectronROIFeatureExtractor_6.md
+++ b/docs/ops/detection/ExperimentalDetectronROIFeatureExtractor_6.md
@@ -136,4 +136,4 @@ must be the same as for 1 input: `[number_of_ROIs, 4]`.
         </port>
     </output>
 </layer>
-```
\ No newline at end of file
+```
diff --git a/docs/optimization_guide/dldt_optimization_guide.md b/docs/optimization_guide/dldt_optimization_guide.md
index 2c13d91d206..87fb3d26b4d 100644
--- a/docs/optimization_guide/dldt_optimization_guide.md
+++ b/docs/optimization_guide/dldt_optimization_guide.md
@@ -445,7 +445,7 @@ There are important performance caveats though: for example, the tasks that run
 
 Also, if the inference is performed on the graphics processing unit (GPU), it can take little gain to do the encoding, for instance, of the resulting video, on the same GPU in parallel, because the device is already busy.
 
-Refer to the [Object Detection SSD Demo](@ref omz_demos_object_detection_demo_ssd_async_README) (latency-oriented Async API showcase) and [Benchmark App Sample](../../inference-engine/samples/benchmark_app/README.md) (which has both latency and throughput-oriented modes) for complete examples of the Async API in action.
+Refer to the [Object Detection SSD Demo](@ref omz_demos_object_detection_demo_cpp) (latency-oriented Async API showcase) and [Benchmark App Sample](../../inference-engine/samples/benchmark_app/README.md) (which has both latency and throughput-oriented modes) for complete examples of the Async API in action.
 
 ## Using Tools <a name="using-tools"></a>
 
diff --git a/docs/ovsa/ovsa_get_started.md b/docs/ovsa/ovsa_get_started.md
index e9062dc7670..19678297eb7 100644
--- a/docs/ovsa/ovsa_get_started.md
+++ b/docs/ovsa/ovsa_get_started.md
@@ -20,7 +20,7 @@ The OpenVINO™ Security Add-on consists of three components that run in Kernel-
 
 - The Model Developer generates a access controlled model from the OpenVINO™ toolkit output. The access controlled model uses the model's Intermediate Representation (IR) files to create a access controlled output file archive that are distributed to Model Users. The Developer can also put the archive file in long-term storage or back it up without additional security. 
 
-- The Model Developer uses the OpenVINO™ Security Add-on Tool(`ovsatool`) to generate and manage cryptographic keys and related collateral for the access controlled models. Cryptographic material is only available in a virtual machine (VM) environment. The OpenVINO™ Security Add-on key management system lets the Model Developer to get external Certificate Authorities to generate certificates to add to a key-store. 
+- The Model Developer uses the OpenVINO™ Security Add-on Tool (<code>ovsatool</code>) to generate and manage cryptographic keys and related collateral for the access controlled models. Cryptographic material is only available in a virtual machine (VM) environment. The OpenVINO™ Security Add-on key management system lets the Model Developer to get external Certificate Authorities to generate certificates to add to a key-store. 
 
 - The Model Developer generates user-specific licenses in a JSON format file for the access controlled model. The Model Developer can define global or user-specific licenses and attach licensing policies to the licenses. For example, the Model Developer can add a time limit for a model or limit the number of times a user can run a model. 
 
@@ -31,7 +31,7 @@ The OpenVINO™ Security Add-on consists of three components that run in Kernel-
 
 - The Independent Software Vendor hosts the OpenVINO™ Security Add-on License Service, which responds to license validation requests when a user attempts to load a access controlled model in a model server. The licenses are registered with the OpenVINO™ Security Add-on License Service.
 
-- When a user loads the model, the OpenVINO™ Security Add-on Runtime contacts the License Service to make sure the license is valid and within the parameters that the Model Developer defined with the OpenVINO™ Security Add-on Tool(`ovsatool`). The user must be able to reach the Independent Software Vendor's License Service over the Internet. 
+- When a user loads the model, the OpenVINO™ Security Add-on Runtime contacts the License Service to make sure the license is valid and within the parameters that the Model Developer defined with the OpenVINO™ Security Add-on Tool (<code>ovsatool</code>). The user must be able to reach the Independent Software Vendor's License Service over the Internet. 
 
 </details>
 
@@ -51,6 +51,8 @@ After the license is successfully validated, the OpenVINO™ Model Server loads
 
 ![Security Add-on Diagram](ovsa_diagram.png)
 
+The binding between SWTPM (vTPM used in guest VM) and HW TPM (TPM on the host) is explained in [this document](https://github.com/openvinotoolkit/security_addon/blob/release_2021_3/docs/fingerprint-changes.md)
+
 ## About the Installation
 The Model Developer, Independent Software Vendor, and User each must prepare one physical hardware machine and one Kernel-based Virtual Machine (KVM). In addition, each person must prepare a Guest Virtual Machine (Guest VM) for each role that person plays. 
 
@@ -248,8 +250,12 @@ See the QEMU documentation for more information about the QEMU network configura
 
 Networking is set up on the Host Machine. Continue to the Step 3 to prepare a Guest VM for the combined role of Model Developer and Independent Software Vendor.
 
-	
-### Step 3: Set Up one Guest VM for the combined roles of Model Developer and Independent Software Vendor<a name="dev-isv-vm"></a>
+### Step 3: Clone the OpenVINO™ Security Add-on
+
+Download the [OpenVINO™ Security Add-on](https://github.com/openvinotoolkit/security_addon).
+
+
+### Step 4: Set Up one Guest VM for the combined roles of Model Developer and Independent Software Vendor<a name="dev-isv-vm"></a>.
 
 For each separate role you play, you must prepare a virtual machine, called a Guest VM. Because in this release, the Model Developer and Independent Software Vendor roles are combined, these instructions guide you to set up one Guest VM, named `ovsa_isv`.
 
@@ -299,15 +305,28 @@ As an option, you can use `virsh` and the virtual machine manager to create and
       Installation information is at https://github.com/tpm2-software/tpm2-tools/blob/master/INSTALL.md
       4. Install the [Docker packages](https://docs.docker.com/engine/install/ubuntu/)
       5. Shut down the Guest VM.<br>
-9. On the host, create a directory to support the virtual TPM device. Only `root` should have read/write permission to this directory:
+9. On the host, create a directory to support the virtual TPM device and provision its certificates. Only `root` should have read/write permission to this directory:
    ```sh
    sudo mkdir -p /var/OVSA/
    sudo mkdir /var/OVSA/vtpm
    sudo mkdir /var/OVSA/vtpm/vtpm_isv_dev
+   
+   export XDG_CONFIG_HOME=~/.config
+   /usr/share/swtpm/swtpm-create-user-config-files
+   swtpm_setup --tpmstate /var/OVSA/vtpm/vtpm_isv_dev --create-ek-cert --create-platform-cert --overwrite --tpm2 --pcr-banks -
    ```
    **NOTE**: For steps 10 and 11, you can copy and edit the script named `start_ovsa_isv_dev_vm.sh` in the `Scripts/reference` directory in the OpenVINO™ Security Add-on repository instead of manually running the commands. If using the script, select the script with `isv` in the file name regardless of whether you are playing the role of the Model Developer or the role of the Independent Software Vendor. Edit the script to point to the correct directory locations and increment `vnc` for each Guest VM.
-10. Start the vTPM on Host:
+10. Start the vTPM on Host, write the HW TPM data into its NVRAM and restart the vTPM for QEMU:
    ```sh
+    sudo swtpm socket --tpm2 --server port=8280 \
+                      --ctrl type=tcp,port=8281 \
+                      --flags not-need-init --tpmstate dir=/var/OVSA/vtpm/vtpm_isv_dev &
+
+    sudo tpm2_startup --clear -T swtpm:port=8280
+    sudo tpm2_startup -T swtpm:port=8280
+    python3 <path to Security-Addon source>/Scripts/host/OVSA_write_hwquote_swtpm_nvram.py 8280
+    sudo pkill -f vtpm_isv_dev
+     
    swtpm socket --tpmstate dir=/var/OVSA/vtpm/vtpm_isv_dev \
     --tpm2 \
     --ctrl type=unixio,path=/var/OVSA/vtpm/vtpm_isv_dev/swtpm-sock \
@@ -335,9 +354,9 @@ As an option, you can use `virsh` and the virtual machine manager to create and
    
 12. Use a VNC client to log on to the Guest VM at `<host-ip-address>:1`
 
-### Step 4: Set Up one Guest VM for the User role
+### Step 5: Set Up one Guest VM for the User role
 
-1. Choose ONE of these options to create a Guest VM for the User role:<br>
+1. Choose **ONE** of these options to create a Guest VM for the User role:<br>
    **Option 1: Copy and Rename the `ovsa_isv_dev_vm_disk.qcow2` disk image**
    1. Copy the `ovsa_isv_dev_vm_disk.qcow2` disk image to a new image named `ovsa_runtime_vm_disk.qcow2`. You created the `ovsa_isv_dev_vm_disk.qcow2` disk image in <a  href="#prerequisites">Step 3</a>.
    2. Boot the new image. 
@@ -383,7 +402,7 @@ As an option, you can use `virsh` and the virtual machine manager to create and
    -netdev tap,id=hostnet1,script=<path-to-scripts>/virbr0-qemu-ifup,   downscript=<path-to-scripts>/virbr0-qemu-ifdown \
    -vnc :2
    ```
-   7. Choose ONE of these options to install additional required software:
+   7. Choose **ONE** of these options to install additional required software:
       
       **Option 1: Use a script to install additional software**
       1. Copy the script `install_guest_deps.sh` from the `Scripts/reference` directory of the OVSA repository to the Guest VM
@@ -400,19 +419,32 @@ As an option, you can use `virsh` and the virtual machine manager to create and
       4. Install the [Docker packages](https://docs.docker.com/engine/install/ubuntu/) 
       5. Shut down the Guest VM.<br><br>            
 
-2. Create a directory to support the virtual TPM device. Only `root` should have read/write permission to this directory:
+2. Create a directory to support the virtual TPM device and provision its certificates. Only `root` should have read/write permission to this directory:
    ```sh
    sudo mkdir /var/OVSA/vtpm/vtpm_runtime
+    
+   export XDG_CONFIG_HOME=~/.config
+   /usr/share/swtpm/swtpm-create-user-config-files
+   swtpm_setup --tpmstate /var/OVSA/vtpm/vtpm_runtime --create-ek-cert --create-platform-cert --overwrite --tpm2 --pcr-banks -
    ```
-   **NOTE**: For steps 3 and 4, you can copy and edit the script named `start_ovsa_runtime_vm.sh` in the scripts directory in the OpenVINO™ Security Add-on repository instead of manually running the commands. Edit the script to point to the correct directory locations and increment `vnc` for each Guest VM. This means that if you are creating a third Guest VM on the same Host Machine, change `-vnc :2` to `-vnc :3`
-3. Start the vTPM:
+   **NOTE**: For steps 3 and 4, you can copy and edit the script named `start_ovsa_runtime_vm.sh` in the `Scripts/reference` directory in the OpenVINO™ Security Add-on repository instead of manually running the commands. Edit the script to point to the correct directory locations and increment `vnc` for each Guest VM. This means that if you are creating a third Guest VM on the same Host Machine, change `-vnc :2` to `-vnc :3`
+3. Start the vTPM, write the HW TPM data into its NVRAM and restart the vTPM for QEMU:
    ```sh
+   sudo swtpm socket --tpm2 --server port=8380 \
+                     --ctrl type=tcp,port=8381 \
+                     --flags not-need-init --tpmstate dir=/var/OVSA/vtpm/vtpm_runtime &
+
+   sudo tpm2_startup --clear -T swtpm:port=8380
+   sudo tpm2_startup -T swtpm:port=8380
+   python3 <path to Security-Addon source>/Scripts/host/OVSA_write_hwquote_swtpm_nvram.py 8380
+   sudo pkill -f vtpm_runtime
+	
    swtpm socket --tpmstate dir=/var/OVSA/vtpm/vtpm_runtime \
    --tpm2 \
    --ctrl type=unixio,path=/var/OVSA/vtpm/vtpm_runtime/swtpm-sock \
    --log level=20
    ```
-4. Start the Guest VM in a new terminal. To do so, either copy and edit the script named `start_ovsa_runtime_vm.sh` in the scripts directory in the OpenVINO™ Security Add-on repository or manually run the command:
+4. Start the Guest VM in a new terminal:
    ```sh
    sudo qemu-system-x86_64 \
     -cpu host \
@@ -450,13 +482,11 @@ Building OpenVINO™ Security Add-on depends on OpenVINO™ Model Server docker
 
 This step is for the combined role of Model Developer and Independent Software Vendor, and the User
 
-1. Download the [OpenVINO™ Security Add-on](https://github.com/openvinotoolkit/security_addon)
-
-2. Go to the top-level OpenVINO™ Security Add-on source directory.
+1. Go to the top-level OpenVINO™ Security Add-on source directory cloned earlier.
    ```sh
    cd security_addon
    ```
-3. Build the OpenVINO™ Security Add-on:
+2. Build the OpenVINO™ Security Add-on:
    ```sh
    make clean all
    sudo make package
@@ -559,7 +589,7 @@ The Model Hosting components install the OpenVINO™ Security Add-on Runtime Doc
 
 This section requires interactions between the Model Developer/Independent Software vendor and the User. All roles must complete all applicable <a href="#setup-host">set up steps</a> and <a href="#ovsa-install">installation steps</a> before beginning this section.
 
-This document uses the [face-detection-retail-0004](@ref omz_models_intel_face_detection_retail_0004_description_face_detection_retail_0004) model as an example. 
+This document uses the [face-detection-retail-0004](@ref omz_models_model_face_detection_retail_0044) model as an example. 
 
 The following figure describes the interactions between the Model Developer, Independent Software Vendor, and User.
 
@@ -577,7 +607,7 @@ The Model Developer creates model, defines access control and creates the user l
    ```sh
    sudo -s
    cd /<username-home-directory>/OVSA/artefacts
-   export OVSA_RUNTIME_ARTEFACTS=$PWD
+	export OVSA_DEV_ARTEFACTS=$PWD
    source /opt/ovsa/scripts/setupvars.sh
    ```
 2. Create files to request a certificate:<br>
@@ -622,7 +652,7 @@ This example uses `curl` to download the `face-detection-retail-004` model from
    ```
 3. Define and enable the model access control and master license:
    ```sh	
-   /opt/ovsa/bin/ovsatool protect -i model/face-detection-retail-0004.xml model/face-detection-retail-0004.bin -n "face detection" -d "face detection retail" -v 0004 -p    face_detection_model.dat -m face_detection_model.masterlic -k isv_keystore -g <output-of-uuidgen>
+	/opt/ovsa/bin/ovsatool controlAccess -i model/face-detection-retail-0004.xml model/face-detection-retail-0004.bin -n "face detection" -d "face detection retail" -v 0004 -p face_detection_model.dat -m face_detection_model.masterlic -k isv_keystore -g <output-of-uuidgen>
    ```
 The Intermediate Representation files for the `face-detection-retail-0004` model are encrypted as `face_detection_model.dat` and a master license is generated as `face_detection_model.masterlic`.
 
@@ -703,6 +733,7 @@ This example uses scp to share data between the ovsa_runtime and ovsa_dev Guest
    cd $OVSA_RUNTIME_ARTEFACTS
    scp custkeystore.csr.crt username@<developer-vm-ip-address>:/<username-home-directory>/OVSA/artefacts
    ```
+
 #### Step 3: Receive and load the access controlled model into the OpenVINO™ Model Server
 1. Receive the model as files named
    * `face_detection_model.dat`
@@ -736,14 +767,15 @@ This example uses scp to share data between the ovsa_runtime and ovsa_dev Guest
    "model_config_list":[
    	{
    	"config":{
-   		"name":"protected-model",
+		"name":"controlled-access-model",
    		"base_path":"/sampleloader/model/fd",
-   		"custom_loader_options": {"loader_name":  "ovsa", "keystore":  "custkeystore", "protected_file": "face_detection_model"}
+		"custom_loader_options": {"loader_name":  "ovsa", "keystore":  "custkeystore", "controlled_access_file": "face_detection_model"}
    	}
    	}
    ]
    }
    ```
+
 #### Step 4: Start the NGINX Model Server
 The NGINX Model Server publishes the access controlled model.
    ```sh
@@ -773,11 +805,12 @@ For information about the NGINX interface, see https://github.com/openvinotoolki
    ```sh
    curl --create-dirs https://raw.githubusercontent.com/openvinotoolkit/model_server/master/example_client/images/people/people1.jpeg -o images/people1.jpeg
    ```
+
 #### Step 6: Run Inference
 
 Run the `face_detection.py` script:
 ```sh
-python3 face_detection.py --grpc_port 3335 --batch_size 1 --width 300 --height 300 --input_images_dir images --output_dir results --tls --server_cert server.pem --client_cert client.pem --client_key client.key --model_name protected-model
+python3 face_detection.py --grpc_port 3335 --batch_size 1 --width 300 --height 300 --input_images_dir images --output_dir results --tls --server_cert server.pem --client_cert client.pem --client_key client.key --model_name controlled-access-model
 ```	
 
 ## Summary
diff --git a/docs/resources/introduction.md b/docs/resources/introduction.md
index 6a3c4ccfaa4..4a62ebef562 100644
--- a/docs/resources/introduction.md
+++ b/docs/resources/introduction.md
@@ -8,14 +8,14 @@
 
 ## Demos
 
-- [Demos](@ref omz_demos_README)
+- [Demos](@ref omz_demos)
 
 
 ## Additional Tools
 
-- A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker_README), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader_README) and other
+- A set of tools to work with your models including [Accuracy Checker utility](@ref omz_tools_accuracy_checker), [Post-Training Optimization Tool Guide](@ref pot_README), [Model Downloader](@ref omz_tools_downloader) and other
 
 ## Pre-Trained Models
 
-- [Intel's Pre-trained Models from Open Model Zoo](@ref omz_models_intel_index)
-- [Public Pre-trained Models Available with OpenVINO™ from Open Model Zoo](@ref omz_models_public_index)
\ No newline at end of file
+- [Intel's Pre-trained Models from Open Model Zoo](@ref omz_models_group_intel)
+- [Public Pre-trained Models Available with OpenVINO™ from Open Model Zoo](@ref omz_models_group_public)
\ No newline at end of file
diff --git a/inference-engine/ie_bridges/python/sample/hello_classification/README.md b/inference-engine/ie_bridges/python/sample/hello_classification/README.md
index 730d6a2f0e7..c02725ebd7d 100644
--- a/inference-engine/ie_bridges/python/sample/hello_classification/README.md
+++ b/inference-engine/ie_bridges/python/sample/hello_classification/README.md
@@ -118,4 +118,4 @@ The sample application logs each step in a standard output stream and outputs to
 [DataPtr.precision]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1DataPtr.html#data_fields
 [IECore.load_network]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1IECore.html#ac9a2e043d14ccfa9c6bbf626cfd69fcc
 [InputInfoPtr.input_data.shape]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1InputInfoPtr.html#data_fields
-[ExecutableNetwork.infer]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1ExecutableNetwork.html#aea96e8e534c8e23d8b257bad11063519
\ No newline at end of file
+[ExecutableNetwork.infer]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1ExecutableNetwork.html#aea96e8e534c8e23d8b257bad11063519
diff --git a/inference-engine/ie_bridges/python/sample/hello_reshape_ssd/README.md b/inference-engine/ie_bridges/python/sample/hello_reshape_ssd/README.md
index 4845f031079..2c5dac57b23 100644
--- a/inference-engine/ie_bridges/python/sample/hello_reshape_ssd/README.md
+++ b/inference-engine/ie_bridges/python/sample/hello_reshape_ssd/README.md
@@ -117,4 +117,4 @@ The sample application logs each step in a standard output stream and creates an
 [DataPtr.precision]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1DataPtr.html#data_fields
 [IECore.load_network]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1IECore.html#ac9a2e043d14ccfa9c6bbf626cfd69fcc
 [IENetwork.reshape]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1IENetwork.html#a6683f0291db25f908f8d6720ab2f221a
-[ExecutableNetwork.infer]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1ExecutableNetwork.html#aea96e8e534c8e23d8b257bad11063519
\ No newline at end of file
+[ExecutableNetwork.infer]:https://docs.openvinotoolkit.org/latest/ie_python_api/classie__api_1_1ExecutableNetwork.html#aea96e8e534c8e23d8b257bad11063519
diff --git a/inference-engine/samples/benchmark_app/README.md b/inference-engine/samples/benchmark_app/README.md
index d3aa8b5e489..49154897462 100644
--- a/inference-engine/samples/benchmark_app/README.md
+++ b/inference-engine/samples/benchmark_app/README.md
@@ -128,7 +128,7 @@ If a model has only image input(s), please provide a folder with images or a pat
 If a model has some specific input(s) (not images), please prepare a binary file(s) that is filled with data of appropriate precision and provide a path to them as input.
 If a model has mixed input types, input folder should contain all required files. Image inputs are filled with image files one by one. Binary inputs are filled with binary inputs one by one.
 
-To run the tool, you can use [public](@ref omz_models_public_index) or [Intel's](@ref omz_models_intel_index) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader_README).
+To run the tool, you can use [public](@ref omz_models_group_public) or [Intel's](@ref omz_models_group_intel) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader).
 
 > **NOTE**: Before running the tool with a trained model, make sure the model is converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](../../../docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
 >
@@ -200,4 +200,4 @@ Below are fragments of sample output for CPU and FPGA devices:
 ## See Also
 * [Using Inference Engine Samples](../../../docs/IE_DG/Samples_Overview.md)
 * [Model Optimizer](../../../docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
-* [Model Downloader](@ref omz_tools_downloader_README)
+* [Model Downloader](@ref omz_tools_downloader)
diff --git a/inference-engine/tools/benchmark_tool/README.md b/inference-engine/tools/benchmark_tool/README.md
index 1c213f67f1f..1eacb8f56ad 100644
--- a/inference-engine/tools/benchmark_tool/README.md
+++ b/inference-engine/tools/benchmark_tool/README.md
@@ -145,7 +145,7 @@ If a model has only image input(s), please a provide folder with images or a pat
 If a model has some specific input(s) (not images), please prepare a binary file(s), which is filled with data of appropriate precision and provide a path to them as input.
 If a model has mixed input types, input folder should contain all required files. Image inputs are filled with image files one by one. Binary inputs are filled with binary inputs one by one.
 
-To run the tool, you can use [public](@ref omz_models_public_index) or [Intel's](@ref omz_models_intel_index) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader_README).
+To run the tool, you can use [public](@ref omz_models_group_public) or [Intel's](@ref omz_models_group_intel) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader).
 
 > **NOTE**: Before running the tool with a trained model, make sure the model is converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](../../../docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
 
@@ -213,4 +213,4 @@ Below are fragments of sample output for CPU and FPGA devices:
 ## See Also
 * [Using Inference Engine Samples](../../../docs/IE_DG/Samples_Overview.md)
 * [Model Optimizer](../../../docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
-* [Model Downloader](@ref omz_tools_downloader_README)
+* [Model Downloader](@ref omz_tools_downloader)
diff --git a/tools/benchmark/README.md b/tools/benchmark/README.md
index 215d16bb47a..280b7a0ef53 100644
--- a/tools/benchmark/README.md
+++ b/tools/benchmark/README.md
@@ -151,7 +151,7 @@ If a model has only image input(s), please a provide folder with images or a pat
 If a model has some specific input(s) (not images), please prepare a binary file(s), which is filled with data of appropriate precision and provide a path to them as input.
 If a model has mixed input types, input folder should contain all required files. Image inputs are filled with image files one by one. Binary inputs are filled with binary inputs one by one.
 
-To run the tool, you can use [public](@ref omz_models_public_index) or [Intel's](@ref omz_models_intel_index) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader_README).
+To run the tool, you can use [public](@ref omz_models_group_public) or [Intel's](@ref omz_models_group_intel) pre-trained models from the Open Model Zoo. The models can be downloaded using the [Model Downloader](@ref omz_tools_downloader).
 
 > **NOTE**: Before running the demo with a trained model, make sure the model is converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
 

	Server Platform	Client Platform
Motherboard	Intel® Server Board S2600WF H48104-872	Inspur YZMB-00882-104 NF5280M5
Memory	Hynix 16 x 16GB @ 2666 MT/s DDR4	Samsung 16 x 16GB @ 2666 MT/s DDR4
CPU	Intel® Xeon® Gold 6252 CPU @ 2.10GHz	Intel® Xeon® Platinum 8260M CPU @ 2.40GHz
Selected CPU Flags	Hyper Threading, Turbo Boost, DL Boost	Hyper Threading, Turbo Boost, DL Boost
CPU Thermal Design Power	150 W	162 W
Operating System	Ubuntu 20.04.2 LTS	Ubuntu 20.04.2 LTS
Kernel Version	5.4.0-65-generic	5.4.0-54-generic
BIOS Vendor	Intel® Corporation	American Megatrends Inc.
BIOS Version and Release Date	SE5C620.86B.02.01, date: 03/26/2020	4.1.16, date: 06/23/2020
Docker Version	20.10.3	20.10.3
Network Speed	40 Gb/s
	Server Platform	Client Platform
Motherboard	ASUSTeK COMPUTER INC. PRIME X299-A II	ASUSTeK COMPUTER INC. PRIME Z370-P
Memory	Corsair 4 x 16GB @ 2666 MT/s DDR4	Corsair 4 x 16GB @ 2133 MT/s DDR4
CPU	Intel® Core™ i9-10920X CPU @ 3.50GHz	Intel® Core™ i7-8700T CPU @ 2.40GHz
Selected CPU Flags	Hyper Threading, Turbo Boost, DL Boost	Hyper Threading, Turbo Boost
CPU Thermal Design Power	165 W	35 W
Operating System	Ubuntu 20.04.1 LTS	Ubuntu 20.04.1 LTS
Kernel Version	5.4.0-52-generic	5.4.0-56-generic
BIOS Vendor	American Megatrends Inc.	American Megatrends Inc.
BIOS Version and Release Date	0603, date: 03/05/2020	2401, date: 07/15/2019
Docker Version	19.03.13	19.03.14
Network Speed	10 Gb/s
	Server Platform	Client Platform
Motherboard	ASUSTeK COMPUTER INC. PRIME Z370-A	Gigabyte Technology Co., Ltd. Z390 UD
Memory	Corsair 2 x 16GB @ 2133 MT/s DDR4	029E 4 x 8GB @ 2400 MT/s DDR4
CPU	Intel® Core™ i5-8500 CPU @ 3.00GHz	Intel® Core™ i3-8100 CPU @ 3.60GHz
Selected CPU Flags	Turbo Boost	-
CPU Thermal Design Power	65 W	65 W
Operating System	Ubuntu 20.04.1 LTS	Ubuntu 20.04.1 LTS
Kernel Version	5.4.0-52-generic	5.4.0-52-generic
BIOS Vendor	American Megatrends Inc.	American Megatrends Inc.
BIOS Version and Release Date	2401, date: 07/12/2019	F10j, date: 09/16/2020
Docker Version	19.03.13	20.10.0
Network Speed	40 Gb/s