diff --git a/docs/notebooks/001-hello-world-with-output.rst b/docs/notebooks/001-hello-world-with-output.rst
index 1b8752b43cf..b5fc484fc04 100644
--- a/docs/notebooks/001-hello-world-with-output.rst
+++ b/docs/notebooks/001-hello-world-with-output.rst
@@ -1,7 +1,7 @@
Hello Image Classification
==========================
-.. _top:
+
This basic introduction to OpenVINO™ shows how to do inference with an
image classification model.
@@ -15,6 +15,10 @@ created, refer to the `TensorFlow to
OpenVINO <101-tensorflow-classification-to-openvino-with-output.html>`__
tutorial.
+
+
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/003-hello-segmentation-with-output.rst b/docs/notebooks/003-hello-segmentation-with-output.rst
index 664854ae8d3..de8e1d16974 100644
--- a/docs/notebooks/003-hello-segmentation-with-output.rst
+++ b/docs/notebooks/003-hello-segmentation-with-output.rst
@@ -1,7 +1,7 @@
Hello Image Segmentation
========================
-.. _top:
+
A very basic introduction to using segmentation models with OpenVINO™.
@@ -12,6 +12,10 @@ Zoo `__ is used.
ADAS stands for Advanced Driver Assistance Services. The model
recognizes four classes: background, road, curb and mark.
+
+
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/004-hello-detection-with-output.rst b/docs/notebooks/004-hello-detection-with-output.rst
index 35d47be09d1..8a96d8e68f0 100644
--- a/docs/notebooks/004-hello-detection-with-output.rst
+++ b/docs/notebooks/004-hello-detection-with-output.rst
@@ -1,7 +1,7 @@
Hello Object Detection
======================
-.. _top:
+
A very basic introduction to using object detection models with
OpenVINO™.
@@ -18,6 +18,10 @@ corner, ``(x_max, y_max)`` are the coordinates of the bottom right
bounding box corner and ``conf`` is the confidence for the predicted
class.
+
+
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst b/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst
index 8e6b721fdbd..6971a252d7d 100644
--- a/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst
+++ b/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst
@@ -1,7 +1,7 @@
Convert a TensorFlow Model to OpenVINO™
=======================================
-.. _top:
+
| This short tutorial shows how to convert a TensorFlow
`MobileNetV3 `__
@@ -13,7 +13,11 @@ Convert a TensorFlow Model to OpenVINO™
Runtime `__
and do inference with a sample image.
-| **Table of contents**:
+
+
+| .. _top:
+
+**Table of contents**:
- `Imports <#imports>`__
- `Settings <#settings>`__
diff --git a/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst b/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst
index c310e83f56d..4ff0c24ecd7 100644
--- a/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst
+++ b/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst
@@ -1,7 +1,7 @@
Convert a PyTorch Model to ONNX and OpenVINO™ IR
================================================
-.. _top:
+
This tutorial demonstrates step-by-step instructions on how to do
inference on a PyTorch semantic segmentation model, using OpenVINO
@@ -35,6 +35,10 @@ plant, sheep, sofa, train, tv monitor**
More information about the model is available in the `torchvision
documentation `__
+
+
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/102-pytorch-to-openvino-with-output.rst b/docs/notebooks/102-pytorch-to-openvino-with-output.rst
index cf6e83887ca..be0a9038b08 100644
--- a/docs/notebooks/102-pytorch-to-openvino-with-output.rst
+++ b/docs/notebooks/102-pytorch-to-openvino-with-output.rst
@@ -1,7 +1,7 @@
Convert a PyTorch Model to OpenVINO™ IR
=======================================
-.. _top:
+
This tutorial demonstrates step-by-step instructions on how to do
inference on a PyTorch classification model using OpenVINO Runtime.
@@ -31,6 +31,10 @@ but elevated to the design space level. The RegNet design space provides
simple and fast networks that work well across a wide range of flop
regimes.
+
+
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst b/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst
index 082be1d6643..94f284cf674 100644
--- a/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst
+++ b/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst
@@ -1,7 +1,7 @@
Convert a PaddlePaddle Model to OpenVINO™ IR
============================================
-.. _top:
+
This notebook shows how to convert a MobileNetV3 model from
`PaddleHub `__, pre-trained
@@ -16,6 +16,10 @@ IR model.
Source of the
`model `__.
+
+
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/104-model-tools-with-output.rst b/docs/notebooks/104-model-tools-with-output.rst
index 62dcd3132ea..441028017b4 100644
--- a/docs/notebooks/104-model-tools-with-output.rst
+++ b/docs/notebooks/104-model-tools-with-output.rst
@@ -1,13 +1,15 @@
Working with Open Model Zoo Models
==================================
-.. _top:
+
This tutorial shows how to download a model from `Open Model
Zoo `__, convert it
to OpenVINO™ IR format, show information about the model, and benchmark
the model.
+.. _top:
+
**Table of contents**:
- `OpenVINO and Open Model Zoo Tools <#openvino-and-open-model-zoo-tools>`__
diff --git a/docs/notebooks/105-language-quantize-bert-with-output.rst b/docs/notebooks/105-language-quantize-bert-with-output.rst
index cbd1ec2b557..c7cdfb21086 100644
--- a/docs/notebooks/105-language-quantize-bert-with-output.rst
+++ b/docs/notebooks/105-language-quantize-bert-with-output.rst
@@ -1,7 +1,7 @@
Quantize NLP models with Post-Training Quantization in NNCF
============================================================
-.. _top:
+
This tutorial demonstrates how to apply ``INT8`` quantization to the
Natural Language Processing model known as
@@ -24,6 +24,10 @@ and datasets. It consists of the following steps:
- Compare the performance of the original, converted and quantized
models.
+
+
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/106-auto-device-with-output.rst b/docs/notebooks/106-auto-device-with-output.rst
index 3e51a92ee2e..98166495d23 100644
--- a/docs/notebooks/106-auto-device-with-output.rst
+++ b/docs/notebooks/106-auto-device-with-output.rst
@@ -1,8 +1,6 @@
Automatic Device Selection with OpenVINO™
=========================================
-.. _top:
-
The `Auto
device `__
(or AUTO in short) selects the most suitable device for inference by
@@ -32,6 +30,10 @@ first inference.
auto
+
+
+.. _top:
+
**Table of contents**:
- `Import modules and create Core <#import-modules-and-create-core>`__
diff --git a/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst b/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst
index 8b1b221b0aa..39cf07b4452 100644
--- a/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst
+++ b/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst
@@ -1,8 +1,6 @@
Quantize Speech Recognition Models using NNCF PTQ API
=====================================================
-.. _top:
-
This tutorial demonstrates how to use the NNCF (Neural Network
Compression Framework) 8-bit quantization in post-training mode (without
the fine-tuning pipeline) to optimize the speech recognition model,
@@ -21,6 +19,10 @@ steps:
- Compare performance of the original and quantized models.
- Compare Accuracy of the Original and Quantized Models.
+
+
+.. _top:
+
**Table of contents**:
- `Download and prepare model <#download-and-prepare-model>`__
diff --git a/docs/notebooks/108-gpu-device-with-output.rst b/docs/notebooks/108-gpu-device-with-output.rst
index 78eec1cf09b..9d7f69faec7 100644
--- a/docs/notebooks/108-gpu-device-with-output.rst
+++ b/docs/notebooks/108-gpu-device-with-output.rst
@@ -1,6 +1,8 @@
Working with GPUs in OpenVINO™
==============================
+
+
.. _top:
**Table of contents**:
diff --git a/docs/notebooks/109-latency-tricks-with-output.rst b/docs/notebooks/109-latency-tricks-with-output.rst
index f939f5e5d4a..5d2d14fa85d 100644
--- a/docs/notebooks/109-latency-tricks-with-output.rst
+++ b/docs/notebooks/109-latency-tricks-with-output.rst
@@ -1,8 +1,6 @@
Performance tricks in OpenVINO for latency mode
===============================================
-.. _top:
-
The goal of this notebook is to provide a step-by-step tutorial for
improving performance for inferencing in a latency mode. Low latency is
especially desired in real-time applications when the results are needed
@@ -51,6 +49,10 @@ optimize performance on OpenVINO IR files in
A similar notebook focused on the throughput mode is available
`here <109-throughput-tricks-with-output.html>`__.
+
+
+.. _top:
+
**Table of contents**:
- `Data <#data>`__
diff --git a/docs/notebooks/109-throughput-tricks-with-output.rst b/docs/notebooks/109-throughput-tricks-with-output.rst
index d01b7d3f3dc..c5e7a2c9646 100644
--- a/docs/notebooks/109-throughput-tricks-with-output.rst
+++ b/docs/notebooks/109-throughput-tricks-with-output.rst
@@ -1,7 +1,7 @@
Performance tricks in OpenVINO for throughput mode
==================================================
-.. _top:
+
The goal of this notebook is to provide a step-by-step tutorial for
improving performance for inferencing in a throughput mode. High
@@ -46,6 +46,10 @@ optimize performance on OpenVINO IR files in
A similar notebook focused on the latency mode is available
`here <109-latency-tricks-with-output.html>`__.
+
+
+.. _top:
+
**Table of contents**:
- `Data <#data>`__
diff --git a/docs/notebooks/110-ct-scan-live-inference-with-output.rst b/docs/notebooks/110-ct-scan-live-inference-with-output.rst
index 7d543aa06d8..0f3e10cca74 100644
--- a/docs/notebooks/110-ct-scan-live-inference-with-output.rst
+++ b/docs/notebooks/110-ct-scan-live-inference-with-output.rst
@@ -1,8 +1,6 @@
Live Inference and Benchmark CT-scan Data with OpenVINO™
========================================================
-.. _top:
-
Kidney Segmentation with PyTorch Lightning and OpenVINO™ - Part 4
-----------------------------------------------------------------
@@ -30,6 +28,10 @@ notebook.
For demonstration purposes, this tutorial will download one converted CT
scan to use for inference.
+
+
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst b/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst
index 2ff15e5eed4..b7089acadd6 100644
--- a/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst
+++ b/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst
@@ -1,8 +1,6 @@
Quantize a Segmentation Model and Show Live Inference
=====================================================
-.. _top:
-
Kidney Segmentation with PyTorch Lightning and OpenVINO™ - Part 3
-----------------------------------------------------------------
@@ -55,6 +53,10 @@ demonstration purposes, this tutorial will download one converted CT
scan and use that scan for quantization and inference. For production
purposes, use a representative dataset for quantizing the model.
+
+
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/111-yolov5-quantization-migration-with-output.rst b/docs/notebooks/111-yolov5-quantization-migration-with-output.rst
index 230ace7db8c..6181e22d000 100644
--- a/docs/notebooks/111-yolov5-quantization-migration-with-output.rst
+++ b/docs/notebooks/111-yolov5-quantization-migration-with-output.rst
@@ -1,8 +1,6 @@
Migrate quantization from POT API to NNCF API
=============================================
-.. _top:
-
This tutorial demonstrates how to migrate quantization pipeline written
using the OpenVINO `Post-Training Optimization Tool (POT) `__ to
`NNCF Post-Training Quantization API `__.
@@ -23,6 +21,9 @@ The tutorial consists from the following parts:
7. Compare performance FP32 and INT8 models
+
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst b/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst
index 16c64286c2b..69d0e04db13 100644
--- a/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst
+++ b/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst
@@ -1,8 +1,6 @@
Post-Training Quantization of PyTorch models with NNCF
======================================================
-.. _top:
-
The goal of this tutorial is to demonstrate how to use the NNCF (Neural
Network Compression Framework) 8-bit quantization in post-training mode
(without the fine-tuning pipeline) to optimize a PyTorch model for the
@@ -27,6 +25,9 @@ quantization, not demanding the fine-tuning of the model.
notebook.
+
+.. _top:
+
**Table of contents**:
- `Preparations <#preparations>`__
diff --git a/docs/notebooks/113-image-classification-quantization-with-output.rst b/docs/notebooks/113-image-classification-quantization-with-output.rst
index d72f5e3e4c0..15e6e52b6f5 100644
--- a/docs/notebooks/113-image-classification-quantization-with-output.rst
+++ b/docs/notebooks/113-image-classification-quantization-with-output.rst
@@ -1,7 +1,7 @@
Quantization of Image Classification Models
===========================================
-.. _top:
+
This tutorial demonstrates how to apply ``INT8`` quantization to Image
Classification model using
@@ -21,6 +21,8 @@ This tutorial consists of the following steps:
- Compare performance of the original and quantized models.
- Compare results on one picture.
+.. _top:
+
**Table of contents**:
- `Prepare the Model <#prepare-the-model>`__
diff --git a/docs/notebooks/115-async-api-with-output.rst b/docs/notebooks/115-async-api-with-output.rst
index 9f59cbc78b2..bec3bc9e219 100644
--- a/docs/notebooks/115-async-api-with-output.rst
+++ b/docs/notebooks/115-async-api-with-output.rst
@@ -1,7 +1,7 @@
Asynchronous Inference with OpenVINO™
=====================================
-.. _top:
+
This notebook demonstrates how to use the `Async
API `__
@@ -14,6 +14,8 @@ in parallel (for example, populating inputs or scheduling other
requests) rather than wait for the current inference to complete first.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/116-sparsity-optimization-with-output.rst b/docs/notebooks/116-sparsity-optimization-with-output.rst
index aa321a6b57e..532094888de 100644
--- a/docs/notebooks/116-sparsity-optimization-with-output.rst
+++ b/docs/notebooks/116-sparsity-optimization-with-output.rst
@@ -1,7 +1,7 @@
Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors
=============================================================================================================
-.. _top:
+
This tutorial demonstrates how to improve performance of sparse
Transformer models with `OpenVINO `__ on 4th
@@ -21,6 +21,8 @@ consists of the following steps:
integration with Hugging Face Optimum.
- Compare sparse 8-bit vs. dense 8-bit inference performance.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/117-model-server-with-output.rst b/docs/notebooks/117-model-server-with-output.rst
index 54989d2a0e7..7cf130e876b 100644
--- a/docs/notebooks/117-model-server-with-output.rst
+++ b/docs/notebooks/117-model-server-with-output.rst
@@ -1,7 +1,7 @@
Hello Model Server
==================
-.. _top:
+
Introduction to OpenVINO™ Model Server (OVMS).
@@ -33,6 +33,8 @@ deployment:
|ovms_diagram|
+.. _top:
+
**Table of contents**:
- `Serving with OpenVINO Model Server <#serving-with-openvino-model-server1>`__
diff --git a/docs/notebooks/118-optimize-preprocessing-with-output.rst b/docs/notebooks/118-optimize-preprocessing-with-output.rst
index c76a8986137..e9f19e107c9 100644
--- a/docs/notebooks/118-optimize-preprocessing-with-output.rst
+++ b/docs/notebooks/118-optimize-preprocessing-with-output.rst
@@ -1,7 +1,7 @@
Optimize Preprocessing
======================
-.. _top:
+
When input data does not fit the model input tensor perfectly,
additional operations/steps are needed to transform the data to the
@@ -27,6 +27,8 @@ This tutorial include following steps:
- Comparing results on one picture.
- Comparing performance.
+.. _top:
+
**Table of contents**:
- `Settings <#settings>`__
diff --git a/docs/notebooks/119-tflite-to-openvino-with-output.rst b/docs/notebooks/119-tflite-to-openvino-with-output.rst
index aa0bc8713a3..6bf4b8924cc 100644
--- a/docs/notebooks/119-tflite-to-openvino-with-output.rst
+++ b/docs/notebooks/119-tflite-to-openvino-with-output.rst
@@ -1,7 +1,7 @@
Convert a Tensorflow Lite Model to OpenVINO™
============================================
-.. _top:
+
`TensorFlow Lite `__, often
referred to as TFLite, is an open source library developed for deploying
@@ -17,6 +17,8 @@ After creating the OpenVINO IR, load the model in `OpenVINO
Runtime `__
and do inference with a sample image.
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/120-tensorflow-object-detection-to-openvino-with-output.rst b/docs/notebooks/120-tensorflow-object-detection-to-openvino-with-output.rst
index 39fcef5cec8..9e2ee531349 100644
--- a/docs/notebooks/120-tensorflow-object-detection-to-openvino-with-output.rst
+++ b/docs/notebooks/120-tensorflow-object-detection-to-openvino-with-output.rst
@@ -1,7 +1,7 @@
Convert a TensorFlow Object Detection Model to OpenVINO™
========================================================
-.. _top:
+
`TensorFlow `__, or TF for short, is an
open-source framework for machine learning.
@@ -26,6 +26,8 @@ After creating the OpenVINO IR, load the model in `OpenVINO
Runtime `__
and do inference with a sample image.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/121-convert-to-openvino-with-output.rst b/docs/notebooks/121-convert-to-openvino-with-output.rst
index 5da2d317e3a..cf93b94ac74 100644
--- a/docs/notebooks/121-convert-to-openvino-with-output.rst
+++ b/docs/notebooks/121-convert-to-openvino-with-output.rst
@@ -4,6 +4,8 @@ OpenVINO™ model conversion API
This notebook shows how to convert a model from original framework
format to OpenVINO Intermediate Representation (IR).
+.. _top:
+
**Table of contents**:
- `OpenVINO IR format <#openvino-ir-format>`__
diff --git a/docs/notebooks/122-speech-recognition-quantization-wav2vec2-with-output.rst b/docs/notebooks/122-speech-recognition-quantization-wav2vec2-with-output.rst
new file mode 100644
index 00000000000..4db1ac32fe9
--- /dev/null
+++ b/docs/notebooks/122-speech-recognition-quantization-wav2vec2-with-output.rst
@@ -0,0 +1,309 @@
+Quantize Speech Recognition Models with accuracy control using NNCF PTQ API
+===========================================================================
+
+
+
+This tutorial demonstrates how to apply ``INT8`` quantization with
+accuracy control to the speech recognition model, known as
+`Wav2Vec2 `__,
+using the NNCF (Neural Network Compression Framework) 8-bit quantization
+with accuracy control in post-training mode (without the fine-tuning
+pipeline). This notebook uses a fine-tuned
+`Wav2Vec2-Base-960h `__
+`PyTorch `__ model trained on the `LibriSpeech ASR
+corpus `__. The tutorial is designed to be
+extendable to custom models and datasets. It consists of the following
+steps:
+
+- Download and prepare the Wav2Vec2 model and LibriSpeech dataset.
+- Define data loading and accuracy validation functionality.
+- Model quantization with accuracy control.
+- Compare Accuracy of original PyTorch model, OpenVINO FP16 and INT8
+ models.
+- Compare performance of the original and quantized models.
+
+The advanced quantization flow allows to apply 8-bit quantization to the
+model with control of accuracy metric. This is achieved by keeping the
+most impactful operations within the model in the original precision.
+The flow is based on the `Basic 8-bit
+quantization `__
+and has the following differences:
+
+- Besides the calibration dataset, a validation dataset is required to
+ compute the accuracy metric. Both datasets can refer to the same data
+ in the simplest case.
+- Validation function, used to compute accuracy metric is required. It
+ can be a function that is already available in the source framework
+ or a custom function.
+- Since accuracy validation is run several times during the
+ quantization process, quantization with accuracy control can take
+ more time than the Basic 8-bit quantization flow.
+- The resulted model can provide smaller performance improvement than
+ the Basic 8-bit quantization flow because some of the operations are
+ kept in the original precision.
+
+.. note::
+
+ Currently, 8-bit quantization with accuracy control in NNCF
+ is available only for models in OpenVINO representation.
+
+The steps for the quantization with accuracy control are described
+below.
+
+
+
+.. _top:
+
+**Table of contents**:
+
+- `Imports <#imports>`__
+- `Prepare the Model <#prepare-the-model>`__
+- `Prepare LibriSpeech Dataset <#prepare-librispeech-dataset>`__
+- `Prepare calibration and validation datasets <#prepare-calibration-and-validation-datasets>`__
+- `Prepare validation function <#prepare-validation-function>`__
+- `Run quantization with accuracy control <#run-quantization-with-accuracy-control>`__
+- `Model Usage Example <#model-usage-example>`__
+- `Compare Accuracy of the Original and Quantized Models <#compare-accuracy-of-the-original-and-quantized-models>`__
+
+
+.. code:: ipython2
+
+ # !pip install -q "openvino-dev>=2023.1.0" "nncf>=2.6.0"
+ !pip install -q "openvino==2023.1.0.dev20230811"
+ !pip install git+https://github.com/openvinotoolkit/nncf.git@develop
+ !pip install -q soundfile librosa transformers torch datasets torchmetrics
+
+Imports `⇑ <#top>`__
+###############################################################################################################################
+
+.. code:: ipython2
+
+ import numpy as np
+ import torch
+
+ from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
+
+Prepare the Model `⇑ <#top>`__
+###############################################################################################################################
+
+For instantiating PyTorch model class,
+we should use ``Wav2Vec2ForCTC.from_pretrained`` method with providing
+model ID for downloading from HuggingFace hub. Model weights and
+configuration files will be downloaded automatically in first time
+usage. Keep in mind that downloading the files can take several minutes
+and depends on your internet connection.
+
+Additionally, we can create processor class which is responsible for
+model specific pre- and post-processing steps.
+
+.. code:: ipython2
+
+ BATCH_SIZE = 1
+ MAX_SEQ_LENGTH = 30480
+
+
+ torch_model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h", ctc_loss_reduction="mean")
+ processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h")
+
+Convert it to the OpenVINO Intermediate Representation (OpenVINO IR)
+
+.. code:: ipython2
+
+ import openvino
+
+
+ default_input = torch.zeros([1, MAX_SEQ_LENGTH], dtype=torch.float)
+ ov_model = openvino.convert_model(torch_model, example_input=default_input)
+
+Prepare LibriSpeech Dataset `⇑ <#top>`__
+###############################################################################################################################
+
+For demonstration purposes, we will use short dummy version of
+LibriSpeech dataset - ``patrickvonplaten/librispeech_asr_dummy`` to
+speed up model evaluation. Model accuracy can be different from reported
+in the paper. For reproducing original accuracy, use ``librispeech_asr``
+dataset.
+
+.. code:: ipython2
+
+ from datasets import load_dataset
+
+
+ dataset = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation")
+ test_sample = dataset[0]["audio"]
+
+
+ # define preprocessing function for converting audio to input values for model
+ def map_to_input(batch):
+ preprocessed_signal = processor(batch["audio"]["array"], return_tensors="pt", padding="longest", sampling_rate=batch['audio']['sampling_rate'])
+ input_values = preprocessed_signal.input_values
+ batch['input_values'] = input_values
+ return batch
+
+
+ # apply preprocessing function to dataset and remove audio column, to save memory as we do not need it anymore
+ dataset = dataset.map(map_to_input, batched=False, remove_columns=["audio"])
+
+Prepare calibration dataset `⇑ <#top>`__
+###############################################################################################################################
+
+.. code:: ipython2
+
+ import nncf
+
+
+ def transform_fn(data_item):
+ """
+ Extract the model's input from the data item.
+ The data item here is the data item that is returned from the data source per iteration.
+ This function should be passed when the data item cannot be used as model's input.
+ """
+ return np.array(data_item["input_values"])
+
+
+ calibration_dataset = nncf.Dataset(dataset, transform_fn)
+
+Prepare validation function `⇑ <#top>`__
+###############################################################################################################################
+
+Define the validation function.
+
+.. code:: ipython2
+
+ from torchmetrics import WordErrorRate
+ from tqdm.notebook import tqdm
+
+
+ def validation_fn(model, dataset):
+ """
+ Calculate and returns a metric for the model.
+ """
+ wer = WordErrorRate()
+ for sample in tqdm(dataset):
+ # run infer function on sample
+ output = model.output(0)
+ logits = model(np.array(sample['input_values']))[output]
+ predicted_ids = np.argmax(logits, axis=-1)
+ transcription = processor.batch_decode(torch.from_numpy(predicted_ids))
+
+ # update metric on sample result
+ wer.update(transcription, [sample['text']])
+
+ result = wer.compute()
+
+ return 1 - result
+
+Run quantization with accuracy control `⇑ <#top>`__
+###############################################################################################################################
+
+You should provide
+the calibration dataset and the validation dataset. It can be the same
+dataset. - parameter ``max_drop`` defines the accuracy drop threshold.
+The quantization process stops when the degradation of accuracy metric
+on the validation dataset is less than the ``max_drop``. The default
+value is 0.01. NNCF will stop the quantization and report an error if
+the ``max_drop`` value can’t be reached. - ``drop_type`` defines how the
+accuracy drop will be calculated: ABSOLUTE (used by default) or
+RELATIVE. - ``ranking_subset_size`` - size of a subset that is used to
+rank layers by their contribution to the accuracy drop. Default value is
+300, and the more samples it has the better ranking, potentially. Here
+we use the value 25 to speed up the execution.
+
+.. note::
+
+ Execution can take tens of minutes and requires up to 10 GB
+ of free memory
+
+
+.. code:: ipython2
+
+ from nncf.quantization.advanced_parameters import AdvancedAccuracyRestorerParameters
+ from nncf.parameters import ModelType
+
+ quantized_model = nncf.quantize_with_accuracy_control(
+ ov_model,
+ calibration_dataset=calibration_dataset,
+ validation_dataset=calibration_dataset,
+ validation_fn=validation_fn,
+ max_drop=0.01,
+ drop_type=nncf.DropType.ABSOLUTE,
+ model_type=ModelType.TRANSFORMER,
+ advanced_accuracy_restorer_parameters=AdvancedAccuracyRestorerParameters(
+ ranking_subset_size=25
+ ),
+ )
+
+Model Usage Example `⇑ <#top>`__
+###############################################################################################################################
+
+.. code:: ipython2
+
+ import IPython.display as ipd
+
+
+ ipd.Audio(test_sample["array"], rate=16000)
+
+.. code:: ipython2
+
+ core = openvino.Core()
+
+ compiled_quantized_model = core.compile_model(model=quantized_model, device_name='CPU')
+
+ input_data = np.expand_dims(test_sample["array"], axis=0)
+
+Next, make a prediction.
+
+.. code:: ipython2
+
+ predictions = compiled_quantized_model([input_data])[0]
+ predicted_ids = np.argmax(predictions, axis=-1)
+ transcription = processor.batch_decode(torch.from_numpy(predicted_ids))
+ transcription
+
+Compare Accuracy of the Original and Quantized Models `⇑ <#top>`__
+###############################################################################################################################
+
+- Define dataloader for test dataset.
+- Define functions to get inference for PyTorch and OpenVINO models.
+- Define functions to compute Word Error Rate.
+
+.. code:: ipython2
+
+ # inference function for pytorch
+ def torch_infer(model, sample):
+ logits = model(torch.Tensor(sample['input_values'])).logits
+ # take argmax and decode
+ predicted_ids = torch.argmax(logits, dim=-1)
+ transcription = processor.batch_decode(predicted_ids)
+ return transcription
+
+
+ # inference function for openvino
+ def ov_infer(model, sample):
+ output = model.output(0)
+ logits = model(np.array(sample['input_values']))[output]
+ predicted_ids = np.argmax(logits, axis=-1)
+ transcription = processor.batch_decode(torch.from_numpy(predicted_ids))
+ return transcription
+
+
+ def compute_wer(dataset, model, infer_fn):
+ wer = WordErrorRate()
+ for sample in tqdm(dataset):
+ # run infer function on sample
+ transcription = infer_fn(model, sample)
+ # update metric on sample result
+ wer.update(transcription, [sample['text']])
+ # finalize metric calculation
+ result = wer.compute()
+ return result
+
+Now, compute WER for the original PyTorch model and quantized model.
+
+.. code:: ipython2
+
+ pt_result = compute_wer(dataset, torch_model, torch_infer)
+ quantized_result = compute_wer(dataset, compiled_quantized_model, ov_infer)
+
+ print(f'[PyTorch] Word Error Rate: {pt_result:.4f}')
+ print(f'[Quantized OpenVino] Word Error Rate: {quantized_result:.4f}')
diff --git a/docs/notebooks/122-yolov8-quantization-with-accuracy-control-with-output.rst b/docs/notebooks/122-yolov8-quantization-with-accuracy-control-with-output.rst
new file mode 100644
index 00000000000..7bba4ef46f0
--- /dev/null
+++ b/docs/notebooks/122-yolov8-quantization-with-accuracy-control-with-output.rst
@@ -0,0 +1,306 @@
+Convert and Optimize YOLOv8 with OpenVINO™
+==========================================
+
+
+
+The YOLOv8 algorithm developed by Ultralytics is a cutting-edge,
+state-of-the-art (SOTA) model that is designed to be fast, accurate, and
+easy to use, making it an excellent choice for a wide range of object
+detection, image segmentation, and image classification tasks. More
+details about its realization can be found in the original model
+`repository `__.
+
+This tutorial demonstrates step-by-step instructions on how to run apply
+quantization with accuracy control to PyTorch YOLOv8. The advanced
+quantization flow allows to apply 8-bit quantization to the model with
+control of accuracy metric. This is achieved by keeping the most
+impactful operations within the model in the original precision. The
+flow is based on the `Basic 8-bit
+quantization `__
+and has the following differences:
+
+- Besides the calibration dataset, a validation dataset is required to
+ compute the accuracy metric. Both datasets can refer to the same data
+ in the simplest case.
+- Validation function, used to compute accuracy metric is required. It
+ can be a function that is already available in the source framework
+ or a custom function.
+- Since accuracy validation is run several times during the
+ quantization process, quantization with accuracy control can take
+ more time than the Basic 8-bit quantization flow.
+- The resulted model can provide smaller performance improvement than
+ the Basic 8-bit quantization flow because some of the operations are
+ kept in the original precision.
+
+.. note::
+
+ Currently, 8-bit quantization with accuracy control in NNCF
+ is available only for models in OpenVINO representation.
+
+The steps for the quantization with accuracy control are described
+below.
+
+The tutorial consists of the following steps:
+
+
+
+- `Prerequisites <#prerequisites>`__
+- `Get Pytorch model and OpenVINO IR model <#get-pytorch-model-and-openvino-ir-model>`__
+- `Define validator and data loader <#define-validator-and-data-loader>`__
+- `Prepare calibration and validation datasets <#prepare-calibration-and-validation-datasets>`__
+- `Prepare validation function <#prepare-validation-function>`__
+- `Run quantization with accuracy control <#run-quantization-with-accuracy-control>`__
+- `Compare Accuracy and Performance of the Original and Quantized Models <#compare-accuracy-and-performance-of-the-original-and-quantized-models>`__
+
+Prerequisites `⇑ <#top>`__
+###############################################################################################################################
+
+
+Install necessary packages.
+
+.. code:: ipython2
+
+ !pip install -q "openvino==2023.1.0.dev20230811"
+ !pip install git+https://github.com/openvinotoolkit/nncf.git@develop
+ !pip install -q "ultralytics==8.0.43"
+
+Get Pytorch model and OpenVINO IR model `⇑ <#top>`__
+###############################################################################################################################
+
+Generally, PyTorch models represent an instance of the
+`torch.nn.Module `__
+class, initialized by a state dictionary with model weights. We will use
+the YOLOv8 nano model (also known as ``yolov8n``) pre-trained on a COCO
+dataset, which is available in this
+`repo `__. Similar steps are
+also applicable to other YOLOv8 models. Typical steps to obtain a
+pre-trained model:
+
+1. Create an instance of a model class.
+2. Load a checkpoint state dict, which contains the pre-trained model
+ weights.
+
+In this case, the creators of the model provide an API that enables
+converting the YOLOv8 model to ONNX and then to OpenVINO IR. Therefore,
+we do not need to do these steps manually.
+
+.. code:: ipython2
+
+ import os
+ from pathlib import Path
+
+ from ultralytics import YOLO
+ from ultralytics.yolo.cfg import get_cfg
+ from ultralytics.yolo.data.utils import check_det_dataset
+ from ultralytics.yolo.engine.validator import BaseValidator as Validator
+ from ultralytics.yolo.utils import DATASETS_DIR
+ from ultralytics.yolo.utils import DEFAULT_CFG
+ from ultralytics.yolo.utils import ops
+ from ultralytics.yolo.utils.metrics import ConfusionMatrix
+
+ ROOT = os.path.abspath('')
+
+ MODEL_NAME = "yolov8n-seg"
+
+ model = YOLO(f"{ROOT}/{MODEL_NAME}.pt")
+ args = get_cfg(cfg=DEFAULT_CFG)
+ args.data = "coco128-seg.yaml"
+
+Load model.
+
+.. code:: ipython2
+
+ import openvino
+
+
+ model_path = Path(f"{ROOT}/{MODEL_NAME}_openvino_model/{MODEL_NAME}.xml")
+ if not model_path.exists():
+ model.export(format="openvino", dynamic=True, half=False)
+
+ ov_model = openvino.Core().read_model(model_path)
+
+Define validator and data loader `⇑ <#top>`__
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+The original model
+repository uses a ``Validator`` wrapper, which represents the accuracy
+validation pipeline. It creates dataloader and evaluation metrics and
+updates metrics on each data batch produced by the dataloader. Besides
+that, it is responsible for data preprocessing and results
+postprocessing. For class initialization, the configuration should be
+provided. We will use the default setup, but it can be replaced with
+some parameters overriding to test on custom data. The model has
+connected the ``ValidatorClass`` method, which creates a validator class
+instance.
+
+.. code:: ipython2
+
+ validator = model.ValidatorClass(args)
+ validator.data = check_det_dataset(args.data)
+ data_loader = validator.get_dataloader(f"{DATASETS_DIR}/coco128-seg", 1)
+
+ validator.is_coco = True
+ validator.class_map = ops.coco80_to_coco91_class()
+ validator.names = model.model.names
+ validator.metrics.names = validator.names
+ validator.nc = model.model.model[-1].nc
+ validator.nm = 32
+ validator.process = ops.process_mask
+ validator.plot_masks = []
+
+Prepare calibration and validation datasets `⇑ <#top>`__
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+We can use one dataset as calibration and validation datasets. Name it
+``quantization_dataset``.
+
+.. code:: ipython2
+
+ from typing import Dict
+
+ import nncf
+
+
+ def transform_fn(data_item: Dict):
+ input_tensor = validator.preprocess(data_item)["img"].numpy()
+ return input_tensor
+
+
+ quantization_dataset = nncf.Dataset(data_loader, transform_fn)
+
+Prepare validation function `⇑ <#top>`__
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+.. code:: ipython2
+
+ from functools import partial
+
+ import torch
+ from nncf.quantization.advanced_parameters import AdvancedAccuracyRestorerParameters
+
+
+ def validation_ac(
+ compiled_model: openvino.CompiledModel,
+ validation_loader: torch.utils.data.DataLoader,
+ validator: Validator,
+ num_samples: int = None,
+ ) -> float:
+ validator.seen = 0
+ validator.jdict = []
+ validator.stats = []
+ validator.batch_i = 1
+ validator.confusion_matrix = ConfusionMatrix(nc=validator.nc)
+ num_outputs = len(compiled_model.outputs)
+
+ counter = 0
+ for batch_i, batch in enumerate(validation_loader):
+ if num_samples is not None and batch_i == num_samples:
+ break
+ batch = validator.preprocess(batch)
+ results = compiled_model(batch["img"])
+ if num_outputs == 1:
+ preds = torch.from_numpy(results[compiled_model.output(0)])
+ else:
+ preds = [
+ torch.from_numpy(results[compiled_model.output(0)]),
+ torch.from_numpy(results[compiled_model.output(1)]),
+ ]
+ preds = validator.postprocess(preds)
+ validator.update_metrics(preds, batch)
+ counter += 1
+ stats = validator.get_stats()
+ if num_outputs == 1:
+ stats_metrics = stats["metrics/mAP50-95(B)"]
+ else:
+ stats_metrics = stats["metrics/mAP50-95(M)"]
+ print(f"Validate: dataset length = {counter}, metric value = {stats_metrics:.3f}")
+
+ return stats_metrics
+
+
+ validation_fn = partial(validation_ac, validator=validator)
+
+Run quantization with accuracy control `⇑ <#top>`__
+###############################################################################################################################
+
+You should provide
+the calibration dataset and the validation dataset. It can be the same
+dataset. - parameter ``max_drop`` defines the accuracy drop threshold.
+The quantization process stops when the degradation of accuracy metric
+on the validation dataset is less than the ``max_drop``. The default
+value is 0.01. NNCF will stop the quantization and report an error if
+the ``max_drop`` value can’t be reached. - ``drop_type`` defines how the
+accuracy drop will be calculated: ABSOLUTE (used by default) or
+RELATIVE. - ``ranking_subset_size`` - size of a subset that is used to
+rank layers by their contribution to the accuracy drop. Default value is
+300, and the more samples it has the better ranking, potentially. Here
+we use the value 25 to speed up the execution.
+
+.. note::
+
+ Execution can take tens of minutes and requires up to 15 GB
+ of free memory
+
+.. code:: ipython2
+
+ quantized_model = nncf.quantize_with_accuracy_control(
+ ov_model,
+ quantization_dataset,
+ quantization_dataset,
+ validation_fn=validation_fn,
+ max_drop=0.01,
+ preset=nncf.QuantizationPreset.MIXED,
+ advanced_accuracy_restorer_parameters=AdvancedAccuracyRestorerParameters(
+ ranking_subset_size=25,
+ num_ranking_processes=1
+ ),
+ )
+
+Compare Accuracy and Performance of the Original and Quantized Models `⇑ <#top>`__
+###############################################################################################################################
+
+
+Now we can compare metrics of the Original non-quantized
+OpenVINO IR model and Quantized OpenVINO IR model to make sure that the
+``max_drop`` is not exceeded.
+
+.. code:: ipython2
+
+ import openvino
+
+ core = openvino.Core()
+ quantized_compiled_model = core.compile_model(model=quantized_model, device_name='CPU')
+ compiled_ov_model = core.compile_model(model=ov_model, device_name='CPU')
+
+ pt_result = validation_ac(compiled_ov_model, data_loader, validator)
+ quantized_result = validation_ac(quantized_compiled_model, data_loader, validator)
+
+
+ print(f'[Original OpenVino]: {pt_result:.4f}')
+ print(f'[Quantized OpenVino]: {quantized_result:.4f}')
+
+And compare performance.
+
+.. code:: ipython2
+
+ from pathlib import Path
+ # Set model directory
+ MODEL_DIR = Path("model")
+ MODEL_DIR.mkdir(exist_ok=True)
+
+ ir_model_path = MODEL_DIR / 'ir_model.xml'
+ quantized_model_path = MODEL_DIR / 'quantized_model.xml'
+
+ # Save models to use them in the commandline banchmark app
+ openvino.save_model(ov_model, ir_model_path, compress_to_fp16=False)
+ openvino.save_model(quantized_model, quantized_model_path, compress_to_fp16=False)
+
+.. code:: ipython2
+
+ # Inference Original model (OpenVINO IR)
+ ! benchmark_app -m $ir_model_path -shape "[1,3,640,640]" -d CPU -api async
+
+.. code:: ipython2
+
+ # Inference Quantized model (OpenVINO IR)
+ ! benchmark_app -m $quantized_model_path -shape "[1,3,640,640]" -d CPU -api async
diff --git a/docs/notebooks/201-vision-monodepth-with-output.rst b/docs/notebooks/201-vision-monodepth-with-output.rst
index 06ec0e5cd77..e98e4c37d8f 100644
--- a/docs/notebooks/201-vision-monodepth-with-output.rst
+++ b/docs/notebooks/201-vision-monodepth-with-output.rst
@@ -1,7 +1,7 @@
Monodepth Estimation with OpenVINO
==================================
-.. _top:
+
This tutorial demonstrates Monocular Depth Estimation with MidasNet in
OpenVINO. Model information can be found
@@ -30,6 +30,8 @@ Transfer,” `__ in IEEE
Transactions on Pattern Analysis and Machine Intelligence, doi:
``10.1109/TPAMI.2020.3019967``.
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/202-vision-superresolution-image-with-output.rst b/docs/notebooks/202-vision-superresolution-image-with-output.rst
index 18ea80db89d..2a9c26e5342 100644
--- a/docs/notebooks/202-vision-superresolution-image-with-output.rst
+++ b/docs/notebooks/202-vision-superresolution-image-with-output.rst
@@ -1,7 +1,7 @@
Single Image Super Resolution with OpenVINO™
============================================
-.. _top:
+
Super Resolution is the process of enhancing the quality of an image by
increasing the pixel count using deep learning. This notebook shows the
@@ -16,6 +16,8 @@ Resolution,” `__ 2018 24th
International Conference on Pattern Recognition (ICPR), 2018,
pp. 2777-2784, doi: 10.1109/ICPR.2018.8545760.
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/202-vision-superresolution-video-with-output.rst b/docs/notebooks/202-vision-superresolution-video-with-output.rst
index 840d31c84ee..7b48a8c64ed 100644
--- a/docs/notebooks/202-vision-superresolution-video-with-output.rst
+++ b/docs/notebooks/202-vision-superresolution-video-with-output.rst
@@ -1,7 +1,7 @@
Video Super Resolution with OpenVINO™
=====================================
-.. _top:
+
Super Resolution is the process of enhancing the quality of an image by
increasing the pixel count using deep learning. This notebook applies
@@ -23,6 +23,8 @@ pp. 2777-2784, doi: 10.1109/ICPR.2018.8545760.
video.
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/203-meter-reader-with-output.rst b/docs/notebooks/203-meter-reader-with-output.rst
index e45a6d9973c..eeec4746977 100644
--- a/docs/notebooks/203-meter-reader-with-output.rst
+++ b/docs/notebooks/203-meter-reader-with-output.rst
@@ -1,7 +1,7 @@
Industrial Meter Reader
=======================
-.. _top:
+
This notebook shows how to create a industrial meter reader with
OpenVINO Runtime. We use the pre-trained
@@ -21,6 +21,8 @@ to build up a multiple inference task pipeline:
workflow
+.. _top:
+
**Table of contents**:
- `Import <#import>`__
diff --git a/docs/notebooks/204-segmenter-semantic-segmentation-with-output.rst b/docs/notebooks/204-segmenter-semantic-segmentation-with-output.rst
index c516000c84c..29f412a4194 100644
--- a/docs/notebooks/204-segmenter-semantic-segmentation-with-output.rst
+++ b/docs/notebooks/204-segmenter-semantic-segmentation-with-output.rst
@@ -1,7 +1,7 @@
Semantic Segmentation with OpenVINO™ using Segmenter
====================================================
-.. _top:
+
Semantic segmentation is a difficult computer vision problem with many
applications such as autonomous driving, robotics, augmented reality,
@@ -28,6 +28,8 @@ paper: `Segmenter: Transformer for Semantic
Segmentation `__ or in the
`repository `__.
+.. _top:
+
**Table of contents**:
- `Get and prepare PyTorch model <#get-and-prepare-pytorch-model>`__
diff --git a/docs/notebooks/205-vision-background-removal-with-output.rst b/docs/notebooks/205-vision-background-removal-with-output.rst
index cd53815c483..1c4ae2d1696 100644
--- a/docs/notebooks/205-vision-background-removal-with-output.rst
+++ b/docs/notebooks/205-vision-background-removal-with-output.rst
@@ -1,7 +1,7 @@
Image Background Removal with U^2-Net and OpenVINO™
===================================================
-.. _top:
+
This notebook demonstrates background removal in images using
U\ :math:`^2`-Net and OpenVINO.
@@ -17,6 +17,8 @@ The model source is available
`here `__.
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/206-vision-paddlegan-anime-with-output.rst b/docs/notebooks/206-vision-paddlegan-anime-with-output.rst
index 7974ce25de1..32cafa0c20c 100644
--- a/docs/notebooks/206-vision-paddlegan-anime-with-output.rst
+++ b/docs/notebooks/206-vision-paddlegan-anime-with-output.rst
@@ -1,7 +1,7 @@
Photos to Anime with PaddleGAN and OpenVINO
===========================================
-.. _top:
+
This tutorial demonstrates converting a
`PaddlePaddle/PaddleGAN `__
@@ -16,6 +16,8 @@ documentation `__
diff --git a/docs/notebooks/207-vision-paddlegan-superresolution-with-output.rst b/docs/notebooks/207-vision-paddlegan-superresolution-with-output.rst
index 5967a0bf7b1..b19bfc982c6 100644
--- a/docs/notebooks/207-vision-paddlegan-superresolution-with-output.rst
+++ b/docs/notebooks/207-vision-paddlegan-superresolution-with-output.rst
@@ -1,7 +1,7 @@
Super Resolution with PaddleGAN and OpenVINO™
=============================================
-.. _top:
+
This notebook demonstrates converting the RealSR (real-world
super-resolution) model from
@@ -18,6 +18,8 @@ from CVPR 2020.
This notebook works best with small images (up to 800x600 resolution).
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/208-optical-character-recognition-with-output.rst b/docs/notebooks/208-optical-character-recognition-with-output.rst
index 0815ae2d3cd..871f7110dd1 100644
--- a/docs/notebooks/208-optical-character-recognition-with-output.rst
+++ b/docs/notebooks/208-optical-character-recognition-with-output.rst
@@ -1,7 +1,7 @@
Optical Character Recognition (OCR) with OpenVINO™
==================================================
-.. _top:
+
This tutorial demonstrates how to perform optical character recognition
(OCR) with OpenVINO models. It is a continuation of the
@@ -21,6 +21,8 @@ Zoo `__. For more
information, refer to the
`104-model-tools <104-model-tools-with-output.html>`__ tutorial.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/209-handwritten-ocr-with-output.rst b/docs/notebooks/209-handwritten-ocr-with-output.rst
index e0f5913988f..8aa26383d21 100644
--- a/docs/notebooks/209-handwritten-ocr-with-output.rst
+++ b/docs/notebooks/209-handwritten-ocr-with-output.rst
@@ -1,7 +1,7 @@
Handwritten Chinese and Japanese OCR with OpenVINO™
===================================================
-.. _top:
+
In this tutorial, we perform optical character recognition (OCR) for
handwritten Chinese (simplified) and Japanese. An OCR tutorial using the
@@ -19,6 +19,8 @@ and
`scut_ept `__
charlists are used. Both models are available on `Open Model Zoo `__.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/210-slowfast-video-recognition-with-output.rst b/docs/notebooks/210-slowfast-video-recognition-with-output.rst
index e795d99a6ef..c2bcfa25c5d 100644
--- a/docs/notebooks/210-slowfast-video-recognition-with-output.rst
+++ b/docs/notebooks/210-slowfast-video-recognition-with-output.rst
@@ -1,7 +1,7 @@
Video Recognition using SlowFast and OpenVINO™
==============================================
-.. _top:
+
Teaching machines to detect, understand and analyze the contents of
images has been one of the more well-known and well-studied problems in
@@ -40,6 +40,8 @@ This tutorial consists of the following steps
.. |image0| image:: https://user-images.githubusercontent.com/34324155/143044111-94676f64-7ba8-4081-9011-f8054bed7030.png
+.. _top:
+
**Table of contents**:
- `Prepare PyTorch Model <#prepare-pytorch-model>`__
diff --git a/docs/notebooks/211-speech-to-text-with-output.rst b/docs/notebooks/211-speech-to-text-with-output.rst
index 080d8b092c9..95d919eb6d6 100644
--- a/docs/notebooks/211-speech-to-text-with-output.rst
+++ b/docs/notebooks/211-speech-to-text-with-output.rst
@@ -1,7 +1,7 @@
Speech to Text with OpenVINO™
=============================
-.. _top:
+
This tutorial demonstrates speech-to-text recognition with OpenVINO.
@@ -13,6 +13,8 @@ with Connectionist Temporal Classification (CTC) loss. The model is
available from `Open Model
Zoo `__.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/212-pyannote-speaker-diarization-with-output.rst b/docs/notebooks/212-pyannote-speaker-diarization-with-output.rst
index 8fabfbf8b90..2e8af021276 100644
--- a/docs/notebooks/212-pyannote-speaker-diarization-with-output.rst
+++ b/docs/notebooks/212-pyannote-speaker-diarization-with-output.rst
@@ -1,7 +1,7 @@
Speaker diarization
===================
-.. _top:
+
Speaker diarization is the process of partitioning an audio stream
containing human speech into homogeneous segments according to the
@@ -39,6 +39,8 @@ card `__,
`repo `__ and
`paper `__.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/213-question-answering-with-output.rst b/docs/notebooks/213-question-answering-with-output.rst
index e3fc0ee6c8d..9b1be824b7a 100644
--- a/docs/notebooks/213-question-answering-with-output.rst
+++ b/docs/notebooks/213-question-answering-with-output.rst
@@ -1,7 +1,7 @@
Interactive question answering with OpenVINO™
=============================================
-.. _top:
+
This demo shows interactive question answering with OpenVINO, using
`small BERT-large-like
@@ -11,6 +11,8 @@ larger BERT-large model. The model comes from `Open Model
Zoo `__. Final part
of this notebook provides live inference results from your inputs.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/214-grammar-correction-with-output.rst b/docs/notebooks/214-grammar-correction-with-output.rst
index eaff3b6e620..434aabbacd3 100644
--- a/docs/notebooks/214-grammar-correction-with-output.rst
+++ b/docs/notebooks/214-grammar-correction-with-output.rst
@@ -1,7 +1,7 @@
Grammatical Error Correction with OpenVINO
==========================================
-.. _top:
+
AI-based auto-correction products are becoming increasingly popular due
to their ease of use, editing speed, and affordability. These products
@@ -43,6 +43,8 @@ It consists of the following steps:
Optimum `__.
- Create an inference pipeline for grammatical error checking
+.. _top:
+
**Table of contents**:
- `How does it work? <#how-does-it-work>`__
diff --git a/docs/notebooks/215-image-inpainting-with-output.rst b/docs/notebooks/215-image-inpainting-with-output.rst
index 85f762359ab..f9ecfbafeeb 100644
--- a/docs/notebooks/215-image-inpainting-with-output.rst
+++ b/docs/notebooks/215-image-inpainting-with-output.rst
@@ -1,7 +1,7 @@
Image In-painting with OpenVINO™
--------------------------------
-.. _top:
+
This notebook demonstrates how to use an image in-painting model with
OpenVINO, using `GMCNN
@@ -11,6 +11,8 @@ given a tampered image, is able to create something very similar to the
original image. The Following pipeline will be used in this notebook.
|pipeline|
+.. _top:
+
**Table of contents**:
- `Download the Model <#download-the-model>`__
diff --git a/docs/notebooks/216-attention-center-with-output.rst b/docs/notebooks/216-attention-center-with-output.rst
index 07e5c69eedb..2a5dcfc7c8a 100644
--- a/docs/notebooks/216-attention-center-with-output.rst
+++ b/docs/notebooks/216-attention-center-with-output.rst
@@ -1,7 +1,7 @@
The attention center model with OpenVINO™
=========================================
-.. _top:
+
This notebook demonstrates how to use the `attention center
model `__ with
@@ -51,6 +51,8 @@ The attention center model has been trained with images from the `COCO
dataset `__ annotated with saliency from
the `SALICON dataset `__.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/217-vision-deblur-with-output.rst b/docs/notebooks/217-vision-deblur-with-output.rst
index 3686de8db5f..6e0f7067823 100644
--- a/docs/notebooks/217-vision-deblur-with-output.rst
+++ b/docs/notebooks/217-vision-deblur-with-output.rst
@@ -1,6 +1,8 @@
Deblur Photos with DeblurGAN-v2 and OpenVINO™
=============================================
+
+
.. _top:
**Table of contents**:
diff --git a/docs/notebooks/218-vehicle-detection-and-recognition-with-output.rst b/docs/notebooks/218-vehicle-detection-and-recognition-with-output.rst
index c5237117f8a..2bc8a6cd2e9 100644
--- a/docs/notebooks/218-vehicle-detection-and-recognition-with-output.rst
+++ b/docs/notebooks/218-vehicle-detection-and-recognition-with-output.rst
@@ -1,7 +1,7 @@
Vehicle Detection And Recognition with OpenVINO™
================================================
-.. _top:
+
This tutorial demonstrates how to use two pre-trained models from `Open
Model Zoo `__:
@@ -19,6 +19,8 @@ As a result, you can get:
result
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/219-knowledge-graphs-conve-with-output.rst b/docs/notebooks/219-knowledge-graphs-conve-with-output.rst
index c623c3cfd00..07fd9413bca 100644
--- a/docs/notebooks/219-knowledge-graphs-conve-with-output.rst
+++ b/docs/notebooks/219-knowledge-graphs-conve-with-output.rst
@@ -1,7 +1,7 @@
OpenVINO optimizations for Knowledge graphs
===========================================
-.. _top:
+
The goal of this notebook is to showcase performance optimizations for
the ConvE knowledge graph embeddings model using the Intel® Distribution
@@ -18,6 +18,8 @@ The ConvE model is an implementation of the paper -
sample dataset can be downloaded from:
https://github.com/TimDettmers/ConvE/tree/master/countries/countries_S1
+.. _top:
+
**Table of contents**:
- `Windows specific settings <#windows-specific-settings>`__
diff --git a/docs/notebooks/220-cross-lingual-books-alignment-with-output.rst b/docs/notebooks/220-cross-lingual-books-alignment-with-output.rst
index cd34355ccf9..88d0874160f 100644
--- a/docs/notebooks/220-cross-lingual-books-alignment-with-output.rst
+++ b/docs/notebooks/220-cross-lingual-books-alignment-with-output.rst
@@ -1,7 +1,7 @@
Cross-lingual Books Alignment with Transformers and OpenVINO™
=============================================================
-.. _top:
+
Cross-lingual text alignment is the task of matching sentences in a pair
of texts that are translations of each other. In this notebook, you’ll
@@ -39,6 +39,8 @@ Prerequisites
- ``seaborn`` - for alignment matrix visualization
- ``ipywidgets`` - for displaying HTML and JS output in the notebook
+.. _top:
+
**Table of contents**:
- `Get Books <#get-books>`__
diff --git a/docs/notebooks/221-machine-translation-with-output.rst b/docs/notebooks/221-machine-translation-with-output.rst
index f8c36d8b482..b4103a43f25 100644
--- a/docs/notebooks/221-machine-translation-with-output.rst
+++ b/docs/notebooks/221-machine-translation-with-output.rst
@@ -1,7 +1,7 @@
Machine translation demo
========================
-.. _top:
+
This demo utilizes Intel’s pre-trained model that translates from
English to German. More information about the model can be found
@@ -18,6 +18,8 @@ following structure: ```` + *tokenized sentence* + ```` +
**Output** After the inference, we have a sequence of up to 200 tokens.
The structure is the same as the one for the input.
+.. _top:
+
**Table of contents**:
- `Downloading model <#downloading-model>`__
diff --git a/docs/notebooks/222-vision-image-colorization-with-output.rst b/docs/notebooks/222-vision-image-colorization-with-output.rst
index 5985afd3fed..5d3d32c0655 100644
--- a/docs/notebooks/222-vision-image-colorization-with-output.rst
+++ b/docs/notebooks/222-vision-image-colorization-with-output.rst
@@ -1,7 +1,7 @@
Image Colorization with OpenVINO
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-.. _top:
+
This notebook demonstrates how to colorize images with OpenVINO using
the Colorization model
@@ -44,6 +44,8 @@ About Colorization-siggraph
See the `colorization `__
repository for more details.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/223-text-prediction-with-output.rst b/docs/notebooks/223-text-prediction-with-output.rst
index ef77dd1d3e0..eeb9f79f0f2 100644
--- a/docs/notebooks/223-text-prediction-with-output.rst
+++ b/docs/notebooks/223-text-prediction-with-output.rst
@@ -1,7 +1,7 @@
Text Prediction with OpenVINO™
==============================
-.. _top:
+
This notebook shows text prediction with OpenVINO. This notebook can
work in two different modes, Text Generation and Conversation, which the
@@ -73,6 +73,8 @@ above. The Generated response is added to the history with the
and the sequence is passed back into the model.
+.. _top:
+
**Table of contents**:
- `Model Selection <#model-selection>`__
diff --git a/docs/notebooks/224-3D-segmentation-point-clouds-with-output.rst b/docs/notebooks/224-3D-segmentation-point-clouds-with-output.rst
index fef333d4d1c..8934a54ec2e 100644
--- a/docs/notebooks/224-3D-segmentation-point-clouds-with-output.rst
+++ b/docs/notebooks/224-3D-segmentation-point-clouds-with-output.rst
@@ -1,7 +1,7 @@
Part Segmentation of 3D Point Clouds with OpenVINO™
===================================================
-.. _top:
+
This notebook demonstrates how to process `point
cloud `__ data and run 3D
@@ -24,6 +24,8 @@ segmentation, to scene semantic parsing. It is highly efficient and
effective, showing strong performance on par or even better than state
of the art.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/225-stable-diffusion-text-to-image-with-output.rst b/docs/notebooks/225-stable-diffusion-text-to-image-with-output.rst
index 90f6243f6c3..255e3b6b2a5 100644
--- a/docs/notebooks/225-stable-diffusion-text-to-image-with-output.rst
+++ b/docs/notebooks/225-stable-diffusion-text-to-image-with-output.rst
@@ -1,7 +1,7 @@
Text-to-Image Generation with Stable Diffusion and OpenVINO™
============================================================
-.. _top:
+
Stable Diffusion is a text-to-image latent diffusion model created by
the researchers and engineers from
@@ -41,6 +41,8 @@ Notebook contains the following steps:
API.
3. Run Stable Diffusion pipeline with OpenVINO.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/226-yolov7-optimization-with-output.rst b/docs/notebooks/226-yolov7-optimization-with-output.rst
index 330d988cab3..5867c26429a 100644
--- a/docs/notebooks/226-yolov7-optimization-with-output.rst
+++ b/docs/notebooks/226-yolov7-optimization-with-output.rst
@@ -1,7 +1,7 @@
Convert and Optimize YOLOv7 with OpenVINO™
==========================================
-.. _top:
+
The YOLOv7 algorithm is making big waves in the computer vision and
machine learning communities. It is a real-time object detection
@@ -40,6 +40,8 @@ The tutorial consists of the following steps:
- Compare accuracy of the FP32 and quantized models.
- Compare performance of the FP32 and quantized models.
+.. _top:
+
**Table of contents**:
- `Get Pytorch model <#get-pytorch-model>`__
diff --git a/docs/notebooks/227-whisper-subtitles-generation-with-output.rst b/docs/notebooks/227-whisper-subtitles-generation-with-output.rst
index 05b04c2fec8..39d210defad 100644
--- a/docs/notebooks/227-whisper-subtitles-generation-with-output.rst
+++ b/docs/notebooks/227-whisper-subtitles-generation-with-output.rst
@@ -1,7 +1,7 @@
Video Subtitle Generation using Whisper and OpenVINO™
=====================================================
-.. _top:
+
`Whisper `__ is an automatic speech
recognition (ASR) system trained on 680,000 hours of multilingual and
@@ -26,6 +26,8 @@ Download the model. 2. Instantiate the PyTorch model pipeline. 3. Export
the ONNX model and convert it to OpenVINO IR, using model conversion
API. 4. Run the Whisper pipeline with OpenVINO models.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/228-clip-zero-shot-convert-with-output.rst b/docs/notebooks/228-clip-zero-shot-convert-with-output.rst
index 913817a8a4e..63f70768c20 100644
--- a/docs/notebooks/228-clip-zero-shot-convert-with-output.rst
+++ b/docs/notebooks/228-clip-zero-shot-convert-with-output.rst
@@ -1,7 +1,7 @@
Zero-shot Image Classification with OpenAI CLIP and OpenVINO™
=============================================================
-.. _top:
+
Zero-shot image classification is a computer vision task to classify
images into one of several classes without any prior training or
@@ -30,6 +30,8 @@ image classification. The notebook contains the following steps:
conversion API.
4. Run CLIP with OpenVINO.
+.. _top:
+
**Table of contents**:
- `Instantiate model <#instantiate-model>`__
diff --git a/docs/notebooks/228-clip-zero-shot-quantize-with-output.rst b/docs/notebooks/228-clip-zero-shot-quantize-with-output.rst
index f6c2d4fb2f0..1e335a73b2f 100644
--- a/docs/notebooks/228-clip-zero-shot-quantize-with-output.rst
+++ b/docs/notebooks/228-clip-zero-shot-quantize-with-output.rst
@@ -1,7 +1,7 @@
Post-Training Quantization of OpenAI CLIP model with NNCF
=========================================================
-.. _top:
+
The goal of this tutorial is to demonstrate how to speed up the model by
applying 8-bit post-training quantization from
@@ -23,6 +23,8 @@ The optimization process contains the following steps:
notebook first to generate OpenVINO IR model that is used for
quantization.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/229-distilbert-sequence-classification-with-output.rst b/docs/notebooks/229-distilbert-sequence-classification-with-output.rst
index 514d49925a5..018993b6f03 100644
--- a/docs/notebooks/229-distilbert-sequence-classification-with-output.rst
+++ b/docs/notebooks/229-distilbert-sequence-classification-with-output.rst
@@ -1,7 +1,7 @@
Sentiment Analysis with OpenVINO™
=================================
-.. _top:
+
**Sentiment analysis** is the use of natural language processing, text
analysis, computational linguistics, and biometrics to systematically
@@ -9,6 +9,8 @@ identify, extract, quantify, and study affective states and subjective
information. This notebook demonstrates how to convert and run a
sequence classification model using OpenVINO.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/230-yolov8-optimization-with-output.rst b/docs/notebooks/230-yolov8-optimization-with-output.rst
index 28d3b14a051..f3083e063aa 100644
--- a/docs/notebooks/230-yolov8-optimization-with-output.rst
+++ b/docs/notebooks/230-yolov8-optimization-with-output.rst
@@ -1,7 +1,7 @@
Convert and Optimize YOLOv8 with OpenVINO™
==========================================
-.. _top:
+
The YOLOv8 algorithm developed by Ultralytics is a cutting-edge,
state-of-the-art (SOTA) model that is designed to be fast, accurate, and
@@ -39,6 +39,8 @@ The tutorial consists of the following steps:
- Compare performance of the FP32 and quantized models.
- Compare accuracy of the FP32 and quantized models.
+.. _top:
+
**Table of contents**:
- `Get Pytorch model <#get-pytorch-model>`__
diff --git a/docs/notebooks/231-instruct-pix2pix-image-editing-with-output.rst b/docs/notebooks/231-instruct-pix2pix-image-editing-with-output.rst
index bf63a422e49..308a358d1c5 100644
--- a/docs/notebooks/231-instruct-pix2pix-image-editing-with-output.rst
+++ b/docs/notebooks/231-instruct-pix2pix-image-editing-with-output.rst
@@ -1,7 +1,7 @@
Image Editing with InstructPix2Pix and OpenVINO
===============================================
-.. _top:
+
The InstructPix2Pix is a conditional diffusion model that edits images
based on written instructions provided by the user. Generative image
@@ -31,6 +31,8 @@ Notebook contains the following steps:
3. Run InstructPix2Pix pipeline with OpenVINO.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/233-blip-visual-language-processing-with-output.rst b/docs/notebooks/233-blip-visual-language-processing-with-output.rst
index 2637f314bf1..8468422b451 100644
--- a/docs/notebooks/233-blip-visual-language-processing-with-output.rst
+++ b/docs/notebooks/233-blip-visual-language-processing-with-output.rst
@@ -1,7 +1,7 @@
Visual Question Answering and Image Captioning using BLIP and OpenVINO
======================================================================
-.. _top:
+
Humans perceive the world through vision and language. A longtime goal
of AI is to build intelligent agents that can understand the world
@@ -24,6 +24,8 @@ The tutorial consists of the following parts:
2. Convert the BLIP model to OpenVINO IR.
3. Run visual question answering and image captioning with OpenVINO.
+.. _top:
+
**Table of contents**:
- `Background <#background>`__
diff --git a/docs/notebooks/234-encodec-audio-compression-with-output.rst b/docs/notebooks/234-encodec-audio-compression-with-output.rst
index 309214879cd..7e98b009f94 100644
--- a/docs/notebooks/234-encodec-audio-compression-with-output.rst
+++ b/docs/notebooks/234-encodec-audio-compression-with-output.rst
@@ -1,7 +1,7 @@
Audio compression with EnCodec and OpenVINO
===========================================
-.. _top:
+
Compression is an important part of the Internet today because it
enables people to easily share high-quality photos, listen to audio
@@ -28,6 +28,8 @@ and original `repo `__.
image.png
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/235-controlnet-stable-diffusion-with-output.rst b/docs/notebooks/235-controlnet-stable-diffusion-with-output.rst
index 1ce9e215d76..3ab1065358f 100644
--- a/docs/notebooks/235-controlnet-stable-diffusion-with-output.rst
+++ b/docs/notebooks/235-controlnet-stable-diffusion-with-output.rst
@@ -1,7 +1,7 @@
Text-to-Image Generation with ControlNet Conditioning
=====================================================
-.. _top:
+
Diffusion models make a revolution in AI-generated art. This technology
enables creation of high-quality images simply by writing a text prompt.
@@ -141,6 +141,8 @@ of the target in the image:
This tutorial focuses mainly on conditioning by pose. However, the
discussed steps are also applicable to other annotation modes.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.rst
index 6916ae2fd5f..4a1e447144f 100644
--- a/docs/notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.rst
+++ b/docs/notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.rst
@@ -1,7 +1,7 @@
Infinite Zoom Stable Diffusion v2 and OpenVINO™
===============================================
-.. _top:
+
Stable Diffusion v2 is the next generation of Stable Diffusion model a
Text-to-Image latent diffusion model created by the researchers and
@@ -74,6 +74,8 @@ Notebook contains the following steps:
3. Run Stable Diffusion v2 inpainting pipeline for generation infinity
zoom video
+.. _top:
+
**Table of contents**:
- `Stable Diffusion v2 Infinite Zoom Showcase <#stable-diffusion-v2-infinite-zoom-showcase>`__
diff --git a/docs/notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.rst
index 59df2505a79..ff8f9a9350f 100644
--- a/docs/notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.rst
+++ b/docs/notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.rst
@@ -1,10 +1,12 @@
Stable Diffusion v2.1 using Optimum-Intel OpenVINO and multiple Intel Hardware
==============================================================================
-.. _top:
+
|image0|
+.. _top:
+
**Table of contents**:
- `Showing Info Available Devices <#showing-info-available-devices>`__
diff --git a/docs/notebooks/236-stable-diffusion-v2-optimum-demo-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-optimum-demo-with-output.rst
index 59641538c13..f44eda207c3 100644
--- a/docs/notebooks/236-stable-diffusion-v2-optimum-demo-with-output.rst
+++ b/docs/notebooks/236-stable-diffusion-v2-optimum-demo-with-output.rst
@@ -1,10 +1,12 @@
Stable Diffusion v2.1 using Optimum-Intel OpenVINO
==================================================
-.. _top:
+
|image0|
+.. _top:
+
**Table of contents**:
- `Showing Info Available Devices <#showing-info-available-devices>`__
diff --git a/docs/notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.rst
index fc046861222..7cd65143c0b 100644
--- a/docs/notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.rst
+++ b/docs/notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.rst
@@ -1,7 +1,7 @@
Stable Diffusion Text-to-Image Demo
===================================
-.. _top:
+
Stable Diffusion is an innovative generative AI technique that allows us
to generate and manipulate images in interesting ways, including
@@ -26,6 +26,8 @@ promising results for selecting a wide range of input text prompts!
`236-stable-diffusion-v2-text-to-image `__.
+.. _top:
+
**Table of contents**:
- `Step 0: Install and import prerequisites <#step-0-install-and-import-prerequisites>`__
diff --git a/docs/notebooks/236-stable-diffusion-v2-text-to-image-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-text-to-image-with-output.rst
index 826dc04d7ee..f8cb417e3cf 100644
--- a/docs/notebooks/236-stable-diffusion-v2-text-to-image-with-output.rst
+++ b/docs/notebooks/236-stable-diffusion-v2-text-to-image-with-output.rst
@@ -1,7 +1,7 @@
Text-to-Image Generation with Stable Diffusion v2 and OpenVINO™
===============================================================
-.. _top:
+
Stable Diffusion v2 is the next generation of Stable Diffusion model a
Text-to-Image latent diffusion model created by the researchers and
@@ -81,6 +81,8 @@ Notebook contains the following steps:
notebook `__.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/237-segment-anything-with-output.rst b/docs/notebooks/237-segment-anything-with-output.rst
index 454adae0660..25969d47260 100644
--- a/docs/notebooks/237-segment-anything-with-output.rst
+++ b/docs/notebooks/237-segment-anything-with-output.rst
@@ -1,6 +1,8 @@
Object masks from prompts with SAM and OpenVINO
===============================================
+
+
.. _top:
**Table of contents**:
diff --git a/docs/notebooks/238-deep-floyd-if-with-output.rst b/docs/notebooks/238-deep-floyd-if-with-output.rst
index 7585c074bad..5701933a9ef 100644
--- a/docs/notebooks/238-deep-floyd-if-with-output.rst
+++ b/docs/notebooks/238-deep-floyd-if-with-output.rst
@@ -1,8 +1,6 @@
Image generation with DeepFloyd IF and OpenVINO™
================================================
-.. _top:
-
DeepFloyd IF is an advanced open-source text-to-image model that
delivers remarkable photorealism and language comprehension. DeepFloyd
IF consists of a frozen text encoder and three cascaded pixel diffusion
@@ -78,6 +76,10 @@ vector in embedded space.
conventional Super Resolution network to get hi-res results.
+
+
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/239-image-bind-convert-with-output.rst b/docs/notebooks/239-image-bind-convert-with-output.rst
index bc4a983a5a2..ffd69a13191 100644
--- a/docs/notebooks/239-image-bind-convert-with-output.rst
+++ b/docs/notebooks/239-image-bind-convert-with-output.rst
@@ -1,7 +1,7 @@
Binding multimodal data using ImageBind and OpenVINO
====================================================
-.. _top:
+
Exploring the surrounding world, people get information using multiple
senses, for example, seeing a busy street and hearing the sounds of car
@@ -69,6 +69,8 @@ represented on the image below:
In this tutorial, we consider how to use ImageBind for multimodal
zero-shot classification.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/240-dolly-2-instruction-following-with-output.rst b/docs/notebooks/240-dolly-2-instruction-following-with-output.rst
index bbc1e240159..9b450eb9902 100644
--- a/docs/notebooks/240-dolly-2-instruction-following-with-output.rst
+++ b/docs/notebooks/240-dolly-2-instruction-following-with-output.rst
@@ -1,7 +1,7 @@
Instruction following using Databricks Dolly 2.0 and OpenVINO
=============================================================
-.. _top:
+
The instruction following is one of the cornerstones of the current
generation of large language models(LLMs). Reinforcement learning with
@@ -82,6 +82,8 @@ post `__
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/241-riffusion-text-to-music-with-output.rst b/docs/notebooks/241-riffusion-text-to-music-with-output.rst
index cae9b6e81d1..d8eb9cb1462 100644
--- a/docs/notebooks/241-riffusion-text-to-music-with-output.rst
+++ b/docs/notebooks/241-riffusion-text-to-music-with-output.rst
@@ -1,7 +1,7 @@
Text-to-Music generation using Riffusion and OpenVINO
=====================================================
-.. _top:
+
`Riffusion `__ is a
latent text-to-image diffusion model capable of generating spectrogram
@@ -76,6 +76,8 @@ The STFT is invertible, so the original audio can be reconstructed from
a spectrogram. This idea is a behind approach to using Riffusion for
audio generation.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/242-freevc-voice-conversion-with-output.rst b/docs/notebooks/242-freevc-voice-conversion-with-output.rst
index 5fcb41ebaf5..1c39257b4a7 100644
--- a/docs/notebooks/242-freevc-voice-conversion-with-output.rst
+++ b/docs/notebooks/242-freevc-voice-conversion-with-output.rst
@@ -1,7 +1,7 @@
High-Quality Text-Free One-Shot Voice Conversion with FreeVC and OpenVINO™
==========================================================================
-.. _top:
+
`FreeVC `__ allows alter the voice of
a source speaker to a target style, while keeping the linguistic content
@@ -30,6 +30,8 @@ devices. It consists of the following steps:
- Convert models to OpenVINO Intermediate Representation.
- Inference using only OpenVINO’s IR models.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/243-tflite-selfie-segmentation-with-output.rst b/docs/notebooks/243-tflite-selfie-segmentation-with-output.rst
index 69a2c1eecdd..c709cd516e9 100644
--- a/docs/notebooks/243-tflite-selfie-segmentation-with-output.rst
+++ b/docs/notebooks/243-tflite-selfie-segmentation-with-output.rst
@@ -1,7 +1,7 @@
Selfie Segmentation using TFLite and OpenVINO
=============================================
-.. _top:
+
The Selfie segmentation pipeline allows developers to easily separate
the background from users within a scene and focus on what matters.
@@ -36,6 +36,8 @@ The tutorial consists of following steps:
2. Run inference on the image.
3. Run interactive background blurring demo on video.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/244-named-entity-recognition-with-output.rst b/docs/notebooks/244-named-entity-recognition-with-output.rst
index 40dcb1455d7..dd6af58fd7b 100644
--- a/docs/notebooks/244-named-entity-recognition-with-output.rst
+++ b/docs/notebooks/244-named-entity-recognition-with-output.rst
@@ -1,7 +1,7 @@
Named entity recognition with OpenVINO™
=======================================
-.. _top:
+
The Named Entity Recognition(NER) is a natural language processing
method that involves the detecting of key information in the
@@ -27,6 +27,8 @@ To simplify the user experience, the `Hugging Face
Optimum `__ library is used to
convert the model to OpenVINO™ IR format and quantize it.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/248-stable-diffusion-xl-with-output.rst b/docs/notebooks/248-stable-diffusion-xl-with-output.rst
index 457c66ce539..594fb4f1a7b 100644
--- a/docs/notebooks/248-stable-diffusion-xl-with-output.rst
+++ b/docs/notebooks/248-stable-diffusion-xl-with-output.rst
@@ -1,7 +1,7 @@
Image generation with Stable Diffusion XL and OpenVINO
======================================================
-.. _top:
+
Stable Diffusion XL or SDXL is the latest image generation model that is
tailored towards more photorealistic outputs with more detailed imagery
@@ -67,6 +67,8 @@ The tutorial consists of the following steps:
Some demonstrated models can require at least 64GB RAM for
conversion and running.
+.. _top:
+
**Table of contents**:
- `Install Prerequisites <#install-prerequisites>`__
diff --git a/docs/notebooks/250-music-generation-with-output.rst b/docs/notebooks/250-music-generation-with-output.rst
index 1339c538e7e..733e303c35f 100644
--- a/docs/notebooks/250-music-generation-with-output.rst
+++ b/docs/notebooks/250-music-generation-with-output.rst
@@ -1,7 +1,7 @@
Controllable Music Generation with MusicGen and OpenVINO
========================================================
-.. _top:
+
MusicGen is a single-stage auto-regressive Transformer model capable of
generating high-quality music samples conditioned on text descriptions
@@ -32,6 +32,8 @@ We will use a model implementation from the `Hugging Face
Transformers `__
library.
+.. _top:
+
**Table of contents**:
- `Requirements and Imports <#prerequisites>`__
diff --git a/docs/notebooks/251-tiny-sd-image-generation-with-output.rst b/docs/notebooks/251-tiny-sd-image-generation-with-output.rst
index f8043dfe552..b2afd5f5c58 100644
--- a/docs/notebooks/251-tiny-sd-image-generation-with-output.rst
+++ b/docs/notebooks/251-tiny-sd-image-generation-with-output.rst
@@ -1,7 +1,7 @@
Image Generation with Tiny-SD and OpenVINO™
===========================================
-.. _top:
+
In recent times, the AI community has witnessed a remarkable surge in
the development of larger and more performant language models, such as
@@ -41,7 +41,9 @@ The notebook contains the following steps:
3. Run Inference pipeline with OpenVINO.
4. Run Interactive demo for Tiny-SD model
-**Table of content**:
+.. _toc:
+
+**Table of contents**:
- `Prerequisites <#prerequisites>`__
- `Create PyTorch Models pipeline <#create-pytorch-models-pipeline>`__
diff --git a/docs/notebooks/252-fastcomposer-image-generation-with-output.rst b/docs/notebooks/252-fastcomposer-image-generation-with-output.rst
index 891e1dd3646..d0c9a479aa0 100644
--- a/docs/notebooks/252-fastcomposer-image-generation-with-output.rst
+++ b/docs/notebooks/252-fastcomposer-image-generation-with-output.rst
@@ -1,7 +1,7 @@
`FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention `__
=====================================================================================================================
-.. _top:
+
FastComposer uses subject embeddings extracted by an image encoder to
augment the generic text conditioning in diffusion models, enabling
@@ -32,6 +32,8 @@ different styles, actions, and contexts.
drivers in the system - changes to have compatibility with
transformers >= 4.30.1 (due to security vulnerability)
+.. _top:
+
**Table of contents**:
- `Install Prerequisites <#install-prerequisites>`__
diff --git a/docs/notebooks/253-zeroscope-text2video-with-output.rst b/docs/notebooks/253-zeroscope-text2video-with-output.rst
index 4a538a6a8fc..549a1ce04e5 100644
--- a/docs/notebooks/253-zeroscope-text2video-with-output.rst
+++ b/docs/notebooks/253-zeroscope-text2video-with-output.rst
@@ -1,7 +1,7 @@
Video generation with ZeroScope and OpenVINO
============================================
-.. _top:
+
The ZeroScope model is a free and open-source text-to-video model that
can generate realistic and engaging videos from text descriptions. It is
@@ -34,6 +34,8 @@ Both versions of the ZeroScope model are available on Hugging Face:
We will use the first one.
+.. _top:
+
**Table of contents**:
- `Install and import required packages <#install-and-import-required-packages>`__
diff --git a/docs/notebooks/301-tensorflow-training-openvino-nncf-with-output.rst b/docs/notebooks/301-tensorflow-training-openvino-nncf-with-output.rst
index 353297f1805..6054fb8ae8c 100644
--- a/docs/notebooks/301-tensorflow-training-openvino-nncf-with-output.rst
+++ b/docs/notebooks/301-tensorflow-training-openvino-nncf-with-output.rst
@@ -11,6 +11,8 @@ A custom dataloader and metric will be defined, and accuracy and
performance will be computed for the original IR model and the quantized
model.
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/301-tensorflow-training-openvino-with-output.rst b/docs/notebooks/301-tensorflow-training-openvino-with-output.rst
index 53b511021f8..0b02ba0ee4f 100644
--- a/docs/notebooks/301-tensorflow-training-openvino-with-output.rst
+++ b/docs/notebooks/301-tensorflow-training-openvino-with-output.rst
@@ -1,6 +1,8 @@
From Training to Deployment with TensorFlow and OpenVINO™
=========================================================
+
+
.. _top:
**Table of contents**:
diff --git a/docs/notebooks/302-pytorch-quantization-aware-training-with-output.rst b/docs/notebooks/302-pytorch-quantization-aware-training-with-output.rst
index 766537b933d..3cc99a837ea 100644
--- a/docs/notebooks/302-pytorch-quantization-aware-training-with-output.rst
+++ b/docs/notebooks/302-pytorch-quantization-aware-training-with-output.rst
@@ -1,7 +1,7 @@
Quantization Aware Training with NNCF, using PyTorch framework
==============================================================
-.. _top:
+
This notebook is based on `ImageNet training in
PyTorch `__.
@@ -34,6 +34,8 @@ hub `__.
This notebook requires a C++ compiler.
+.. _top:
+
**Table of contents**:
- `Imports and Settings <#imports-and-settings>`__
diff --git a/docs/notebooks/305-tensorflow-quantization-aware-training-with-output.rst b/docs/notebooks/305-tensorflow-quantization-aware-training-with-output.rst
index b4673eb4c3e..8f0ad9a7f72 100644
--- a/docs/notebooks/305-tensorflow-quantization-aware-training-with-output.rst
+++ b/docs/notebooks/305-tensorflow-quantization-aware-training-with-output.rst
@@ -1,7 +1,7 @@
Quantization Aware Training with NNCF, using TensorFlow Framework
=================================================================
-.. _top:
+
The goal of this notebook to demonstrate how to use the Neural Network
Compression Framework `NNCF `__
@@ -23,6 +23,8 @@ Imagenette is a subset of 10 easily classified classes from the ImageNet
dataset. Using the smaller model and dataset will speed up training and
download time.
+.. _top:
+
**Table of contents**:
- `Imports and Settings <#imports-and-settings>`__
diff --git a/docs/notebooks/401-object-detection-with-output.rst b/docs/notebooks/401-object-detection-with-output.rst
index bc83f4a2af3..45ee50e220e 100644
--- a/docs/notebooks/401-object-detection-with-output.rst
+++ b/docs/notebooks/401-object-detection-with-output.rst
@@ -1,7 +1,7 @@
Live Object Detection with OpenVINO™
====================================
-.. _top:
+
This notebook demonstrates live object detection with OpenVINO, using
the `SSDLite
@@ -17,6 +17,8 @@ Additionally, you can also upload a video file.
with a webcam. If you run the notebook on a server, the webcam will not work.
However, you can still do inference on a video.
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/402-pose-estimation-with-output.rst b/docs/notebooks/402-pose-estimation-with-output.rst
index fbee0c5e470..efe0ffcdd55 100644
--- a/docs/notebooks/402-pose-estimation-with-output.rst
+++ b/docs/notebooks/402-pose-estimation-with-output.rst
@@ -1,7 +1,7 @@
Live Human Pose Estimation with OpenVINO™
=========================================
-.. _top:
+
This notebook demonstrates live pose estimation with OpenVINO, using the
OpenPose
@@ -18,6 +18,8 @@ Additionally, you can also upload a video file.
work. However, you can still do inference on a video in the final
step.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/403-action-recognition-webcam-with-output.rst b/docs/notebooks/403-action-recognition-webcam-with-output.rst
index d0cb4b74b57..d6755518701 100644
--- a/docs/notebooks/403-action-recognition-webcam-with-output.rst
+++ b/docs/notebooks/403-action-recognition-webcam-with-output.rst
@@ -1,7 +1,7 @@
Human Action Recognition with OpenVINO™
=======================================
-.. _top:
+
This notebook demonstrates live human action recognition with OpenVINO,
using the `Action Recognition
@@ -39,6 +39,8 @@ Transformer
and
`ResNet34 `__.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/404-style-transfer-with-output.rst b/docs/notebooks/404-style-transfer-with-output.rst
index 7c5d9c10228..630aca385b8 100644
--- a/docs/notebooks/404-style-transfer-with-output.rst
+++ b/docs/notebooks/404-style-transfer-with-output.rst
@@ -1,7 +1,7 @@
Style Transfer with OpenVINO™
=============================
-.. _top:
+
This notebook demonstrates style transfer with OpenVINO, using the Style
Transfer Models from `ONNX Model
@@ -32,6 +32,8 @@ Additionally, you can also upload a video file.
but you can run inference, using a video file.
+.. _top:
+
**Table of contents**:
- `Preparation <#preparation>`__
diff --git a/docs/notebooks/405-paddle-ocr-webcam-with-output.rst b/docs/notebooks/405-paddle-ocr-webcam-with-output.rst
index 608a9d4ab58..8f11e078ae9 100644
--- a/docs/notebooks/405-paddle-ocr-webcam-with-output.rst
+++ b/docs/notebooks/405-paddle-ocr-webcam-with-output.rst
@@ -1,7 +1,7 @@
PaddleOCR with OpenVINO™
========================
-.. _top:
+
This demo shows how to run PP-OCR model on OpenVINO natively. Instead of
exporting the PaddlePaddle model to ONNX and then converting to the
@@ -25,6 +25,8 @@ the PaddleOCR is as follows:
with a webcam. If you run the notebook on a server, the webcam will not work.
You can still do inference on a video file.
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/notebooks/406-3D-pose-estimation-with-output.rst b/docs/notebooks/406-3D-pose-estimation-with-output.rst
index 9038ce30981..121a5d44326 100644
--- a/docs/notebooks/406-3D-pose-estimation-with-output.rst
+++ b/docs/notebooks/406-3D-pose-estimation-with-output.rst
@@ -1,7 +1,7 @@
Live 3D Human Pose Estimation with OpenVINO
===========================================
-.. _top:
+
This notebook demonstrates live 3D Human Pose Estimation with OpenVINO
via a webcam. We utilize the model
@@ -30,6 +30,8 @@ To ensure that the results are displayed correctly, run the code in a
recommended browser on one of the following operating systems: Ubuntu,
Windows: Chrome, macOS: Safari.
+.. _top:
+
**Table of contents**:
- `Prerequisites <#prerequisites>`__
diff --git a/docs/notebooks/407-person-tracking-with-output.rst b/docs/notebooks/407-person-tracking-with-output.rst
index abc808bb273..b267e6bd9ec 100644
--- a/docs/notebooks/407-person-tracking-with-output.rst
+++ b/docs/notebooks/407-person-tracking-with-output.rst
@@ -1,7 +1,7 @@
Person Tracking with OpenVINO™
==============================
-.. _top:
+
This notebook demonstrates live person tracking with OpenVINO: it reads
frames from an input video sequence, detects people in the frames,
@@ -95,6 +95,8 @@ realtime tracking,” in ICIP, 2016, pp. 3464–3468.
.. |deepsort| image:: https://user-images.githubusercontent.com/91237924/221744683-0042eff8-2c41-43b8-b3ad-b5929bafb60b.png
+.. _top:
+
**Table of contents**:
- `Imports <#imports>`__
diff --git a/docs/tutorials.md b/docs/tutorials.md
index a4fa0ed98cb..c21005bab47 100644
--- a/docs/tutorials.md
+++ b/docs/tutorials.md
@@ -131,6 +131,15 @@ Tutorials that explain how to optimize and quantize models with OpenVINO tools.
+----------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+
| `120-tensorflow-object-detection-to-openvino `__ |br| |n120| |br| |c120| | Convert TensorFlow Object Detection models to OpenVINO IR |
+----------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+
+ | `122-speech-recognition-quantization-wav2vec2 `__ | Quantize Speech Recognition Models with accuracy control using NNCF PTQ API. |
+ +----------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+
+ | `122-yolov8-quantization-with-accuracy-control `__ | Convert and Optimize YOLOv8 with OpenVINO™. |
+ +----------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+
+
+
+
+
+
Model Demos