diff --git a/docs/notebooks/001-hello-world-with-output.rst b/docs/notebooks/001-hello-world-with-output.rst index 1b8752b43cf..b5fc484fc04 100644 --- a/docs/notebooks/001-hello-world-with-output.rst +++ b/docs/notebooks/001-hello-world-with-output.rst @@ -1,7 +1,7 @@ Hello Image Classification ========================== -.. _top: + This basic introduction to OpenVINO™ shows how to do inference with an image classification model. @@ -15,6 +15,10 @@ created, refer to the `TensorFlow to OpenVINO <101-tensorflow-classification-to-openvino-with-output.html>`__ tutorial. + + +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/003-hello-segmentation-with-output.rst b/docs/notebooks/003-hello-segmentation-with-output.rst index 664854ae8d3..de8e1d16974 100644 --- a/docs/notebooks/003-hello-segmentation-with-output.rst +++ b/docs/notebooks/003-hello-segmentation-with-output.rst @@ -1,7 +1,7 @@ Hello Image Segmentation ======================== -.. _top: + A very basic introduction to using segmentation models with OpenVINO™. @@ -12,6 +12,10 @@ Zoo `__ is used. ADAS stands for Advanced Driver Assistance Services. The model recognizes four classes: background, road, curb and mark. + + +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/004-hello-detection-with-output.rst b/docs/notebooks/004-hello-detection-with-output.rst index 35d47be09d1..8a96d8e68f0 100644 --- a/docs/notebooks/004-hello-detection-with-output.rst +++ b/docs/notebooks/004-hello-detection-with-output.rst @@ -1,7 +1,7 @@ Hello Object Detection ====================== -.. _top: + A very basic introduction to using object detection models with OpenVINO™. @@ -18,6 +18,10 @@ corner, ``(x_max, y_max)`` are the coordinates of the bottom right bounding box corner and ``conf`` is the confidence for the predicted class. + + +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst b/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst index 8e6b721fdbd..6971a252d7d 100644 --- a/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst +++ b/docs/notebooks/101-tensorflow-classification-to-openvino-with-output.rst @@ -1,7 +1,7 @@ Convert a TensorFlow Model to OpenVINO™ ======================================= -.. _top: + | This short tutorial shows how to convert a TensorFlow `MobileNetV3 `__ @@ -13,7 +13,11 @@ Convert a TensorFlow Model to OpenVINO™ Runtime `__ and do inference with a sample image. -| **Table of contents**: + + +| .. _top: + +**Table of contents**: - `Imports <#imports>`__ - `Settings <#settings>`__ diff --git a/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst b/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst index c310e83f56d..4ff0c24ecd7 100644 --- a/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst +++ b/docs/notebooks/102-pytorch-onnx-to-openvino-with-output.rst @@ -1,7 +1,7 @@ Convert a PyTorch Model to ONNX and OpenVINO™ IR ================================================ -.. _top: + This tutorial demonstrates step-by-step instructions on how to do inference on a PyTorch semantic segmentation model, using OpenVINO @@ -35,6 +35,10 @@ plant, sheep, sofa, train, tv monitor** More information about the model is available in the `torchvision documentation `__ + + +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/102-pytorch-to-openvino-with-output.rst b/docs/notebooks/102-pytorch-to-openvino-with-output.rst index cf6e83887ca..be0a9038b08 100644 --- a/docs/notebooks/102-pytorch-to-openvino-with-output.rst +++ b/docs/notebooks/102-pytorch-to-openvino-with-output.rst @@ -1,7 +1,7 @@ Convert a PyTorch Model to OpenVINO™ IR ======================================= -.. _top: + This tutorial demonstrates step-by-step instructions on how to do inference on a PyTorch classification model using OpenVINO Runtime. @@ -31,6 +31,10 @@ but elevated to the design space level. The RegNet design space provides simple and fast networks that work well across a wide range of flop regimes. + + +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst b/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst index 082be1d6643..94f284cf674 100644 --- a/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst +++ b/docs/notebooks/103-paddle-to-openvino-classification-with-output.rst @@ -1,7 +1,7 @@ Convert a PaddlePaddle Model to OpenVINO™ IR ============================================ -.. _top: + This notebook shows how to convert a MobileNetV3 model from `PaddleHub `__, pre-trained @@ -16,6 +16,10 @@ IR model. Source of the `model `__. + + +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/104-model-tools-with-output.rst b/docs/notebooks/104-model-tools-with-output.rst index 62dcd3132ea..441028017b4 100644 --- a/docs/notebooks/104-model-tools-with-output.rst +++ b/docs/notebooks/104-model-tools-with-output.rst @@ -1,13 +1,15 @@ Working with Open Model Zoo Models ================================== -.. _top: + This tutorial shows how to download a model from `Open Model Zoo `__, convert it to OpenVINO™ IR format, show information about the model, and benchmark the model. +.. _top: + **Table of contents**: - `OpenVINO and Open Model Zoo Tools <#openvino-and-open-model-zoo-tools>`__ diff --git a/docs/notebooks/105-language-quantize-bert-with-output.rst b/docs/notebooks/105-language-quantize-bert-with-output.rst index cbd1ec2b557..c7cdfb21086 100644 --- a/docs/notebooks/105-language-quantize-bert-with-output.rst +++ b/docs/notebooks/105-language-quantize-bert-with-output.rst @@ -1,7 +1,7 @@ Quantize NLP models with Post-Training Quantization ​in NNCF ============================================================ -.. _top: + This tutorial demonstrates how to apply ``INT8`` quantization to the Natural Language Processing model known as @@ -24,6 +24,10 @@ and datasets. It consists of the following steps: - Compare the performance of the original, converted and quantized models. + + +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/106-auto-device-with-output.rst b/docs/notebooks/106-auto-device-with-output.rst index 3e51a92ee2e..98166495d23 100644 --- a/docs/notebooks/106-auto-device-with-output.rst +++ b/docs/notebooks/106-auto-device-with-output.rst @@ -1,8 +1,6 @@ Automatic Device Selection with OpenVINO™ ========================================= -.. _top: - The `Auto device `__ (or AUTO in short) selects the most suitable device for inference by @@ -32,6 +30,10 @@ first inference. auto + + +.. _top: + **Table of contents**: - `Import modules and create Core <#import-modules-and-create-core>`__ diff --git a/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst b/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst index 8b1b221b0aa..39cf07b4452 100644 --- a/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst +++ b/docs/notebooks/107-speech-recognition-quantization-data2vec-with-output.rst @@ -1,8 +1,6 @@ Quantize Speech Recognition Models using NNCF PTQ API ===================================================== -.. _top: - This tutorial demonstrates how to use the NNCF (Neural Network Compression Framework) 8-bit quantization in post-training mode (without the fine-tuning pipeline) to optimize the speech recognition model, @@ -21,6 +19,10 @@ steps: - Compare performance of the original and quantized models. - Compare Accuracy of the Original and Quantized Models. + + +.. _top: + **Table of contents**: - `Download and prepare model <#download-and-prepare-model>`__ diff --git a/docs/notebooks/108-gpu-device-with-output.rst b/docs/notebooks/108-gpu-device-with-output.rst index 78eec1cf09b..9d7f69faec7 100644 --- a/docs/notebooks/108-gpu-device-with-output.rst +++ b/docs/notebooks/108-gpu-device-with-output.rst @@ -1,6 +1,8 @@ Working with GPUs in OpenVINO™ ============================== + + .. _top: **Table of contents**: diff --git a/docs/notebooks/109-latency-tricks-with-output.rst b/docs/notebooks/109-latency-tricks-with-output.rst index f939f5e5d4a..5d2d14fa85d 100644 --- a/docs/notebooks/109-latency-tricks-with-output.rst +++ b/docs/notebooks/109-latency-tricks-with-output.rst @@ -1,8 +1,6 @@ Performance tricks in OpenVINO for latency mode =============================================== -.. _top: - The goal of this notebook is to provide a step-by-step tutorial for improving performance for inferencing in a latency mode. Low latency is especially desired in real-time applications when the results are needed @@ -51,6 +49,10 @@ optimize performance on OpenVINO IR files in A similar notebook focused on the throughput mode is available `here <109-throughput-tricks-with-output.html>`__. + + +.. _top: + **Table of contents**: - `Data <#data>`__ diff --git a/docs/notebooks/109-throughput-tricks-with-output.rst b/docs/notebooks/109-throughput-tricks-with-output.rst index d01b7d3f3dc..c5e7a2c9646 100644 --- a/docs/notebooks/109-throughput-tricks-with-output.rst +++ b/docs/notebooks/109-throughput-tricks-with-output.rst @@ -1,7 +1,7 @@ Performance tricks in OpenVINO for throughput mode ================================================== -.. _top: + The goal of this notebook is to provide a step-by-step tutorial for improving performance for inferencing in a throughput mode. High @@ -46,6 +46,10 @@ optimize performance on OpenVINO IR files in A similar notebook focused on the latency mode is available `here <109-latency-tricks-with-output.html>`__. + + +.. _top: + **Table of contents**: - `Data <#data>`__ diff --git a/docs/notebooks/110-ct-scan-live-inference-with-output.rst b/docs/notebooks/110-ct-scan-live-inference-with-output.rst index 7d543aa06d8..0f3e10cca74 100644 --- a/docs/notebooks/110-ct-scan-live-inference-with-output.rst +++ b/docs/notebooks/110-ct-scan-live-inference-with-output.rst @@ -1,8 +1,6 @@ Live Inference and Benchmark CT-scan Data with OpenVINO™ ======================================================== -.. _top: - Kidney Segmentation with PyTorch Lightning and OpenVINO™ - Part 4 ----------------------------------------------------------------- @@ -30,6 +28,10 @@ notebook. For demonstration purposes, this tutorial will download one converted CT scan to use for inference. + + +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst b/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst index 2ff15e5eed4..b7089acadd6 100644 --- a/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst +++ b/docs/notebooks/110-ct-segmentation-quantize-nncf-with-output.rst @@ -1,8 +1,6 @@ Quantize a Segmentation Model and Show Live Inference ===================================================== -.. _top: - Kidney Segmentation with PyTorch Lightning and OpenVINO™ - Part 3 ----------------------------------------------------------------- @@ -55,6 +53,10 @@ demonstration purposes, this tutorial will download one converted CT scan and use that scan for quantization and inference. For production purposes, use a representative dataset for quantizing the model. + + +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/111-yolov5-quantization-migration-with-output.rst b/docs/notebooks/111-yolov5-quantization-migration-with-output.rst index 230ace7db8c..6181e22d000 100644 --- a/docs/notebooks/111-yolov5-quantization-migration-with-output.rst +++ b/docs/notebooks/111-yolov5-quantization-migration-with-output.rst @@ -1,8 +1,6 @@ Migrate quantization from POT API to NNCF API ============================================= -.. _top: - This tutorial demonstrates how to migrate quantization pipeline written using the OpenVINO `Post-Training Optimization Tool (POT) `__ to `NNCF Post-Training Quantization API `__. @@ -23,6 +21,9 @@ The tutorial consists from the following parts: 7. Compare performance FP32 and INT8 models + +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst b/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst index 16c64286c2b..69d0e04db13 100644 --- a/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst +++ b/docs/notebooks/112-pytorch-post-training-quantization-nncf-with-output.rst @@ -1,8 +1,6 @@ Post-Training Quantization of PyTorch models with NNCF ====================================================== -.. _top: - The goal of this tutorial is to demonstrate how to use the NNCF (Neural Network Compression Framework) 8-bit quantization in post-training mode (without the fine-tuning pipeline) to optimize a PyTorch model for the @@ -27,6 +25,9 @@ quantization, not demanding the fine-tuning of the model. notebook. + +.. _top: + **Table of contents**: - `Preparations <#preparations>`__ diff --git a/docs/notebooks/113-image-classification-quantization-with-output.rst b/docs/notebooks/113-image-classification-quantization-with-output.rst index d72f5e3e4c0..15e6e52b6f5 100644 --- a/docs/notebooks/113-image-classification-quantization-with-output.rst +++ b/docs/notebooks/113-image-classification-quantization-with-output.rst @@ -1,7 +1,7 @@ Quantization of Image Classification Models =========================================== -.. _top: + This tutorial demonstrates how to apply ``INT8`` quantization to Image Classification model using @@ -21,6 +21,8 @@ This tutorial consists of the following steps: - Compare performance of the original and quantized models. - Compare results on one picture. +.. _top: + **Table of contents**: - `Prepare the Model <#prepare-the-model>`__ diff --git a/docs/notebooks/115-async-api-with-output.rst b/docs/notebooks/115-async-api-with-output.rst index 9f59cbc78b2..bec3bc9e219 100644 --- a/docs/notebooks/115-async-api-with-output.rst +++ b/docs/notebooks/115-async-api-with-output.rst @@ -1,7 +1,7 @@ Asynchronous Inference with OpenVINO™ ===================================== -.. _top: + This notebook demonstrates how to use the `Async API `__ @@ -14,6 +14,8 @@ in parallel (for example, populating inputs or scheduling other requests) rather than wait for the current inference to complete first. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/116-sparsity-optimization-with-output.rst b/docs/notebooks/116-sparsity-optimization-with-output.rst index aa321a6b57e..532094888de 100644 --- a/docs/notebooks/116-sparsity-optimization-with-output.rst +++ b/docs/notebooks/116-sparsity-optimization-with-output.rst @@ -1,7 +1,7 @@ Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors ============================================================================================================= -.. _top: + This tutorial demonstrates how to improve performance of sparse Transformer models with `OpenVINO `__ on 4th @@ -21,6 +21,8 @@ consists of the following steps: integration with Hugging Face Optimum. - Compare sparse 8-bit vs. dense 8-bit inference performance. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/117-model-server-with-output.rst b/docs/notebooks/117-model-server-with-output.rst index 54989d2a0e7..7cf130e876b 100644 --- a/docs/notebooks/117-model-server-with-output.rst +++ b/docs/notebooks/117-model-server-with-output.rst @@ -1,7 +1,7 @@ Hello Model Server ================== -.. _top: + Introduction to OpenVINO™ Model Server (OVMS). @@ -33,6 +33,8 @@ deployment: |ovms_diagram| +.. _top: + **Table of contents**: - `Serving with OpenVINO Model Server <#serving-with-openvino-model-server1>`__ diff --git a/docs/notebooks/118-optimize-preprocessing-with-output.rst b/docs/notebooks/118-optimize-preprocessing-with-output.rst index c76a8986137..e9f19e107c9 100644 --- a/docs/notebooks/118-optimize-preprocessing-with-output.rst +++ b/docs/notebooks/118-optimize-preprocessing-with-output.rst @@ -1,7 +1,7 @@ Optimize Preprocessing ====================== -.. _top: + When input data does not fit the model input tensor perfectly, additional operations/steps are needed to transform the data to the @@ -27,6 +27,8 @@ This tutorial include following steps: - Comparing results on one picture. - Comparing performance. +.. _top: + **Table of contents**: - `Settings <#settings>`__ diff --git a/docs/notebooks/119-tflite-to-openvino-with-output.rst b/docs/notebooks/119-tflite-to-openvino-with-output.rst index aa0bc8713a3..6bf4b8924cc 100644 --- a/docs/notebooks/119-tflite-to-openvino-with-output.rst +++ b/docs/notebooks/119-tflite-to-openvino-with-output.rst @@ -1,7 +1,7 @@ Convert a Tensorflow Lite Model to OpenVINO™ ============================================ -.. _top: + `TensorFlow Lite `__, often referred to as TFLite, is an open source library developed for deploying @@ -17,6 +17,8 @@ After creating the OpenVINO IR, load the model in `OpenVINO Runtime `__ and do inference with a sample image. +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/120-tensorflow-object-detection-to-openvino-with-output.rst b/docs/notebooks/120-tensorflow-object-detection-to-openvino-with-output.rst index 39fcef5cec8..9e2ee531349 100644 --- a/docs/notebooks/120-tensorflow-object-detection-to-openvino-with-output.rst +++ b/docs/notebooks/120-tensorflow-object-detection-to-openvino-with-output.rst @@ -1,7 +1,7 @@ Convert a TensorFlow Object Detection Model to OpenVINO™ ======================================================== -.. _top: + `TensorFlow `__, or TF for short, is an open-source framework for machine learning. @@ -26,6 +26,8 @@ After creating the OpenVINO IR, load the model in `OpenVINO Runtime `__ and do inference with a sample image. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/121-convert-to-openvino-with-output.rst b/docs/notebooks/121-convert-to-openvino-with-output.rst index 5da2d317e3a..cf93b94ac74 100644 --- a/docs/notebooks/121-convert-to-openvino-with-output.rst +++ b/docs/notebooks/121-convert-to-openvino-with-output.rst @@ -4,6 +4,8 @@ OpenVINO™ model conversion API This notebook shows how to convert a model from original framework format to OpenVINO Intermediate Representation (IR). +.. _top: + **Table of contents**: - `OpenVINO IR format <#openvino-ir-format>`__ diff --git a/docs/notebooks/122-speech-recognition-quantization-wav2vec2-with-output.rst b/docs/notebooks/122-speech-recognition-quantization-wav2vec2-with-output.rst new file mode 100644 index 00000000000..4db1ac32fe9 --- /dev/null +++ b/docs/notebooks/122-speech-recognition-quantization-wav2vec2-with-output.rst @@ -0,0 +1,309 @@ +Quantize Speech Recognition Models with accuracy control using NNCF PTQ API +=========================================================================== + + + +This tutorial demonstrates how to apply ``INT8`` quantization with +accuracy control to the speech recognition model, known as +`Wav2Vec2 `__, +using the NNCF (Neural Network Compression Framework) 8-bit quantization +with accuracy control in post-training mode (without the fine-tuning +pipeline). This notebook uses a fine-tuned +`Wav2Vec2-Base-960h `__ +`PyTorch `__ model trained on the `LibriSpeech ASR +corpus `__. The tutorial is designed to be +extendable to custom models and datasets. It consists of the following +steps: + +- Download and prepare the Wav2Vec2 model and LibriSpeech dataset. +- Define data loading and accuracy validation functionality. +- Model quantization with accuracy control. +- Compare Accuracy of original PyTorch model, OpenVINO FP16 and INT8 + models. +- Compare performance of the original and quantized models. + +The advanced quantization flow allows to apply 8-bit quantization to the +model with control of accuracy metric. This is achieved by keeping the +most impactful operations within the model in the original precision. +The flow is based on the `Basic 8-bit +quantization `__ +and has the following differences: + +- Besides the calibration dataset, a validation dataset is required to + compute the accuracy metric. Both datasets can refer to the same data + in the simplest case. +- Validation function, used to compute accuracy metric is required. It + can be a function that is already available in the source framework + or a custom function. +- Since accuracy validation is run several times during the + quantization process, quantization with accuracy control can take + more time than the Basic 8-bit quantization flow. +- The resulted model can provide smaller performance improvement than + the Basic 8-bit quantization flow because some of the operations are + kept in the original precision. + +.. note:: + + Currently, 8-bit quantization with accuracy control in NNCF + is available only for models in OpenVINO representation. + +The steps for the quantization with accuracy control are described +below. + + + +.. _top: + +**Table of contents**: + +- `Imports <#imports>`__ +- `Prepare the Model <#prepare-the-model>`__ +- `Prepare LibriSpeech Dataset <#prepare-librispeech-dataset>`__ +- `Prepare calibration and validation datasets <#prepare-calibration-and-validation-datasets>`__ +- `Prepare validation function <#prepare-validation-function>`__ +- `Run quantization with accuracy control <#run-quantization-with-accuracy-control>`__ +- `Model Usage Example <#model-usage-example>`__ +- `Compare Accuracy of the Original and Quantized Models <#compare-accuracy-of-the-original-and-quantized-models>`__ + + +.. code:: ipython2 + + # !pip install -q "openvino-dev>=2023.1.0" "nncf>=2.6.0" + !pip install -q "openvino==2023.1.0.dev20230811" + !pip install git+https://github.com/openvinotoolkit/nncf.git@develop + !pip install -q soundfile librosa transformers torch datasets torchmetrics + +Imports `⇑ <#top>`__ +############################################################################################################################### + +.. code:: ipython2 + + import numpy as np + import torch + + from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor + +Prepare the Model `⇑ <#top>`__ +############################################################################################################################### + +For instantiating PyTorch model class, +we should use ``Wav2Vec2ForCTC.from_pretrained`` method with providing +model ID for downloading from HuggingFace hub. Model weights and +configuration files will be downloaded automatically in first time +usage. Keep in mind that downloading the files can take several minutes +and depends on your internet connection. + +Additionally, we can create processor class which is responsible for +model specific pre- and post-processing steps. + +.. code:: ipython2 + + BATCH_SIZE = 1 + MAX_SEQ_LENGTH = 30480 + + + torch_model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h", ctc_loss_reduction="mean") + processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h") + +Convert it to the OpenVINO Intermediate Representation (OpenVINO IR) + +.. code:: ipython2 + + import openvino + + + default_input = torch.zeros([1, MAX_SEQ_LENGTH], dtype=torch.float) + ov_model = openvino.convert_model(torch_model, example_input=default_input) + +Prepare LibriSpeech Dataset `⇑ <#top>`__ +############################################################################################################################### + +For demonstration purposes, we will use short dummy version of +LibriSpeech dataset - ``patrickvonplaten/librispeech_asr_dummy`` to +speed up model evaluation. Model accuracy can be different from reported +in the paper. For reproducing original accuracy, use ``librispeech_asr`` +dataset. + +.. code:: ipython2 + + from datasets import load_dataset + + + dataset = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation") + test_sample = dataset[0]["audio"] + + + # define preprocessing function for converting audio to input values for model + def map_to_input(batch): + preprocessed_signal = processor(batch["audio"]["array"], return_tensors="pt", padding="longest", sampling_rate=batch['audio']['sampling_rate']) + input_values = preprocessed_signal.input_values + batch['input_values'] = input_values + return batch + + + # apply preprocessing function to dataset and remove audio column, to save memory as we do not need it anymore + dataset = dataset.map(map_to_input, batched=False, remove_columns=["audio"]) + +Prepare calibration dataset `⇑ <#top>`__ +############################################################################################################################### + +.. code:: ipython2 + + import nncf + + + def transform_fn(data_item): + """ + Extract the model's input from the data item. + The data item here is the data item that is returned from the data source per iteration. + This function should be passed when the data item cannot be used as model's input. + """ + return np.array(data_item["input_values"]) + + + calibration_dataset = nncf.Dataset(dataset, transform_fn) + +Prepare validation function `⇑ <#top>`__ +############################################################################################################################### + +Define the validation function. + +.. code:: ipython2 + + from torchmetrics import WordErrorRate + from tqdm.notebook import tqdm + + + def validation_fn(model, dataset): + """ + Calculate and returns a metric for the model. + """ + wer = WordErrorRate() + for sample in tqdm(dataset): + # run infer function on sample + output = model.output(0) + logits = model(np.array(sample['input_values']))[output] + predicted_ids = np.argmax(logits, axis=-1) + transcription = processor.batch_decode(torch.from_numpy(predicted_ids)) + + # update metric on sample result + wer.update(transcription, [sample['text']]) + + result = wer.compute() + + return 1 - result + +Run quantization with accuracy control `⇑ <#top>`__ +############################################################################################################################### + +You should provide +the calibration dataset and the validation dataset. It can be the same +dataset. - parameter ``max_drop`` defines the accuracy drop threshold. +The quantization process stops when the degradation of accuracy metric +on the validation dataset is less than the ``max_drop``. The default +value is 0.01. NNCF will stop the quantization and report an error if +the ``max_drop`` value can’t be reached. - ``drop_type`` defines how the +accuracy drop will be calculated: ABSOLUTE (used by default) or +RELATIVE. - ``ranking_subset_size`` - size of a subset that is used to +rank layers by their contribution to the accuracy drop. Default value is +300, and the more samples it has the better ranking, potentially. Here +we use the value 25 to speed up the execution. + +.. note:: + + Execution can take tens of minutes and requires up to 10 GB + of free memory + + +.. code:: ipython2 + + from nncf.quantization.advanced_parameters import AdvancedAccuracyRestorerParameters + from nncf.parameters import ModelType + + quantized_model = nncf.quantize_with_accuracy_control( + ov_model, + calibration_dataset=calibration_dataset, + validation_dataset=calibration_dataset, + validation_fn=validation_fn, + max_drop=0.01, + drop_type=nncf.DropType.ABSOLUTE, + model_type=ModelType.TRANSFORMER, + advanced_accuracy_restorer_parameters=AdvancedAccuracyRestorerParameters( + ranking_subset_size=25 + ), + ) + +Model Usage Example `⇑ <#top>`__ +############################################################################################################################### + +.. code:: ipython2 + + import IPython.display as ipd + + + ipd.Audio(test_sample["array"], rate=16000) + +.. code:: ipython2 + + core = openvino.Core() + + compiled_quantized_model = core.compile_model(model=quantized_model, device_name='CPU') + + input_data = np.expand_dims(test_sample["array"], axis=0) + +Next, make a prediction. + +.. code:: ipython2 + + predictions = compiled_quantized_model([input_data])[0] + predicted_ids = np.argmax(predictions, axis=-1) + transcription = processor.batch_decode(torch.from_numpy(predicted_ids)) + transcription + +Compare Accuracy of the Original and Quantized Models `⇑ <#top>`__ +############################################################################################################################### + +- Define dataloader for test dataset. +- Define functions to get inference for PyTorch and OpenVINO models. +- Define functions to compute Word Error Rate. + +.. code:: ipython2 + + # inference function for pytorch + def torch_infer(model, sample): + logits = model(torch.Tensor(sample['input_values'])).logits + # take argmax and decode + predicted_ids = torch.argmax(logits, dim=-1) + transcription = processor.batch_decode(predicted_ids) + return transcription + + + # inference function for openvino + def ov_infer(model, sample): + output = model.output(0) + logits = model(np.array(sample['input_values']))[output] + predicted_ids = np.argmax(logits, axis=-1) + transcription = processor.batch_decode(torch.from_numpy(predicted_ids)) + return transcription + + + def compute_wer(dataset, model, infer_fn): + wer = WordErrorRate() + for sample in tqdm(dataset): + # run infer function on sample + transcription = infer_fn(model, sample) + # update metric on sample result + wer.update(transcription, [sample['text']]) + # finalize metric calculation + result = wer.compute() + return result + +Now, compute WER for the original PyTorch model and quantized model. + +.. code:: ipython2 + + pt_result = compute_wer(dataset, torch_model, torch_infer) + quantized_result = compute_wer(dataset, compiled_quantized_model, ov_infer) + + print(f'[PyTorch] Word Error Rate: {pt_result:.4f}') + print(f'[Quantized OpenVino] Word Error Rate: {quantized_result:.4f}') diff --git a/docs/notebooks/122-yolov8-quantization-with-accuracy-control-with-output.rst b/docs/notebooks/122-yolov8-quantization-with-accuracy-control-with-output.rst new file mode 100644 index 00000000000..7bba4ef46f0 --- /dev/null +++ b/docs/notebooks/122-yolov8-quantization-with-accuracy-control-with-output.rst @@ -0,0 +1,306 @@ +Convert and Optimize YOLOv8 with OpenVINO™ +========================================== + + + +The YOLOv8 algorithm developed by Ultralytics is a cutting-edge, +state-of-the-art (SOTA) model that is designed to be fast, accurate, and +easy to use, making it an excellent choice for a wide range of object +detection, image segmentation, and image classification tasks. More +details about its realization can be found in the original model +`repository `__. + +This tutorial demonstrates step-by-step instructions on how to run apply +quantization with accuracy control to PyTorch YOLOv8. The advanced +quantization flow allows to apply 8-bit quantization to the model with +control of accuracy metric. This is achieved by keeping the most +impactful operations within the model in the original precision. The +flow is based on the `Basic 8-bit +quantization `__ +and has the following differences: + +- Besides the calibration dataset, a validation dataset is required to + compute the accuracy metric. Both datasets can refer to the same data + in the simplest case. +- Validation function, used to compute accuracy metric is required. It + can be a function that is already available in the source framework + or a custom function. +- Since accuracy validation is run several times during the + quantization process, quantization with accuracy control can take + more time than the Basic 8-bit quantization flow. +- The resulted model can provide smaller performance improvement than + the Basic 8-bit quantization flow because some of the operations are + kept in the original precision. + +.. note:: + + Currently, 8-bit quantization with accuracy control in NNCF + is available only for models in OpenVINO representation. + +The steps for the quantization with accuracy control are described +below. + +The tutorial consists of the following steps: + + + +- `Prerequisites <#prerequisites>`__ +- `Get Pytorch model and OpenVINO IR model <#get-pytorch-model-and-openvino-ir-model>`__ +- `Define validator and data loader <#define-validator-and-data-loader>`__ +- `Prepare calibration and validation datasets <#prepare-calibration-and-validation-datasets>`__ +- `Prepare validation function <#prepare-validation-function>`__ +- `Run quantization with accuracy control <#run-quantization-with-accuracy-control>`__ +- `Compare Accuracy and Performance of the Original and Quantized Models <#compare-accuracy-and-performance-of-the-original-and-quantized-models>`__ + +Prerequisites `⇑ <#top>`__ +############################################################################################################################### + + +Install necessary packages. + +.. code:: ipython2 + + !pip install -q "openvino==2023.1.0.dev20230811" + !pip install git+https://github.com/openvinotoolkit/nncf.git@develop + !pip install -q "ultralytics==8.0.43" + +Get Pytorch model and OpenVINO IR model `⇑ <#top>`__ +############################################################################################################################### + +Generally, PyTorch models represent an instance of the +`torch.nn.Module `__ +class, initialized by a state dictionary with model weights. We will use +the YOLOv8 nano model (also known as ``yolov8n``) pre-trained on a COCO +dataset, which is available in this +`repo `__. Similar steps are +also applicable to other YOLOv8 models. Typical steps to obtain a +pre-trained model: + +1. Create an instance of a model class. +2. Load a checkpoint state dict, which contains the pre-trained model + weights. + +In this case, the creators of the model provide an API that enables +converting the YOLOv8 model to ONNX and then to OpenVINO IR. Therefore, +we do not need to do these steps manually. + +.. code:: ipython2 + + import os + from pathlib import Path + + from ultralytics import YOLO + from ultralytics.yolo.cfg import get_cfg + from ultralytics.yolo.data.utils import check_det_dataset + from ultralytics.yolo.engine.validator import BaseValidator as Validator + from ultralytics.yolo.utils import DATASETS_DIR + from ultralytics.yolo.utils import DEFAULT_CFG + from ultralytics.yolo.utils import ops + from ultralytics.yolo.utils.metrics import ConfusionMatrix + + ROOT = os.path.abspath('') + + MODEL_NAME = "yolov8n-seg" + + model = YOLO(f"{ROOT}/{MODEL_NAME}.pt") + args = get_cfg(cfg=DEFAULT_CFG) + args.data = "coco128-seg.yaml" + +Load model. + +.. code:: ipython2 + + import openvino + + + model_path = Path(f"{ROOT}/{MODEL_NAME}_openvino_model/{MODEL_NAME}.xml") + if not model_path.exists(): + model.export(format="openvino", dynamic=True, half=False) + + ov_model = openvino.Core().read_model(model_path) + +Define validator and data loader `⇑ <#top>`__ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + +The original model +repository uses a ``Validator`` wrapper, which represents the accuracy +validation pipeline. It creates dataloader and evaluation metrics and +updates metrics on each data batch produced by the dataloader. Besides +that, it is responsible for data preprocessing and results +postprocessing. For class initialization, the configuration should be +provided. We will use the default setup, but it can be replaced with +some parameters overriding to test on custom data. The model has +connected the ``ValidatorClass`` method, which creates a validator class +instance. + +.. code:: ipython2 + + validator = model.ValidatorClass(args) + validator.data = check_det_dataset(args.data) + data_loader = validator.get_dataloader(f"{DATASETS_DIR}/coco128-seg", 1) + + validator.is_coco = True + validator.class_map = ops.coco80_to_coco91_class() + validator.names = model.model.names + validator.metrics.names = validator.names + validator.nc = model.model.model[-1].nc + validator.nm = 32 + validator.process = ops.process_mask + validator.plot_masks = [] + +Prepare calibration and validation datasets `⇑ <#top>`__ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + +We can use one dataset as calibration and validation datasets. Name it +``quantization_dataset``. + +.. code:: ipython2 + + from typing import Dict + + import nncf + + + def transform_fn(data_item: Dict): + input_tensor = validator.preprocess(data_item)["img"].numpy() + return input_tensor + + + quantization_dataset = nncf.Dataset(data_loader, transform_fn) + +Prepare validation function `⇑ <#top>`__ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + +.. code:: ipython2 + + from functools import partial + + import torch + from nncf.quantization.advanced_parameters import AdvancedAccuracyRestorerParameters + + + def validation_ac( + compiled_model: openvino.CompiledModel, + validation_loader: torch.utils.data.DataLoader, + validator: Validator, + num_samples: int = None, + ) -> float: + validator.seen = 0 + validator.jdict = [] + validator.stats = [] + validator.batch_i = 1 + validator.confusion_matrix = ConfusionMatrix(nc=validator.nc) + num_outputs = len(compiled_model.outputs) + + counter = 0 + for batch_i, batch in enumerate(validation_loader): + if num_samples is not None and batch_i == num_samples: + break + batch = validator.preprocess(batch) + results = compiled_model(batch["img"]) + if num_outputs == 1: + preds = torch.from_numpy(results[compiled_model.output(0)]) + else: + preds = [ + torch.from_numpy(results[compiled_model.output(0)]), + torch.from_numpy(results[compiled_model.output(1)]), + ] + preds = validator.postprocess(preds) + validator.update_metrics(preds, batch) + counter += 1 + stats = validator.get_stats() + if num_outputs == 1: + stats_metrics = stats["metrics/mAP50-95(B)"] + else: + stats_metrics = stats["metrics/mAP50-95(M)"] + print(f"Validate: dataset length = {counter}, metric value = {stats_metrics:.3f}") + + return stats_metrics + + + validation_fn = partial(validation_ac, validator=validator) + +Run quantization with accuracy control `⇑ <#top>`__ +############################################################################################################################### + +You should provide +the calibration dataset and the validation dataset. It can be the same +dataset. - parameter ``max_drop`` defines the accuracy drop threshold. +The quantization process stops when the degradation of accuracy metric +on the validation dataset is less than the ``max_drop``. The default +value is 0.01. NNCF will stop the quantization and report an error if +the ``max_drop`` value can’t be reached. - ``drop_type`` defines how the +accuracy drop will be calculated: ABSOLUTE (used by default) or +RELATIVE. - ``ranking_subset_size`` - size of a subset that is used to +rank layers by their contribution to the accuracy drop. Default value is +300, and the more samples it has the better ranking, potentially. Here +we use the value 25 to speed up the execution. + +.. note:: + + Execution can take tens of minutes and requires up to 15 GB + of free memory + +.. code:: ipython2 + + quantized_model = nncf.quantize_with_accuracy_control( + ov_model, + quantization_dataset, + quantization_dataset, + validation_fn=validation_fn, + max_drop=0.01, + preset=nncf.QuantizationPreset.MIXED, + advanced_accuracy_restorer_parameters=AdvancedAccuracyRestorerParameters( + ranking_subset_size=25, + num_ranking_processes=1 + ), + ) + +Compare Accuracy and Performance of the Original and Quantized Models `⇑ <#top>`__ +############################################################################################################################### + + +Now we can compare metrics of the Original non-quantized +OpenVINO IR model and Quantized OpenVINO IR model to make sure that the +``max_drop`` is not exceeded. + +.. code:: ipython2 + + import openvino + + core = openvino.Core() + quantized_compiled_model = core.compile_model(model=quantized_model, device_name='CPU') + compiled_ov_model = core.compile_model(model=ov_model, device_name='CPU') + + pt_result = validation_ac(compiled_ov_model, data_loader, validator) + quantized_result = validation_ac(quantized_compiled_model, data_loader, validator) + + + print(f'[Original OpenVino]: {pt_result:.4f}') + print(f'[Quantized OpenVino]: {quantized_result:.4f}') + +And compare performance. + +.. code:: ipython2 + + from pathlib import Path + # Set model directory + MODEL_DIR = Path("model") + MODEL_DIR.mkdir(exist_ok=True) + + ir_model_path = MODEL_DIR / 'ir_model.xml' + quantized_model_path = MODEL_DIR / 'quantized_model.xml' + + # Save models to use them in the commandline banchmark app + openvino.save_model(ov_model, ir_model_path, compress_to_fp16=False) + openvino.save_model(quantized_model, quantized_model_path, compress_to_fp16=False) + +.. code:: ipython2 + + # Inference Original model (OpenVINO IR) + ! benchmark_app -m $ir_model_path -shape "[1,3,640,640]" -d CPU -api async + +.. code:: ipython2 + + # Inference Quantized model (OpenVINO IR) + ! benchmark_app -m $quantized_model_path -shape "[1,3,640,640]" -d CPU -api async diff --git a/docs/notebooks/201-vision-monodepth-with-output.rst b/docs/notebooks/201-vision-monodepth-with-output.rst index 06ec0e5cd77..e98e4c37d8f 100644 --- a/docs/notebooks/201-vision-monodepth-with-output.rst +++ b/docs/notebooks/201-vision-monodepth-with-output.rst @@ -1,7 +1,7 @@ Monodepth Estimation with OpenVINO ================================== -.. _top: + This tutorial demonstrates Monocular Depth Estimation with MidasNet in OpenVINO. Model information can be found @@ -30,6 +30,8 @@ Transfer,” `__ in IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: ``10.1109/TPAMI.2020.3019967``. +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/202-vision-superresolution-image-with-output.rst b/docs/notebooks/202-vision-superresolution-image-with-output.rst index 18ea80db89d..2a9c26e5342 100644 --- a/docs/notebooks/202-vision-superresolution-image-with-output.rst +++ b/docs/notebooks/202-vision-superresolution-image-with-output.rst @@ -1,7 +1,7 @@ Single Image Super Resolution with OpenVINO™ ============================================ -.. _top: + Super Resolution is the process of enhancing the quality of an image by increasing the pixel count using deep learning. This notebook shows the @@ -16,6 +16,8 @@ Resolution,” `__ 2018 24th International Conference on Pattern Recognition (ICPR), 2018, pp. 2777-2784, doi: 10.1109/ICPR.2018.8545760. +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/202-vision-superresolution-video-with-output.rst b/docs/notebooks/202-vision-superresolution-video-with-output.rst index 840d31c84ee..7b48a8c64ed 100644 --- a/docs/notebooks/202-vision-superresolution-video-with-output.rst +++ b/docs/notebooks/202-vision-superresolution-video-with-output.rst @@ -1,7 +1,7 @@ Video Super Resolution with OpenVINO™ ===================================== -.. _top: + Super Resolution is the process of enhancing the quality of an image by increasing the pixel count using deep learning. This notebook applies @@ -23,6 +23,8 @@ pp. 2777-2784, doi: 10.1109/ICPR.2018.8545760. video. +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/203-meter-reader-with-output.rst b/docs/notebooks/203-meter-reader-with-output.rst index e45a6d9973c..eeec4746977 100644 --- a/docs/notebooks/203-meter-reader-with-output.rst +++ b/docs/notebooks/203-meter-reader-with-output.rst @@ -1,7 +1,7 @@ Industrial Meter Reader ======================= -.. _top: + This notebook shows how to create a industrial meter reader with OpenVINO Runtime. We use the pre-trained @@ -21,6 +21,8 @@ to build up a multiple inference task pipeline: workflow +.. _top: + **Table of contents**: - `Import <#import>`__ diff --git a/docs/notebooks/204-segmenter-semantic-segmentation-with-output.rst b/docs/notebooks/204-segmenter-semantic-segmentation-with-output.rst index c516000c84c..29f412a4194 100644 --- a/docs/notebooks/204-segmenter-semantic-segmentation-with-output.rst +++ b/docs/notebooks/204-segmenter-semantic-segmentation-with-output.rst @@ -1,7 +1,7 @@ Semantic Segmentation with OpenVINO™ using Segmenter ==================================================== -.. _top: + Semantic segmentation is a difficult computer vision problem with many applications such as autonomous driving, robotics, augmented reality, @@ -28,6 +28,8 @@ paper: `Segmenter: Transformer for Semantic Segmentation `__ or in the `repository `__. +.. _top: + **Table of contents**: - `Get and prepare PyTorch model <#get-and-prepare-pytorch-model>`__ diff --git a/docs/notebooks/205-vision-background-removal-with-output.rst b/docs/notebooks/205-vision-background-removal-with-output.rst index cd53815c483..1c4ae2d1696 100644 --- a/docs/notebooks/205-vision-background-removal-with-output.rst +++ b/docs/notebooks/205-vision-background-removal-with-output.rst @@ -1,7 +1,7 @@ Image Background Removal with U^2-Net and OpenVINO™ =================================================== -.. _top: + This notebook demonstrates background removal in images using U\ :math:`^2`-Net and OpenVINO. @@ -17,6 +17,8 @@ The model source is available `here `__. +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/206-vision-paddlegan-anime-with-output.rst b/docs/notebooks/206-vision-paddlegan-anime-with-output.rst index 7974ce25de1..32cafa0c20c 100644 --- a/docs/notebooks/206-vision-paddlegan-anime-with-output.rst +++ b/docs/notebooks/206-vision-paddlegan-anime-with-output.rst @@ -1,7 +1,7 @@ Photos to Anime with PaddleGAN and OpenVINO =========================================== -.. _top: + This tutorial demonstrates converting a `PaddlePaddle/PaddleGAN `__ @@ -16,6 +16,8 @@ documentation `__ diff --git a/docs/notebooks/207-vision-paddlegan-superresolution-with-output.rst b/docs/notebooks/207-vision-paddlegan-superresolution-with-output.rst index 5967a0bf7b1..b19bfc982c6 100644 --- a/docs/notebooks/207-vision-paddlegan-superresolution-with-output.rst +++ b/docs/notebooks/207-vision-paddlegan-superresolution-with-output.rst @@ -1,7 +1,7 @@ Super Resolution with PaddleGAN and OpenVINO™ ============================================= -.. _top: + This notebook demonstrates converting the RealSR (real-world super-resolution) model from @@ -18,6 +18,8 @@ from CVPR 2020. This notebook works best with small images (up to 800x600 resolution). +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/208-optical-character-recognition-with-output.rst b/docs/notebooks/208-optical-character-recognition-with-output.rst index 0815ae2d3cd..871f7110dd1 100644 --- a/docs/notebooks/208-optical-character-recognition-with-output.rst +++ b/docs/notebooks/208-optical-character-recognition-with-output.rst @@ -1,7 +1,7 @@ Optical Character Recognition (OCR) with OpenVINO™ ================================================== -.. _top: + This tutorial demonstrates how to perform optical character recognition (OCR) with OpenVINO models. It is a continuation of the @@ -21,6 +21,8 @@ Zoo `__. For more information, refer to the `104-model-tools <104-model-tools-with-output.html>`__ tutorial. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/209-handwritten-ocr-with-output.rst b/docs/notebooks/209-handwritten-ocr-with-output.rst index e0f5913988f..8aa26383d21 100644 --- a/docs/notebooks/209-handwritten-ocr-with-output.rst +++ b/docs/notebooks/209-handwritten-ocr-with-output.rst @@ -1,7 +1,7 @@ Handwritten Chinese and Japanese OCR with OpenVINO™ =================================================== -.. _top: + In this tutorial, we perform optical character recognition (OCR) for handwritten Chinese (simplified) and Japanese. An OCR tutorial using the @@ -19,6 +19,8 @@ and `scut_ept `__ charlists are used. Both models are available on `Open Model Zoo `__. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/210-slowfast-video-recognition-with-output.rst b/docs/notebooks/210-slowfast-video-recognition-with-output.rst index e795d99a6ef..c2bcfa25c5d 100644 --- a/docs/notebooks/210-slowfast-video-recognition-with-output.rst +++ b/docs/notebooks/210-slowfast-video-recognition-with-output.rst @@ -1,7 +1,7 @@ Video Recognition using SlowFast and OpenVINO™ ============================================== -.. _top: + Teaching machines to detect, understand and analyze the contents of images has been one of the more well-known and well-studied problems in @@ -40,6 +40,8 @@ This tutorial consists of the following steps .. |image0| image:: https://user-images.githubusercontent.com/34324155/143044111-94676f64-7ba8-4081-9011-f8054bed7030.png +.. _top: + **Table of contents**: - `Prepare PyTorch Model <#prepare-pytorch-model>`__ diff --git a/docs/notebooks/211-speech-to-text-with-output.rst b/docs/notebooks/211-speech-to-text-with-output.rst index 080d8b092c9..95d919eb6d6 100644 --- a/docs/notebooks/211-speech-to-text-with-output.rst +++ b/docs/notebooks/211-speech-to-text-with-output.rst @@ -1,7 +1,7 @@ Speech to Text with OpenVINO™ ============================= -.. _top: + This tutorial demonstrates speech-to-text recognition with OpenVINO. @@ -13,6 +13,8 @@ with Connectionist Temporal Classification (CTC) loss. The model is available from `Open Model Zoo `__. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/212-pyannote-speaker-diarization-with-output.rst b/docs/notebooks/212-pyannote-speaker-diarization-with-output.rst index 8fabfbf8b90..2e8af021276 100644 --- a/docs/notebooks/212-pyannote-speaker-diarization-with-output.rst +++ b/docs/notebooks/212-pyannote-speaker-diarization-with-output.rst @@ -1,7 +1,7 @@ Speaker diarization =================== -.. _top: + Speaker diarization is the process of partitioning an audio stream containing human speech into homogeneous segments according to the @@ -39,6 +39,8 @@ card `__, `repo `__ and `paper `__. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/213-question-answering-with-output.rst b/docs/notebooks/213-question-answering-with-output.rst index e3fc0ee6c8d..9b1be824b7a 100644 --- a/docs/notebooks/213-question-answering-with-output.rst +++ b/docs/notebooks/213-question-answering-with-output.rst @@ -1,7 +1,7 @@ Interactive question answering with OpenVINO™ ============================================= -.. _top: + This demo shows interactive question answering with OpenVINO, using `small BERT-large-like @@ -11,6 +11,8 @@ larger BERT-large model. The model comes from `Open Model Zoo `__. Final part of this notebook provides live inference results from your inputs. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/214-grammar-correction-with-output.rst b/docs/notebooks/214-grammar-correction-with-output.rst index eaff3b6e620..434aabbacd3 100644 --- a/docs/notebooks/214-grammar-correction-with-output.rst +++ b/docs/notebooks/214-grammar-correction-with-output.rst @@ -1,7 +1,7 @@ Grammatical Error Correction with OpenVINO ========================================== -.. _top: + AI-based auto-correction products are becoming increasingly popular due to their ease of use, editing speed, and affordability. These products @@ -43,6 +43,8 @@ It consists of the following steps: Optimum `__. - Create an inference pipeline for grammatical error checking +.. _top: + **Table of contents**: - `How does it work? <#how-does-it-work>`__ diff --git a/docs/notebooks/215-image-inpainting-with-output.rst b/docs/notebooks/215-image-inpainting-with-output.rst index 85f762359ab..f9ecfbafeeb 100644 --- a/docs/notebooks/215-image-inpainting-with-output.rst +++ b/docs/notebooks/215-image-inpainting-with-output.rst @@ -1,7 +1,7 @@ Image In-painting with OpenVINO™ -------------------------------- -.. _top: + This notebook demonstrates how to use an image in-painting model with OpenVINO, using `GMCNN @@ -11,6 +11,8 @@ given a tampered image, is able to create something very similar to the original image. The Following pipeline will be used in this notebook. |pipeline| +.. _top: + **Table of contents**: - `Download the Model <#download-the-model>`__ diff --git a/docs/notebooks/216-attention-center-with-output.rst b/docs/notebooks/216-attention-center-with-output.rst index 07e5c69eedb..2a5dcfc7c8a 100644 --- a/docs/notebooks/216-attention-center-with-output.rst +++ b/docs/notebooks/216-attention-center-with-output.rst @@ -1,7 +1,7 @@ The attention center model with OpenVINO™ ========================================= -.. _top: + This notebook demonstrates how to use the `attention center model `__ with @@ -51,6 +51,8 @@ The attention center model has been trained with images from the `COCO dataset `__ annotated with saliency from the `SALICON dataset `__. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/217-vision-deblur-with-output.rst b/docs/notebooks/217-vision-deblur-with-output.rst index 3686de8db5f..6e0f7067823 100644 --- a/docs/notebooks/217-vision-deblur-with-output.rst +++ b/docs/notebooks/217-vision-deblur-with-output.rst @@ -1,6 +1,8 @@ Deblur Photos with DeblurGAN-v2 and OpenVINO™ ============================================= + + .. _top: **Table of contents**: diff --git a/docs/notebooks/218-vehicle-detection-and-recognition-with-output.rst b/docs/notebooks/218-vehicle-detection-and-recognition-with-output.rst index c5237117f8a..2bc8a6cd2e9 100644 --- a/docs/notebooks/218-vehicle-detection-and-recognition-with-output.rst +++ b/docs/notebooks/218-vehicle-detection-and-recognition-with-output.rst @@ -1,7 +1,7 @@ Vehicle Detection And Recognition with OpenVINO™ ================================================ -.. _top: + This tutorial demonstrates how to use two pre-trained models from `Open Model Zoo `__: @@ -19,6 +19,8 @@ As a result, you can get: result +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/219-knowledge-graphs-conve-with-output.rst b/docs/notebooks/219-knowledge-graphs-conve-with-output.rst index c623c3cfd00..07fd9413bca 100644 --- a/docs/notebooks/219-knowledge-graphs-conve-with-output.rst +++ b/docs/notebooks/219-knowledge-graphs-conve-with-output.rst @@ -1,7 +1,7 @@ OpenVINO optimizations for Knowledge graphs =========================================== -.. _top: + The goal of this notebook is to showcase performance optimizations for the ConvE knowledge graph embeddings model using the Intel® Distribution @@ -18,6 +18,8 @@ The ConvE model is an implementation of the paper - sample dataset can be downloaded from: https://github.com/TimDettmers/ConvE/tree/master/countries/countries_S1 +.. _top: + **Table of contents**: - `Windows specific settings <#windows-specific-settings>`__ diff --git a/docs/notebooks/220-cross-lingual-books-alignment-with-output.rst b/docs/notebooks/220-cross-lingual-books-alignment-with-output.rst index cd34355ccf9..88d0874160f 100644 --- a/docs/notebooks/220-cross-lingual-books-alignment-with-output.rst +++ b/docs/notebooks/220-cross-lingual-books-alignment-with-output.rst @@ -1,7 +1,7 @@ Cross-lingual Books Alignment with Transformers and OpenVINO™ ============================================================= -.. _top: + Cross-lingual text alignment is the task of matching sentences in a pair of texts that are translations of each other. In this notebook, you’ll @@ -39,6 +39,8 @@ Prerequisites - ``seaborn`` - for alignment matrix visualization - ``ipywidgets`` - for displaying HTML and JS output in the notebook +.. _top: + **Table of contents**: - `Get Books <#get-books>`__ diff --git a/docs/notebooks/221-machine-translation-with-output.rst b/docs/notebooks/221-machine-translation-with-output.rst index f8c36d8b482..b4103a43f25 100644 --- a/docs/notebooks/221-machine-translation-with-output.rst +++ b/docs/notebooks/221-machine-translation-with-output.rst @@ -1,7 +1,7 @@ Machine translation demo ======================== -.. _top: + This demo utilizes Intel’s pre-trained model that translates from English to German. More information about the model can be found @@ -18,6 +18,8 @@ following structure: ```` + *tokenized sentence* + ```` + **Output** After the inference, we have a sequence of up to 200 tokens. The structure is the same as the one for the input. +.. _top: + **Table of contents**: - `Downloading model <#downloading-model>`__ diff --git a/docs/notebooks/222-vision-image-colorization-with-output.rst b/docs/notebooks/222-vision-image-colorization-with-output.rst index 5985afd3fed..5d3d32c0655 100644 --- a/docs/notebooks/222-vision-image-colorization-with-output.rst +++ b/docs/notebooks/222-vision-image-colorization-with-output.rst @@ -1,7 +1,7 @@ Image Colorization with OpenVINO ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.. _top: + This notebook demonstrates how to colorize images with OpenVINO using the Colorization model @@ -44,6 +44,8 @@ About Colorization-siggraph See the `colorization `__ repository for more details. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/223-text-prediction-with-output.rst b/docs/notebooks/223-text-prediction-with-output.rst index ef77dd1d3e0..eeb9f79f0f2 100644 --- a/docs/notebooks/223-text-prediction-with-output.rst +++ b/docs/notebooks/223-text-prediction-with-output.rst @@ -1,7 +1,7 @@ Text Prediction with OpenVINO™ ============================== -.. _top: + This notebook shows text prediction with OpenVINO. This notebook can work in two different modes, Text Generation and Conversation, which the @@ -73,6 +73,8 @@ above. The Generated response is added to the history with the and the sequence is passed back into the model. +.. _top: + **Table of contents**: - `Model Selection <#model-selection>`__ diff --git a/docs/notebooks/224-3D-segmentation-point-clouds-with-output.rst b/docs/notebooks/224-3D-segmentation-point-clouds-with-output.rst index fef333d4d1c..8934a54ec2e 100644 --- a/docs/notebooks/224-3D-segmentation-point-clouds-with-output.rst +++ b/docs/notebooks/224-3D-segmentation-point-clouds-with-output.rst @@ -1,7 +1,7 @@ Part Segmentation of 3D Point Clouds with OpenVINO™ =================================================== -.. _top: + This notebook demonstrates how to process `point cloud `__ data and run 3D @@ -24,6 +24,8 @@ segmentation, to scene semantic parsing. It is highly efficient and effective, showing strong performance on par or even better than state of the art. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/225-stable-diffusion-text-to-image-with-output.rst b/docs/notebooks/225-stable-diffusion-text-to-image-with-output.rst index 90f6243f6c3..255e3b6b2a5 100644 --- a/docs/notebooks/225-stable-diffusion-text-to-image-with-output.rst +++ b/docs/notebooks/225-stable-diffusion-text-to-image-with-output.rst @@ -1,7 +1,7 @@ Text-to-Image Generation with Stable Diffusion and OpenVINO™ ============================================================ -.. _top: + Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from @@ -41,6 +41,8 @@ Notebook contains the following steps: API. 3. Run Stable Diffusion pipeline with OpenVINO. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/226-yolov7-optimization-with-output.rst b/docs/notebooks/226-yolov7-optimization-with-output.rst index 330d988cab3..5867c26429a 100644 --- a/docs/notebooks/226-yolov7-optimization-with-output.rst +++ b/docs/notebooks/226-yolov7-optimization-with-output.rst @@ -1,7 +1,7 @@ Convert and Optimize YOLOv7 with OpenVINO™ ========================================== -.. _top: + The YOLOv7 algorithm is making big waves in the computer vision and machine learning communities. It is a real-time object detection @@ -40,6 +40,8 @@ The tutorial consists of the following steps: - Compare accuracy of the FP32 and quantized models. - Compare performance of the FP32 and quantized models. +.. _top: + **Table of contents**: - `Get Pytorch model <#get-pytorch-model>`__ diff --git a/docs/notebooks/227-whisper-subtitles-generation-with-output.rst b/docs/notebooks/227-whisper-subtitles-generation-with-output.rst index 05b04c2fec8..39d210defad 100644 --- a/docs/notebooks/227-whisper-subtitles-generation-with-output.rst +++ b/docs/notebooks/227-whisper-subtitles-generation-with-output.rst @@ -1,7 +1,7 @@ Video Subtitle Generation using Whisper and OpenVINO™ ===================================================== -.. _top: + `Whisper `__ is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and @@ -26,6 +26,8 @@ Download the model. 2. Instantiate the PyTorch model pipeline. 3. Export the ONNX model and convert it to OpenVINO IR, using model conversion API. 4. Run the Whisper pipeline with OpenVINO models. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/228-clip-zero-shot-convert-with-output.rst b/docs/notebooks/228-clip-zero-shot-convert-with-output.rst index 913817a8a4e..63f70768c20 100644 --- a/docs/notebooks/228-clip-zero-shot-convert-with-output.rst +++ b/docs/notebooks/228-clip-zero-shot-convert-with-output.rst @@ -1,7 +1,7 @@ Zero-shot Image Classification with OpenAI CLIP and OpenVINO™ ============================================================= -.. _top: + Zero-shot image classification is a computer vision task to classify images into one of several classes without any prior training or @@ -30,6 +30,8 @@ image classification. The notebook contains the following steps: conversion API. 4. Run CLIP with OpenVINO. +.. _top: + **Table of contents**: - `Instantiate model <#instantiate-model>`__ diff --git a/docs/notebooks/228-clip-zero-shot-quantize-with-output.rst b/docs/notebooks/228-clip-zero-shot-quantize-with-output.rst index f6c2d4fb2f0..1e335a73b2f 100644 --- a/docs/notebooks/228-clip-zero-shot-quantize-with-output.rst +++ b/docs/notebooks/228-clip-zero-shot-quantize-with-output.rst @@ -1,7 +1,7 @@ Post-Training Quantization of OpenAI CLIP model with NNCF ========================================================= -.. _top: + The goal of this tutorial is to demonstrate how to speed up the model by applying 8-bit post-training quantization from @@ -23,6 +23,8 @@ The optimization process contains the following steps: notebook first to generate OpenVINO IR model that is used for quantization. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/229-distilbert-sequence-classification-with-output.rst b/docs/notebooks/229-distilbert-sequence-classification-with-output.rst index 514d49925a5..018993b6f03 100644 --- a/docs/notebooks/229-distilbert-sequence-classification-with-output.rst +++ b/docs/notebooks/229-distilbert-sequence-classification-with-output.rst @@ -1,7 +1,7 @@ Sentiment Analysis with OpenVINO™ ================================= -.. _top: + **Sentiment analysis** is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically @@ -9,6 +9,8 @@ identify, extract, quantify, and study affective states and subjective information. This notebook demonstrates how to convert and run a sequence classification model using OpenVINO. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/230-yolov8-optimization-with-output.rst b/docs/notebooks/230-yolov8-optimization-with-output.rst index 28d3b14a051..f3083e063aa 100644 --- a/docs/notebooks/230-yolov8-optimization-with-output.rst +++ b/docs/notebooks/230-yolov8-optimization-with-output.rst @@ -1,7 +1,7 @@ Convert and Optimize YOLOv8 with OpenVINO™ ========================================== -.. _top: + The YOLOv8 algorithm developed by Ultralytics is a cutting-edge, state-of-the-art (SOTA) model that is designed to be fast, accurate, and @@ -39,6 +39,8 @@ The tutorial consists of the following steps: - Compare performance of the FP32 and quantized models. - Compare accuracy of the FP32 and quantized models. +.. _top: + **Table of contents**: - `Get Pytorch model <#get-pytorch-model>`__ diff --git a/docs/notebooks/231-instruct-pix2pix-image-editing-with-output.rst b/docs/notebooks/231-instruct-pix2pix-image-editing-with-output.rst index bf63a422e49..308a358d1c5 100644 --- a/docs/notebooks/231-instruct-pix2pix-image-editing-with-output.rst +++ b/docs/notebooks/231-instruct-pix2pix-image-editing-with-output.rst @@ -1,7 +1,7 @@ Image Editing with InstructPix2Pix and OpenVINO =============================================== -.. _top: + The InstructPix2Pix is a conditional diffusion model that edits images based on written instructions provided by the user. Generative image @@ -31,6 +31,8 @@ Notebook contains the following steps: 3. Run InstructPix2Pix pipeline with OpenVINO. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/233-blip-visual-language-processing-with-output.rst b/docs/notebooks/233-blip-visual-language-processing-with-output.rst index 2637f314bf1..8468422b451 100644 --- a/docs/notebooks/233-blip-visual-language-processing-with-output.rst +++ b/docs/notebooks/233-blip-visual-language-processing-with-output.rst @@ -1,7 +1,7 @@ Visual Question Answering and Image Captioning using BLIP and OpenVINO ====================================================================== -.. _top: + Humans perceive the world through vision and language. A longtime goal of AI is to build intelligent agents that can understand the world @@ -24,6 +24,8 @@ The tutorial consists of the following parts: 2. Convert the BLIP model to OpenVINO IR. 3. Run visual question answering and image captioning with OpenVINO. +.. _top: + **Table of contents**: - `Background <#background>`__ diff --git a/docs/notebooks/234-encodec-audio-compression-with-output.rst b/docs/notebooks/234-encodec-audio-compression-with-output.rst index 309214879cd..7e98b009f94 100644 --- a/docs/notebooks/234-encodec-audio-compression-with-output.rst +++ b/docs/notebooks/234-encodec-audio-compression-with-output.rst @@ -1,7 +1,7 @@ Audio compression with EnCodec and OpenVINO =========================================== -.. _top: + Compression is an important part of the Internet today because it enables people to easily share high-quality photos, listen to audio @@ -28,6 +28,8 @@ and original `repo `__. image.png +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/235-controlnet-stable-diffusion-with-output.rst b/docs/notebooks/235-controlnet-stable-diffusion-with-output.rst index 1ce9e215d76..3ab1065358f 100644 --- a/docs/notebooks/235-controlnet-stable-diffusion-with-output.rst +++ b/docs/notebooks/235-controlnet-stable-diffusion-with-output.rst @@ -1,7 +1,7 @@ Text-to-Image Generation with ControlNet Conditioning ===================================================== -.. _top: + Diffusion models make a revolution in AI-generated art. This technology enables creation of high-quality images simply by writing a text prompt. @@ -141,6 +141,8 @@ of the target in the image: This tutorial focuses mainly on conditioning by pose. However, the discussed steps are also applicable to other annotation modes. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.rst index 6916ae2fd5f..4a1e447144f 100644 --- a/docs/notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.rst +++ b/docs/notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.rst @@ -1,7 +1,7 @@ Infinite Zoom Stable Diffusion v2 and OpenVINO™ =============================================== -.. _top: + Stable Diffusion v2 is the next generation of Stable Diffusion model a Text-to-Image latent diffusion model created by the researchers and @@ -74,6 +74,8 @@ Notebook contains the following steps: 3. Run Stable Diffusion v2 inpainting pipeline for generation infinity zoom video +.. _top: + **Table of contents**: - `Stable Diffusion v2 Infinite Zoom Showcase <#stable-diffusion-v2-infinite-zoom-showcase>`__ diff --git a/docs/notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.rst index 59df2505a79..ff8f9a9350f 100644 --- a/docs/notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.rst +++ b/docs/notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.rst @@ -1,10 +1,12 @@ Stable Diffusion v2.1 using Optimum-Intel OpenVINO and multiple Intel Hardware ============================================================================== -.. _top: + |image0| +.. _top: + **Table of contents**: - `Showing Info Available Devices <#showing-info-available-devices>`__ diff --git a/docs/notebooks/236-stable-diffusion-v2-optimum-demo-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-optimum-demo-with-output.rst index 59641538c13..f44eda207c3 100644 --- a/docs/notebooks/236-stable-diffusion-v2-optimum-demo-with-output.rst +++ b/docs/notebooks/236-stable-diffusion-v2-optimum-demo-with-output.rst @@ -1,10 +1,12 @@ Stable Diffusion v2.1 using Optimum-Intel OpenVINO ================================================== -.. _top: + |image0| +.. _top: + **Table of contents**: - `Showing Info Available Devices <#showing-info-available-devices>`__ diff --git a/docs/notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.rst index fc046861222..7cd65143c0b 100644 --- a/docs/notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.rst +++ b/docs/notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.rst @@ -1,7 +1,7 @@ Stable Diffusion Text-to-Image Demo =================================== -.. _top: + Stable Diffusion is an innovative generative AI technique that allows us to generate and manipulate images in interesting ways, including @@ -26,6 +26,8 @@ promising results for selecting a wide range of input text prompts! `236-stable-diffusion-v2-text-to-image `__. +.. _top: + **Table of contents**: - `Step 0: Install and import prerequisites <#step-0-install-and-import-prerequisites>`__ diff --git a/docs/notebooks/236-stable-diffusion-v2-text-to-image-with-output.rst b/docs/notebooks/236-stable-diffusion-v2-text-to-image-with-output.rst index 826dc04d7ee..f8cb417e3cf 100644 --- a/docs/notebooks/236-stable-diffusion-v2-text-to-image-with-output.rst +++ b/docs/notebooks/236-stable-diffusion-v2-text-to-image-with-output.rst @@ -1,7 +1,7 @@ Text-to-Image Generation with Stable Diffusion v2 and OpenVINO™ =============================================================== -.. _top: + Stable Diffusion v2 is the next generation of Stable Diffusion model a Text-to-Image latent diffusion model created by the researchers and @@ -81,6 +81,8 @@ Notebook contains the following steps: notebook `__. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/237-segment-anything-with-output.rst b/docs/notebooks/237-segment-anything-with-output.rst index 454adae0660..25969d47260 100644 --- a/docs/notebooks/237-segment-anything-with-output.rst +++ b/docs/notebooks/237-segment-anything-with-output.rst @@ -1,6 +1,8 @@ Object masks from prompts with SAM and OpenVINO =============================================== + + .. _top: **Table of contents**: diff --git a/docs/notebooks/238-deep-floyd-if-with-output.rst b/docs/notebooks/238-deep-floyd-if-with-output.rst index 7585c074bad..5701933a9ef 100644 --- a/docs/notebooks/238-deep-floyd-if-with-output.rst +++ b/docs/notebooks/238-deep-floyd-if-with-output.rst @@ -1,8 +1,6 @@ Image generation with DeepFloyd IF and OpenVINO™ ================================================ -.. _top: - DeepFloyd IF is an advanced open-source text-to-image model that delivers remarkable photorealism and language comprehension. DeepFloyd IF consists of a frozen text encoder and three cascaded pixel diffusion @@ -78,6 +76,10 @@ vector in embedded space. conventional Super Resolution network to get hi-res results. + + +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/239-image-bind-convert-with-output.rst b/docs/notebooks/239-image-bind-convert-with-output.rst index bc4a983a5a2..ffd69a13191 100644 --- a/docs/notebooks/239-image-bind-convert-with-output.rst +++ b/docs/notebooks/239-image-bind-convert-with-output.rst @@ -1,7 +1,7 @@ Binding multimodal data using ImageBind and OpenVINO ==================================================== -.. _top: + Exploring the surrounding world, people get information using multiple senses, for example, seeing a busy street and hearing the sounds of car @@ -69,6 +69,8 @@ represented on the image below: In this tutorial, we consider how to use ImageBind for multimodal zero-shot classification. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/240-dolly-2-instruction-following-with-output.rst b/docs/notebooks/240-dolly-2-instruction-following-with-output.rst index bbc1e240159..9b450eb9902 100644 --- a/docs/notebooks/240-dolly-2-instruction-following-with-output.rst +++ b/docs/notebooks/240-dolly-2-instruction-following-with-output.rst @@ -1,7 +1,7 @@ Instruction following using Databricks Dolly 2.0 and OpenVINO ============================================================= -.. _top: + The instruction following is one of the cornerstones of the current generation of large language models(LLMs). Reinforcement learning with @@ -82,6 +82,8 @@ post `__ +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/241-riffusion-text-to-music-with-output.rst b/docs/notebooks/241-riffusion-text-to-music-with-output.rst index cae9b6e81d1..d8eb9cb1462 100644 --- a/docs/notebooks/241-riffusion-text-to-music-with-output.rst +++ b/docs/notebooks/241-riffusion-text-to-music-with-output.rst @@ -1,7 +1,7 @@ Text-to-Music generation using Riffusion and OpenVINO ===================================================== -.. _top: + `Riffusion `__ is a latent text-to-image diffusion model capable of generating spectrogram @@ -76,6 +76,8 @@ The STFT is invertible, so the original audio can be reconstructed from a spectrogram. This idea is a behind approach to using Riffusion for audio generation. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/242-freevc-voice-conversion-with-output.rst b/docs/notebooks/242-freevc-voice-conversion-with-output.rst index 5fcb41ebaf5..1c39257b4a7 100644 --- a/docs/notebooks/242-freevc-voice-conversion-with-output.rst +++ b/docs/notebooks/242-freevc-voice-conversion-with-output.rst @@ -1,7 +1,7 @@ High-Quality Text-Free One-Shot Voice Conversion with FreeVC and OpenVINO™ ========================================================================== -.. _top: + `FreeVC `__ allows alter the voice of a source speaker to a target style, while keeping the linguistic content @@ -30,6 +30,8 @@ devices. It consists of the following steps: - Convert models to OpenVINO Intermediate Representation. - Inference using only OpenVINO’s IR models. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/243-tflite-selfie-segmentation-with-output.rst b/docs/notebooks/243-tflite-selfie-segmentation-with-output.rst index 69a2c1eecdd..c709cd516e9 100644 --- a/docs/notebooks/243-tflite-selfie-segmentation-with-output.rst +++ b/docs/notebooks/243-tflite-selfie-segmentation-with-output.rst @@ -1,7 +1,7 @@ Selfie Segmentation using TFLite and OpenVINO ============================================= -.. _top: + The Selfie segmentation pipeline allows developers to easily separate the background from users within a scene and focus on what matters. @@ -36,6 +36,8 @@ The tutorial consists of following steps: 2. Run inference on the image. 3. Run interactive background blurring demo on video. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/244-named-entity-recognition-with-output.rst b/docs/notebooks/244-named-entity-recognition-with-output.rst index 40dcb1455d7..dd6af58fd7b 100644 --- a/docs/notebooks/244-named-entity-recognition-with-output.rst +++ b/docs/notebooks/244-named-entity-recognition-with-output.rst @@ -1,7 +1,7 @@ Named entity recognition with OpenVINO™ ======================================= -.. _top: + The Named Entity Recognition(NER) is a natural language processing method that involves the detecting of key information in the @@ -27,6 +27,8 @@ To simplify the user experience, the `Hugging Face Optimum `__ library is used to convert the model to OpenVINO™ IR format and quantize it. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/248-stable-diffusion-xl-with-output.rst b/docs/notebooks/248-stable-diffusion-xl-with-output.rst index 457c66ce539..594fb4f1a7b 100644 --- a/docs/notebooks/248-stable-diffusion-xl-with-output.rst +++ b/docs/notebooks/248-stable-diffusion-xl-with-output.rst @@ -1,7 +1,7 @@ Image generation with Stable Diffusion XL and OpenVINO ====================================================== -.. _top: + Stable Diffusion XL or SDXL is the latest image generation model that is tailored towards more photorealistic outputs with more detailed imagery @@ -67,6 +67,8 @@ The tutorial consists of the following steps: Some demonstrated models can require at least 64GB RAM for conversion and running. +.. _top: + **Table of contents**: - `Install Prerequisites <#install-prerequisites>`__ diff --git a/docs/notebooks/250-music-generation-with-output.rst b/docs/notebooks/250-music-generation-with-output.rst index 1339c538e7e..733e303c35f 100644 --- a/docs/notebooks/250-music-generation-with-output.rst +++ b/docs/notebooks/250-music-generation-with-output.rst @@ -1,7 +1,7 @@ Controllable Music Generation with MusicGen and OpenVINO ======================================================== -.. _top: + MusicGen is a single-stage auto-regressive Transformer model capable of generating high-quality music samples conditioned on text descriptions @@ -32,6 +32,8 @@ We will use a model implementation from the `Hugging Face Transformers `__ library. +.. _top: + **Table of contents**: - `Requirements and Imports <#prerequisites>`__ diff --git a/docs/notebooks/251-tiny-sd-image-generation-with-output.rst b/docs/notebooks/251-tiny-sd-image-generation-with-output.rst index f8043dfe552..b2afd5f5c58 100644 --- a/docs/notebooks/251-tiny-sd-image-generation-with-output.rst +++ b/docs/notebooks/251-tiny-sd-image-generation-with-output.rst @@ -1,7 +1,7 @@ Image Generation with Tiny-SD and OpenVINO™ =========================================== -.. _top: + In recent times, the AI community has witnessed a remarkable surge in the development of larger and more performant language models, such as @@ -41,7 +41,9 @@ The notebook contains the following steps: 3. Run Inference pipeline with OpenVINO. 4. Run Interactive demo for Tiny-SD model -**Table of content**: +.. _toc: + +**Table of contents**: - `Prerequisites <#prerequisites>`__ - `Create PyTorch Models pipeline <#create-pytorch-models-pipeline>`__ diff --git a/docs/notebooks/252-fastcomposer-image-generation-with-output.rst b/docs/notebooks/252-fastcomposer-image-generation-with-output.rst index 891e1dd3646..d0c9a479aa0 100644 --- a/docs/notebooks/252-fastcomposer-image-generation-with-output.rst +++ b/docs/notebooks/252-fastcomposer-image-generation-with-output.rst @@ -1,7 +1,7 @@ `FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention `__ ===================================================================================================================== -.. _top: + FastComposer uses subject embeddings extracted by an image encoder to augment the generic text conditioning in diffusion models, enabling @@ -32,6 +32,8 @@ different styles, actions, and contexts. drivers in the system - changes to have compatibility with transformers >= 4.30.1 (due to security vulnerability) +.. _top: + **Table of contents**: - `Install Prerequisites <#install-prerequisites>`__ diff --git a/docs/notebooks/253-zeroscope-text2video-with-output.rst b/docs/notebooks/253-zeroscope-text2video-with-output.rst index 4a538a6a8fc..549a1ce04e5 100644 --- a/docs/notebooks/253-zeroscope-text2video-with-output.rst +++ b/docs/notebooks/253-zeroscope-text2video-with-output.rst @@ -1,7 +1,7 @@ Video generation with ZeroScope and OpenVINO ============================================ -.. _top: + The ZeroScope model is a free and open-source text-to-video model that can generate realistic and engaging videos from text descriptions. It is @@ -34,6 +34,8 @@ Both versions of the ZeroScope model are available on Hugging Face: We will use the first one. +.. _top: + **Table of contents**: - `Install and import required packages <#install-and-import-required-packages>`__ diff --git a/docs/notebooks/301-tensorflow-training-openvino-nncf-with-output.rst b/docs/notebooks/301-tensorflow-training-openvino-nncf-with-output.rst index 353297f1805..6054fb8ae8c 100644 --- a/docs/notebooks/301-tensorflow-training-openvino-nncf-with-output.rst +++ b/docs/notebooks/301-tensorflow-training-openvino-nncf-with-output.rst @@ -11,6 +11,8 @@ A custom dataloader and metric will be defined, and accuracy and performance will be computed for the original IR model and the quantized model. +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/301-tensorflow-training-openvino-with-output.rst b/docs/notebooks/301-tensorflow-training-openvino-with-output.rst index 53b511021f8..0b02ba0ee4f 100644 --- a/docs/notebooks/301-tensorflow-training-openvino-with-output.rst +++ b/docs/notebooks/301-tensorflow-training-openvino-with-output.rst @@ -1,6 +1,8 @@ From Training to Deployment with TensorFlow and OpenVINO™ ========================================================= + + .. _top: **Table of contents**: diff --git a/docs/notebooks/302-pytorch-quantization-aware-training-with-output.rst b/docs/notebooks/302-pytorch-quantization-aware-training-with-output.rst index 766537b933d..3cc99a837ea 100644 --- a/docs/notebooks/302-pytorch-quantization-aware-training-with-output.rst +++ b/docs/notebooks/302-pytorch-quantization-aware-training-with-output.rst @@ -1,7 +1,7 @@ Quantization Aware Training with NNCF, using PyTorch framework ============================================================== -.. _top: + This notebook is based on `ImageNet training in PyTorch `__. @@ -34,6 +34,8 @@ hub `__. This notebook requires a C++ compiler. +.. _top: + **Table of contents**: - `Imports and Settings <#imports-and-settings>`__ diff --git a/docs/notebooks/305-tensorflow-quantization-aware-training-with-output.rst b/docs/notebooks/305-tensorflow-quantization-aware-training-with-output.rst index b4673eb4c3e..8f0ad9a7f72 100644 --- a/docs/notebooks/305-tensorflow-quantization-aware-training-with-output.rst +++ b/docs/notebooks/305-tensorflow-quantization-aware-training-with-output.rst @@ -1,7 +1,7 @@ Quantization Aware Training with NNCF, using TensorFlow Framework ================================================================= -.. _top: + The goal of this notebook to demonstrate how to use the Neural Network Compression Framework `NNCF `__ @@ -23,6 +23,8 @@ Imagenette is a subset of 10 easily classified classes from the ImageNet dataset. Using the smaller model and dataset will speed up training and download time. +.. _top: + **Table of contents**: - `Imports and Settings <#imports-and-settings>`__ diff --git a/docs/notebooks/401-object-detection-with-output.rst b/docs/notebooks/401-object-detection-with-output.rst index bc83f4a2af3..45ee50e220e 100644 --- a/docs/notebooks/401-object-detection-with-output.rst +++ b/docs/notebooks/401-object-detection-with-output.rst @@ -1,7 +1,7 @@ Live Object Detection with OpenVINO™ ==================================== -.. _top: + This notebook demonstrates live object detection with OpenVINO, using the `SSDLite @@ -17,6 +17,8 @@ Additionally, you can also upload a video file. with a webcam. If you run the notebook on a server, the webcam will not work. However, you can still do inference on a video. +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/402-pose-estimation-with-output.rst b/docs/notebooks/402-pose-estimation-with-output.rst index fbee0c5e470..efe0ffcdd55 100644 --- a/docs/notebooks/402-pose-estimation-with-output.rst +++ b/docs/notebooks/402-pose-estimation-with-output.rst @@ -1,7 +1,7 @@ Live Human Pose Estimation with OpenVINO™ ========================================= -.. _top: + This notebook demonstrates live pose estimation with OpenVINO, using the OpenPose @@ -18,6 +18,8 @@ Additionally, you can also upload a video file. work. However, you can still do inference on a video in the final step. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/403-action-recognition-webcam-with-output.rst b/docs/notebooks/403-action-recognition-webcam-with-output.rst index d0cb4b74b57..d6755518701 100644 --- a/docs/notebooks/403-action-recognition-webcam-with-output.rst +++ b/docs/notebooks/403-action-recognition-webcam-with-output.rst @@ -1,7 +1,7 @@ Human Action Recognition with OpenVINO™ ======================================= -.. _top: + This notebook demonstrates live human action recognition with OpenVINO, using the `Action Recognition @@ -39,6 +39,8 @@ Transformer and `ResNet34 `__. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/404-style-transfer-with-output.rst b/docs/notebooks/404-style-transfer-with-output.rst index 7c5d9c10228..630aca385b8 100644 --- a/docs/notebooks/404-style-transfer-with-output.rst +++ b/docs/notebooks/404-style-transfer-with-output.rst @@ -1,7 +1,7 @@ Style Transfer with OpenVINO™ ============================= -.. _top: + This notebook demonstrates style transfer with OpenVINO, using the Style Transfer Models from `ONNX Model @@ -32,6 +32,8 @@ Additionally, you can also upload a video file. but you can run inference, using a video file. +.. _top: + **Table of contents**: - `Preparation <#preparation>`__ diff --git a/docs/notebooks/405-paddle-ocr-webcam-with-output.rst b/docs/notebooks/405-paddle-ocr-webcam-with-output.rst index 608a9d4ab58..8f11e078ae9 100644 --- a/docs/notebooks/405-paddle-ocr-webcam-with-output.rst +++ b/docs/notebooks/405-paddle-ocr-webcam-with-output.rst @@ -1,7 +1,7 @@ PaddleOCR with OpenVINO™ ======================== -.. _top: + This demo shows how to run PP-OCR model on OpenVINO natively. Instead of exporting the PaddlePaddle model to ONNX and then converting to the @@ -25,6 +25,8 @@ the PaddleOCR is as follows: with a webcam. If you run the notebook on a server, the webcam will not work. You can still do inference on a video file. +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/notebooks/406-3D-pose-estimation-with-output.rst b/docs/notebooks/406-3D-pose-estimation-with-output.rst index 9038ce30981..121a5d44326 100644 --- a/docs/notebooks/406-3D-pose-estimation-with-output.rst +++ b/docs/notebooks/406-3D-pose-estimation-with-output.rst @@ -1,7 +1,7 @@ Live 3D Human Pose Estimation with OpenVINO =========================================== -.. _top: + This notebook demonstrates live 3D Human Pose Estimation with OpenVINO via a webcam. We utilize the model @@ -30,6 +30,8 @@ To ensure that the results are displayed correctly, run the code in a recommended browser on one of the following operating systems: Ubuntu, Windows: Chrome, macOS: Safari. +.. _top: + **Table of contents**: - `Prerequisites <#prerequisites>`__ diff --git a/docs/notebooks/407-person-tracking-with-output.rst b/docs/notebooks/407-person-tracking-with-output.rst index abc808bb273..b267e6bd9ec 100644 --- a/docs/notebooks/407-person-tracking-with-output.rst +++ b/docs/notebooks/407-person-tracking-with-output.rst @@ -1,7 +1,7 @@ Person Tracking with OpenVINO™ ============================== -.. _top: + This notebook demonstrates live person tracking with OpenVINO: it reads frames from an input video sequence, detects people in the frames, @@ -95,6 +95,8 @@ realtime tracking,” in ICIP, 2016, pp. 3464–3468. .. |deepsort| image:: https://user-images.githubusercontent.com/91237924/221744683-0042eff8-2c41-43b8-b3ad-b5929bafb60b.png +.. _top: + **Table of contents**: - `Imports <#imports>`__ diff --git a/docs/tutorials.md b/docs/tutorials.md index a4fa0ed98cb..c21005bab47 100644 --- a/docs/tutorials.md +++ b/docs/tutorials.md @@ -131,6 +131,15 @@ Tutorials that explain how to optimize and quantize models with OpenVINO tools. +----------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+ | `120-tensorflow-object-detection-to-openvino `__ |br| |n120| |br| |c120| | Convert TensorFlow Object Detection models to OpenVINO IR | +----------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+ + | `122-speech-recognition-quantization-wav2vec2 `__ | Quantize Speech Recognition Models with accuracy control using NNCF PTQ API. | + +----------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+ + | `122-yolov8-quantization-with-accuracy-control `__ | Convert and Optimize YOLOv8 with OpenVINO™. | + +----------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+ + + + + + Model Demos