add-253 (#19500)

2023-08-30 13:46:27 +02:00 · 2023-08-30 13:46:27 +02:00 · 8aec490128
commit 8aec490128
parent 23cad1770e
4 changed files with 1015 additions and 109 deletions
--- a/docs/notebooks/253-zeroscope-text2video-with-output.rst
+++ b/docs/notebooks/253-zeroscope-text2video-with-output.rst
@ -0,0 +1,896 @@
 Video generation with ZeroScope and OpenVINO
 ============================================
 .. _top:
 The ZeroScope model is a free and open-source text-to-video model that
 can generate realistic and engaging videos from text descriptions. It is
 based on the
 `Modelscope <https://modelscope.cn/models/damo/text-to-video-synthesis/summary>`__
 model, but it has been improved to produce higher-quality videos with a
 16:9 aspect ratio and no Shutterstock watermark. The ZeroScope model is
 available in two versions: ZeroScope_v2 576w, which is optimized for
 rapid content creation at a resolution of 576x320 pixels, and
 ZeroScope_v2 XL, which upscales videos to a high-definition resolution
 of 1024x576.
 The ZeroScope model is trained on a dataset of over 9,000 videos and
 29,000 tagged frames. It uses a diffusion model to generate videos,
 which means that it starts with a random noise image and gradually adds
 detail to it until it matches the text description. The ZeroScope model
 is still under development, but it has already been used to create some
 impressive videos. For example, it has been used to create videos of
 people dancing, playing sports, and even driving cars.
 The ZeroScope model is a powerful tool that can be used to create
 various videos, from simple animations to complex scenes. It is still
 under development, but it has the potential to revolutionize the way we
 create and consume video content.
 Both versions of the ZeroScope model are available on Hugging Face:
 - `ZeroScope_v2 576w <https://huggingface.co/cerspense/zeroscope_v2_576w>`__
 - `ZeroScope_v2 XL <https://huggingface.co/cerspense/zeroscope_v2_XL>`__
 We will use the first one.
 **Table of contents**:
 - `Install and import required packages <#install-and-import-required-packages>`__
 - `Load the model <#load-the-model>`__
 - `Convert the model <#convert-the-model>`__
  - `Define the conversion function <#define-the-conversion-function>`__
  - `UNet <#unet>`__ -
  - `VAE <#vae>`__
  - `Text encoder <#text-encoder>`__
 - `Build a pipeline <#build-a-pipeline>`__
 - `Inference with OpenVINO <#inference-with-openvino>`__
  - `Select inference device <#select-inference-device>`__
  - `Define a prompt <#define-a-prompt>`__
  - `Video generation <#video-generation>`__
 .. important::
   This tutorial requires at least 24GB of free memory to generate a video with 
   a frame size of 432x240 and 16 frames. Increasing either of these values will 
   require more memory and take more time.
 Install and import required packages `⇑ <#top>`__
 ###############################################################################################################################
 To work with text-to-video synthesis model, we will use Hugging Face’s
 `Diffusers <https://github.com/huggingface/diffusers>`__ library. It
 provides already pretrained model from ``cerspense``.
 .. code:: ipython3
    !pip install -q "diffusers[torch]>=0.15.0" transformers "openvino==2023.1.0.dev20230811" numpy gradio
 .. code:: ipython3
    import gc
    from pathlib import Path
    from typing import Optional, Union, List, Callable
    import base64
    import tempfile
    import warnings
    import diffusers
    import transformers
    import numpy as np
    import IPython
    import ipywidgets as widgets
    import torch
    import PIL
    import gradio as gr
    import openvino as ov
 .. parsed-literal::
    2023-08-16 21:15:40.145184: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
    2023-08-16 21:15:40.146998: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
    2023-08-16 21:15:40.179214: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
    2023-08-16 21:15:40.180050: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
    To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
    2023-08-16 21:15:40.750499: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
 Original 576x320 inference requires a lot of RAM (>100GB), so let’s run
 our example on a smaller frame size, keeping the same aspect ratio. Try
 reducing values below to reduce the memory consumption.
 .. code:: ipython3
    WIDTH = 432  # must be divisible by 8
    HEIGHT = 240  # must be divisible by 8
    NUM_FRAMES = 16
 Load the model `⇑ <#top>`__
 ###############################################################################################################################
 The model is loaded from HuggingFace using ``.from_pretrained`` method
 of ``diffusers.DiffusionPipeline``.
 .. code:: ipython3
    pipe = diffusers.DiffusionPipeline.from_pretrained('cerspense/zeroscope_v2_576w')
 .. parsed-literal::
    vae/diffusion_pytorch_model.safetensors not found
 .. parsed-literal::
    Loading pipeline components...:   0%|          | 0/5 [00:00<?, ?it/s]
 .. code:: ipython3
    unet = pipe.unet
    unet.eval()
    vae = pipe.vae
    vae.eval()
    text_encoder = pipe.text_encoder
    text_encoder.eval()
    tokenizer = pipe.tokenizer
    scheduler = pipe.scheduler
    vae_scale_factor = pipe.vae_scale_factor
    unet_in_channels = pipe.unet.config.in_channels
    sample_width = WIDTH // vae_scale_factor
    sample_height = HEIGHT // vae_scale_factor
    del pipe
    gc.collect();
 Convert the model `⇑ <#top>`__
 ###############################################################################################################################
 The architecture for generating videos from text comprises three
 distinct sub-networks: one for extracting text features, another for
 translating text features into the video latent space using a diffusion
 model, and a final one for mapping the video latent space to the visual
 space. The collective parameters of the entire model amount to
 approximately 1.7 billion. It’s capable of processing English input. The
 diffusion model is built upon the Unet3D model and achieves video
 generation by iteratively denoising a starting point of pure Gaussian
 noise video.
 .. image:: 253-zeroscope-text2video-with-output_files/253-zeroscope-text2video-with-output_01_02.png
 Define the conversion function `⇑ <#top>`__
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 Model components are PyTorch modules, that can be converted with
 ``ov.convert_model`` function directly. We also use ``ov.save_model``
 function to serialize the result of conversion.
 .. code:: ipython3
    warnings.filterwarnings("ignore", category=torch.jit.TracerWarning)
 .. code:: ipython3
    def convert(model: torch.nn.Module, xml_path: str, **convert_kwargs) -> Path:
        xml_path = Path(xml_path)
        if not xml_path.exists():
            xml_path.parent.mkdir(parents=True, exist_ok=True)
            with torch.no_grad():
                converted_model = ov.convert_model(model, **convert_kwargs)
            ov.save_model(converted_model, xml_path)
            del converted model
            gc.collect()
            torch._C._jit_clear_class_registry()
            torch.jit._recursive.concrete_type_store = torch.jit._recursive.ConcreteTypeStore()
            torch.jit._state._clear_class_state()
        return xml_path
 UNet `⇑ <#top>`__
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 Text-to-video generation pipeline main component is a conditional 3D
 UNet model that takes a noisy sample, conditional state, and a timestep
 and returns a sample shaped output.
 .. code:: ipython3
    unet_xml_path = convert(
        unet,
        "models/unet.xml",
        example_input={
            "sample": torch.randn(2, 4, 2, 32, 32),
            "timestep": torch.tensor(1),
            "encoder_hidden_states": torch.randn(2, 77, 1024),
        },
        input=[
            ("sample", (2, 4, NUM_FRAMES, sample_height, sample_width)),
            ("timestep", ()),
            ("encoder_hidden_states", (2, 77, 1024)),
        ],
    )
    del unet
    gc.collect();
 .. parsed-literal::
    WARNING:tensorflow:Please fix your imports. Module tensorflow.python.training.tracking.base has been moved to tensorflow.python.trackable.base. The old module will be deleted in version 2.11.
 .. parsed-literal::
    [ WARNING ]  Please fix your imports. Module %s has been moved to %s. The old module will be deleted in version %s.
 VAE `⇑ <#top>`__
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 Variational autoencoder (VAE) uses UNet output to decode latents to
 visual representations. Our VAE model has KL loss for encoding images
 into latents and decoding latent representations into images. For
 inference, we need only decoder part.
 .. code:: ipython3
    class VaeDecoderWrapper(torch.nn.Module):
        def __init__(self, vae):
            super().__init__()
            self.vae = vae
        def forward(self, z: torch.FloatTensor):
            return self.vae.decode(z)
 .. code:: ipython3
    vae_decoder_xml_path = convert(
        VaeDecoderWrapper(vae),
        "models/vae.xml",
        example_input=torch.randn(2, 4, 32, 32),
        input=((NUM_FRAMES, 4, sample_height, sample_width)),
    )
    del vae
    gc.collect();
 Text encoder `⇑ <#top>`__
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 Text encoder is used to encode the input prompt to tensor. Default
 tensor length is 77.
 .. code:: ipython3
    text_encoder_xml = convert(
        text_encoder,
        "models/text_encoder.xml",
        example_input=torch.ones(1, 77, dtype=torch.int64),
        input=((1, 77), (ov.Type.i64,)),
    )
    del text_encoder
    gc.collect();
 Build a pipeline `⇑ <#top>`__
 ###############################################################################################################################
 .. code:: ipython3
    def tensor2vid(video: torch.Tensor, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) -> List[np.ndarray]:
        # This code is copied from https://github.com/modelscope/modelscope/blob/1509fdb973e5871f37148a4b5e5964cafd43e64d/modelscope/pipelines/multi_modal/text_to_video_synthesis_pipeline.py#L78
        # reshape to ncfhw
        mean = torch.tensor(mean, device=video.device).reshape(1, -1, 1, 1, 1)
        std = torch.tensor(std, device=video.device).reshape(1, -1, 1, 1, 1)
        # unnormalize back to [0,1]
        video = video.mul_(std).add_(mean)
        video.clamp_(0, 1)
        # prepare the final outputs
        i, c, f, h, w = video.shape
        images = video.permute(2, 3, 0, 4, 1).reshape(
            f, h, i * w, c
        )  # 1st (frames, h, batch_size, w, c) 2nd (frames, h, batch_size * w, c)
        images = images.unbind(dim=0)  # prepare a list of indvidual (consecutive frames)
        images = [(image.cpu().numpy() * 255).astype("uint8") for image in images]  # f h w c
        return images
 .. code:: ipython3
    class OVTextToVideoSDPipeline(diffusers.DiffusionPipeline):
        def __init__(
            self,
            vae_decoder: ov.CompiledModel,
            text_encoder: ov.CompiledModel,
            tokenizer: transformers.CLIPTokenizer,
            unet: ov.CompiledModel,
            scheduler: diffusers.schedulers.DDIMScheduler,
        ):
            super().__init__()
            self.vae_decoder = vae_decoder
            self.text_encoder = text_encoder
            self.tokenizer = tokenizer
            self.unet = unet
            self.scheduler = scheduler
            self.vae_scale_factor = vae_scale_factor
            self.unet_in_channels = unet_in_channels
            self.width = WIDTH
            self.height = HEIGHT
            self.num_frames = NUM_FRAMES
        def __call__(
            self,
            prompt: Union[str, List[str]] = None,
            num_inference_steps: int = 50,
            guidance_scale: float = 9.0,
            negative_prompt: Optional[Union[str, List[str]]] = None,
            eta: float = 0.0,
            generator: Optional[Union[torch.Generator, List[torch.Generator]]] = None,
            latents: Optional[torch.FloatTensor] = None,
            prompt_embeds: Optional[torch.FloatTensor] = None,
            negative_prompt_embeds: Optional[torch.FloatTensor] = None,
            output_type: Optional[str] = "np",
            return_dict: bool = True,
            callback: Optional[Callable[[int, int, torch.FloatTensor], None]] = None,
            callback_steps: int = 1,
        ):
            r"""
            Function invoked when calling the pipeline for generation.
            Args:
                prompt (`str` or `List[str]`, *optional*):
                    The prompt or prompts to guide the video generation. If not defined, one has to pass `prompt_embeds`.
                    instead.
                num_inference_steps (`int`, *optional*, defaults to 50):
                    The number of denoising steps. More denoising steps usually lead to a higher quality videos at the
                    expense of slower inference.
                guidance_scale (`float`, *optional*, defaults to 7.5):
                    Guidance scale as defined in [Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598).
                    `guidance_scale` is defined as `w` of equation 2. of [Imagen
                    Paper](https://arxiv.org/pdf/2205.11487.pdf). Guidance scale is enabled by setting `guidance_scale >
                    1`. Higher guidance scale encourages to generate videos that are closely linked to the text `prompt`,
                    usually at the expense of lower video quality.
                negative_prompt (`str` or `List[str]`, *optional*):
                    The prompt or prompts not to guide the video generation. If not defined, one has to pass
                    `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is
                    less than `1`).
                eta (`float`, *optional*, defaults to 0.0):
                    Corresponds to parameter eta (η) in the DDIM paper: https://arxiv.org/abs/2010.02502. Only applies to
                    [`schedulers.DDIMScheduler`], will be ignored for others.
                generator (`torch.Generator` or `List[torch.Generator]`, *optional*):
                    One or a list of [torch generator(s)](https://pytorch.org/docs/stable/generated/torch.Generator.html)
                    to make generation deterministic.
                latents (`torch.FloatTensor`, *optional*):
                    Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for video
                    generation. Can be used to tweak the same generation with different prompts. If not provided, a latents
                    tensor will ge generated by sampling using the supplied random `generator`. Latents should be of shape
                    `(batch_size, num_channel, num_frames, height, width)`.
                prompt_embeds (`torch.FloatTensor`, *optional*):
                    Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not
                    provided, text embeddings will be generated from `prompt` input argument.
                negative_prompt_embeds (`torch.FloatTensor`, *optional*):
                    Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt
                    weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input
                    argument.
                output_type (`str`, *optional*, defaults to `"np"`):
                    The output format of the generate video. Choose between `torch.FloatTensor` or `np.array`.
                return_dict (`bool`, *optional*, defaults to `True`):
                    Whether or not to return a [`~pipelines.stable_diffusion.TextToVideoSDPipelineOutput`] instead of a
                    plain tuple.
                callback (`Callable`, *optional*):
                    A function that will be called every `callback_steps` steps during inference. The function will be
                    called with the following arguments: `callback(step: int, timestep: int, latents: torch.FloatTensor)`.
                callback_steps (`int`, *optional*, defaults to 1):
                    The frequency at which the `callback` function will be called. If not specified, the callback will be
                    called at every step.
            Returns:
                `List[np.ndarray]`: generated video frames
            """
            num_images_per_prompt = 1
            # 1. Check inputs. Raise error if not correct
            self.check_inputs(
                prompt,
                callback_steps,
                negative_prompt,
                prompt_embeds,
                negative_prompt_embeds,
            )
            # 2. Define call parameters
            if prompt is not None and isinstance(prompt, str):
                batch_size = 1
            elif prompt is not None and isinstance(prompt, list):
                batch_size = len(prompt)
            else:
                batch_size = prompt_embeds.shape[0]
            # here `guidance_scale` is defined analog to the guidance weight `w` of equation (2)
            # of the Imagen paper: https://arxiv.org/pdf/2205.11487.pdf . `guidance_scale = 1`
            # corresponds to doing no classifier free guidance.
            do_classifier_free_guidance = guidance_scale > 1.0
            # 3. Encode input prompt
            prompt_embeds = self._encode_prompt(
                prompt,
                num_images_per_prompt,
                do_classifier_free_guidance,
                negative_prompt,
                prompt_embeds=prompt_embeds,
                negative_prompt_embeds=negative_prompt_embeds,
            )
            # 4. Prepare timesteps
            self.scheduler.set_timesteps(num_inference_steps)
            timesteps = self.scheduler.timesteps
            # 5. Prepare latent variables
            num_channels_latents = self.unet_in_channels
            latents = self.prepare_latents(
                batch_size * num_images_per_prompt,
                num_channels_latents,
                prompt_embeds.dtype,
                generator,
                latents,
            )
            # 6. Prepare extra step kwargs. TODO: Logic should ideally just be moved out of the pipeline
            extra_step_kwargs = {"generator": generator, "eta": eta}
            # 7. Denoising loop
            num_warmup_steps = len(timesteps) - num_inference_steps * self.scheduler.order
            with self.progress_bar(total=num_inference_steps) as progress_bar:
                for i, t in enumerate(timesteps):
                    # expand the latents if we are doing classifier free guidance
                    latent_model_input = (
                        torch.cat([latents] * 2) if do_classifier_free_guidance else latents
                    )
                    latent_model_input = self.scheduler.scale_model_input(latent_model_input, t)
                    # predict the noise residual
                    noise_pred = self.unet(
                        {
                            "sample": latent_model_input,
                            "timestep": t,
                            "encoder_hidden_states": prompt_embeds,
                        }
                    )[0]
                    noise_pred = torch.tensor(noise_pred)
                    # perform guidance
                    if do_classifier_free_guidance:
                        noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
                        noise_pred = noise_pred_uncond + guidance_scale * (
                            noise_pred_text - noise_pred_uncond
                        )
                    # reshape latents
                    bsz, channel, frames, width, height = latents.shape
                    latents = latents.permute(0, 2, 1, 3, 4).reshape(
                        bsz * frames, channel, width, height
                    )
                    noise_pred = noise_pred.permute(0, 2, 1, 3, 4).reshape(
                        bsz * frames, channel, width, height
                    )
                    # compute the previous noisy sample x_t -> x_t-1
                    latents = self.scheduler.step(
                        noise_pred, t, latents, **extra_step_kwargs
                    ).prev_sample
                    # reshape latents back
                    latents = (
                        latents[None, :]
                        .reshape(bsz, frames, channel, width, height)
                        .permute(0, 2, 1, 3, 4)
                    )
                    # call the callback, if provided
                    if i == len(timesteps) - 1 or (
                        (i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0
                    ):
                        progress_bar.update()
                        if callback is not None and i % callback_steps == 0:
                            callback(i, t, latents)
            video_tensor = self.decode_latents(latents)
            if output_type == "pt":
                video = video_tensor
            else:
                video = tensor2vid(video_tensor)
            if not return_dict:
                return (video,)
            return {"frames": video}
        # Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline._encode_prompt
        def _encode_prompt(
            self,
            prompt,
            num_images_per_prompt,
            do_classifier_free_guidance,
            negative_prompt=None,
            prompt_embeds: Optional[torch.FloatTensor] = None,
            negative_prompt_embeds: Optional[torch.FloatTensor] = None,
        ):
            r"""
            Encodes the prompt into text encoder hidden states.
            Args:
                 prompt (`str` or `List[str]`, *optional*):
                    prompt to be encoded
                num_images_per_prompt (`int`):
                    number of images that should be generated per prompt
                do_classifier_free_guidance (`bool`):
                    whether to use classifier free guidance or not
                negative_prompt (`str` or `List[str]`, *optional*):
                    The prompt or prompts not to guide the image generation. If not defined, one has to pass
                    `negative_prompt_embeds` instead. Ignored when not using guidance (i.e., ignored if `guidance_scale` is
                    less than `1`).
                prompt_embeds (`torch.FloatTensor`, *optional*):
                    Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not
                    provided, text embeddings will be generated from `prompt` input argument.
                negative_prompt_embeds (`torch.FloatTensor`, *optional*):
                    Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt
                    weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input
                    argument.
            """
            if prompt is not None and isinstance(prompt, str):
                batch_size = 1
            elif prompt is not None and isinstance(prompt, list):
                batch_size = len(prompt)
            else:
                batch_size = prompt_embeds.shape[0]
            if prompt_embeds is None:
                text_inputs = self.tokenizer(
                    prompt,
                    padding="max_length",
                    max_length=self.tokenizer.model_max_length,
                    truncation=True,
                    return_tensors="pt",
                )
                text_input_ids = text_inputs.input_ids
                untruncated_ids = self.tokenizer(
                    prompt, padding="longest", return_tensors="pt"
                ).input_ids
                if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal(
                    text_input_ids, untruncated_ids
                ):
                    removed_text = self.tokenizer.batch_decode(
                        untruncated_ids[:, self.tokenizer.model_max_length - 1 : -1]
                    )
                    print(
                        "The following part of your input was truncated because CLIP can only handle sequences up to"
                        f" {self.tokenizer.model_max_length} tokens: {removed_text}"
                    )
                prompt_embeds = self.text_encoder(text_input_ids)
                prompt_embeds = prompt_embeds[0]
                prompt_embeds = torch.tensor(prompt_embeds)
            bs_embed, seq_len, _ = prompt_embeds.shape
            # duplicate text embeddings for each generation per prompt, using mps friendly method
            prompt_embeds = prompt_embeds.repeat(1, num_images_per_prompt, 1)
            prompt_embeds = prompt_embeds.view(bs_embed * num_images_per_prompt, seq_len, -1)
            # get unconditional embeddings for classifier free guidance
            if do_classifier_free_guidance and negative_prompt_embeds is None:
                uncond_tokens: List[str]
                if negative_prompt is None:
                    uncond_tokens = [""] * batch_size
                elif type(prompt) is not type(negative_prompt):
                    raise TypeError(
                        f"`negative_prompt` should be the same type to `prompt`, but got {type(negative_prompt)} !="
                        f" {type(prompt)}."
                    )
                elif isinstance(negative_prompt, str):
                    uncond_tokens = [negative_prompt]
                elif batch_size != len(negative_prompt):
                    raise ValueError(
                        f"`negative_prompt`: {negative_prompt} has batch size {len(negative_prompt)}, but `prompt`:"
                        f" {prompt} has batch size {batch_size}. Please make sure that passed `negative_prompt` matches"
                        " the batch size of `prompt`."
                    )
                else:
                    uncond_tokens = negative_prompt
                max_length = prompt_embeds.shape[1]
                uncond_input = self.tokenizer(
                    uncond_tokens,
                    padding="max_length",
                    max_length=max_length,
                    truncation=True,
                    return_tensors="pt",
                )
                negative_prompt_embeds = self.text_encoder(uncond_input.input_ids)
                negative_prompt_embeds = negative_prompt_embeds[0]
                negative_prompt_embeds = torch.tensor(negative_prompt_embeds)
            if do_classifier_free_guidance:
                # duplicate unconditional embeddings for each generation per prompt, using mps friendly method
                seq_len = negative_prompt_embeds.shape[1]
                negative_prompt_embeds = negative_prompt_embeds.repeat(1, num_images_per_prompt, 1)
                negative_prompt_embeds = negative_prompt_embeds.view(
                    batch_size * num_images_per_prompt, seq_len, -1
                )
                # For classifier free guidance, we need to do two forward passes.
                # Here we concatenate the unconditional and text embeddings into a single batch
                # to avoid doing two forward passes
                prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds])
            return prompt_embeds
        def prepare_latents(
            self,
            batch_size,
            num_channels_latents,
            dtype,
            generator,
            latents=None,
        ):
            shape = (
                batch_size,
                num_channels_latents,
                self.num_frames,
                self.height // self.vae_scale_factor,
                self.width // self.vae_scale_factor,
            )
            if isinstance(generator, list) and len(generator) != batch_size:
                raise ValueError(
                    f"You have passed a list of generators of length {len(generator)}, but requested an effective batch"
                    f" size of {batch_size}. Make sure the batch size matches the length of the generators."
                )
            if latents is None:
                latents = diffusers.utils.randn_tensor(shape, generator=generator, dtype=dtype)
            # scale the initial noise by the standard deviation required by the scheduler
            latents = latents * self.scheduler.init_noise_sigma
            return latents
        def check_inputs(
            self,
            prompt,
            callback_steps,
            negative_prompt=None,
            prompt_embeds=None,
            negative_prompt_embeds=None,
        ):
            if self.height % 8 != 0 or self.width % 8 != 0:
                raise ValueError(
                    f"`height` and `width` have to be divisible by 8 but are {self.height} and {self.width}."
                )
            if (callback_steps is None) or (
                callback_steps is not None
                and (not isinstance(callback_steps, int) or callback_steps <= 0)
            ):
                raise ValueError(
                    f"`callback_steps` has to be a positive integer but is {callback_steps} of type"
                    f" {type(callback_steps)}."
                )
            if prompt is not None and prompt_embeds is not None:
                raise ValueError(
                    f"Cannot forward both `prompt`: {prompt} and `prompt_embeds`: {prompt_embeds}. Please make sure to"
                    " only forward one of the two."
                )
            elif prompt is None and prompt_embeds is None:
                raise ValueError(
                    "Provide either `prompt` or `prompt_embeds`. Cannot leave both `prompt` and `prompt_embeds` undefined."
                )
            elif prompt is not None and (not isinstance(prompt, str) and not isinstance(prompt, list)):
                raise ValueError(f"`prompt` has to be of type `str` or `list` but is {type(prompt)}")
            if negative_prompt is not None and negative_prompt_embeds is not None:
                raise ValueError(
                    f"Cannot forward both `negative_prompt`: {negative_prompt} and `negative_prompt_embeds`:"
                    f" {negative_prompt_embeds}. Please make sure to only forward one of the two."
                )
            if prompt_embeds is not None and negative_prompt_embeds is not None:
                if prompt_embeds.shape != negative_prompt_embeds.shape:
                    raise ValueError(
                        "`prompt_embeds` and `negative_prompt_embeds` must have the same shape when passed directly, but"
                        f" got: `prompt_embeds` {prompt_embeds.shape} != `negative_prompt_embeds`"
                        f" {negative_prompt_embeds.shape}."
                    )
        def decode_latents(self, latents):
            scale_factor = 0.18215
            latents = 1 / scale_factor * latents
            batch_size, channels, num_frames, height, width = latents.shape
            latents = latents.permute(0, 2, 1, 3, 4).reshape(
                batch_size * num_frames, channels, height, width
            )
            image = self.vae_decoder(latents)[0]
            image = torch.tensor(image)
            video = (
                image[None, :]
                .reshape(
                    (
                        batch_size,
                        num_frames,
                        -1,
                    )
                    + image.shape[2:]
                )
                .permute(0, 2, 1, 3, 4)
            )
            # we always cast to float32 as this does not cause significant overhead and is compatible with bfloat16
            video = video.float()
            return video
 Inference with OpenVINO `⇑ <#top>`__
 ###############################################################################################################################
 .. code:: ipython3
    core = ov.Core()
 Select inference device `⇑ <#top>`__
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 select device from dropdown list for running inference using OpenVINO
 .. code:: ipython3
    device = widgets.Dropdown(
        options=core.available_devices + ["AUTO"],
        value='AUTO',
        description='Device:',
        disabled=False,
    )
    device
 .. parsed-literal::
    Dropdown(description='Device:', index=4, options=('CPU', 'GPU.0', 'GPU.1', 'GPU.2', 'AUTO'), value='AUTO')
 .. code:: ipython3
    %%time
    ov_unet = core.compile_model(unet_xml_path, device_name=device.value)
 .. parsed-literal::
    CPU times: user 14.1 s, sys: 5.62 s, total: 19.7 s
    Wall time: 10.6 s
 .. code:: ipython3
    %%time
    ov_vae_decoder = core.compile_model(vae_decoder_xml_path, device_name=device.value)
 .. parsed-literal::
    CPU times: user 456 ms, sys: 320 ms, total: 776 ms
    Wall time: 328 ms
 .. code:: ipython3
    %%time
    ov_text_encoder = core.compile_model(text_encoder_xml, device_name=device.value)
 .. parsed-literal::
    CPU times: user 1.78 s, sys: 1.44 s, total: 3.22 s
    Wall time: 1.13 s
 Here we replace the pipeline parts with versions converted to OpenVINO
 IR and compiled to specific device. Note that we use original pipeline
 tokenizer and scheduler.
 .. code:: ipython3
    ov_pipe = OVTextToVideoSDPipeline(ov_vae_decoder, ov_text_encoder, tokenizer, ov_unet, scheduler)
 Define a prompt `⇑ <#top>`__
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 .. code:: ipython3
    prompt = "A panda eating bamboo on a rock."
 Let’s generate a video for our prompt. For full list of arguments, see
 ``__call__`` function definition of ``OVTextToVideoSDPipeline`` class in
 `Build a pipeline <#Build-a-pipeline>`__ section.
 Video generation `⇑ <#top>`__
 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 .. code:: ipython3
    frames = ov_pipe(prompt, num_inference_steps=25)['frames']
 .. parsed-literal::
      0%|          | 0/25 [00:00<?, ?it/s]
 .. code:: ipython3
    images = [PIL.Image.fromarray(frame) for frame in frames]
    images[0].save("output.gif", save_all=True, append_images=images[1:], duration=125, loop=0)
    with open("output.gif", "rb") as gif_file:
        b64 = f'data:image/gif;base64,{base64.b64encode(gif_file.read()).decode()}'
    IPython.display.HTML(f"<img src=\"{b64}\" />")
 .. image:: 253-zeroscope-text2video-with-output_files/253-zeroscope-text2video-with-output_01_03.gif
 Interactive demo `⇑ <#top>`__
 ###############################################################################################################################
 .. code:: ipython3
    def generate(
        prompt, seed, num_inference_steps, _=gr.Progress(track_tqdm=True)
    ):
        generator = torch.Generator().manual_seed(seed)
        frames = ov_pipe(
            prompt,
            num_inference_steps=num_inference_steps,
            generator=generator,
        )["frames"]
        out_file = tempfile.NamedTemporaryFile(suffix=".gif", delete=False)
        images = [PIL.Image.fromarray(frame) for frame in frames]
        images[0].save(
            out_file, save_all=True, append_images=images[1:], duration=125, loop=0
        )
        return out_file.name
    demo = gr.Interface(
        generate,
        [
            gr.Textbox(label="Prompt"),
            gr.Slider(0, 1000000, value=42, label="Seed", step=1),
            gr.Slider(10, 50, value=25, label="Number of inference steps", step=1),
        ],
        gr.Image(label="Result"),
        examples=[
            ["An astronaut riding a horse.", 0, 25],
            ["A panda eating bamboo on a rock.", 0, 25],
            ["Spiderman is surfing.", 0, 25],
        ],
        allow_flagging="never"
    )
    try:
        demo.queue().launch(debug=True)
    except Exception:
        demo.queue().launch(share=True, debug=True)
    # if you are launching remotely, specify server_name and server_port
    # demo.launch(server_name='your server name', server_port='server port in int')
    # Read more in the docs: https://gradio.app/docs/
--- a/docs/notebooks/253-zeroscope-text2video-with-output_files/253-zeroscope-text2video-with-output_01_02.png
+++ b/docs/notebooks/253-zeroscope-text2video-with-output_files/253-zeroscope-text2video-with-output_01_02.png
@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:f9b3abdf1818a885d159961285a1ef96a2c0c0c99d26eac96435b7813e28198d
 size 41341
--- a/docs/notebooks/253-zeroscope-text2video-with-output_files/253-zeroscope-text2video-with-output_01_03.gif
+++ b/docs/notebooks/253-zeroscope-text2video-with-output_files/253-zeroscope-text2video-with-output_01_03.gif
@ -0,0 +1,3 @@
 version https://git-lfs.github.com/spec/v1
 oid sha256:c0786f897470a25d935d1f5e096132f086c7f96f42d441102f598828d6d39452
 size 1366066
--- a/docs/tutorials.md
+++ b/docs/tutorials.md
@ -154,115 +154,117 @@ Demos that demonstrate inference on a particular model.
 .. dropdown:: Explore more notebooks below.
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | Notebook                                                                                                                      | Description                                                                                                                                | Preview                                   |
+   | Notebook                                                                                                                      | Description                                                                                                                                | Preview                                            |
-   +===============================================================================================================================+============================================================================================================================================+===========================================+
+   +===============================================================================================================================+============================================================================================================================================+====================================================+
-   | `201-vision-monodepth <notebooks/201-vision-monodepth-with-output.html>`__ |br| |n201| |br| |c201|                            | Monocular depth estimation with images and video.                                                                                          | |n201-img1|                               |
+   | `201-vision-monodepth <notebooks/201-vision-monodepth-with-output.html>`__ |br| |n201| |br| |c201|                            | Monocular depth estimation with images and video.                                                                                          | |n201-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `202-vision-superresolution-image <notebooks/202-vision-superresolution-image-with-output.html>`__ |br| |n202i| |br| |c202i|  | Upscale raw images with a super resolution model.                                                                                          | |n202i-img1| → |n202i-img2|               |
+   | `202-vision-superresolution-image <notebooks/202-vision-superresolution-image-with-output.html>`__ |br| |n202i| |br| |c202i|  | Upscale raw images with a super resolution model.                                                                                          | |n202i-img1| → |n202i-img2|                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `202-vision-superresolution-video <notebooks/202-vision-superresolution-video-with-output.html>`__ |br| |n202v| |br| |c202v|  | Turn 360p into 1080p video using a super resolution model.                                                                                 | |n202v-img1| → |n202v-img2|               |
+   | `202-vision-superresolution-video <notebooks/202-vision-superresolution-video-with-output.html>`__ |br| |n202v| |br| |c202v|  | Turn 360p into 1080p video using a super resolution model.                                                                                 | |n202v-img1| → |n202v-img2|                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `203-meter-reader <notebooks/203-meter-reader-with-output.html>`__ |br| |n203|                                                | PaddlePaddle pre-trained models to read industrial meter's value.                                                                          | |n203-img1|                               |
+   | `203-meter-reader <notebooks/203-meter-reader-with-output.html>`__ |br| |n203|                                                | PaddlePaddle pre-trained models to read industrial meter's value.                                                                          | |n203-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `204-segmenter-semantic-segmentation <notebooks/204-segmenter-semantic-segmentation-with-output.html>`__ |br| |c204|          | Semantic segmentation with OpenVINO™ using Segmenter.                                                                                      | |n204-img1|                               |
+   | `204-segmenter-semantic-segmentation <notebooks/204-segmenter-semantic-segmentation-with-output.html>`__ |br| |c204|          | Semantic segmentation with OpenVINO™ using Segmenter.                                                                                      | |n204-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `206-vision-paddlegan-anime <notebooks/206-vision-paddlegan-anime-with-output.html>`__                                        | Turn an image into anime using a GAN.                                                                                                      | |n206-img1| → |n206-img2|                 |
+   | `206-vision-paddlegan-anime <notebooks/206-vision-paddlegan-anime-with-output.html>`__                                        | Turn an image into anime using a GAN.                                                                                                      | |n206-img1| → |n206-img2|                          |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `207-vision-paddlegan-superresolution <notebooks/207-vision-paddlegan-superresolution-with-output.html>`__                    | Upscale small images with superresolution using a PaddleGAN model.                                                                         | |n207-img1| → |n207-img2|                 |
+   | `207-vision-paddlegan-superresolution <notebooks/207-vision-paddlegan-superresolution-with-output.html>`__                    | Upscale small images with superresolution using a PaddleGAN model.                                                                         | |n207-img1| → |n207-img2|                          |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `208-optical-character-recognition <notebooks/208-optical-character-recognition-with-output.html>`__                          | Annotate text on images using text recognition resnet.                                                                                     | |n208-img1|                               |
+   | `208-optical-character-recognition <notebooks/208-optical-character-recognition-with-output.html>`__                          | Annotate text on images using text recognition resnet.                                                                                     | |n208-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `212-pyannote-speaker-diarization <notebooks/212-pyannote-speaker-diarization-with-output.html>`__                            | Run inference on speaker diarization pipeline.                                                                                             | |n212-img1|                               |
+   | `212-pyannote-speaker-diarization <notebooks/212-pyannote-speaker-diarization-with-output.html>`__                            | Run inference on speaker diarization pipeline.                                                                                             | |n212-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `210-slowfast-video-recognition <notebooks/210-slowfast-video-recognition-with-output.html>`__ |br| |n210|                    | Video Recognition using SlowFast and OpenVINO™                                                                                             | |n210-img1|                               |
+   | `210-slowfast-video-recognition <notebooks/210-slowfast-video-recognition-with-output.html>`__ |br| |n210|                    | Video Recognition using SlowFast and OpenVINO™                                                                                             | |n210-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `213-question-answering <notebooks/213-question-answering-with-output.html>`__ |br| |n213|                                    | Answer your questions basing on a context.                                                                                                 | |n213-img1|                               |
+   | `213-question-answering <notebooks/213-question-answering-with-output.html>`__ |br| |n213|                                    | Answer your questions basing on a context.                                                                                                 | |n213-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `214-grammar-correction <notebooks/214-grammar-correction-with-output.html>`__                                                | Grammatical error correction with OpenVINO.                                                                                                |                                           |
+   | `214-grammar-correction <notebooks/214-grammar-correction-with-output.html>`__                                                | Grammatical error correction with OpenVINO.                                                                                                |                                                    |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `216-attention-center <notebooks/216-attention-center-with-output.html>`__                                                    | The attention center model with OpenVINO™                                                                                                  |                                           |
+   | `216-attention-center <notebooks/216-attention-center-with-output.html>`__                                                    | The attention center model with OpenVINO™                                                                                                  |                                                    |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `217-vision-deblur <notebooks/217-vision-deblur-with-output.html>`__ |br| |n217|                                              | Deblur images with DeblurGAN-v2.                                                                                                           | |n217-img1|                               |
+   | `217-vision-deblur <notebooks/217-vision-deblur-with-output.html>`__ |br| |n217|                                              | Deblur images with DeblurGAN-v2.                                                                                                           | |n217-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `219-knowledge-graphs-conve <notebooks/219-knowledge-graphs-conve-with-output.html>`__ |br| |n219|                            | Optimize the knowledge graph embeddings model (ConvE) with OpenVINO.                                                                       |                                           |
+   | `219-knowledge-graphs-conve <notebooks/219-knowledge-graphs-conve-with-output.html>`__ |br| |n219|                            | Optimize the knowledge graph embeddings model (ConvE) with OpenVINO.                                                                       |                                                    |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `220-cross-lingual-books-alignment <notebooks/220-cross-lingual-books-alignment-with-output.html>`__ |br| |n220| |br| |c220|  | Cross-lingual Books Alignment With Transformers and OpenVINO™                                                                              | |n220-img1|                               |
+   | `220-cross-lingual-books-alignment <notebooks/220-cross-lingual-books-alignment-with-output.html>`__ |br| |n220| |br| |c220|  | Cross-lingual Books Alignment With Transformers and OpenVINO™                                                                              | |n220-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `221-machine-translation <notebooks/221-machine-translation-with-output.html>`__ |br| |n221| |br| |c221|                      | Real-time translation from English to German.                                                                                              |                                           |
+   | `221-machine-translation <notebooks/221-machine-translation-with-output.html>`__ |br| |n221| |br| |c221|                      | Real-time translation from English to German.                                                                                              |                                                    |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `222-vision-image-colorization <notebooks/222-vision-image-colorization-with-output.html>`__ |br| |n222|                      | Use pre-trained models to colorize black & white images using OpenVINO.                                                                    | |n222-img1|                               |
+   | `222-vision-image-colorization <notebooks/222-vision-image-colorization-with-output.html>`__ |br| |n222|                      | Use pre-trained models to colorize black & white images using OpenVINO.                                                                    | |n222-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `223-text-prediction <notebooks/223-text-prediction-with-output.html>`__ |br| |c223|                                          | Use pre-trained models to perform text prediction on an input sequence.                                                                    | |n223-img1|                               |
+   | `223-text-prediction <notebooks/223-text-prediction-with-output.html>`__ |br| |c223|                                          | Use pre-trained models to perform text prediction on an input sequence.                                                                    | |n223-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `224-3D-segmentation-point-clouds <notebooks/224-3D-segmentation-point-clouds-with-output.html>`__                            | Process point cloud data and run 3D Part Segmentation with OpenVINO.                                                                       | |n224-img1|                               |
+   | `224-3D-segmentation-point-clouds <notebooks/224-3D-segmentation-point-clouds-with-output.html>`__                            | Process point cloud data and run 3D Part Segmentation with OpenVINO.                                                                       | |n224-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `225-stable-diffusion-text-to-image <notebooks/225-stable-diffusion-text-to-image-with-output.html>`__                        | Text-to-image generation with Stable Diffusion method.                                                                                     | |n225-img1|                               |
+   | `225-stable-diffusion-text-to-image <notebooks/225-stable-diffusion-text-to-image-with-output.html>`__                        | Text-to-image generation with Stable Diffusion method.                                                                                     | |n225-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `226-yolov7-optimization <notebooks/226-yolov7-optimization-with-output.html>`__                                              | Optimize YOLOv7, using NNCF PTQ API.                                                                                                       | |n226-img1|                               |
+   | `226-yolov7-optimization <notebooks/226-yolov7-optimization-with-output.html>`__                                              | Optimize YOLOv7, using NNCF PTQ API.                                                                                                       | |n226-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `227-whisper-subtitles-generation <notebooks/227-whisper-subtitles-generation-with-output.html>`__ |br| |c227|                | Generate subtitles for video with OpenAI Whisper and OpenVINO.                                                                             | |n227-img1|                               |
+   | `227-whisper-subtitles-generation <notebooks/227-whisper-subtitles-generation-with-output.html>`__ |br| |c227|                | Generate subtitles for video with OpenAI Whisper and OpenVINO.                                                                             | |n227-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `228-clip-zero-shot-convert <notebooks/228-clip-zero-shot-convert-with-output.html>`__                                        | Zero-shot Image Classification with OpenAI CLIP and OpenVINO™                                                                              | |n228-img1|                               |
+   | `228-clip-zero-shot-convert <notebooks/228-clip-zero-shot-convert-with-output.html>`__                                        | Zero-shot Image Classification with OpenAI CLIP and OpenVINO™                                                                              | |n228-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `228-clip-zero-shot-quantize <notebooks/228-clip-zero-shot-quantize-with-output.html>`__                                      | Post-Training Quantization of OpenAI CLIP model with NNCF                                                                                  | |n228-img2|                               |
+   | `228-clip-zero-shot-quantize <notebooks/228-clip-zero-shot-quantize-with-output.html>`__                                      | Post-Training Quantization of OpenAI CLIP model with NNCF                                                                                  | |n228-img2|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `229-distilbert-sequence-classification <notebooks/229-distilbert-sequence-classification-with-output.html>`__ |br| |n229|    | Sequence classification with OpenVINO.                                                                                                     | |n229-img1|                               |
+   | `229-distilbert-sequence-classification <notebooks/229-distilbert-sequence-classification-with-output.html>`__ |br| |n229|    | Sequence classification with OpenVINO.                                                                                                     | |n229-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `230-yolov8-optimization <notebooks/230-yolov8-optimization-with-output.html>`__ |br| |c230|                                  | Optimize YOLOv8, using NNCF PTQ API.                                                                                                       | |n230-img1|                               |
+   | `230-yolov8-optimization <notebooks/230-yolov8-optimization-with-output.html>`__ |br| |c230|                                  | Optimize YOLOv8, using NNCF PTQ API.                                                                                                       | |n230-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `231-instruct-pix2pix-image-editing <notebooks/231-instruct-pix2pix-image-editing-with-output.html>`__                        | Image editing with InstructPix2Pix.                                                                                                        | |n231-img1|                               |
+   | `231-instruct-pix2pix-image-editing <notebooks/231-instruct-pix2pix-image-editing-with-output.html>`__                        | Image editing with InstructPix2Pix.                                                                                                        | |n231-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `232-clip-language-saliency-map <notebooks/232-clip-language-saliency-map-with-output.html>`__ |br| |c232|                    | Language-visual saliency with CLIP and OpenVINO™.                                                                                          | |n232-img1|                               |
+   | `232-clip-language-saliency-map <notebooks/232-clip-language-saliency-map-with-output.html>`__ |br| |c232|                    | Language-visual saliency with CLIP and OpenVINO™.                                                                                          | |n232-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `233-blip-visual-language-processing <notebooks/233-blip-visual-language-processing-with-output.html>`__                      | Visual question answering and image captioning using BLIP and OpenVINO™.                                                                   | |n233-img1|                               |
+   | `233-blip-visual-language-processing <notebooks/233-blip-visual-language-processing-with-output.html>`__                      | Visual question answering and image captioning using BLIP and OpenVINO™.                                                                   | |n233-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `234-encodec-audio-compression <notebooks/234-encodec-audio-compression-with-output.html>`__                                  | Audio compression with EnCodec and OpenVINO™.                                                                                              | |n234-img1|                               |
+   | `234-encodec-audio-compression <notebooks/234-encodec-audio-compression-with-output.html>`__                                  | Audio compression with EnCodec and OpenVINO™.                                                                                              | |n234-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `235-controlnet-stable-diffusion <notebooks/235-controlnet-stable-diffusion-with-output.html>`__                              | A text-to-image generation with ControlNet Conditioning and OpenVINO™.                                                                     | |n235-img1|                               |
+   | `235-controlnet-stable-diffusion <notebooks/235-controlnet-stable-diffusion-with-output.html>`__                              | A text-to-image generation with ControlNet Conditioning and OpenVINO™.                                                                     | |n235-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.html>`__                                | Text-to-image generation and Infinite Zoom with Stable Diffusion v2 and OpenVINO™.                                                         | |n236-img1|                               |
+   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-infinite-zoom-with-output.html>`__                                | Text-to-image generation and Infinite Zoom with Stable Diffusion v2 and OpenVINO™.                                                         | |n236-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.html>`__                      | Stable Diffusion v2.1 using Optimum-Intel OpenVINO and multiple Intel Hardware.                                                            | |n236-img4|                               |
+   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-optimum-demo-comparison-with-output.html>`__                      | Stable Diffusion v2.1 using Optimum-Intel OpenVINO and multiple Intel Hardware.                                                            | |n236-img4|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-optimum-demo-with-output.html>`__                                 | Stable Diffusion v2.1 using Optimum-Intel OpenVINO.                                                                                        | |n236-img4|                               |
+   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-optimum-demo-with-output.html>`__                                 | Stable Diffusion v2.1 using Optimum-Intel OpenVINO.                                                                                        | |n236-img4|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.html>`__                           | Stable Diffusion Text-to-Image Demo.                                                                                                       | |n236-img4|                               |
+   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-text-to-image-demo-with-output.html>`__                           | Stable Diffusion Text-to-Image Demo.                                                                                                       | |n236-img4|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-text-to-image-with-output.html>`__                                | Text-to-image generation with Stable Diffusion v2 and OpenVINO™.                                                                           | |n236-img1|                               |
+   | `236-stable-diffusion-v2 <notebooks/236-stable-diffusion-v2-text-to-image-with-output.html>`__                                | Text-to-image generation with Stable Diffusion v2 and OpenVINO™.                                                                           | |n236-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `237-segment-anything <notebooks/237-segment-anything-with-output.html>`__                                                    | Prompt based object segmentation mask generation, using Segment Anything and OpenVINO™.                                                    | |n237-img1|                               |
+   | `237-segment-anything <notebooks/237-segment-anything-with-output.html>`__                                                    | Prompt based object segmentation mask generation, using Segment Anything and OpenVINO™.                                                    | |n237-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `238-deep-floyd-if <notebooks/238-deep-floyd-if-with-output.html>`__                                                          | Text-to-image generation with DeepFloyd IF and OpenVINO™.                                                                                  | |n238-img1|                               |
+   | `238-deep-floyd-if <notebooks/238-deep-floyd-if-with-output.html>`__                                                          | Text-to-image generation with DeepFloyd IF and OpenVINO™.                                                                                  | |n238-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `239-image-bind <notebooks/239-image-bind-convert-with-output.html>`__                                                        | Binding multimodal data, using ImageBind and OpenVINO™.                                                                                    | |n239-img1|                               |
+   | `239-image-bind <notebooks/239-image-bind-convert-with-output.html>`__                                                        | Binding multimodal data, using ImageBind and OpenVINO™.                                                                                    | |n239-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `240-dolly-2-instruction-following <notebooks/240-dolly-2-instruction-following-with-output.html>`__                          | Instruction following using Databricks Dolly 2.0 and OpenVINO™.                                                                            | |n240-img1|                               |
+   | `240-dolly-2-instruction-following <notebooks/240-dolly-2-instruction-following-with-output.html>`__                          | Instruction following using Databricks Dolly 2.0 and OpenVINO™.                                                                            | |n240-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `241-riffusion-text-to-music <notebooks/241-riffusion-text-to-music-with-output.html>`__                                      | Text-to-Music generation using Riffusion and OpenVINO™.                                                                                    | |n241-img1|                               |
+   | `241-riffusion-text-to-music <notebooks/241-riffusion-text-to-music-with-output.html>`__                                      | Text-to-Music generation using Riffusion and OpenVINO™.                                                                                    | |n241-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `242-freevc-voice-conversion <notebooks/242-freevc-voice-conversion-with-output.html>`__                                      | High-Quality Text-Free One-Shot Voice Conversion with FreeVC and OpenVINO™                                                                 |                                           |
+   | `242-freevc-voice-conversion <notebooks/242-freevc-voice-conversion-with-output.html>`__                                      | High-Quality Text-Free One-Shot Voice Conversion with FreeVC and OpenVINO™                                                                 |                                                    |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `243-tflite-selfie-segmentation <notebooks/243-tflite-selfie-segmentation-with-output.html>`__ |br| |n243| |br| |c243|        | Selfie Segmentation using TFLite and OpenVINO™.                                                                                            | |n243-img1|                               |
+   | `243-tflite-selfie-segmentation <notebooks/243-tflite-selfie-segmentation-with-output.html>`__ |br| |n243| |br| |c243|        | Selfie Segmentation using TFLite and OpenVINO™.                                                                                            | |n243-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `244-named-entity-recognition <notebooks/244-named-entity-recognition-with-output.html>`__ |br| |c244|                        | Named entity recognition with OpenVINO™.                                                                                                   |                                           |
+   | `244-named-entity-recognition <notebooks/244-named-entity-recognition-with-output.html>`__ |br| |c244|                        | Named entity recognition with OpenVINO™.                                                                                                   |                                                    |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `245-typo-detector <notebooks/245-typo-detector-with-output.html>`__                                                          | English Typo Detection in sentences with OpenVINO™.                                                                                        | |n245-img1|                               |
+   | `245-typo-detector <notebooks/245-typo-detector-with-output.html>`__                                                          | English Typo Detection in sentences with OpenVINO™.                                                                                        | |n245-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `246-depth-estimation-videpth <notebooks/246-depth-estimation-videpth-with-output.html>`__                                    | Monocular Visual-Inertial Depth Estimation with OpenVINO™.                                                                                 | |n246-img1|                               |
+   | `246-depth-estimation-videpth <notebooks/246-depth-estimation-videpth-with-output.html>`__                                    | Monocular Visual-Inertial Depth Estimation with OpenVINO™.                                                                                 | |n246-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `247-code-language-id <notebooks/247-code-language-id-with-output.html>`__ |br| |n247|                                        | Identify the programming language used in an arbitrary code snippet.                                                                       |                                           |
+   | `247-code-language-id <notebooks/247-code-language-id-with-output.html>`__ |br| |n247|                                        | Identify the programming language used in an arbitrary code snippet.                                                                       |                                                    |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `248-stable-diffusion-xl <notebooks/248-stable-diffusion-xl-with-output.html>`__                                              | Image generation with Stable Diffusion XL and OpenVINO™.                                                                                   | |n248-img1|                               |
+   | `248-stable-diffusion-xl <notebooks/248-stable-diffusion-xl-with-output.html>`__                                              | Image generation with Stable Diffusion XL and OpenVINO™.                                                                                   | |n248-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `249-oneformer-segmentation <notebooks/249-oneformer-segmentation-with-output.html>`__                                        | Universal segmentation with OneFormer and OpenVINO™.                                                                                       | |n249-img1|                               |
+   | `249-oneformer-segmentation <notebooks/249-oneformer-segmentation-with-output.html>`__                                        | Universal segmentation with OneFormer and OpenVINO™.                                                                                       | |n249-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `250-music-generation <notebooks/250-music-generation-with-output.html>`__ |br| |n250| |br| |c250|                            | Controllable Music Generation with MusicGen and OpenVINO™.                                                                                 | |n250-img1|                               |
+   | `250-music-generation <notebooks/250-music-generation-with-output.html>`__ |br| |n250| |br| |c250|                            | Controllable Music Generation with MusicGen and OpenVINO™.                                                                                 | |n250-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `251-tiny-sd-image-generation <notebooks/251-tiny-sd-image-generation-with-output.html>`__ |br| |c251|                        | Image Generation with Tiny-SD and OpenVINO™.                                                                                               | |n251-img1|                               |
+   | `251-tiny-sd-image-generation <notebooks/251-tiny-sd-image-generation-with-output.html>`__ |br| |c251|                        | Image Generation with Tiny-SD and OpenVINO™.                                                                                               | |n251-img1|                                        |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
-   | `252-fastcomposer-image-generation <notebooks/252-fastcomposer-image-generation-with-output.html>`__                          | Image generation with FastComposer and OpenVINO™.                                                                                          |                                           |
+   | `252-fastcomposer-image-generation <notebooks/252-fastcomposer-image-generation-with-output.html>`__                          | Image generation with FastComposer and OpenVINO™.                                                                                          |                                                    |
-   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------+
+   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
   | `253-zeroscope-text2video <notebooks/253-zeroscope-text2video-with-output.html>`__                                            | Text-to video synthesis with ZeroScope and OpenVINO™.                                                                                      | A panda eating bamboo on a rock. |br| |n253-img1|  |
   +-------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------+
 Model Training
@ -501,6 +503,8 @@ Made with `contributors-img <https://contrib.rocks>`__.
   :target: https://user-images.githubusercontent.com/76463150/260439306-81c81c8d-1f9c-41d0-b881-9491766def8e.png
 .. |n251-img1| image:: https://user-images.githubusercontent.com/29454499/260904650-274fc2f9-24d2-46a3-ac3d-d660ec3c9a19.png
   :target: https://user-images.githubusercontent.com/29454499/260904650-274fc2f9-24d2-46a3-ac3d-d660ec3c9a19.png
 .. |n253-img1| image:: https://user-images.githubusercontent.com/76161256/261102399-500956d5-4aac-4710-a77c-4df34bcda3be.gif
   :target: https://user-images.githubusercontent.com/76161256/261102399-500956d5-4aac-4710-a77c-4df34bcda3be.gif
 .. |n301-img1| image:: https://user-images.githubusercontent.com/15709723/127779607-8fa34947-1c35-4260-8d04-981c41a2a2cc.png
   :target: https://user-images.githubusercontent.com/15709723/127779607-8fa34947-1c35-4260-8d04-981c41a2a2cc.png
 .. |n401-img1| image:: https://user-images.githubusercontent.com/4547501/141471665-82b28c86-cf64-4bfe-98b3-c314658f2d96.gif