Files
openvino/docs/ops/normalization/BatchNormInference_1.md
Maciej Smyk 49b5d039db DOCS shift to rst - Opsets B (#17169)
* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_1.md

* Update BatchNormInference_5.md

* Update BatchToSpace_2.md

* Update BinaryConvolution_1.md

* Update Broadcast_1.md

* Update Broadcast_3.md

* Update Bucketize_3.md

* fix

* fix-2
2023-04-25 16:06:17 +02:00

5.4 KiB

BatchNormInference

@sphinxdirective

Versioned name: BatchNormInference-5

Category: Normalization

Short description: BatchNormInference performs Batch Normalization operation described in the Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift <https://arxiv.org/abs/1502.03167v2>__ article.

Detailed Description

BatchNormInference performs the following operations on a given data batch input tensor data:

  • Normalizes each activation :math:x^{(k)} by the mean and variance.

    .. math::

    \hat{x}^{(k)}=\frac{x^{(k)} - E[x^{(k)}]}{\sqrt{Var(x^{(k)}) + \epsilon}}

    where :math:E[x^{(k)}] and :math:Var(x^{(k)}) are the mean and variance, calculated per channel axis of data input, and correspond to mean and variance inputs, respectively. Additionally, :math:\epsilon is a value added to the variance for numerical stability and corresponds to epsilon attribute.

  • Performs linear transformation of each normalized activation based on gamma and beta input, representing the scaling factor and shift, respectively.

    .. math::

    \hat{y}^{(k)}=\gamma^{(k)}\hat{x}^{(k)} + \beta^{(k)}

    where :math:\gamma^{(k)} and :math:\beta^{(k)} are learnable parameters, calculated per channel axis, and correspond to gamma and beta inputs.

Mathematical Formulation

Let x be a d-dimensional input, :math:x=(x_{1}\dotsc x_{d}). Since normalization is applied to each activation :math:E[x^{(k)}], you can focus on a particular activation and omit k.

For a particular activation, consider a mini-batch :math:\mathcal{B} of m values. BatchNormInference performs Batch Normalization algorithm as follows:

  • Input: Values of :math:x over a mini-batch:

    .. math::

    \mathcal{B} = {x_{1...m}}

  • Parameters to learn: :math:\gamma, \beta

  • Output:

    .. math::

    {o_{i} = BN_{\gamma, \beta} ( b_{i} )}

  • Mini-batch mean:

    .. math::

    \mu_{\mathcal{B}} \leftarrow \frac{1}{m}\sum_{i=1}^{m}b_{i}

  • Mini-batch variance:

    .. math::

    \sigma_{\mathcal{B}}^{2}\leftarrow \frac{1}{m}\sum_{i=1}^{m} ( b_{i} - \mu_{\mathcal{B}})^{2}

  • Normalize:

    .. math::

    \hat{b_{i}} \leftarrow \frac{b_{i} - \mu_{\mathcal{B}}}{\sqrt{\sigma_{\mathcal{B}}^{2} + \epsilon }}

  • Scale and shift:

    .. math::

    o_{i} \leftarrow \gamma\hat{b_{i}} + \beta = BN_{\gamma ,\beta } ( b_{i} )

Attributes:

  • epsilon

    • Description: epsilon is a constant added to the variance for numerical stability.
    • Range of values: a floating-point number greater than or equal to zero
    • Type: float
    • Required: yes

Inputs

  • 1: data - A tensor of type T and at least rank 2. The second dimension represents the channel axis and must have a span of at least 1. Required.
  • 2: gamma - Scaling factor for normalized value. A 1D tensor of type T with the same span as data channel axis. Required.
  • 3: beta - Bias added to the scaled normalized value. A 1D tensor of type T with the same span as data channel axis. Required.
  • 4: mean - Value for mean normalization. A 1D tensor of type T with the same span as data channel axis. Required.
  • 5: variance - Value for variance normalization. A 1D tensor of type T with the same span as data channel axis. Required.

Outputs

  • 1: The result of element-wise Batch Normalization operation applied to the input tensor data. A tensor of type T and the same shape as data input tensor.

Types

  • T: any supported floating-point type.

Examples

Example: 2D input tensor data

.. code-block:: cpp

<layer ... type="BatchNormInference" ...> < !-- input --> 10 128 < !-- gamma --> 128 < !-- beta --> 128 < !-- mean --> 128 < !-- variance --> 128 10 128

Example: 4D input tensor data

.. code-block:: cpp

<layer ... type="BatchNormInference" ...> < !-- input --> 1 3 224 224 < !-- gamma --> 3 < !-- beta --> 3 < !-- mean --> 3 < !-- variance --> 3 1 3 224 224

@endsphinxdirective