* Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_1.md * Update BatchNormInference_5.md * Update BatchToSpace_2.md * Update BinaryConvolution_1.md * Update Broadcast_1.md * Update Broadcast_3.md * Update Bucketize_3.md * fix * fix-2
5.4 KiB
BatchNormInference
@sphinxdirective
Versioned name: BatchNormInference-5
Category: Normalization
Short description: BatchNormInference performs Batch Normalization operation described in the Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift <https://arxiv.org/abs/1502.03167v2>__ article.
Detailed Description
BatchNormInference performs the following operations on a given data batch input tensor data:
-
Normalizes each activation :math:
x^{(k)}by the mean and variance... math::
\hat{x}^{(k)}=\frac{x^{(k)} - E[x^{(k)}]}{\sqrt{Var(x^{(k)}) + \epsilon}}
where :math:
E[x^{(k)}]and :math:Var(x^{(k)})are the mean and variance, calculated per channel axis ofdatainput, and correspond tomeanandvarianceinputs, respectively. Additionally, :math:\epsilonis a value added to the variance for numerical stability and corresponds toepsilonattribute. -
Performs linear transformation of each normalized activation based on
gammaandbetainput, representing the scaling factor and shift, respectively... math::
\hat{y}^{(k)}=\gamma^{(k)}\hat{x}^{(k)} + \beta^{(k)}
where :math:
\gamma^{(k)}and :math:\beta^{(k)}are learnable parameters, calculated per channel axis, and correspond togammaandbetainputs.
Mathematical Formulation
Let x be a d-dimensional input, :math:x=(x_{1}\dotsc x_{d}). Since normalization is applied to each activation :math:E[x^{(k)}], you can focus on a particular activation and omit k.
For a particular activation, consider a mini-batch :math:\mathcal{B} of m values. BatchNormInference performs Batch Normalization algorithm as follows:
-
Input: Values of :math:
xover a mini-batch:.. math::
\mathcal{B} = {x_{1...m}}
-
Parameters to learn: :math:
\gamma, \beta -
Output:
.. math::
{o_{i} = BN_{\gamma, \beta} ( b_{i} )}
-
Mini-batch mean:
.. math::
\mu_{\mathcal{B}} \leftarrow \frac{1}{m}\sum_{i=1}^{m}b_{i}
-
Mini-batch variance:
.. math::
\sigma_{\mathcal{B}}^{2}\leftarrow \frac{1}{m}\sum_{i=1}^{m} ( b_{i} - \mu_{\mathcal{B}})^{2}
-
Normalize:
.. math::
\hat{b_{i}} \leftarrow \frac{b_{i} - \mu_{\mathcal{B}}}{\sqrt{\sigma_{\mathcal{B}}^{2} + \epsilon }}
-
Scale and shift:
.. math::
o_{i} \leftarrow \gamma\hat{b_{i}} + \beta = BN_{\gamma ,\beta } ( b_{i} )
Attributes:
-
epsilon
- Description: epsilon is a constant added to the variance for numerical stability.
- Range of values: a floating-point number greater than or equal to zero
- Type:
float - Required: yes
Inputs
- 1:
data- A tensor of type T and at least rank 2. The second dimension represents the channel axis and must have a span of at least 1. Required. - 2:
gamma- Scaling factor for normalized value. A 1D tensor of type T with the same span asdatachannel axis. Required. - 3:
beta- Bias added to the scaled normalized value. A 1D tensor of type T with the same span asdatachannel axis. Required. - 4:
mean- Value for mean normalization. A 1D tensor of type T with the same span asdatachannel axis. Required. - 5:
variance- Value for variance normalization. A 1D tensor of type T with the same span asdatachannel axis. Required.
Outputs
- 1: The result of element-wise Batch Normalization operation applied to the input tensor
data. A tensor of type T and the same shape asdatainput tensor.
Types
- T: any supported floating-point type.
Examples
Example: 2D input tensor data
.. code-block:: cpp
<layer ... type="BatchNormInference" ...> < !-- input --> 10 128 < !-- gamma --> 128 < !-- beta --> 128 < !-- mean --> 128 < !-- variance --> 128 10 128
Example: 4D input tensor data
.. code-block:: cpp
<layer ... type="BatchNormInference" ...> < !-- input --> 1 3 224 224 < !-- gamma --> 3 < !-- beta --> 3 < !-- mean --> 3 < !-- variance --> 3 1 3 224 224
@endsphinxdirective