diff --git a/docs/ops/normalization/NormalizeL2_1.md b/docs/ops/normalization/NormalizeL2_1.md
index 56fd13092ad..4668519030f 100644
--- a/docs/ops/normalization/NormalizeL2_1.md
+++ b/docs/ops/normalization/NormalizeL2_1.md
@@ -4,47 +4,57 @@
 
 **Category**: *Normalization*
 
-**Short description**: *NormalizeL2* operation performs L2 normalization of the 1st input tensor in slices specified by the 2nd input.
+**Short description**: *NormalizeL2* operation performs L2 normalization on a given input `data` along dimensions specified by `axes` input.
+
+**Detailed Description**
+
+Each element in the output is the result of dividing the corresponding element of `data` input by the result of L2 reduction along dimensions specified by the `axes` input:
+
+    output[i0, i1, ..., iN] = x[i0, i1, ..., iN] / sqrt(eps_mode(sum[j0,..., jN](x[j0, ..., jN]**2), eps))
+
+Where indices `i0, ..., iN` run through all valid indices for the `data` input and summation `sum[j0, ..., jN]` has `jk = ik` for those dimensions `k` that are not in the set of indices specified by the `axes` input of the operation.
+`eps_mode` selects how the reduction value and `eps` are combined. It can be `max` or `add` depending on `eps_mode` attribute value.
+
+Particular cases:
+
+1. If `axes` is an empty list, then each input element is divided by itself resulting value `1` for all non-zero elements.
+2. If `axes` contains all dimensions of input `data`, a single L2 reduction value is calculated for the entire input tensor and each input element is divided by that value.
+
 
 **Attributes**
 
 * *eps*
 
-  * **Description**: *eps* is the number to be added/maximized to/with the variance to avoid division by zero when normalizing the value. For example, *eps* equal to 0.001 means that 0.001 is used if all the values in normalization are equal to zero.
+  * **Description**: *eps* is the number applied by *eps_mode* function to the sum of squares to avoid division by zero when normalizing the value.
   * **Range of values**: a positive floating-point number
   * **Type**: `float`
-  * **Default value**: None
   * **Required**: *yes*
 
 * *eps_mode*
 
-  * **Description**: Specifies how *eps* is combined with L2 value calculated before division.
-  * **Range of values**: `add`, `max`
+  * **Description**: Specifies how *eps* is combined with the sum of squares to avoid division by zero.
+  * **Range of values**: `add` or `max`
   * **Type**: `string`
-  * **Default value**: None
   * **Required**: *yes*
 
 **Inputs**
 
-* **1**: `data` - input tensor to be normalized. Type of elements is any floating point type. Required.
+* **1**: `data` - A tensor of type *T* and arbitrary shape. **Required.**
 
-* **2**: `axes` - scalar or 1D tensor with axis indices for the `data` input along which L2 reduction is calculated. Required.
+* **2**: `axes` - Axis indices of `data` input tensor, along which L2 reduction is calculated. A scalar or 1D tensor of unique elements and type *T_IND*. The range of elements is `[-r, r-1]`, where `r` is the rank of `data` input tensor. **Required.**
 
 **Outputs**
 
-* **1**: Tensor of the same shape and type as the `data` input and normalized slices defined by `axes` input.
+* **1**: The result of *NormalizeL2* function applied to `data` input tensor. Normalized tensor of the same type and shape as the data input.
 
-**Detailed Description**
+**Types**
 
-Each element in the output is the result of division of corresponding element from the `data` input tensor by the result of L2 reduction along dimensions specified by the `axes` input:
+* *T*: arbitrary supported floating-point type.
+* *T_IND*: any supported integer type.
 
-    output[i0, i1, ..., iN] = x[i0, i1, ..., iN] / sqrt(eps_mode(sum[j0,..., jN](x[j0, ..., jN]**2), eps))
+**Examples**
 
-Where indices `i0, ..., iN` run through all valid indices for the 1st input and summation `sum[j0, ..., jN]` have `jk = ik` for those dimensions `k` that are not in the set of indices specified by the `axes` input of the operation. One of the corner cases is when `axes` is an empty list, then we divide each input element by itself resulting value 1 for all non-zero elements. Another corner case is where `axes` input contains all dimensions from `data` tensor, which means that a single L2 reduction value is calculated for entire input tensor and each input element is divided by that value.
-
-`eps_mode` selects how the reduction value and `eps` are combined. It can be `max` or `add` depending on `eps_mode` attribute value.
-
-**Example**
+*Example: Normalization over channel dimension for `NCHW` layout*
 
 ```xml
 <layer id="1" type="NormalizeL2" ...>
@@ -57,7 +67,7 @@ Where indices `i0, ..., iN` run through all valid indices for the 1st input and
             <dim>24</dim>
         </port>
         <port id="1">
-            <dim>2</dim>         <!-- value is [2, 3] that means independent normalization in each channel -->
+            <dim>1</dim>         <!-- axes list [1] means normalization over channel dimension -->
         </port>
     </input>
     <output>
@@ -69,4 +79,31 @@ Where indices `i0, ..., iN` run through all valid indices for the 1st input and
         </port>
     </output>
 </layer>
-```
\ No newline at end of file
+```
+
+*Example: Normalization over channel and spatial dimensions for `NCHW` layout*
+
+```xml
+<layer id="1" type="NormalizeL2" ...>
+    <data eps="1e-8" eps_mode="add"/>
+    <input>
+        <port id="0">
+            <dim>6</dim>
+            <dim>12</dim>
+            <dim>10</dim>
+            <dim>24</dim>
+        </port>
+        <port id="1">
+            <dim>3</dim>         <!-- axes list [1, 2, 3] means normalization over channel and spatial dimensions -->
+        </port>
+    </input>
+    <output>
+        <port id="2">
+            <dim>6</dim>
+            <dim>12</dim>
+            <dim>10</dim>
+            <dim>24</dim>
+        </port>
+    </output>
+</layer>
+```