DOCS shift to rst - Opsets C (#17112)

2023-04-21 13:30:07 +02:00
parent 304991f88b
commit c4b155edc2
14 changed files with 757 additions and 694 deletions
--- a/docs/ops/convolution/ConvolutionBackpropData_1.md
+++ b/docs/ops/convolution/ConvolutionBackpropData_1.md
@@ -1,5 +1,7 @@
 # ConvolutionBackpropData {#openvino_docs_ops_convolution_ConvolutionBackpropData_1}

+@sphinxdirective
+
 **Versioned name**: *ConvolutionBackpropData-1*

 **Category**: *Convolution*
@@ -8,33 +10,33 @@

 **Detailed description**:

-ConvolutionBackpropData takes the input tensor, weights tensor and output shape and computes the output tensor of a given shape. The shape of the output can be specified as an input 1D integer tensor explicitly or determined by other attributes implicitly. If output shape is specified as an explicit input, shape of the output exactly matches the specified size and required amount of padding is computed. More thorough explanation can be found in [Transposed Convolutions](https://arxiv.org/abs/1603.07285).
+ConvolutionBackpropData takes the input tensor, weights tensor and output shape and computes the output tensor of a given shape. The shape of the output can be specified as an input 1D integer tensor explicitly or determined by other attributes implicitly. If output shape is specified as an explicit input, shape of the output exactly matches the specified size and required amount of padding is computed. More thorough explanation can be found in `Transposed Convolutions <https://arxiv.org/abs/1603.07285>`__.

-ConvolutionBackpropData accepts the same set of attributes as a regular Convolution operation and additionally `output_padding` attribute, but they are interpreted in a "backward way", so they are applied to the output of ConvolutionBackpropData, but not to the input. Refer to a regular [Convolution](Convolution_1.md) operation for detailed description of each Convolution attribute.
+ConvolutionBackpropData accepts the same set of attributes as a regular Convolution operation and additionally ``output_padding`` attribute, but they are interpreted in a "backward way", so they are applied to the output of ConvolutionBackpropData, but not to the input. Refer to a regular :doc:`Convolution <openvino_docs_ops_convolution_Convolution_1>` operation for detailed description of each Convolution attribute.

-When output shape is specified as an input tensor `output_shape` then it specifies only spatial dimensions. No batch or channel dimension should be passed along with spatial dimensions. If `output_shape` is omitted, then `pads_begin`, `pads_end` or `auto_pad` are used to determine output spatial shape `[O_z, O_y, O_x]` by input spatial shape `[I_z, I_y, I_x]` in the following way:
+When output shape is specified as an input tensor ``output_shape`` then it specifies only spatial dimensions. No batch or channel dimension should be passed along with spatial dimensions. If ``output_shape`` is omitted, then ``pads_begin``, ``pads_end`` or ``auto_pad`` are used to determine output spatial shape ``[O_z, O_y, O_x]`` by input spatial shape ``[I_z, I_y, I_x]`` in the following way:

-```
-if auto_pads != None:
-    pads_begin[i] = 0
-    pads_end[i] = 0
+.. code-block:: cpp
+   
+   if auto_pads != None:
+       pads_begin[i] = 0
+       pads_end[i] = 0
+   
+   Y_i = stride[i] * (X_i - 1) + ((K_i - 1) * dilations[i] + 1) - pads_begin[i] - pads_end[i] + output_padding[i]

-Y_i = stride[i] * (X_i - 1) + ((K_i - 1) * dilations[i] + 1) - pads_begin[i] - pads_end[i] + output_padding[i]
-```
+where ``K_i`` filter kernel dimension along spatial axis ``i``.

-where `K_i` filter kernel dimension along spatial axis `i`.
+If ``output_shape`` is specified, ``pads_begin`` and ``pads_end`` are ignored, and ``auto_pad`` defines how to distribute padding amount around the tensor. In this case pads are determined based on the next formulas to correctly align input and output tensors:

- If `output_shape` is specified, `pads_begin` and `pads_end` are ignored, and `auto_pad` defines how to distribute padding amount around the tensor. In this case pads are determined based on the next formulas to correctly align input and output tensors:
-
-```
-total_padding[i] = stride[i] * (X_i - 1) + ((K_i - 1) * dilations[i] + 1) - output_shape[i] + output_padding[i]
-if auto_pads != SAME_UPPER:
-    pads_begin[i] = total_padding[i] // 2
-    pads_end[i] = total_padding[i] - pads_begin[i]
-else:
-    pads_end[i] = total_padding[i] // 2
-    pads_begin[i] = total_padding[i] - pads_end[i]
-```
+.. code-block:: cpp
+   
+   total_padding[i] = stride[i] * (X_i - 1) + ((K_i - 1) * dilations[i] + 1) - output_shape[i] + output_padding[i]
+   if auto_pads != SAME_UPPER:
+       pads_begin[i] = total_padding[i] // 2
+       pads_end[i] = total_padding[i] - pads_begin[i]
+   else:
+       pads_end[i] = total_padding[i] // 2
+       pads_begin[i] = total_padding[i] - pads_end[i]

 **Attributes**

@@ -42,14 +44,14 @@ else:

  * **Description**: *strides* has the same definition as *strides* for a regular Convolution but applied in the backward way, for the output tensor.
  * **Range of values**: positive integers
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Required**: *yes*

 * *pads_begin*

  * **Description**: *pads_begin* has the same definition as *pads_begin* for a regular Convolution but applied in the backward way, for the output tensor. May be omitted specified, in which case pads are calculated automatically.
  * **Range of values**: non-negative integers
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Required**: *yes*
  * **Note**: the attribute is ignored when *auto_pad* attribute is specified.

@@ -57,7 +59,7 @@ else:

  * **Description**: *pads_end* has the same definition as *pads_end* for a regular Convolution but applied in the backward way, for the output tensor. May be omitted, in which case pads are calculated automatically.
  * **Range of values**: non-negative integers
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Required**: *yes*
  * **Note**: the attribute is ignored when *auto_pad* attribute is specified.

@@ -65,44 +67,44 @@ else:

  * **Description**: *dilations* has the same definition as *dilations* for a regular Convolution but applied in the backward way, for the output tensor.
  * **Range of values**: positive integers
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Required**: *yes*

 * *auto_pad*

  * **Description**: *auto_pad* has the same definition as *auto_pad* for a regular Convolution but applied in the backward way, for the output tensor.
-    * *explicit*: use explicit padding values from `pads_begin` and `pads_end`.
+    
+    * *explicit*: use explicit padding values from ``pads_begin`` and ``pads_end``.
    * *same_upper* the input is padded to match the output size. In case of odd padding value an extra padding is added at the end.
    * *same_lower* the input is padded to match the output size. In case of odd padding value an extra padding is added at the beginning.
    * *valid* - do not use padding.
-  * **Type**: `string`
+  * **Type**: ``string``
  * **Default value**: None
  * **Required**: *no*
  * **Note**: *pads_begin* and *pads_end* attributes are ignored when *auto_pad* is specified.

 * *output_padding*

-  * **Description**: *output_padding* adds additional amount of paddings per each spatial axis in the `output` tensor. It unlocks more elements in the output allowing them to be computed. Elements are added at the higher coordinate indices for the spatial dimensions. Number of elements in *output_padding* list matches the number of spatial dimensions in `data` and `output` tensors.
+  * **Description**: *output_padding* adds additional amount of paddings per each spatial axis in the ``output`` tensor. It unlocks more elements in the output allowing them to be computed. Elements are added at the higher coordinate indices for the spatial dimensions. Number of elements in *output_padding* list matches the number of spatial dimensions in ``data`` and ``output`` tensors.
  * **Range of values**: non-negative integer values
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Default value**: all zeros
  * **Required**: *no*

 **Inputs**:

-*   **1**: Input tensor of type *T1* and rank 3, 4 or 5. Layout is `[N, C_INPUT, Z, Y, X]` (number of batches, number of input channels, spatial axes Z, Y, X). **Required.**
-
-*   **2**: Convolution kernel tensor of type *T1* and rank 3, 4 or 5. Layout is `[C_INPUT, C_OUTPUT, Z, Y, X]` (number of input channels, number of output channels, spatial axes Z, Y, X). Spatial size of the kernel is derived from the shape of this input and aren't specified by any attribute. **Required.**
-
-*   **3**: `output_shape` is 1D tensor of type *T2* that specifies spatial shape of the output. If specified, *padding amount* is deduced from relation of input and output spatial shapes according to formulas in the description. If not specified, *output shape* is calculated based on the `pads_begin` and `pads_end` or completely according to `auto_pad`. **Optional.**
-*   **Note**: Type of the convolution (1D, 2D or 3D) is derived from the rank of the input tensors and not specified by any attribute:
-      * 1D convolution (input tensors rank 3) means that there is only one spatial axis X,
-      * 2D convolution (input tensors rank 4) means that there are two spatial axes Y, X,
-      * 3D convolution (input tensors rank 5) means that there are three spatial axes Z, Y, X.
+* **1**: Input tensor of type *T1* and rank 3, 4 or 5. Layout is ``[N, C_INPUT, Z, Y, X]`` (number of batches, number of input channels, spatial axes Z, Y, X). **Required.**
+* **2**: Convolution kernel tensor of type *T1* and rank 3, 4 or 5. Layout is ``[C_INPUT, C_OUTPUT, Z, Y, X]`` (number of input channels, number of output channels, spatial axes Z, Y, X). Spatial size of the kernel is derived from the shape of this input and aren't specified by any attribute. **Required.**
+* **3**: ``output_shape`` is 1D tensor of type *T2* that specifies spatial shape of the output. If specified, *padding amount* is deduced from relation of input and output spatial shapes according to formulas in the description. If not specified, *output shape* is calculated based on the ``pads_begin`` and ``pads_end`` or completely according to ``auto_pad``. **Optional.**
+* **Note**: Type of the convolution (1D, 2D or 3D) is derived from the rank of the input tensors and not specified by any attribute:
+  
+  * 1D convolution (input tensors rank 3) means that there is only one spatial axis X,
+  * 2D convolution (input tensors rank 4) means that there are two spatial axes Y, X,
+  * 3D convolution (input tensors rank 5) means that there are three spatial axes Z, Y, X.

 **Outputs**:

-*   **1**: Output tensor of type *T1* and rank 3, 4 or 5. Layout is `[N, C_OUTPUT, Z, Y, X]` (number of batches, number of kernel output channels, spatial axes Z, Y, X).
+*   **1**: Output tensor of type *T1* and rank 3, 4 or 5. Layout is ``[N, C_OUTPUT, Z, Y, X]`` (number of batches, number of kernel output channels, spatial axes Z, Y, X).

 **Types**:

@@ -113,93 +115,96 @@ else:

 *Example 1: 2D ConvolutionBackpropData*

-```xml
-<layer id="5" name="upsampling_node" type="ConvolutionBackpropData">
-    <data dilations="1,1" pads_begin="1,1" pads_end="1,1" strides="2,2" output_padding="0,0" auto_pad="explicit"/>
-    <input>
-        <port id="0">
-            <dim>1</dim>
-            <dim>20</dim>
-            <dim>224</dim>
-            <dim>224</dim>
-        </port>
-        <port id="1">
-            <dim>20</dim>
-            <dim>10</dim>
-            <dim>3</dim>
-            <dim>3</dim>
-        </port>
-    </input>
-    <output>
-        <port id="0" precision="FP32">
-            <dim>1</dim>
-            <dim>10</dim>
-            <dim>447</dim>
-            <dim>447</dim>
-        </port>
-    </output>
-</layer>
-```
+.. code-block:: cpp
+   
+   <layer id="5" name="upsampling_node" type="ConvolutionBackpropData">
+       <data dilations="1,1" pads_begin="1,1" pads_end="1,1" strides="2,2" output_padding="0,0" auto_pad="explicit"/>
+       <input>
+           <port id="0">
+               <dim>1</dim>
+               <dim>20</dim>
+               <dim>224</dim>
+               <dim>224</dim>
+           </port>
+           <port id="1">
+               <dim>20</dim>
+               <dim>10</dim>
+               <dim>3</dim>
+               <dim>3</dim>
+           </port>
+       </input>
+       <output>
+           <port id="0" precision="FP32">
+               <dim>1</dim>
+               <dim>10</dim>
+               <dim>447</dim>
+               <dim>447</dim>
+           </port>
+       </output>
+   </layer>

 *Example 2: 2D ConvolutionBackpropData with output_padding*

-```xml
-<layer id="5" name="upsampling_node" type="ConvolutionBackpropData">
-    <data dilations="1,1" pads_begin="0,0" pads_end="0,0" strides="3,3" output_padding="2,2" auto_pad="explicit"/>
-    <input>
-        <port id="0">
-            <dim>1</dim>
-            <dim>20</dim>
-            <dim>2</dim>
-            <dim>2</dim>
-        </port>
-        <port id="1">
-            <dim>20</dim>
-            <dim>10</dim>
-            <dim>3</dim>
-            <dim>3</dim>
-        </port>
-    </input>
-    <output>
-        <port id="0" precision="FP32">
-            <dim>1</dim>
-            <dim>10</dim>
-            <dim>8</dim>
-            <dim>8</dim>
-        </port>
-    </output>
-</layer>
-```
+.. code-block:: cpp
+   
+   <layer id="5" name="upsampling_node" type="ConvolutionBackpropData">
+       <data dilations="1,1" pads_begin="0,0" pads_end="0,0" strides="3,3" output_padding="2,2" auto_pad="explicit"/>
+       <input>
+           <port id="0">
+               <dim>1</dim>
+               <dim>20</dim>
+               <dim>2</dim>
+               <dim>2</dim>
+           </port>
+           <port id="1">
+               <dim>20</dim>
+               <dim>10</dim>
+               <dim>3</dim>
+               <dim>3</dim>
+           </port>
+       </input>
+       <output>
+           <port id="0" precision="FP32">
+               <dim>1</dim>
+               <dim>10</dim>
+               <dim>8</dim>
+               <dim>8</dim>
+           </port>
+       </output>
+   </layer>

 *Example 3: 2D ConvolutionBackpropData with output_shape input*

-```xml
-<layer id="5" name="upsampling_node" type="ConvolutionBackpropData">
-    <data dilations="1,1" pads_begin="1,1" pads_end="1,1" strides="1,1" output_padding="0,0" auto_pad="valid"/>
-    <input>
-        <port id="0">
-            <dim>1</dim>
-            <dim>20</dim>
-            <dim>224</dim>
-            <dim>224</dim>
-        </port>
-        <port id="1">
-            <dim>20</dim>
-            <dim>10</dim>
-            <dim>3</dim>
-            <dim>3</dim>
-        </port>
-        <port id="2">
-            <dim>2</dim> <!-- output_shape value is: [450, 450]-->
-        </port>
-    </input>
-    <output>
-        <port id="0" precision="FP32">
-            <dim>1</dim>
-            <dim>10</dim>
-            <dim>450</dim>
-            <dim>450</dim>
-        </port>
-    </output>
-</layer>
-```
+.. code-block:: cpp
+   
+   <layer id="5" name="upsampling_node" type="ConvolutionBackpropData">
+       <data dilations="1,1" pads_begin="1,1" pads_end="1,1" strides="1,1" output_padding="0,0" auto_pad="valid"/>
+       <input>
+           <port id="0">
+               <dim>1</dim>
+               <dim>20</dim>
+               <dim>224</dim>
+               <dim>224</dim>
+           </port>
+           <port id="1">
+               <dim>20</dim>
+               <dim>10</dim>
+               <dim>3</dim>
+               <dim>3</dim>
+           </port>
+           <port id="2">
+               <dim>2</dim> < !-- output_shape value is: [450, 450]-->
+           </port>
+       </input>
+       <output>
+           <port id="0" precision="FP32">
+               <dim>1</dim>
+               <dim>10</dim>
+               <dim>450</dim>
+               <dim>450</dim>
+           </port>
+       </output>
+   </layer>
+
+@endsphinxdirective
+
--- a/docs/ops/convolution/Convolution_1.md
+++ b/docs/ops/convolution/Convolution_1.md
@@ -1,92 +1,105 @@
 # Convolution {#openvino_docs_ops_convolution_Convolution_1}

+@sphinxdirective
+
 **Versioned name**: *Convolution-1*

 **Category**: *Convolution*

 **Short description**: Computes 1D, 2D or 3D convolution (cross-correlation to be precise) of input and kernel tensors.

-**Detailed description**: Basic building block of convolution is a dot product of input patch and kernel. Whole operation consist of multiple such computations over multiple input patches and kernels. More thorough explanation can be found in [Convolutional Neural Networks](http://cs231n.github.io/convolutional-networks/#conv) and [Convolution operation](https://medium.com/apache-mxnet/convolutions-explained-with-ms-excel-465d6649831c).
+**Detailed description**: Basic building block of convolution is a dot product of input patch and kernel. Whole operation consist of multiple such computations over multiple input patches and kernels. More thorough explanation can be found in `Convolutional Neural Networks <http://cs231n.github.io/convolutional-networks/#conv>`__ and `Convolution operation <https://medium.com/apache-mxnet/convolutions-explained-with-ms-excel-465d6649831c>`__ .

 For the convolutional layer, the number of output features in each dimension is calculated using the formula:
-\f[
-n_{out} = \left ( \frac{n_{in} + 2p - k}{s} \right ) + 1
-\f]
+
+.. math::
+   
+   n_{out} = \left ( \frac{n_{in} + 2p - k}{s} \right ) + 1

 The receptive field in each layer is calculated using the formulas:
-*   Jump in the output feature map:
-  \f[
-  j_{out} = j_{in} \cdot s
-  \f]
-*   Size of the receptive field of output feature:
-  \f[
-  r_{out} = r_{in} + ( k - 1 ) \cdot j_{in}
-  \f]
-*   Center position of the receptive field of the first output feature:
-  \f[
-  start_{out} = start_{in} + ( \frac{k - 1}{2} - p ) \cdot j_{in}
-  \f]
-*   Output is calculated using the following formula:
-  \f[
-  out = \sum_{i = 0}^{n}w_{i}x_{i} + b
-  \f]
+
+* Jump in the output feature map:
+  
+  .. math::
+     
+     j_{out} = j_{in} \cdot s
+  
+* Size of the receptive field of output feature:
+  
+  .. math::
+     
+     r_{out} = r_{in} + ( k - 1 ) \cdot j_{in}
+  
+* Center position of the receptive field of the first output feature:
+  
+  .. math::
+     
+     start_{out} = start_{in} + ( \frac{k - 1}{2} - p ) \cdot j_{in}
+  
+* Output is calculated using the following formula:
+  
+  .. math::
+     
+     out = \sum_{i = 0}^{n}w_{i}x_{i} + b

 **Attributes**:

 * *strides*

-  * **Description**: *strides* is a distance (in pixels) to slide the filter on the feature map over the `(z, y, x)` axes for 3D convolutions and `(y, x)` axes for 2D convolutions. For example, *strides* equal `4,2,1` means sliding the filter 4 pixel at a time over depth dimension, 2 over height dimension and 1 over width dimension.
+  * **Description**: *strides* is a distance (in pixels) to slide the filter on the feature map over the ``(z, y, x)`` axes for 3D convolutions and ``(y, x)`` axes for 2D convolutions. For example, *strides* equal ``4,2,1`` means sliding the filter 4 pixel at a time over depth dimension, 2 over height dimension and 1 over width dimension.
  * **Range of values**: integer values starting from 0
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Required**: *yes*

 * *pads_begin*

-  * **Description**: *pads_begin* is a number of pixels to add to the beginning along each axis. For example, *pads_begin* equal `1,2` means adding 1 pixel to the top of the input and 2 to the left of the input.
+  * **Description**: *pads_begin* is a number of pixels to add to the beginning along each axis. For example, *pads_begin* equal ``1,2`` means adding 1 pixel to the top of the input and 2 to the left of the input.
  * **Range of values**: integer values starting from 0
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Required**: *yes*
  * **Note**: the attribute is ignored when *auto_pad* attribute is specified.

 * *pads_end*

-  * **Description**: *pads_end* is a number of pixels to add to the ending along each axis. For example, *pads_end* equal `1,2` means adding 1 pixel to the bottom of the input and 2 to the right of the input.
+  * **Description**: *pads_end* is a number of pixels to add to the ending along each axis. For example, *pads_end* equal ``1,2`` means adding 1 pixel to the bottom of the input and 2 to the right of the input.
  * **Range of values**: integer values starting from 0
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Required**: *yes*
  * **Note**: the attribute is ignored when *auto_pad* attribute is specified.

 * *dilations*

-  * **Description**: *dilations* denotes the distance in width and height between elements (weights) in the filter. For example, *dilation* equal `1,1` means that all the elements in the filter are neighbors, so it is the same as for the usual convolution. *dilation* equal `2,2` means that all the elements in the filter are matched not to adjacent elements in the input matrix, but to those that are adjacent with distance 1.
+  * **Description**: *dilations* denotes the distance in width and height between elements (weights) in the filter. For example, *dilation* equal ``1,1`` means that all the elements in the filter are neighbors, so it is the same as for the usual convolution. *dilation* equal ``2,2`` means that all the elements in the filter are matched not to adjacent elements in the input matrix, but to those that are adjacent with distance 1.
  * **Range of values**: integer value starting from 0
-  * **Type**: `int[]`
+  * **Type**: ``int[]``
  * **Required**: *yes*

 * *auto_pad*

  * **Description**: *auto_pad* how the padding is calculated. Possible values:
+    
    * *explicit* - use explicit padding values from *pads_begin* and *pads_end*.
    * *same_upper* - the input is padded to match the output size. In case of odd padding value an extra padding is added at the end.
    * *same_lower* - the input is padded to match the output size. In case of odd padding value an extra padding is added at the beginning.
    * *valid* - do not use padding.
-  * **Type**: `string`
+  * **Type**: ``string``
  * **Default value**: explicit
  * **Required**: *no*
  * **Note**: *pads_begin* and *pads_end* attributes are ignored when *auto_pad* is specified.

 **Inputs**:

-*   **1**: Input tensor of type *T* and rank 3, 4 or 5. Layout is `[N, C_IN, Z, Y, X]` (number of batches, number of channels, spatial axes Z, Y, X). **Required.**
-*   **2**: Kernel tensor of type *T* and rank 3, 4 or 5. Layout is `[C_OUT, C_IN, Z, Y, X]` (number of output channels, number of input channels, spatial axes Z, Y, X). **Required.**
-*   **Note**: Type of the convolution (1D, 2D or 3D) is derived from the rank of the input tensors and not specified by any attribute:
-      * 1D convolution (input tensors rank 3) means that there is only one spatial axis X
-      * 2D convolution (input tensors rank 4) means that there are two spatial axes Y, X
-      * 3D convolution (input tensors rank 5) means that there are three spatial axes Z, Y, X
+* **1**: Input tensor of type *T* and rank 3, 4 or 5. Layout is ``[N, C_IN, Z, Y, X]`` (number of batches, number of channels, spatial axes Z, Y, X). **Required.**
+* **2**: Kernel tensor of type *T* and rank 3, 4 or 5. Layout is ``[C_OUT, C_IN, Z, Y, X]`` (number of output channels, number of input channels, spatial axes Z, Y, X). **Required.**
+* **Note**: Type of the convolution (1D, 2D or 3D) is derived from the rank of the input tensors and not specified by any attribute:
+  
+  * 1D convolution (input tensors rank 3) means that there is only one spatial axis X
+  * 2D convolution (input tensors rank 4) means that there are two spatial axes Y, X
+  * 3D convolution (input tensors rank 5) means that there are three spatial axes Z, Y, X

 **Outputs**:

-*   **1**: Output tensor of type *T* and rank 3, 4 or 5. Layout is `[N, C_OUT, Z, Y, X]` (number of batches, number of kernel output channels, spatial axes Z, Y, X).
+* **1**: Output tensor of type *T* and rank 3, 4 or 5. Layout is ``[N, C_OUT, Z, Y, X]`` (number of batches, number of kernel output channels, spatial axes Z, Y, X).

 **Types**:

@@ -95,87 +108,96 @@ The receptive field in each layer is calculated using the formulas:
 **Example**:

 1D Convolution
-```xml
-<layer type="Convolution" ...>
-    <data dilations="1" pads_begin="0" pads_end="0" strides="2" auto_pad="valid"/>
-    <input>
-        <port id="0">
-            <dim>1</dim>
-            <dim>5</dim>
-            <dim>128</dim>
-        </port>
-        <port id="1">
-            <dim>16</dim>
-            <dim>5</dim>
-            <dim>4</dim>
-        </port>
-    </input>
-    <output>
-        <port id="2" precision="FP32">
-            <dim>1</dim>
-            <dim>16</dim>
-            <dim>63</dim>
-        </port>
-    </output>
-</layer>
-```
+
+.. code-block:: cpp
+   
+   <layer type="Convolution" ...>
+       <data dilations="1" pads_begin="0" pads_end="0" strides="2" auto_pad="valid"/>
+       <input>
+           <port id="0">
+               <dim>1</dim>
+               <dim>5</dim>
+               <dim>128</dim>
+           </port>
+           <port id="1">
+               <dim>16</dim>
+               <dim>5</dim>
+               <dim>4</dim>
+           </port>
+       </input>
+       <output>
+           <port id="2" precision="FP32">
+               <dim>1</dim>
+               <dim>16</dim>
+               <dim>63</dim>
+           </port>
+       </output>
+   </layer>
+
+
 2D Convolution
-```xml
-<layer type="Convolution" ...>
-    <data dilations="1,1" pads_begin="2,2" pads_end="2,2" strides="1,1" auto_pad="explicit"/>
-    <input>
-        <port id="0">
-            <dim>1</dim>
-            <dim>3</dim>
-            <dim>224</dim>
-            <dim>224</dim>
-        </port>
-        <port id="1">
-            <dim>64</dim>
-            <dim>3</dim>
-            <dim>5</dim>
-            <dim>5</dim>
-        </port>
-    </input>
-    <output>
-        <port id="2" precision="FP32">
-            <dim>1</dim>
-            <dim>64</dim>
-            <dim>224</dim>
-            <dim>224</dim>
-        </port>
-    </output>
-</layer>
-```
+
+.. code-block:: cpp
+   
+   <layer type="Convolution" ...>
+       <data dilations="1,1" pads_begin="2,2" pads_end="2,2" strides="1,1" auto_pad="explicit"/>
+       <input>
+           <port id="0">
+               <dim>1</dim>
+               <dim>3</dim>
+               <dim>224</dim>
+               <dim>224</dim>
+           </port>
+           <port id="1">
+               <dim>64</dim>
+               <dim>3</dim>
+               <dim>5</dim>
+               <dim>5</dim>
+           </port>
+       </input>
+       <output>
+           <port id="2" precision="FP32">
+               <dim>1</dim>
+               <dim>64</dim>
+               <dim>224</dim>
+               <dim>224</dim>
+           </port>
+       </output>
+   </layer>

 3D Convolution
-```xml
-<layer type="Convolution" ...>
-    <data dilations="2,2,2" pads_begin="0,0,0" pads_end="0,0,0" strides="3,3,3" auto_pad="explicit"/>
-    <input>
-        <port id="0">
-            <dim>1</dim>
-            <dim>7</dim>
-            <dim>320</dim>
-            <dim>320</dim>
-            <dim>320</dim>
-        </port>
-        <port id="1">
-            <dim>32</dim>
-            <dim>7</dim>
-            <dim>3</dim>
-            <dim>3</dim>
-            <dim>3</dim>
-        </port>
-    </input>
-    <output>
-        <port id="2" precision="FP32">
-            <dim>1</dim>
-            <dim>32</dim>
-            <dim>106</dim>
-            <dim>106</dim>
-            <dim>106</dim>
-        </port>
-    </output>
-</layer>
-```
+
+.. code-block:: cpp
+   
+   <layer type="Convolution" ...>
+       <data dilations="2,2,2" pads_begin="0,0,0" pads_end="0,0,0" strides="3,3,3" auto_pad="explicit"/>
+       <input>
+           <port id="0">
+               <dim>1</dim>
+               <dim>7</dim>
+               <dim>320</dim>
+               <dim>320</dim>
+               <dim>320</dim>
+           </port>
+           <port id="1">
+               <dim>32</dim>
+               <dim>7</dim>
+               <dim>3</dim>
+               <dim>3</dim>
+               <dim>3</dim>
+           </port>
+       </input>
+       <output>
+           <port id="2" precision="FP32">
+               <dim>1</dim>
+               <dim>32</dim>
+               <dim>106</dim>
+               <dim>106</dim>
+               <dim>106</dim>
+           </port>
+       </output>
+   </layer>
+
+
+@endsphinxdirective
+