openvino/GroupConvolution_1.md at 4795391b73381660b69b4cd3986c7a0bf902e868

Files

Nikolay Tyukaev ef45b5da8d Doc Migration (master) (#1377 )

* Doc Migration from Gitlab (#1289)

* doc migration

* fix

* Update FakeQuantize_1.md

* Update performance_benchmarks.md

* Updates graphs for FPGA

* Update performance_benchmarks.md

* Change DL Workbench structure (#1)

* Changed DL Workbench structure

* Fixed tags

* fixes

* Update ie_docs.xml

* Update performance_benchmarks_faq.md

* Fixes in DL Workbench layout

* Fixes for CVS-31290

* [DL Workbench] Minor correction

* Fix for CVS-30955

* Added nGraph deprecation notice as requested by Zoe

* fix broken links in api doxy layouts

* CVS-31131 fixes

* Additional fixes

* Fixed POT TOC

* Update PAC_Configure.md

PAC DCP 1.2.1 install guide.

* Update inference_engine_intro.md

* fix broken link

* Update opset.md

* fix

* added opset4 to layout

* added new opsets to layout, set labels for them

* Update VisionAcceleratorFPGA_Configure.md

Updated from 2020.3 to 2020.4

Co-authored-by: domi2000 <domi2000@users.noreply.github.com>

2020-07-20 17:36:08 +03:00

4.5 KiB

Raw Blame History

GroupConvolution

Versioned name: GroupConvolution-1

Category: Convolution

Short description: Reference

Detailed description: Reference

Attributes

strides
- Description: strides is a distance (in pixels) to slide the filter on the feature map over the (z, y, x) axes for 3D convolutions and (y, x) axes for 2D convolutions. For example, strides equal 4,2,1 means sliding the filter 4 pixel at a time over depth dimension, 2 over height dimension and 1 over width dimension.
- Range of values: positive integer numbers
- Type: int[]
- Default value: None
- Required: yes
pads_begin
- Description: pads_begin is a number of pixels to add to the beginning along each axis. For example, pads_begin equal 1,2 means adding 1 pixel to the top of the input and 2 to the left of the input.
- Range of values: positive integer numbers
- Type: int[]
- Default value: None
- Required: yes
- Note: the attribute is ignored when auto_pad attribute is specified.
pads_end
- Description: pads_end is a number of pixels to add to the ending along each axis. For example, pads_end equal 1,2 means adding 1 pixel to the bottom of the input and 2 to the right of the input.
- Range of values: positive integer numbers
- Type: int[]
- Default value: None
- Required: yes
- Note: the attribute is ignored when auto_pad attribute is specified.
dilations
- Description: dilations denotes the distance in width and height between elements (weights) in the filter. For example, dilation equal 1,1 means that all the elements in the filter are neighbors, so it is the same as for the usual convolution. dilation equal 2,2 means that all the elements in the filter are matched not to adjacent elements in the input matrix, but to those that are adjacent with distance 1.
- Range of values: positive integer numbers
- Type: int[]
- Default value: None
- Required: yes
auto_pad
- Description: auto_pad how the padding is calculated. Possible values:
  - None (not specified): use explicit padding values.
  - same_upper (same_lower) the input is padded to match the output size. In case of odd padding value an extra padding is added at the end (at the beginning).
  - valid - do not use padding.
- Type: string
- Default value: None
- Required: no
- Note: pads_begin and pads_end attributes are ignored when auto_pad is specified.

Inputs:

1: 4D or 5D input tensor. Required.
2: Convolution kernel tensor. Weights layout is GOIYX (GOIZYX for 3D convolution), which means that X is changing the fastest, then Y, then Input, Output and Group. The size of kernel and number of groups are derived from the shape of this input and aren't specified by any attribute. Required.

Mathematical Formulation

For the convolutional layer, the number of output features in each dimension is calculated using the formula: \f[ n_{out} = \left ( \frac{n_{in} + 2p - k}{s} \right ) + 1 \f]
The receptive field in each layer is calculated using the formulas:
- Jump in the output feature map: \f[ j_{out} = j_{in} * s \f]
- Size of the receptive field of output feature: \f[ r_{out} = r_{in} + ( k - 1 ) * j_{in} \f]
- Center position of the receptive field of the first output feature: \f[ start_{out} = start_{in} + ( \frac{k - 1}{2} - p ) * j_{in} \f]
- Output is calculated using the following formula: \f[ out = \sum_{i = 0}^{n}w_{i}x_{i} + b \f]

Example

<layer type="GroupConvolution" ...>
    <data dilations="1,1" pads_begin="2,2" pads_end="2,2" strides="1,1"/>
    <input>
        <port id="0">
            <dim>1</dim>
            <dim>12</dim>
            <dim>224</dim>
            <dim>224</dim>
        </port>
        <port id="1">
            <dim>4</dim>
            <dim>1</dim>
            <dim>3</dim>
            <dim>5</dim>
            <dim>5</dim>
        </port>
    </input>
    <output>
        <port id="2" precision="FP32">
            <dim>1</dim>
            <dim>4</dim>
            <dim>224</dim>
            <dim>224</dim>
        </port>
    </output>

4.5 KiB Raw Blame History

GroupConvolution

4.5 KiB

Raw Blame History