* Doc Migration from Gitlab (#1289) * doc migration * fix * Update FakeQuantize_1.md * Update performance_benchmarks.md * Updates graphs for FPGA * Update performance_benchmarks.md * Change DL Workbench structure (#1) * Changed DL Workbench structure * Fixed tags * fixes * Update ie_docs.xml * Update performance_benchmarks_faq.md * Fixes in DL Workbench layout * Fixes for CVS-31290 * [DL Workbench] Minor correction * Fix for CVS-30955 * Added nGraph deprecation notice as requested by Zoe * fix broken links in api doxy layouts * CVS-31131 fixes * Additional fixes * Fixed POT TOC * Update PAC_Configure.md PAC DCP 1.2.1 install guide. * Update inference_engine_intro.md * fix broken link * Update opset.md * fix * added opset4 to layout * added new opsets to layout, set labels for them * Update VisionAcceleratorFPGA_Configure.md Updated from 2020.3 to 2020.4 Co-authored-by: domi2000 <domi2000@users.noreply.github.com>
3.7 KiB
ROIAlign
Versioned name: ROIAlign-3
Category: Object detection
Short description: ROIAlign is a pooling layer used over feature maps of non-uniform input sizes and outputs a feature map of a fixed size.
Detailed description: Reference.
ROIAlign performs the following for each Region of Interest (ROI) for each input feature map:
- Multiply box coordinates with spatial_scale to produce box coordinates relative to the input feature map size.
- Divide the box into bins according to the sampling_ratio attribute.
- Apply bilinear interpolation with 4 points in each bin and apply maximum or average pooling based on mode attribute to produce output feature map element.
Attributes
-
pooled_h
- Description: pooled_h is the height of the ROI output feature map.
- Range of values: a positive integer
- Type:
int - Default value: None
- Required: yes
-
pooled_w
- Description: pooled_w is the width of the ROI output feature map.
- Range of values: a positive integer
- Type:
int - Default value: None
- Required: yes
-
sampling_ratio
- Description: sampling_ratio is the number of bins over height and width to use to calculate each output feature map element. If the value
is equal to 0 then use adaptive number of elements over height and width:
ceil(roi_height / pooled_h)andceil(roi_width / pooled_w)respectively. - Range of values: a non-negative integer
- Type:
int - Default value: None
- Required: yes
- Description: sampling_ratio is the number of bins over height and width to use to calculate each output feature map element. If the value
is equal to 0 then use adaptive number of elements over height and width:
-
spatial_scale
- Description: spatial_scale is a multiplicative spatial scale factor to translate ROI coordinates from their input spatial scale to the scale used when pooling.
- Range of values: a positive floating-point number
- Type:
float - Default value: None
- Required: yes
-
mode
- Description: mode specifies a method to perform pooling to produce output feature map elements.
- Range of values:
- max - maximum pooling
- avg - average pooling
- Type: string
- Default value: None
- Required: yes
Inputs:
-
1: 4D input tensor of shape
[N, C, H, W]with feature maps of type T. Required. -
2: 2D input tensor of shape
[NUM_ROIS, 4]describing box consisting of 4 element tuples:[x_1, y_1, x_2, y_2]in relative coordinates of type T. The box height and width are calculated the following way:roi_width = max(spatial_scale * (x_2 - x_1), 1.0),roi_height = max(spatial_scale * (y_2 - y_1), 1.0), so the malformed boxes are expressed as a box of size1 x 1. Required. -
3: 1D input tensor of shape
[NUM_ROIS]with batch indices of type IND_T. Required.
Outputs:
- 1: 4D output tensor of shape
[NUM_ROIS, C, pooled_h, pooled_w]with feature maps of type T.
Types
-
T: any supported floating point type.
-
IND_T: any supported integer type.
Example
<layer ... type="ROIAlign" ... >
<data pooled_h="6" pooled_w="6" spatial_scale="16.0" sampling_ratio="2" mode="avg"/>
<input>
<port id="0">
<dim>7</dim>
<dim>256</dim>
<dim>200</dim>
<dim>200</dim>
</port>
<port id="1">
<dim>1000</dim>
<dim>4</dim>
</port>
<port id="2">
<dim>1000</dim>
</port>
</input>
<output>
<port id="3" precision="FP32">
<dim>1000</dim>
<dim>256</dim>
<dim>6</dim>
<dim>6</dim>
</port>
</output>
</layer>