Specify ROIAlign-9 (#11067)
This commit is contained in:
parent
e68613a2fc
commit
d560cf19a3
@ -161,6 +161,7 @@
|
||||
RNNCell-3 <openvino_docs_ops_sequence_RNNCell_3>
|
||||
RNNSequence-5 <openvino_docs_ops_sequence_RNNSequence_5>
|
||||
ROIAlign-3 <openvino_docs_ops_detection_ROIAlign_3>
|
||||
ROIAlign-9 <openvino_docs_ops_detection_ROIAlign_9>
|
||||
ROIPooling-1 <openvino_docs_ops_detection_ROIPooling_1>
|
||||
Roll-7 <openvino_docs_ops_movement_Roll_7>
|
||||
Round-5 <openvino_docs_ops_arithmetic_Round_5>
|
||||
|
119
docs/ops/detection/ROIAlign_9.md
Normal file
119
docs/ops/detection/ROIAlign_9.md
Normal file
@ -0,0 +1,119 @@
|
||||
# ROIAlign {#openvino_docs_ops_detection_ROIAlign_9}
|
||||
|
||||
**Versioned name**: *ROIAlign-9*
|
||||
|
||||
**Category**: *Object detection*
|
||||
|
||||
**Short description**: *ROIAlign* is a *pooling layer* used over feature maps of non-uniform input sizes and outputs a feature map of a fixed size.
|
||||
|
||||
**Detailed description**: [Reference](https://arxiv.org/abs/1703.06870).
|
||||
|
||||
*ROIAlign* performs the following for each Region of Interest (ROI) for each input feature map:
|
||||
1. Multiply box coordinates with *spatial_scale* to produce box coordinates relative to the input feature map size based on *aligned_mode* attribute.
|
||||
2. Divide the box into bins according to the *sampling_ratio* attribute.
|
||||
3. Apply bilinear interpolation with 4 points in each bin and apply maximum or average pooling based on *mode* attribute to produce output feature map element.
|
||||
|
||||
**Attributes**
|
||||
|
||||
* *pooled_h*
|
||||
|
||||
* **Description**: *pooled_h* is the height of the ROI output feature map.
|
||||
* **Range of values**: a positive integer
|
||||
* **Type**: `int`
|
||||
* **Required**: *yes*
|
||||
|
||||
* *pooled_w*
|
||||
|
||||
* **Description**: *pooled_w* is the width of the ROI output feature map.
|
||||
* **Range of values**: a positive integer
|
||||
* **Type**: `int`
|
||||
* **Required**: *yes*
|
||||
|
||||
* *sampling_ratio*
|
||||
|
||||
* **Description**: *sampling_ratio* is the number of bins over height and width to use to calculate each output feature map element. If the value
|
||||
is equal to 0 then use adaptive number of elements over height and width: `ceil(roi_height / pooled_h)` and `ceil(roi_width / pooled_w)` respectively.
|
||||
* **Range of values**: a non-negative integer
|
||||
* **Type**: `int`
|
||||
* **Required**: *yes*
|
||||
|
||||
* *spatial_scale*
|
||||
|
||||
* **Description**: *spatial_scale* is a multiplicative spatial scale factor to translate ROI coordinates from their input spatial scale to the scale used when pooling.
|
||||
* **Range of values**: a positive floating-point number
|
||||
* **Type**: `float`
|
||||
* **Required**: *yes*
|
||||
|
||||
* *mode*
|
||||
|
||||
* **Description**: *mode* specifies a method to perform pooling to produce output feature map elements.
|
||||
* **Range of values**:
|
||||
* *max* - maximum pooling
|
||||
* *avg* - average pooling
|
||||
* **Type**: string
|
||||
* **Required**: *yes*
|
||||
|
||||
* *aligned_mode*
|
||||
|
||||
* **Description**: *aligned_mode* specifies how to transform the coordinate in original tensor to the resized tensor.
|
||||
* **Range of values**: name of the transformation mode in string format (here spatial_scale is resized_shape[x] / original_shape[x], resized_shape[x] is the shape of resized tensor in axis x, original_shape[x] is the shape of original tensor in axis x and x_original is a coordinate in axis x, for any axis x from the input axes):
|
||||
* *asymmetric* - the coordinate in the resized tensor axis x is calculated according to the formula x_original * spatial_scale
|
||||
* *tf_half_pixel_for_nn* - the coordinate in the resized tensor axis x is x_original * spatial_scale - 0.5
|
||||
* *half_pixel* - the coordinate in the resized tensor axis x is calculated as ((x_original + 0.5) * spatial_scale) - 0.5
|
||||
* **Type**: string
|
||||
* **Default value**: asymmetric
|
||||
* **Required**: *no*
|
||||
|
||||
**Inputs**:
|
||||
|
||||
* **1**: 4D input tensor of shape `[N, C, H, W]` with feature maps of type *T*. **Required.**
|
||||
|
||||
* **2**: 2D input tensor of shape `[NUM_ROIS, 4]` describing box consisting of 4 element tuples: `[x_1, y_1, x_2, y_2]` in relative coordinates of type *T*.
|
||||
The box height and width are calculated the following way:
|
||||
* If *aligned_mode* equals *asymmetric*: `roi_width = max(spatial_scale * (x_2 - x_1), 1.0)`, `roi_height = max(spatial_scale * (y_2 - y_1), 1.0)`, so the malformed boxes are expressed as a box of size `1 x 1`.
|
||||
* else: `roi_width = spatial_scale * (x_2 - x_1)`, `roi_height = spatial_scale * (y_2 - y_1)`.
|
||||
* **Required.**
|
||||
|
||||
* **3**: 1D input tensor of shape `[NUM_ROIS]` with batch indices of type *IND_T*. **Required.**
|
||||
|
||||
**Outputs**:
|
||||
|
||||
* **1**: 4D output tensor of shape `[NUM_ROIS, C, pooled_h, pooled_w]` with feature maps of type *T*.
|
||||
|
||||
**Types**
|
||||
|
||||
* *T*: any supported floating-point type.
|
||||
|
||||
* *IND_T*: any supported integer type.
|
||||
|
||||
|
||||
**Example**
|
||||
|
||||
```xml
|
||||
<layer ... type="ROIAlign" ... >
|
||||
<data pooled_h="6" pooled_w="6" spatial_scale="16.0" sampling_ratio="2" mode="avg" aligned_mode="half_pixel"/>
|
||||
<input>
|
||||
<port id="0">
|
||||
<dim>7</dim>
|
||||
<dim>256</dim>
|
||||
<dim>200</dim>
|
||||
<dim>200</dim>
|
||||
</port>
|
||||
<port id="1">
|
||||
<dim>1000</dim>
|
||||
<dim>4</dim>
|
||||
</port>
|
||||
<port id="2">
|
||||
<dim>1000</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="3" precision="FP32">
|
||||
<dim>1000</dim>
|
||||
<dim>256</dim>
|
||||
<dim>6</dim>
|
||||
<dim>6</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
```
|
@ -144,7 +144,7 @@ declared in `namespace opset9`.
|
||||
* [ReverseSequence](movement/ReverseSequence_1.md)
|
||||
* [RNNCell](sequence/RNNCell_3.md)
|
||||
* [RNNSequence](sequence/RNNSequence_5.md)
|
||||
* [ROIAlign](detection/ROIAlign_3.md)
|
||||
* [ROIAlign](detection/ROIAlign_9.md)
|
||||
* [ROIPooling](detection/ROIPooling_1.md)
|
||||
* [Roll](movement/Roll_7.md)
|
||||
* [Round](arithmetic/Round_5.md)
|
||||
|
Loading…
Reference in New Issue
Block a user