Specify GenerateProposals-9 (#11004)

This commit is contained in:
mei, yang 2022-05-05 16:28:18 +08:00 committed by GitHub
parent d560cf19a3
commit 2d0ffd8fe5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 149 additions and 0 deletions

View File

@ -76,6 +76,7 @@
GatherND-8 <openvino_docs_ops_movement_GatherND_8>
GELU-2 <openvino_docs_ops_activation_GELU_2>
GELU-7 <openvino_docs_ops_activation_GELU_7>
GenerateProposals-9 <openvino_docs_ops_detection_GenerateProposals_9>
GreaterEqual-1 <openvino_docs_ops_comparison_GreaterEqual_1>
Greater-1 <openvino_docs_ops_comparison_Greater_1>
GroupConvolutionBackpropData-1 <openvino_docs_ops_convolution_GroupConvolutionBackpropData_1>

View File

@ -0,0 +1,147 @@
# GenerateProposals {#openvino_docs_ops_detection_GenerateProposals_9}
**Versioned name**: *GenerateProposals-9*
**Category**: *Object detection*
**Short description**: The *GenerateProposals* operation proposes ROIs and their scores
based on input data for each image in the batch.
**Detailed description**: The operation performs the following steps for each image:
1. Transposes and reshapes predicted bounding boxes deltas and scores to get them into the same dimension order as the
anchors.
2. Transforms anchors and deltas into proposal bboxes and clips proposal bboxes to an image. The attribute *normalized*
indicates whether the proposal bboxes are normalized or not.
3. Sorts all `(proposal, score)` pairs by score from highest to lowest; order of pairs with equal scores is undefined.
4. Takes top *pre_nms_count* proposals, if total number of proposals is less than *pre_nms_count* takes all proposals.
5. Removes predicted boxes with either height or width < *min_size*.
6. Applies non-maximum suppression with *adaptive_nms_threshold*. The initial value of *adaptive_nms_threshold* is
*nms_threshold*. If `nms_eta < 1` and `adaptive_threshold > 0.5`, update `adaptive_threshold *= nms_eta`.
7. Takes and returns top proposals after nms operation. The number of returned proposals in each image is dynamic and is specified by output port 3 `rpnroisnum`. And the max number of proposals in each image is specified by attribute *post_nms_count*.
All proposals of the whole batch are concated image by image, and distinguishable through outputs.
**Attributes**:
* *min_size*
* **Description**: The *min_size* attribute specifies minimum box width and height.
* **Range of values**: non-negative floating-point number
* **Type**: float
* **Required**: *yes*
* *nms_threshold*
* **Description**: The *nms_threshold* attribute specifies threshold to be used in the NMS stage.
* **Range of values**: non-negative floating-point number
* **Type**: float
* **Required**: *yes*
* *pre_nms_count*
* **Description**: The *pre_nms_count* attribute specifies number of top-n proposals before NMS.
* **Range of values**: non-negative integer number
* **Type**: int
* **Required**: *yes*
* *post_nms_count*
* **Description**: The *post_nms_count* attribute specifies number of top-n proposals after NMS.
* **Range of values**: non-negative integer number
* **Type**: int
* **Required**: *yes*
* *normalized*
* **Description**: *normalized* is a flag that indicates whether proposal bboxes are normalized or not.
* **Range of values**: true or false
* *true* - the bbox coordinates are normalized.
* *false* - the bbox coordinates are not normalized.
* **Type**: boolean
* **Default value**: True
* **Required**: *no*
* *nms_eta*
* **Description**: eta parameter for adaptive NMS.
* **Range of values**: a floating-point number in close range `[0, 1.0]`.
* **Type**: float
* **Default value**: `1.0`
* **Required**: *no*
* *roi_num_type*
* **Description**: the type of element of output 3 `rpnroisnum`.
* **Range of values**: i32, i64
* **Type**: string
* **Default value**: `i64`
* **Required**: *no*
**Inputs**
* **1**: `im_info` - tensor of type *T* and shape `[num_batches, 3]` or `[num_batches, 4]` providing input image info. The image info is layout as `[image_height, image_width, scale_height_and_width]` or as `[image_height, image_width, scale_height, scale_width]`. **Required.**
* **2**: `anchors` - tensor of type *T* with shape `[height, width, number_of_anchors, 4]` providing anchors. Each anchor is layouted as `[xmin, ymin, xmax, ymax]`. **Required.**
* **3**: `boxesdeltas` - tensor of type *T* with shape `[num_batches, number_of_anchors * 4, height, width]` providing deltas for anchors. The delta consists of 4 element tuples with layout `[dx, dy, log(dw), log(dh)]`. **Required.**
* **4**: `scores` - tensor of type *T* with shape `[num_batches, number_of_anchors, height, width]` providing proposals scores. **Required.**
The `height` and `width` from inputs `anchors`, `boxesdeltas` and `scores` are the height and width of feature maps.
**Outputs**
* **1**: `rpnrois` - tensor of type *T* with shape `[num_rois, 4]` providing proposed ROIs. The proposals are layouted as `[xmin, ymin, xmax, ymax]`. The `num_rois` means the total proposals number of all the images in one batch. `num_rois` is a dynamic dimension.
* **2**: `rpnscores` - tensor of type *T* with shape `[num_rois]` providing proposed ROIs scores.
* **3**: `rpnroisnum` - tensor of type *roi_num_type* with shape `[num_batches]` providing the number of proposed ROIs in each image.
**Types**
* *T*: any supported floating-point type.
**Example**
```xml
<layer ... type="GenerateProposals" version="opset9">
<data min_size="0.0" nms_threshold="0.699999988079071" post_nms_count="1000" pre_nms_count="1000" roi_num_type="i32"/>
<input>
<port id="0">
<dim>8</dim>
<dim>3</dim>
</port>
<port id="1">
<dim>50</dim>
<dim>84</dim>
<dim>3</dim>
<dim>4</dim>
</port>
<port id="2">
<dim>8</dim>
<dim>12</dim>
<dim>50</dim>
<dim>84</dim>
</port>
<port id="3">
<dim>8</dim>
<dim>3</dim>
<dim>50</dim>
<dim>84</dim>
</port>
</input>
<output>
<port id="4" precision="FP32">
<dim>-1</dim>
<dim>4</dim>
</port>
<port id="5" precision="FP32">
<dim>-1</dim>
</port>
<port id="6" precision="I32">
<dim>8</dim>
</port>
</output>
</layer>
```

View File

@ -69,6 +69,7 @@ declared in `namespace opset9`.
* [GatherND](movement/GatherND_8.md)
* [GatherTree](movement/GatherTree_1.md)
* [Gelu](activation/GELU_7.md)
* [GenerateProposals](detection/GenerateProposals_9.md)
* [Greater](comparison/Greater_1.md)
* [GreaterEqual](comparison/GreaterEqual_1.md)
* [GRN](normalization/GRN_1.md)