Specify MulticlassNonMaxSuppression-9 operation (#11083)

This commit is contained in:
cecilia peng 2022-05-05 16:27:47 +08:00 committed by GitHub
parent 68ef1555bc
commit e68613a2fc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 211 additions and 2 deletions

View File

@ -113,6 +113,7 @@
Mish-4 <openvino_docs_ops_activation_Mish_4>
Mod-1 <openvino_docs_ops_arithmetic_Mod_1>
MulticlassNonMaxSuppression-8 <openvino_docs_ops_sort_MulticlassNonMaxSuppression_8>
MulticlassNonMaxSuppression-9 <openvino_docs_ops_sort_MulticlassNonMaxSuppression_9>
Multiply-1 <openvino_docs_ops_arithmetic_Multiply_1>
Negative-1 <openvino_docs_ops_arithmetic_Negative_1>
NonMaxSuppression-1 <openvino_docs_ops_sort_NonMaxSuppression_1>

View File

@ -103,7 +103,7 @@ declared in `namespace opset8`.
* [Mish](activation/Mish_4.md)
* [Mod](arithmetic/Mod_1.md)
* [MVN](normalization/MVN_6.md)
* [MulticlassNMS](sort/MulticlassNMS_8.md)
* [MulticlassNMS](sort/MulticlassNonMaxSuppression_8.md)
* [Multiply](arithmetic/Multiply_1.md)
* [Negative](arithmetic/Negative_1.md)
* [NonMaxSuppression](sort/NonMaxSuppression_5.md)

View File

@ -105,7 +105,7 @@ declared in `namespace opset9`.
* [Mish](activation/Mish_4.md)
* [Mod](arithmetic/Mod_1.md)
* [MVN](normalization/MVN_6.md)
* [MulticlassNMS](sort/MulticlassNMS_8.md)
* [MulticlassNMS](sort/MulticlassNonMaxSuppression_9.md)
* [Multiply](arithmetic/Multiply_1.md)
* [Negative](arithmetic/Negative_1.md)
* [NonMaxSuppression](sort/NonMaxSuppression_5.md)

View File

@ -0,0 +1,208 @@
## MulticlassNonMaxSuppression<a name="MulticlassNonMaxSuppression"></a> {#openvino_docs_ops_sort_MulticlassNonMaxSuppression_9}
**Versioned name**: *MulticlassNonMaxSuppression-9*
**Category**: *Sorting and maximization*
**Short description**: *MulticlassNonMaxSuppression* performs multi-class non-maximum suppression of the boxes with predicted scores.
**Detailed description**: *MulticlassNonMaxSuppression* is a multi-phase operation. It implements non-maximum suppression algorithm as described below:
1. Let `B = [b_0,...,b_n]` be the list of initial detection boxes, `S = [s_0,...,s_N]` be the list of corresponding scores.
2. Let `D = []` be an initial collection of resulting boxes. Let `adaptive_threshold = iou_threshold`.
3. If `B` is empty, go to step 9.
4. Take the box with highest score. Suppose that it is the box `b` with the score `s`.
5. Delete `b` from `B`.
6. If the score `s` is greater than or equal to `score_threshold`, add `b` to `D`, else go to step 9.
7. If `nms_eta < 1` and `adaptive_threshold > 0.5`, update `adaptive_threshold *= nms_eta`.
8. For each input box `b_i` from `B` and the corresponding score `s_i`, set `s_i = 0` when `iou(b, b_i) > adaptive_threshold`, and go to step 3.
9. Return `D`, a collection of the corresponding scores `S`, and the number of elements in `D`.
This algorithm is applied independently to each class of each batch element. The operation feeds at most `nms_top_k` scoring candidate boxes to this algorithm.
The total number of output boxes of each batch element must not exceed `keep_top_k`.
Boxes of `background_class` are skipped and thus eliminated.
**Attributes**:
* *sort_result*
* **Description**: *sort_result* specifies the order of output elements.
* **Range of values**: `class`, `score`, `none`
* *class* - sort selected boxes by class id (ascending).
* *score* - sort selected boxes by score (descending).
* *none* - do not guarantee the order.
* **Type**: `string`
* **Default value**: `none`
* **Required**: *no*
* *sort_result_across_batch*
* **Description**: *sort_result_across_batch* is a flag that specifies whenever it is necessary to sort selected boxes across batches or not.
* **Range of values**: true or false
* *true* - sort selected boxes across batches.
* *false* - do not sort selected boxes across batches (boxes are sorted per batch element).
* **Type**: boolean
* **Default value**: false
* **Required**: *no*
* *output_type*
* **Description**: the tensor type of outputs `selected_indices` and `valid_outputs`.
* **Range of values**: `i64` or `i32`
* **Type**: `string`
* **Default value**: `i64`
* **Required**: *no*
* *iou_threshold*
* **Description**: intersection over union threshold.
* **Range of values**: a floating-point number
* **Type**: `float`
* **Default value**: `0`
* **Required**: *no*
* *score_threshold*
* **Description**: minimum score to consider box for the processing.
* **Range of values**: a floating-point number
* **Type**: `float`
* **Default value**: `0`
* **Required**: *no*
* *nms_top_k*
* **Description**: maximum number of boxes to be selected per class.
* **Range of values**: an integer
* **Type**: `int`
* **Default value**: `-1` meaning to keep all boxes
* **Required**: *no*
* *keep_top_k*
* **Description**: maximum number of boxes to be selected per batch element.
* **Range of values**: an integer
* **Type**: `int`
* **Default value**: `-1` meaning to keep all boxes
* **Required**: *no*
* *background_class*
* **Description**: the background class id.
* **Range of values**: an integer
* **Type**: `int`
* **Default value**: `-1` meaning to keep all classes.
* **Required**: *no*
* *normalized*
* **Description**: *normalized* is a flag that indicates whether `boxes` are normalized or not.
* **Range of values**: true or false
* *true* - the box coordinates are normalized.
* *false* - the box coordinates are not normalized.
* **Type**: boolean
* **Default value**: True
* **Required**: *no*
* *nms_eta*
* **Description**: eta parameter for adaptive NMS.
* **Range of values**: a floating-point number in close range `[0, 1.0]`.
* **Type**: `float`
* **Default value**: `1.0`
* **Required**: *no*
**Inputs**:
There are 2 kinds of input formats. The first one is of two inputs. The boxes are shared by all classes.
* **1**: `boxes` - tensor of type *T* and shape `[num_batches, num_boxes, 4]` with box coordinates. The box coordinates are layout as `[xmin, ymin, xmax, ymax]`. **Required.**
* **2**: `scores` - tensor of type *T* and shape `[num_batches, num_classes, num_boxes]` with box scores. The tensor type should be same with `boxes`. **Required.**
The second format is of three inputs. Each class has its own boxes that are not shared.
* **1**: `boxes` - tensor of type *T* and shape `[num_classes, num_boxes, 4]` with box coordinates. The box coordinates are layout as `[xmin, ymin, xmax, ymax]`. **Required.**
* **2**: `scores` - tensor of type *T* and shape `[num_classes, num_boxes]` with box scores. The tensor type should be same with `boxes`. **Required.**
* **3**: `roisnum` - tensor of type *T_IND* and shape `[num_batches]` with box numbers in each image. `num_batches` is the number of images. Each element in this tensor is the number of boxes for corresponding image. The sum of all elements is `num_boxes`. **Required.**
**Outputs**:
* **1**: `selected_outputs` - tensor of type *T* which should be same with `boxes` and shape `[number of selected boxes, 6]` containing the selected boxes with score and class as tuples `[class_id, box_score, xmin, ymin, xmax, ymax]`.
* **2**: `selected_indices` - tensor of type *T_IND* and shape `[number of selected boxes, 1]` the selected indices in the flattened `boxes`, which are absolute values cross batches. Therefore possible valid values are in the range `[0, num_batches * num_boxes - 1]`.
* **3**: `selected_num` - 1D tensor of type *T_IND* and shape `[num_batches]` representing the number of selected boxes for each batch element.
When there is no box selected, `selected_num` is filled with `0`. `selected_outputs` is an empty tensor of shape `[0, 6]`, and `selected_indices` is an empty tensor of shape `[0, 1]`.
**Types**
* *T*: floating-point type.
* *T_IND*: `int64` or `int32`.
**Example**
```xml
<layer ... type="MulticlassNonMaxSuppression" ... >
<data sort_result="score" output_type="i64" sort_result_across_batch="false" iou_threshold="0.2" score_threshold="0.5" nms_top_k="-1" keep_top_k="-1" background_class="-1" normalized="false" nms_eta="0.0"/>
<input>
<port id="0">
<dim>3</dim>
<dim>100</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
<dim>5</dim>
<dim>100</dim>
</port>
</input>
<output>
<port id="5" precision="FP32">
<dim>-1</dim> <!-- "-1" means a undefined dimension calculated during the model inference -->
<dim>6</dim>
</port>
<port id="6" precision="I64">
<dim>-1</dim>
<dim>1</dim>
</port>
<port id="7" precision="I64">
<dim>3</dim>
</port>
</output>
</layer>
```
Another possible example with 3 inputs could be like:
```xml
<layer ... type="MulticlassNonMaxSuppression" ... >
<data sort_result="score" output_type="i64" sort_result_across_batch="false" iou_threshold="0.2" score_threshold="0.5" nms_top_k="-1" keep_top_k="-1" background_class="-1" normalized="false" nms_eta="0.0"/>
<input>
<port id="0">
<dim>3</dim>
<dim>100</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
<dim>100</dim>
</port>
<port id="2">
<dim>10</dim>
</port>
</input>
<output>
<port id="5" precision="FP32">
<dim>-1</dim> <!-- "-1" means a undefined dimension calculated during the model inference -->
<dim>6</dim>
</port>
<port id="6" precision="I64">
<dim>-1</dim>
<dim>1</dim>
</port>
<port id="7" precision="I64">
<dim>3</dim>
</port>
</output>
</layer>
```