Specify MulticlassNonMaxSuppression-9 operation (#11083)
This commit is contained in:
parent
68ef1555bc
commit
e68613a2fc
@ -113,6 +113,7 @@
|
||||
Mish-4 <openvino_docs_ops_activation_Mish_4>
|
||||
Mod-1 <openvino_docs_ops_arithmetic_Mod_1>
|
||||
MulticlassNonMaxSuppression-8 <openvino_docs_ops_sort_MulticlassNonMaxSuppression_8>
|
||||
MulticlassNonMaxSuppression-9 <openvino_docs_ops_sort_MulticlassNonMaxSuppression_9>
|
||||
Multiply-1 <openvino_docs_ops_arithmetic_Multiply_1>
|
||||
Negative-1 <openvino_docs_ops_arithmetic_Negative_1>
|
||||
NonMaxSuppression-1 <openvino_docs_ops_sort_NonMaxSuppression_1>
|
||||
|
@ -103,7 +103,7 @@ declared in `namespace opset8`.
|
||||
* [Mish](activation/Mish_4.md)
|
||||
* [Mod](arithmetic/Mod_1.md)
|
||||
* [MVN](normalization/MVN_6.md)
|
||||
* [MulticlassNMS](sort/MulticlassNMS_8.md)
|
||||
* [MulticlassNMS](sort/MulticlassNonMaxSuppression_8.md)
|
||||
* [Multiply](arithmetic/Multiply_1.md)
|
||||
* [Negative](arithmetic/Negative_1.md)
|
||||
* [NonMaxSuppression](sort/NonMaxSuppression_5.md)
|
||||
|
@ -105,7 +105,7 @@ declared in `namespace opset9`.
|
||||
* [Mish](activation/Mish_4.md)
|
||||
* [Mod](arithmetic/Mod_1.md)
|
||||
* [MVN](normalization/MVN_6.md)
|
||||
* [MulticlassNMS](sort/MulticlassNMS_8.md)
|
||||
* [MulticlassNMS](sort/MulticlassNonMaxSuppression_9.md)
|
||||
* [Multiply](arithmetic/Multiply_1.md)
|
||||
* [Negative](arithmetic/Negative_1.md)
|
||||
* [NonMaxSuppression](sort/NonMaxSuppression_5.md)
|
||||
|
208
docs/ops/sort/MulticlassNonMaxSuppression_9.md
Normal file
208
docs/ops/sort/MulticlassNonMaxSuppression_9.md
Normal file
@ -0,0 +1,208 @@
|
||||
## MulticlassNonMaxSuppression<a name="MulticlassNonMaxSuppression"></a> {#openvino_docs_ops_sort_MulticlassNonMaxSuppression_9}
|
||||
|
||||
**Versioned name**: *MulticlassNonMaxSuppression-9*
|
||||
|
||||
**Category**: *Sorting and maximization*
|
||||
|
||||
**Short description**: *MulticlassNonMaxSuppression* performs multi-class non-maximum suppression of the boxes with predicted scores.
|
||||
|
||||
**Detailed description**: *MulticlassNonMaxSuppression* is a multi-phase operation. It implements non-maximum suppression algorithm as described below:
|
||||
|
||||
1. Let `B = [b_0,...,b_n]` be the list of initial detection boxes, `S = [s_0,...,s_N]` be the list of corresponding scores.
|
||||
2. Let `D = []` be an initial collection of resulting boxes. Let `adaptive_threshold = iou_threshold`.
|
||||
3. If `B` is empty, go to step 9.
|
||||
4. Take the box with highest score. Suppose that it is the box `b` with the score `s`.
|
||||
5. Delete `b` from `B`.
|
||||
6. If the score `s` is greater than or equal to `score_threshold`, add `b` to `D`, else go to step 9.
|
||||
7. If `nms_eta < 1` and `adaptive_threshold > 0.5`, update `adaptive_threshold *= nms_eta`.
|
||||
8. For each input box `b_i` from `B` and the corresponding score `s_i`, set `s_i = 0` when `iou(b, b_i) > adaptive_threshold`, and go to step 3.
|
||||
9. Return `D`, a collection of the corresponding scores `S`, and the number of elements in `D`.
|
||||
|
||||
This algorithm is applied independently to each class of each batch element. The operation feeds at most `nms_top_k` scoring candidate boxes to this algorithm.
|
||||
The total number of output boxes of each batch element must not exceed `keep_top_k`.
|
||||
Boxes of `background_class` are skipped and thus eliminated.
|
||||
|
||||
**Attributes**:
|
||||
|
||||
* *sort_result*
|
||||
|
||||
* **Description**: *sort_result* specifies the order of output elements.
|
||||
* **Range of values**: `class`, `score`, `none`
|
||||
* *class* - sort selected boxes by class id (ascending).
|
||||
* *score* - sort selected boxes by score (descending).
|
||||
* *none* - do not guarantee the order.
|
||||
* **Type**: `string`
|
||||
* **Default value**: `none`
|
||||
* **Required**: *no*
|
||||
|
||||
* *sort_result_across_batch*
|
||||
|
||||
* **Description**: *sort_result_across_batch* is a flag that specifies whenever it is necessary to sort selected boxes across batches or not.
|
||||
* **Range of values**: true or false
|
||||
* *true* - sort selected boxes across batches.
|
||||
* *false* - do not sort selected boxes across batches (boxes are sorted per batch element).
|
||||
* **Type**: boolean
|
||||
* **Default value**: false
|
||||
* **Required**: *no*
|
||||
|
||||
* *output_type*
|
||||
|
||||
* **Description**: the tensor type of outputs `selected_indices` and `valid_outputs`.
|
||||
* **Range of values**: `i64` or `i32`
|
||||
* **Type**: `string`
|
||||
* **Default value**: `i64`
|
||||
* **Required**: *no*
|
||||
|
||||
* *iou_threshold*
|
||||
|
||||
* **Description**: intersection over union threshold.
|
||||
* **Range of values**: a floating-point number
|
||||
* **Type**: `float`
|
||||
* **Default value**: `0`
|
||||
* **Required**: *no*
|
||||
|
||||
* *score_threshold*
|
||||
|
||||
* **Description**: minimum score to consider box for the processing.
|
||||
* **Range of values**: a floating-point number
|
||||
* **Type**: `float`
|
||||
* **Default value**: `0`
|
||||
* **Required**: *no*
|
||||
|
||||
* *nms_top_k*
|
||||
|
||||
* **Description**: maximum number of boxes to be selected per class.
|
||||
* **Range of values**: an integer
|
||||
* **Type**: `int`
|
||||
* **Default value**: `-1` meaning to keep all boxes
|
||||
* **Required**: *no*
|
||||
|
||||
* *keep_top_k*
|
||||
|
||||
* **Description**: maximum number of boxes to be selected per batch element.
|
||||
* **Range of values**: an integer
|
||||
* **Type**: `int`
|
||||
* **Default value**: `-1` meaning to keep all boxes
|
||||
* **Required**: *no*
|
||||
|
||||
* *background_class*
|
||||
|
||||
* **Description**: the background class id.
|
||||
* **Range of values**: an integer
|
||||
* **Type**: `int`
|
||||
* **Default value**: `-1` meaning to keep all classes.
|
||||
* **Required**: *no*
|
||||
|
||||
* *normalized*
|
||||
|
||||
* **Description**: *normalized* is a flag that indicates whether `boxes` are normalized or not.
|
||||
* **Range of values**: true or false
|
||||
* *true* - the box coordinates are normalized.
|
||||
* *false* - the box coordinates are not normalized.
|
||||
* **Type**: boolean
|
||||
* **Default value**: True
|
||||
* **Required**: *no*
|
||||
|
||||
* *nms_eta*
|
||||
|
||||
* **Description**: eta parameter for adaptive NMS.
|
||||
* **Range of values**: a floating-point number in close range `[0, 1.0]`.
|
||||
* **Type**: `float`
|
||||
* **Default value**: `1.0`
|
||||
* **Required**: *no*
|
||||
|
||||
**Inputs**:
|
||||
|
||||
There are 2 kinds of input formats. The first one is of two inputs. The boxes are shared by all classes.
|
||||
* **1**: `boxes` - tensor of type *T* and shape `[num_batches, num_boxes, 4]` with box coordinates. The box coordinates are layout as `[xmin, ymin, xmax, ymax]`. **Required.**
|
||||
|
||||
* **2**: `scores` - tensor of type *T* and shape `[num_batches, num_classes, num_boxes]` with box scores. The tensor type should be same with `boxes`. **Required.**
|
||||
|
||||
The second format is of three inputs. Each class has its own boxes that are not shared.
|
||||
* **1**: `boxes` - tensor of type *T* and shape `[num_classes, num_boxes, 4]` with box coordinates. The box coordinates are layout as `[xmin, ymin, xmax, ymax]`. **Required.**
|
||||
|
||||
* **2**: `scores` - tensor of type *T* and shape `[num_classes, num_boxes]` with box scores. The tensor type should be same with `boxes`. **Required.**
|
||||
|
||||
* **3**: `roisnum` - tensor of type *T_IND* and shape `[num_batches]` with box numbers in each image. `num_batches` is the number of images. Each element in this tensor is the number of boxes for corresponding image. The sum of all elements is `num_boxes`. **Required.**
|
||||
|
||||
**Outputs**:
|
||||
|
||||
* **1**: `selected_outputs` - tensor of type *T* which should be same with `boxes` and shape `[number of selected boxes, 6]` containing the selected boxes with score and class as tuples `[class_id, box_score, xmin, ymin, xmax, ymax]`.
|
||||
|
||||
* **2**: `selected_indices` - tensor of type *T_IND* and shape `[number of selected boxes, 1]` the selected indices in the flattened `boxes`, which are absolute values cross batches. Therefore possible valid values are in the range `[0, num_batches * num_boxes - 1]`.
|
||||
|
||||
* **3**: `selected_num` - 1D tensor of type *T_IND* and shape `[num_batches]` representing the number of selected boxes for each batch element.
|
||||
|
||||
When there is no box selected, `selected_num` is filled with `0`. `selected_outputs` is an empty tensor of shape `[0, 6]`, and `selected_indices` is an empty tensor of shape `[0, 1]`.
|
||||
|
||||
**Types**
|
||||
|
||||
* *T*: floating-point type.
|
||||
|
||||
* *T_IND*: `int64` or `int32`.
|
||||
|
||||
**Example**
|
||||
|
||||
```xml
|
||||
<layer ... type="MulticlassNonMaxSuppression" ... >
|
||||
<data sort_result="score" output_type="i64" sort_result_across_batch="false" iou_threshold="0.2" score_threshold="0.5" nms_top_k="-1" keep_top_k="-1" background_class="-1" normalized="false" nms_eta="0.0"/>
|
||||
<input>
|
||||
<port id="0">
|
||||
<dim>3</dim>
|
||||
<dim>100</dim>
|
||||
<dim>4</dim>
|
||||
</port>
|
||||
<port id="1">
|
||||
<dim>3</dim>
|
||||
<dim>5</dim>
|
||||
<dim>100</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="5" precision="FP32">
|
||||
<dim>-1</dim> <!-- "-1" means a undefined dimension calculated during the model inference -->
|
||||
<dim>6</dim>
|
||||
</port>
|
||||
<port id="6" precision="I64">
|
||||
<dim>-1</dim>
|
||||
<dim>1</dim>
|
||||
</port>
|
||||
<port id="7" precision="I64">
|
||||
<dim>3</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
```
|
||||
Another possible example with 3 inputs could be like:
|
||||
```xml
|
||||
<layer ... type="MulticlassNonMaxSuppression" ... >
|
||||
<data sort_result="score" output_type="i64" sort_result_across_batch="false" iou_threshold="0.2" score_threshold="0.5" nms_top_k="-1" keep_top_k="-1" background_class="-1" normalized="false" nms_eta="0.0"/>
|
||||
<input>
|
||||
<port id="0">
|
||||
<dim>3</dim>
|
||||
<dim>100</dim>
|
||||
<dim>4</dim>
|
||||
</port>
|
||||
<port id="1">
|
||||
<dim>3</dim>
|
||||
<dim>100</dim>
|
||||
</port>
|
||||
<port id="2">
|
||||
<dim>10</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="5" precision="FP32">
|
||||
<dim>-1</dim> <!-- "-1" means a undefined dimension calculated during the model inference -->
|
||||
<dim>6</dim>
|
||||
</port>
|
||||
<port id="6" precision="I64">
|
||||
<dim>-1</dim>
|
||||
<dim>1</dim>
|
||||
</port>
|
||||
<port id="7" precision="I64">
|
||||
<dim>3</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
```
|
Loading…
Reference in New Issue
Block a user