Specify MulticlassNonMaxSuppression-9 operation (#11083)

2022-05-05 16:27:47 +08:00 · 2022-05-05 16:27:47 +08:00 · e68613a2fc
commit e68613a2fc
parent 68ef1555bc
5 changed files with 211 additions and 2 deletions
--- a/docs/OV_Runtime_UG/Operations_specifications.md
+++ b/docs/OV_Runtime_UG/Operations_specifications.md
@ -113,6 +113,7 @@
   Mish-4 <openvino_docs_ops_activation_Mish_4>
   Mod-1 <openvino_docs_ops_arithmetic_Mod_1>
   MulticlassNonMaxSuppression-8 <openvino_docs_ops_sort_MulticlassNonMaxSuppression_8>
+   MulticlassNonMaxSuppression-9 <openvino_docs_ops_sort_MulticlassNonMaxSuppression_9>
   Multiply-1 <openvino_docs_ops_arithmetic_Multiply_1>
   Negative-1 <openvino_docs_ops_arithmetic_Negative_1>
   NonMaxSuppression-1 <openvino_docs_ops_sort_NonMaxSuppression_1>
--- a/docs/ops/opset8.md
+++ b/docs/ops/opset8.md
@ -103,7 +103,7 @@ declared in `namespace opset8`.
 * [Mish](activation/Mish_4.md)
 * [Mod](arithmetic/Mod_1.md)
 * [MVN](normalization/MVN_6.md)
-* [MulticlassNMS](sort/MulticlassNMS_8.md)
+* [MulticlassNMS](sort/MulticlassNonMaxSuppression_8.md)
 * [Multiply](arithmetic/Multiply_1.md)
 * [Negative](arithmetic/Negative_1.md)
 * [NonMaxSuppression](sort/NonMaxSuppression_5.md)
--- a/docs/ops/opset9.md
+++ b/docs/ops/opset9.md
@ -105,7 +105,7 @@ declared in `namespace opset9`.
 * [Mish](activation/Mish_4.md)
 * [Mod](arithmetic/Mod_1.md)
 * [MVN](normalization/MVN_6.md)
-* [MulticlassNMS](sort/MulticlassNMS_8.md)
+* [MulticlassNMS](sort/MulticlassNonMaxSuppression_9.md)
 * [Multiply](arithmetic/Multiply_1.md)
 * [Negative](arithmetic/Negative_1.md)
 * [NonMaxSuppression](sort/NonMaxSuppression_5.md)
--- a/docs/ops/sort/MulticlassNonMaxSuppression_8.md
+++ b/docs/ops/sort/MulticlassNonMaxSuppression_8.md
--- a/docs/ops/sort/MulticlassNonMaxSuppression_9.md
+++ b/docs/ops/sort/MulticlassNonMaxSuppression_9.md
@ -0,0 +1,208 @@
+##  MulticlassNonMaxSuppression<a name="MulticlassNonMaxSuppression"></a> {#openvino_docs_ops_sort_MulticlassNonMaxSuppression_9}
+
+**Versioned name**: *MulticlassNonMaxSuppression-9*
+
+**Category**: *Sorting and maximization*
+
+**Short description**: *MulticlassNonMaxSuppression* performs multi-class non-maximum suppression of the boxes with predicted scores.
+
+**Detailed description**: *MulticlassNonMaxSuppression* is a multi-phase operation. It implements non-maximum suppression algorithm as described below:
+
+1.  Let `B = [b_0,...,b_n]` be the list of initial detection boxes, `S = [s_0,...,s_N]` be  the list of corresponding scores.
+2.  Let `D = []` be an initial collection of resulting boxes. Let `adaptive_threshold = iou_threshold`.
+3.  If `B` is empty, go to step 9.
+4.  Take the box with highest score. Suppose that it is the box `b` with the score `s`.
+5.  Delete `b` from `B`.
+6.  If the score `s` is greater than or equal to `score_threshold`,  add `b` to `D`, else go to step 9.
+7.  If `nms_eta < 1` and `adaptive_threshold > 0.5`, update `adaptive_threshold *= nms_eta`.
+8.  For each input box `b_i` from `B` and the corresponding score `s_i`, set `s_i = 0` when `iou(b, b_i) > adaptive_threshold`, and go to step 3.
+9.  Return `D`, a collection of the corresponding scores `S`, and the number of elements in `D`.
+
+This algorithm is applied independently to each class of each batch element. The operation feeds at most `nms_top_k` scoring candidate boxes to this algorithm.
+The total number of output boxes of each batch element must not exceed `keep_top_k`.
+Boxes of `background_class` are skipped and thus eliminated.
+
+**Attributes**:
+
+* *sort_result*
+
+  * **Description**: *sort_result* specifies the order of output elements.
+  * **Range of values**: `class`, `score`, `none`
+    * *class* - sort selected boxes by class id (ascending).
+    * *score* - sort selected boxes by score (descending).
+    * *none* - do not guarantee the order.
+  * **Type**: `string`
+  * **Default value**: `none`
+  * **Required**: *no*
+
+* *sort_result_across_batch*
+
+  * **Description**: *sort_result_across_batch* is a flag that specifies whenever it is necessary to sort selected boxes across batches or not.
+  * **Range of values**: true or false
+    * *true* - sort selected boxes across batches.
+    * *false* - do not sort selected boxes across batches (boxes are sorted per batch element).
+  * **Type**: boolean
+  * **Default value**: false
+  * **Required**: *no*
+
+* *output_type*
+
+  * **Description**: the tensor type of outputs `selected_indices` and `valid_outputs`.
+  * **Range of values**: `i64` or `i32`
+  * **Type**: `string`
+  * **Default value**: `i64`
+  * **Required**: *no*
+
+* *iou_threshold*
+
+  * **Description**: intersection over union threshold.
+  * **Range of values**: a floating-point number
+  * **Type**: `float`
+  * **Default value**: `0`
+  * **Required**: *no*
+
+* *score_threshold*
+
+  * **Description**: minimum score to consider box for the processing.
+  * **Range of values**: a floating-point number
+  * **Type**: `float`
+  * **Default value**: `0`
+  * **Required**: *no*
+
+* *nms_top_k*
+
+  * **Description**: maximum number of boxes to be selected per class.
+  * **Range of values**: an integer
+  * **Type**: `int`
+  * **Default value**: `-1` meaning to keep all boxes
+  * **Required**: *no*
+
+* *keep_top_k*
+
+  * **Description**: maximum number of boxes to be selected per batch element.
+  * **Range of values**: an integer
+  * **Type**: `int`
+  * **Default value**: `-1` meaning to keep all boxes
+  * **Required**: *no*
+
+* *background_class*
+
+  * **Description**: the background class id.
+  * **Range of values**: an integer
+  * **Type**: `int`
+  * **Default value**: `-1` meaning to keep all classes.
+  * **Required**: *no*
+
+* *normalized*
+
+  * **Description**: *normalized* is a flag that indicates whether `boxes` are normalized or not.
+  * **Range of values**: true or false
+    * *true* - the box coordinates are normalized.
+    * *false* - the box coordinates are not normalized.
+  * **Type**: boolean
+  * **Default value**: True
+  * **Required**: *no*
+
+* *nms_eta*
+
+  * **Description**: eta parameter for adaptive NMS.
+  * **Range of values**: a floating-point number in close range `[0, 1.0]`.
+  * **Type**: `float`
+  * **Default value**: `1.0`
+  * **Required**: *no*
+
+**Inputs**:
+
+There are 2 kinds of input formats. The first one is of two inputs. The boxes are shared by all classes.
+*   **1**: `boxes` - tensor of type *T* and shape `[num_batches, num_boxes, 4]` with box coordinates. The box coordinates are layout as `[xmin, ymin, xmax, ymax]`. **Required.**
+
+*   **2**: `scores` - tensor of type *T* and shape `[num_batches, num_classes, num_boxes]` with box scores. The tensor type should be same with `boxes`. **Required.**
+
+The second format is of three inputs. Each class has its own boxes that are not shared.
+*   **1**: `boxes` - tensor of type *T* and shape `[num_classes, num_boxes, 4]` with box coordinates. The box coordinates are layout as `[xmin, ymin, xmax, ymax]`. **Required.**
+
+*   **2**: `scores` - tensor of type *T* and shape `[num_classes, num_boxes]` with box scores. The tensor type should be same with `boxes`. **Required.**
+
+*   **3**: `roisnum` - tensor of type *T_IND* and shape `[num_batches]` with box numbers in each image. `num_batches` is the number of images. Each element in this tensor is the number of boxes for corresponding image. The sum of all elements is `num_boxes`. **Required.**
+
+**Outputs**:
+
+*   **1**: `selected_outputs` - tensor of type *T* which should be same with `boxes` and shape `[number of selected boxes, 6]` containing the selected boxes with score and class as tuples `[class_id, box_score, xmin, ymin, xmax, ymax]`.
+
+*   **2**: `selected_indices` - tensor of type *T_IND* and shape `[number of selected boxes, 1]` the selected indices in the flattened `boxes`, which are absolute values cross batches. Therefore possible valid values are in the range `[0, num_batches * num_boxes - 1]`.
+
+*   **3**: `selected_num` - 1D tensor of type *T_IND* and shape `[num_batches]` representing the number of selected boxes for each batch element.
+
+When there is no box selected, `selected_num` is filled with `0`. `selected_outputs` is an empty tensor of shape `[0, 6]`, and `selected_indices` is an empty tensor of shape `[0, 1]`.
+
+**Types**
+
+* *T*: floating-point type.
+
+* *T_IND*: `int64` or `int32`.
+
+**Example**
+
+```xml
+<layer ... type="MulticlassNonMaxSuppression" ... >
+    <data sort_result="score" output_type="i64" sort_result_across_batch="false" iou_threshold="0.2" score_threshold="0.5" nms_top_k="-1" keep_top_k="-1" background_class="-1" normalized="false" nms_eta="0.0"/>
+    <input>
+        <port id="0">
+            <dim>3</dim>
+            <dim>100</dim>
+            <dim>4</dim>
+        </port>
+        <port id="1">
+            <dim>3</dim>
+            <dim>5</dim>
+            <dim>100</dim>
+        </port>
+    </input>
+    <output>
+        <port id="5" precision="FP32">
+            <dim>-1</dim> <!-- "-1" means a undefined dimension calculated during the model inference -->
+            <dim>6</dim>
+        </port>
+        <port id="6" precision="I64">
+            <dim>-1</dim>
+            <dim>1</dim>
+        </port>
+        <port id="7" precision="I64">
+            <dim>3</dim>
+        </port>
+    </output>
+</layer>
+```
+Another possible example with 3 inputs could be like:
+```xml
+<layer ... type="MulticlassNonMaxSuppression" ... >
+    <data sort_result="score" output_type="i64" sort_result_across_batch="false" iou_threshold="0.2" score_threshold="0.5" nms_top_k="-1" keep_top_k="-1" background_class="-1" normalized="false" nms_eta="0.0"/>
+    <input>
+        <port id="0">
+            <dim>3</dim>
+            <dim>100</dim>
+            <dim>4</dim>
+        </port>
+        <port id="1">
+            <dim>3</dim>
+            <dim>100</dim>
+        </port>
+        <port id="2">
+            <dim>10</dim>
+	</port>
+    </input>
+    <output>
+        <port id="5" precision="FP32">
+            <dim>-1</dim> <!-- "-1" means a undefined dimension calculated during the model inference -->
+            <dim>6</dim>
+        </port>
+        <port id="6" precision="I64">
+            <dim>-1</dim>
+            <dim>1</dim>
+        </port>
+        <port id="7" precision="I64">
+            <dim>3</dim>
+        </port>
+    </output>
+</layer>
+```