Revise DetectionOutput reference implementation (#3448)

* Revice DetectionOutput reference implementation

Ticket: 37433

* fix test_create_op

* fix test_dyn_attributes

* apply code format

* fix crash on Windows when variance_encoded_in_target == 1

* add more checks for DetectionOutput inputs

* Fix single layer tests

* apply code format

* fix ssd_vgg16_300 inference with batch size > 1

* update types in docs

* fix crash on windows

* apply code style

* fix python tests

* fix setting output type

* change False to false and True to true in docs

* Allow prior boxes to have different type than box logits

Some models work that way

* simplify output shape calculation

* fixes to docs
This commit is contained in:
Mateusz Tabaka 2020-12-16 09:02:12 +01:00 committed by GitHub
parent 602f8f2e08
commit 47485646bb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
16 changed files with 2082 additions and 144 deletions

View File

@ -6,7 +6,7 @@
**Short description**: *DetectionOutput* performs non-maximum suppression to generate the detection output using information on location and confidence predictions. **Short description**: *DetectionOutput* performs non-maximum suppression to generate the detection output using information on location and confidence predictions.
**Detailed description**: [Reference](https://arxiv.org/pdf/1512.02325.pdf). The layer has 3 mandatory inputs: tensor with box logits, tensor with confidence predictions and tensor with box coordinates (proposals). It can have 2 additional inputs with additional confidence predictions and box coordinates described in the [article](https://arxiv.org/pdf/1711.06897.pdf). The 5-input version of the layer is supported with Myriad plugin only. The output tensor contains information about filtered detections described with 7 element tuples: *[batch_id, class_id, confidence, x_1, y_1, x_2, y_2]*. The first tuple with *batch_id* equal to *-1* means end of output. **Detailed description**: [Reference](https://arxiv.org/pdf/1512.02325.pdf). The layer has 3 mandatory inputs: tensor with box logits, tensor with confidence predictions and tensor with box coordinates (proposals). It can have 2 additional inputs with additional confidence predictions and box coordinates described in the [article](https://arxiv.org/pdf/1711.06897.pdf). The output tensor contains information about filtered detections described with 7 element tuples: `[batch_id, class_id, confidence, x_1, y_1, x_2, y_2]`. The first tuple with `batch_id` equal to `-1` means end of output.
At each feature map cell, *DetectionOutput* predicts the offsets relative to the default box shapes in the cell, as well as the per-class scores that indicate the presence of a class instance in each of those boxes. Specifically, for each box out of k at a given location, *DetectionOutput* computes class scores and the four offsets relative to the original default box shape. This results in a total of \f$(c + 4)k\f$ filters that are applied around each location in the feature map, yielding \f$(c + 4)kmn\f$ outputs for a *m \* n* feature map. At each feature map cell, *DetectionOutput* predicts the offsets relative to the default box shapes in the cell, as well as the per-class scores that indicate the presence of a class instance in each of those boxes. Specifically, for each box out of k at a given location, *DetectionOutput* computes class scores and the four offsets relative to the original default box shape. This results in a total of \f$(c + 4)k\f$ filters that are applied around each location in the feature map, yielding \f$(c + 4)kmn\f$ outputs for a *m \* n* feature map.
@ -63,9 +63,9 @@ At each feature map cell, *DetectionOutput* predicts the offsets relative to the
* *share_location* * *share_location*
* **Description**: *share_location* is a flag that denotes if bounding boxes are shared among different classes. * **Description**: *share_location* is a flag that denotes if bounding boxes are shared among different classes.
* **Range of values**: 0 or 1 * **Range of values**: false or true
* **Type**: int * **Type**: boolean
* **Default value**: 1 * **Default value**: true
* **Required**: *no* * **Required**: *no*
* *nms_threshold* * *nms_threshold*
@ -87,35 +87,35 @@ At each feature map cell, *DetectionOutput* predicts the offsets relative to the
* *clip_after_nms* * *clip_after_nms*
* **Description**: *clip_after_nms* flag that denotes whether to perform clip bounding boxes after non-maximum suppression or not. * **Description**: *clip_after_nms* flag that denotes whether to perform clip bounding boxes after non-maximum suppression or not.
* **Range of values**: 0 or 1 * **Range of values**: false or true
* **Type**: int * **Type**: boolean
* **Default value**: 0 * **Default value**: false
* **Required**: *no* * **Required**: *no*
* *clip_before_nms* * *clip_before_nms*
* **Description**: *clip_before_nms* flag that denotes whether to perform clip bounding boxes before non-maximum suppression or not. * **Description**: *clip_before_nms* flag that denotes whether to perform clip bounding boxes before non-maximum suppression or not.
* **Range of values**: 0 or 1 * **Range of values**: false or true
* **Type**: int * **Type**: boolean
* **Default value**: 0 * **Default value**: false
* **Required**: *no* * **Required**: *no*
* *decrease_label_id* * *decrease_label_id*
* **Description**: *decrease_label_id* flag that denotes how to perform NMS. * **Description**: *decrease_label_id* flag that denotes how to perform NMS.
* **Range of values**: * **Range of values**:
* 0 - perform NMS like in Caffe\*. * false - perform NMS like in Caffe\*.
* 1 - perform NMS like in MxNet\*. * true - perform NMS like in MxNet\*.
* **Type**: int * **Type**: boolean
* **Default value**: 0 * **Default value**: false
* **Required**: *no* * **Required**: *no*
* *normalized* * *normalized*
* **Description**: *normalized* flag that denotes whether input tensors with boxes are normalized. If tensors are not normalized then *input_height* and *input_width* attributes are used to normalize box coordinates. * **Description**: *normalized* flag that denotes whether input tensor with proposal boxes is normalized. If tensor is not normalized then *input_height* and *input_width* attributes are used to normalize box coordinates.
* **Range of values**: 0 or 1 * **Range of values**: false or true
* **Type**: int * **Type**: boolean
* **Default value**: 0 * **Default value**: false
* **Required**: *no* * **Required**: *no*
* *input_height (input_width)* * *input_height (input_width)*
@ -133,21 +133,52 @@ At each feature map cell, *DetectionOutput* predicts the offsets relative to the
* **Type**: float * **Type**: float
* **Default value**: 0 * **Default value**: 0
* **Required**: *no* * **Required**: *no*
**Inputs** **Inputs**
* **1**: 2D input tensor with box logits. Required. * **1**: 2D input tensor with box logits with shape `[N, num_prior_boxes * num_loc_classes * 4]` and type *T*. `num_loc_classes` is equal to `num_classes` when `share_location` is 0 or it's equal to 1 otherwise. Required.
* **2**: 2D input tensor with class predictions. Required. * **2**: 2D input tensor with class predictions with shape `[N, num_prior_boxes * num_classes]` and type *T*. Required.
* **3**: 3D input tensor with proposals. Required. * **3**: 3D input tensor with proposals with shape `[priors_batch_size, 1, num_prior_boxes * prior_box_size]` or `[priors_batch_size, 2, num_prior_boxes * prior_box_size]`. `priors_batch_size` is either 1 or `N`. Size of the second dimension depends on `variance_encoded_in_target`. If `variance_encoded_in_target` is equal to 0, the second dimension equals to 2 and variance values are provided for each boxes coordinates. If `variance_encoded_in_target` is equal to 1, the second dimension equals to 1 and this tensor contains proposals boxes only. `prior_box_size` is equal to 4 when `normalized` is set to 1 or it's equal to 5 otherwise. Required.
* **4**: 2D input tensor with additional class predictions information described in the [article](https://arxiv.org/pdf/1711.06897.pdf). Optional. Required.
* **5**: 2D input tensor with additional box predictions information described in the [article](https://arxiv.org/pdf/1711.06897.pdf). Optional. * **4**: 2D input tensor with additional class predictions information described in the [article](https://arxiv.org/pdf/1711.06897.pdf). Its shape must be equal to `[N, num_prior_boxes * 2]`. Optional.
* **5**: 2D input tensor with additional box predictions information described in the [article](https://arxiv.org/pdf/1711.06897.pdf). Its shape must be equal to first input tensor shape. Optional.
**Outputs**
* **1**: 4D output tensor with type *T*. Its shape depends on `keep_top_k` or `top_k` being set. It `keep_top_k[0]` is greater than zero, then the shape is `[1, 1, N * keep_top_k[0], 7]`. If `keep_top_k[0]` is set to -1 and `top_k` is greater than zero, then the shape is `[1, 1, N * top_k * num_classes, 7]`. Otherwise, the output shape is equal to `[1, 1, N * num_classes * num_prior_boxes, 7]`.
**Types**
* *T*: any supported floating point type.
**Example** **Example**
```xml ```xml
<layer ... type="DetectionOutput" ... > <layer ... type="DetectionOutput" ... >
<data num_classes="21" share_location="1" background_label_id="0" nms_threshold="0.450000" top_k="400" input_height="1" input_width="1" code_type="caffe.PriorBoxParameter.CENTER_SIZE" variance_encoded_in_target="0" keep_top_k="200" confidence_threshold="0.010000"/> <data background_label_id="1" code_type="caffe.PriorBoxParameter.CENTER_SIZE" confidence_threshold="0.019999999552965164" eta="1.0" height="0" height_scale="0" input_height="1" input_width="1" interp_mode="" keep_top_k="200" nms_threshold="0.44999998807907104" normalized="1" num_classes="2" pad_mode="" pad_value="" prob="0" resize_mode="" share_location="1" top_k="200" variance_encoded_in_target="0" visualize_threshold="0.6" width="0" width_scale="0"/>
<input> ... </input> <input>
<output> ... </output> <port id="0">
<dim>1</dim>
<dim>5376</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>2688</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>2</dim>
<dim>5376</dim>
</port>
</input>
<output>
<port id="3" precision="FP32">
<dim>1</dim>
<dim>1</dim>
<dim>200</dim>
<dim>7</dim>
</port>
</output>
</layer> </layer>
``` ```

View File

@ -931,11 +931,12 @@ void DetectionOutputValidator::parseParams(CNNLayer* layer) {
if (_nms_threshold < 0) { if (_nms_threshold < 0) {
THROW_IE_EXCEPTION << "nms_threshold parameter of DetectionOutput layer can't be less then zero"; THROW_IE_EXCEPTION << "nms_threshold parameter of DetectionOutput layer can't be less then zero";
} }
int _keep_top_k = layer->GetParamAsUInt("keep_top_k", -1); int _keep_top_k = layer->GetParamAsInt("keep_top_k", -1);
if (layer->CheckParamPresence("background_label_id")) if (layer->CheckParamPresence("background_label_id"))
int _background_label_id = layer->GetParamAsUInt("background_label_id", -1); int _background_label_id = layer->GetParamAsInt("background_label_id", -1);
if (layer->CheckParamPresence("top_k")) int _top_k = layer->GetParamAsUInt("top_k", -1); if (layer->CheckParamPresence("top_k"))
int _top_k = layer->GetParamAsInt("top_k", -1);
if (layer->CheckParamPresence("variance_encoded_in_target")) if (layer->CheckParamPresence("variance_encoded_in_target"))
bool _variance_encoded_in_target = static_cast<bool>(layer->GetParamAsUInt("variance_encoded_in_target", 0)); bool _variance_encoded_in_target = static_cast<bool>(layer->GetParamAsUInt("variance_encoded_in_target", 0));
if (layer->CheckParamPresence("num_orient_classes")) if (layer->CheckParamPresence("num_orient_classes"))
@ -947,7 +948,7 @@ void DetectionOutputValidator::parseParams(CNNLayer* layer) {
if (layer->CheckParamPresence("confidence_threshold")) { if (layer->CheckParamPresence("confidence_threshold")) {
float _confidence_threshold = layer->GetParamAsFloat("confidence_threshold"); float _confidence_threshold = layer->GetParamAsFloat("confidence_threshold");
if (_confidence_threshold < 0) { if (_confidence_threshold < 0) {
THROW_IE_EXCEPTION << "_nms_threshold parameter of DetectionOutput layer can't be less then zero"; THROW_IE_EXCEPTION << "_confidence_threshold parameter of DetectionOutput layer can't be less then zero";
} }
} }

View File

@ -278,17 +278,23 @@ public:
} }
} }
const int num_results = outputs[0]->getTensorDesc().getDims()[2];
const int DETECTION_SIZE = outputs[0]->getTensorDesc().getDims()[3]; const int DETECTION_SIZE = outputs[0]->getTensorDesc().getDims()[3];
if (DETECTION_SIZE != 7) { if (DETECTION_SIZE != 7) {
return NOT_IMPLEMENTED; return NOT_IMPLEMENTED;
} }
auto dst_data_size = N * _keep_top_k * DETECTION_SIZE * sizeof(float); int dst_data_size = 0;
if (_keep_top_k > 0)
dst_data_size = N * _keep_top_k * DETECTION_SIZE * sizeof(float);
else if (_top_k > 0)
dst_data_size = N * _top_k * _num_classes * DETECTION_SIZE * sizeof(float);
else
dst_data_size = N * _num_classes * _num_priors * DETECTION_SIZE * sizeof(float);
if (dst_data_size > outputs[0]->byteSize()) { if (dst_data_size > outputs[0]->byteSize()) {
return OUT_OF_BOUNDS; return OUT_OF_BOUNDS;
} }
memset(dst_data, 0, dst_data_size); memset(dst_data, 0, dst_data_size);
int count = 0; int count = 0;
@ -331,7 +337,7 @@ public:
} }
} }
if (count < N*_keep_top_k) { if (count < num_results) {
// marker at end of boxes list // marker at end of boxes list
dst_data[count * DETECTION_SIZE + 0] = -1; dst_data[count * DETECTION_SIZE + 0] = -1;
} }

View File

@ -28,11 +28,11 @@ namespace ngraph
int background_label_id = 0; int background_label_id = 0;
int top_k = -1; int top_k = -1;
bool variance_encoded_in_target = false; bool variance_encoded_in_target = false;
std::vector<int> keep_top_k = {1}; std::vector<int> keep_top_k;
std::string code_type = std::string{"caffe.PriorBoxParameter.CORNER"}; std::string code_type = std::string{"caffe.PriorBoxParameter.CORNER"};
bool share_location = true; bool share_location = true;
float nms_threshold; float nms_threshold;
float confidence_threshold = std::numeric_limits<float>::min(); float confidence_threshold = 0;
bool clip_after_nms = false; bool clip_after_nms = false;
bool clip_before_nms = false; bool clip_before_nms = false;
bool decrease_label_id = false; bool decrease_label_id = false;

View File

@ -9,6 +9,7 @@
#include <string> #include <string>
#include <vector> #include <vector>
#include "ngraph/op/detection_output.hpp"
#include "ngraph/shape.hpp" #include "ngraph/shape.hpp"
namespace ngraph namespace ngraph
@ -37,7 +38,6 @@ namespace ngraph
dataType ymin = dataType(0); dataType ymin = dataType(0);
dataType xmax = dataType(0); dataType xmax = dataType(0);
dataType ymax = dataType(0); dataType ymax = dataType(0);
dataType size = dataType(0);
}; };
using LabelBBox = std::map<int, std::vector<NormalizedBBox>>; using LabelBBox = std::map<int, std::vector<NormalizedBBox>>;
@ -45,8 +45,11 @@ namespace ngraph
size_t numImages; size_t numImages;
size_t priorSize; size_t priorSize;
size_t numPriors; size_t numPriors;
size_t priorsBatchSize;
size_t numLocClasses; size_t numLocClasses;
size_t offset; size_t offset;
size_t numResults;
size_t outTotalSize;
void GetLocPredictions(const dataType* locData, std::vector<LabelBBox>& locations) void GetLocPredictions(const dataType* locData, std::vector<LabelBBox>& locations)
{ {
@ -145,13 +148,12 @@ namespace ngraph
std::vector<std::vector<NormalizedBBox>>& priorBboxes, std::vector<std::vector<NormalizedBBox>>& priorBboxes,
std::vector<std::vector<std::vector<dataType>>>& priorVariances) std::vector<std::vector<std::vector<dataType>>>& priorVariances)
{ {
priorBboxes.resize(numImages); priorBboxes.resize(priorsBatchSize);
priorVariances.resize(numImages); priorVariances.resize(priorsBatchSize);
for (int n = 0; n < numImages; n++) int off = attrs.variance_encoded_in_target ? (numPriors * priorSize)
: (2 * numPriors * priorSize);
for (int n = 0; n < priorsBatchSize; n++)
{ {
priorData += attrs.variance_encoded_in_target
? n * numPriors * priorSize
: 2 * n * numPriors * priorSize;
std::vector<NormalizedBBox>& currPrBbox = priorBboxes[n]; std::vector<NormalizedBBox>& currPrBbox = priorBboxes[n];
std::vector<std::vector<dataType>>& currPrVar = priorVariances[n]; std::vector<std::vector<dataType>>& currPrVar = priorVariances[n];
for (int i = 0; i < numPriors; ++i) for (int i = 0; i < numPriors; ++i)
@ -162,8 +164,6 @@ namespace ngraph
bbox.ymin = priorData[start_idx + 1 + offset]; bbox.ymin = priorData[start_idx + 1 + offset];
bbox.xmax = priorData[start_idx + 2 + offset]; bbox.xmax = priorData[start_idx + 2 + offset];
bbox.ymax = priorData[start_idx + 3 + offset]; bbox.ymax = priorData[start_idx + 3 + offset];
dataType bbox_size = BBoxSize(bbox);
bbox.size = bbox_size;
currPrBbox.push_back(bbox); currPrBbox.push_back(bbox);
} }
if (!attrs.variance_encoded_in_target) if (!attrs.variance_encoded_in_target)
@ -172,14 +172,15 @@ namespace ngraph
for (int i = 0; i < numPriors; ++i) for (int i = 0; i < numPriors; ++i)
{ {
int start_idx = i * 4; int start_idx = i * 4;
std::vector<dataType> var; std::vector<dataType> var(4);
for (int j = 0; j < 4; ++j) for (int j = 0; j < 4; ++j)
{ {
var.push_back(priorVar[start_idx + j]); var[j] = (priorVar[start_idx + j]);
} }
currPrVar.push_back(var); currPrVar.push_back(var);
} }
} }
priorData += off;
} }
} }
@ -200,22 +201,13 @@ namespace ngraph
priorXmax /= attrs.input_width; priorXmax /= attrs.input_width;
priorYmax /= attrs.input_height; priorYmax /= attrs.input_height;
} }
if (attrs.code_type == "caffe.PriorBoxParameter.CORNER") if (attrs.code_type == "caffe.PriorBoxParameter.CORNER")
{ {
if (attrs.variance_encoded_in_target) decodeBbox.xmin = priorXmin + priorVariances[0] * bbox.xmin;
{ decodeBbox.ymin = priorYmin + priorVariances[1] * bbox.ymin;
decodeBbox.xmin = priorXmin + bbox.xmin; decodeBbox.xmax = priorXmax + priorVariances[2] * bbox.xmax;
decodeBbox.ymin = priorYmin + bbox.ymin; decodeBbox.ymax = priorYmax + priorVariances[3] * bbox.ymax;
decodeBbox.xmax = priorXmax + bbox.xmax;
decodeBbox.ymax = priorYmax + bbox.ymax;
}
else
{
decodeBbox.xmin = priorXmin + priorVariances[0] * bbox.xmin;
decodeBbox.ymin = priorYmin + priorVariances[1] * bbox.ymin;
decodeBbox.xmax = priorXmax + priorVariances[2] * bbox.xmax;
decodeBbox.ymax = priorYmax + priorVariances[3] * bbox.ymax;
}
} }
else if (attrs.code_type == "caffe.PriorBoxParameter.CENTER_SIZE") else if (attrs.code_type == "caffe.PriorBoxParameter.CENTER_SIZE")
{ {
@ -225,41 +217,60 @@ namespace ngraph
dataType priorCenterY = (priorYmin + priorYmax) / 2; dataType priorCenterY = (priorYmin + priorYmax) / 2;
dataType decodeBboxCenterX, decodeBboxCenterY; dataType decodeBboxCenterX, decodeBboxCenterY;
dataType decodeBboxWidth, decodeBboxHeight; dataType decodeBboxWidth, decodeBboxHeight;
if (attrs.variance_encoded_in_target) decodeBboxCenterX =
{ priorVariances[0] * bbox.xmin * priorWidth + priorCenterX;
decodeBboxCenterX = bbox.xmin * priorWidth + priorCenterX; decodeBboxCenterY =
decodeBboxCenterY = bbox.ymin * priorHeight + priorCenterY; priorVariances[1] * bbox.ymin * priorHeight + priorCenterY;
decodeBboxWidth = std::exp(bbox.xmax) * priorWidth; decodeBboxWidth = std::exp(priorVariances[2] * bbox.xmax) * priorWidth;
decodeBboxHeight = std::exp(bbox.ymax) * priorHeight; decodeBboxHeight = std::exp(priorVariances[3] * bbox.ymax) * priorHeight;
}
else
{
decodeBboxCenterX =
priorVariances[0] * bbox.xmin * priorWidth + priorCenterX;
decodeBboxCenterY =
priorVariances[1] * bbox.ymin * priorHeight + priorCenterY;
decodeBboxWidth = std::exp(priorVariances[2] * bbox.xmax) * priorWidth;
decodeBboxHeight =
std::exp(priorVariances[3] * bbox.ymax) * priorHeight;
}
decodeBbox.xmin = decodeBboxCenterX - decodeBboxWidth / 2; decodeBbox.xmin = decodeBboxCenterX - decodeBboxWidth / 2;
decodeBbox.ymin = decodeBboxCenterY - decodeBboxHeight / 2; decodeBbox.ymin = decodeBboxCenterY - decodeBboxHeight / 2;
decodeBbox.xmax = decodeBboxCenterX + decodeBboxWidth / 2; decodeBbox.xmax = decodeBboxCenterX + decodeBboxWidth / 2;
decodeBbox.ymax = decodeBboxCenterY + decodeBboxHeight / 2; decodeBbox.ymax = decodeBboxCenterY + decodeBboxHeight / 2;
} }
if (attrs.clip_before_nms) }
void DecodeBBox(const NormalizedBBox& priorBboxes,
const NormalizedBBox& bbox,
NormalizedBBox& decodeBbox)
{
dataType priorXmin = priorBboxes.xmin;
dataType priorYmin = priorBboxes.ymin;
dataType priorXmax = priorBboxes.xmax;
dataType priorYmax = priorBboxes.ymax;
if (!attrs.normalized)
{ {
decodeBbox.xmin = priorXmin /= attrs.input_width;
std::max<dataType>(0, std::min<dataType>(1, decodeBbox.xmin)); priorYmin /= attrs.input_height;
decodeBbox.ymin = priorXmax /= attrs.input_width;
std::max<dataType>(0, std::min<dataType>(1, decodeBbox.ymin)); priorYmax /= attrs.input_height;
decodeBbox.xmax = }
std::max<dataType>(0, std::min<dataType>(1, decodeBbox.xmax));
decodeBbox.ymax = if (attrs.code_type == "caffe.PriorBoxParameter.CORNER")
std::max<dataType>(0, std::min<dataType>(1, decodeBbox.ymax)); {
decodeBbox.xmin = priorXmin + bbox.xmin;
decodeBbox.ymin = priorYmin + bbox.ymin;
decodeBbox.xmax = priorXmax + bbox.xmax;
decodeBbox.ymax = priorYmax + bbox.ymax;
}
else if (attrs.code_type == "caffe.PriorBoxParameter.CENTER_SIZE")
{
dataType priorWidth = priorXmax - priorXmin;
dataType priorHeight = priorYmax - priorYmin;
dataType priorCenterX = (priorXmin + priorXmax) / 2;
dataType priorCenterY = (priorYmin + priorYmax) / 2;
dataType decodeBboxCenterX, decodeBboxCenterY;
dataType decodeBboxWidth, decodeBboxHeight;
decodeBboxCenterX = bbox.xmin * priorWidth + priorCenterX;
decodeBboxCenterY = bbox.ymin * priorHeight + priorCenterY;
decodeBboxWidth = std::exp(bbox.xmax) * priorWidth;
decodeBboxHeight = std::exp(bbox.ymax) * priorHeight;
decodeBbox.xmin = decodeBboxCenterX - decodeBboxWidth / 2;
decodeBbox.ymin = decodeBboxCenterY - decodeBboxHeight / 2;
decodeBbox.xmax = decodeBboxCenterX + decodeBboxWidth / 2;
decodeBbox.ymax = decodeBboxCenterY + decodeBboxHeight / 2;
} }
dataType bboxSize = BBoxSize(decodeBbox);
decodeBbox.size = bboxSize;
} }
void DecodeBBoxes(const std::vector<NormalizedBBox>& priorBboxes, void DecodeBBoxes(const std::vector<NormalizedBBox>& priorBboxes,
@ -271,7 +282,27 @@ namespace ngraph
for (int i = 0; i < numBboxes; ++i) for (int i = 0; i < numBboxes; ++i)
{ {
NormalizedBBox decodeBbox; NormalizedBBox decodeBbox;
DecodeBBox(priorBboxes[i], priorVariances[i], labelLocPreds[i], decodeBbox);
if (attrs.variance_encoded_in_target)
{
DecodeBBox(priorBboxes[i], labelLocPreds[i], decodeBbox);
}
else
{
DecodeBBox(
priorBboxes[i], priorVariances[i], labelLocPreds[i], decodeBbox);
}
if (attrs.clip_before_nms)
{
decodeBbox.xmin =
std::max<dataType>(0, std::min<dataType>(1, decodeBbox.xmin));
decodeBbox.ymin =
std::max<dataType>(0, std::min<dataType>(1, decodeBbox.ymin));
decodeBbox.xmax =
std::max<dataType>(0, std::min<dataType>(1, decodeBbox.xmax));
decodeBbox.ymax =
std::max<dataType>(0, std::min<dataType>(1, decodeBbox.ymax));
}
decodeBboxes.push_back(decodeBbox); decodeBboxes.push_back(decodeBbox);
} }
} }
@ -286,12 +317,19 @@ namespace ngraph
for (int i = 0; i < numImages; ++i) for (int i = 0; i < numImages; ++i)
{ {
LabelBBox& decodeBboxesImage = decodeBboxes[i]; LabelBBox& decodeBboxesImage = decodeBboxes[i];
const std::vector<NormalizedBBox>& currPrBbox = priorBboxes[i]; int pboxIdx = i;
const std::vector<std::vector<dataType>>& currPrVar = priorVariances[i]; if (priorBboxes.size() == 1)
{
pboxIdx = 0;
}
const std::vector<NormalizedBBox>& currPrBbox = priorBboxes[pboxIdx];
const std::vector<std::vector<dataType>>& currPrVar =
priorVariances[pboxIdx];
for (int c = 0; c < numLocClasses; ++c) for (int c = 0; c < numLocClasses; ++c)
{ {
int label = attrs.share_location ? -1 : c; int label = attrs.share_location ? -1 : c;
if (label == attrs.background_label_id) if (attrs.background_label_id > -1 &&
label == attrs.background_label_id)
{ {
continue; continue;
} }
@ -319,7 +357,8 @@ namespace ngraph
for (int c = 0; c < numLocClasses; ++c) for (int c = 0; c < numLocClasses; ++c)
{ {
int label = attrs.share_location ? -1 : c; int label = attrs.share_location ? -1 : c;
if (label == attrs.background_label_id) if (attrs.background_label_id > -1 &&
label == attrs.background_label_id)
{ {
continue; continue;
} }
@ -360,6 +399,7 @@ namespace ngraph
std::stable_sort( std::stable_sort(
scoreIndexVec.begin(), scoreIndexVec.end(), SortScorePairDescend<int>); scoreIndexVec.begin(), scoreIndexVec.end(), SortScorePairDescend<int>);
if (topK > -1 && topK < scoreIndexVec.size()) if (topK > -1 && topK < scoreIndexVec.size())
{ {
scoreIndexVec.resize(topK); scoreIndexVec.resize(topK);
@ -391,6 +431,7 @@ namespace ngraph
{ {
NormalizedBBox intersectBbox; NormalizedBBox intersectBbox;
IntersectBBox(bbox1, bbox2, intersectBbox); IntersectBBox(bbox1, bbox2, intersectBbox);
dataType intersectWidth, intersectHeight; dataType intersectWidth, intersectHeight;
intersectWidth = intersectBbox.xmax - intersectBbox.xmin; intersectWidth = intersectBbox.xmax - intersectBbox.xmin;
intersectHeight = intersectBbox.ymax - intersectBbox.ymin; intersectHeight = intersectBbox.ymax - intersectBbox.ymin;
@ -399,7 +440,6 @@ namespace ngraph
dataType intersect_size = intersectWidth * intersectHeight; dataType intersect_size = intersectWidth * intersectHeight;
dataType bbox1_size = BBoxSize(bbox1); dataType bbox1_size = BBoxSize(bbox1);
dataType bbox2_size = BBoxSize(bbox2); dataType bbox2_size = BBoxSize(bbox2);
return intersect_size / (bbox1_size + bbox2_size - intersect_size); return intersect_size / (bbox1_size + bbox2_size - intersect_size);
} }
else else
@ -423,6 +463,7 @@ namespace ngraph
{ {
const int kept_idx = indices[k]; const int kept_idx = indices[k];
dataType overlap = JaccardOverlap(bboxes[idx], bboxes[kept_idx]); dataType overlap = JaccardOverlap(bboxes[idx], bboxes[kept_idx]);
if (overlap > attrs.nms_threshold) if (overlap > attrs.nms_threshold)
{ {
keep = false; keep = false;
@ -448,6 +489,8 @@ namespace ngraph
int id = 0; int id = 0;
for (int c = 1; c < attrs.num_classes; c++) for (int c = 1; c < attrs.num_classes; c++)
{ {
if (attrs.background_label_id > -1 && c == attrs.background_label_id)
continue;
dataType temp = confScores.at(c)[p]; dataType temp = confScores.at(c)[p];
if (temp > conf) if (temp > conf)
{ {
@ -497,15 +540,19 @@ namespace ngraph
public: public:
referenceDetectionOutput(const ngraph::op::DetectionOutputAttrs& _attrs, referenceDetectionOutput(const ngraph::op::DetectionOutputAttrs& _attrs,
const ngraph::Shape& locShape, const ngraph::Shape& locShape,
const ngraph::Shape& priorsShape) const ngraph::Shape& priorsShape,
const ngraph::Shape& outShape)
: attrs(_attrs) : attrs(_attrs)
{ {
numImages = locShape[0]; numImages = locShape[0];
priorSize = _attrs.normalized ? 4 : 5; priorSize = _attrs.normalized ? 4 : 5;
offset = _attrs.normalized ? 0 : 1; offset = _attrs.normalized ? 0 : 1;
numPriors = priorsShape[2] / priorSize; numPriors = priorsShape[2] / priorSize;
priorsBatchSize = priorsShape[0];
numLocClasses = numLocClasses =
_attrs.share_location ? 1 : static_cast<size_t>(_attrs.num_classes); _attrs.share_location ? 1 : static_cast<size_t>(_attrs.num_classes);
numResults = outShape[2];
outTotalSize = shape_size(outShape);
} }
void run(const dataType* _location, void run(const dataType* _location,
@ -515,6 +562,7 @@ namespace ngraph
const dataType* _armLocation, const dataType* _armLocation,
dataType* result) dataType* result)
{ {
std::memset(result, 0, outTotalSize * sizeof(dataType));
bool withAddBoxPred = _armConfidence != nullptr && _armLocation != nullptr; bool withAddBoxPred = _armConfidence != nullptr && _armLocation != nullptr;
std::vector<LabelBBox> armLocPreds; std::vector<LabelBBox> armLocPreds;
if (withAddBoxPred) if (withAddBoxPred)
@ -566,6 +614,7 @@ namespace ngraph
if (confScores.find(c) == confScores.end()) if (confScores.find(c) == confScores.end())
continue; continue;
const std::vector<dataType>& scores = confScores.find(c)->second; const std::vector<dataType>& scores = confScores.find(c)->second;
int label = attrs.share_location ? -1 : c; int label = attrs.share_location ? -1 : c;
if (decodeBboxesImage.find(label) == decodeBboxesImage.end()) if (decodeBboxesImage.find(label) == decodeBboxesImage.end())
continue; continue;
@ -666,7 +715,7 @@ namespace ngraph
} }
} }
} }
if (count < numImages * attrs.keep_top_k[0]) if (count < numResults)
{ {
result[count * 7 + 0] = -1; result[count * 7 + 0] = -1;
} }

View File

@ -45,16 +45,223 @@ op::DetectionOutput::DetectionOutput(const Output<Node>& box_logits,
void op::DetectionOutput::validate_and_infer_types() void op::DetectionOutput::validate_and_infer_types()
{ {
if (get_input_partial_shape(0).is_static()) NODE_VALIDATION_CHECK(
this, m_attrs.num_classes > 0, "Number of classes must be greater than zero");
NODE_VALIDATION_CHECK(
this, m_attrs.keep_top_k.size() > 0, "keep_top_k attribute must be provided");
NODE_VALIDATION_CHECK(this,
m_attrs.code_type == "caffe.PriorBoxParameter.CORNER" ||
m_attrs.code_type == "caffe.PriorBoxParameter.CENTER_SIZE",
"code_type must be either \"caffe.PriorBoxParameter.CORNER\" or "
"\"caffe.PriorBoxParameter.CENTER_SIZE\"");
auto box_logits_et = get_input_element_type(0);
NODE_VALIDATION_CHECK(this,
box_logits_et.is_real(),
"Box logits' data type must be floating point. Got " +
box_logits_et.get_type_name());
auto class_preds_et = get_input_element_type(1);
NODE_VALIDATION_CHECK(this,
class_preds_et == box_logits_et,
"Class predictions' data type must be the same as box logits type (" +
box_logits_et.get_type_name() + "). Got " +
class_preds_et.get_type_name());
auto proposals_et = get_input_element_type(2);
NODE_VALIDATION_CHECK(this,
proposals_et.is_real(),
"Proposals' data type must be floating point. Got " +
proposals_et.get_type_name());
const PartialShape& box_logits_pshape = get_input_partial_shape(0);
const PartialShape& class_preds_pshape = get_input_partial_shape(1);
const PartialShape& proposals_pshape = get_input_partial_shape(2);
int num_loc_classes = m_attrs.share_location ? 1 : m_attrs.num_classes;
int prior_box_size = m_attrs.normalized ? 4 : 5;
Dimension num_images = Dimension::dynamic();
Dimension num_prior_boxes = Dimension::dynamic();
if (box_logits_pshape.rank().is_static())
{ {
auto box_logits_shape = get_input_partial_shape(0).to_shape(); NODE_VALIDATION_CHECK(this,
set_output_type( box_logits_pshape.rank().get_length() == 2,
0, element::f32, Shape{1, 1, m_attrs.keep_top_k[0] * box_logits_shape[0], 7}); "Box logits rank must be 2. Got " +
std::to_string(box_logits_pshape.rank().get_length()));
num_images = box_logits_pshape[0];
if (box_logits_pshape[1].is_static())
{
NODE_VALIDATION_CHECK(
this,
(box_logits_pshape[1].get_length() % (num_loc_classes * 4)) == 0,
"Box logits' second dimension must be a multiply of num_loc_classes * 4 (" +
std::to_string(num_loc_classes * 4) + "). Current value is: ",
box_logits_pshape[1].get_length(),
".");
num_prior_boxes = box_logits_pshape[1].get_length() / (num_loc_classes * 4);
}
}
if (class_preds_pshape.rank().is_static())
{
NODE_VALIDATION_CHECK(this,
class_preds_pshape.rank().get_length() == 2,
"Class predictions rank must be 2. Got " +
std::to_string(class_preds_pshape.rank().get_length()));
if (num_images.is_dynamic() && class_preds_pshape[0].is_static())
{
num_images = class_preds_pshape[0];
}
else
{
NODE_VALIDATION_CHECK(
this,
class_preds_pshape[0].compatible(num_images),
"Class predictions' first dimension is not compatible with batch size.");
}
if (class_preds_pshape[1].is_static())
{
if (num_prior_boxes.is_dynamic())
{
NODE_VALIDATION_CHECK(
this,
class_preds_pshape[1].get_length() % m_attrs.num_classes == 0,
"Class predictions' second dimension must be a multiply of num_classes (" +
std::to_string(m_attrs.num_classes) + "). Current value is: ",
class_preds_pshape[1].get_length(),
".");
num_prior_boxes = class_preds_pshape[1].get_length() / m_attrs.num_classes;
}
else
{
int num_prior_boxes_val = num_prior_boxes.get_length();
NODE_VALIDATION_CHECK(
this,
class_preds_pshape[1].get_length() == num_prior_boxes_val * m_attrs.num_classes,
"Class predictions' second dimension must be equal to num_prior_boxes * "
"num_classes (" +
std::to_string(num_prior_boxes_val * m_attrs.num_classes) +
"). Current value is: ",
class_preds_pshape[1].get_length(),
".");
}
}
}
if (proposals_pshape.rank().is_static())
{
NODE_VALIDATION_CHECK(this,
proposals_pshape.rank().get_length() == 3,
"Proposals rank must be 3. Got " +
std::to_string(proposals_pshape.rank().get_length()));
if (num_images.is_static() && proposals_pshape[0].is_static())
{
int64_t proposals_1st_dim = proposals_pshape[0].get_length();
int64_t num_images_val = num_images.get_length();
NODE_VALIDATION_CHECK(
this,
proposals_1st_dim == 1 || proposals_1st_dim == num_images_val,
"Proposals' first dimension is must be equal to either batch size (" +
std::to_string(num_images_val) + ") or 1. Got: " +
std::to_string(proposals_1st_dim) + ".");
}
if (proposals_pshape[1].is_static())
{
size_t proposals_expected_2nd_dim = m_attrs.variance_encoded_in_target ? 1 : 2;
NODE_VALIDATION_CHECK(this,
proposals_pshape[1].compatible(proposals_expected_2nd_dim),
"Proposals' second dimension is mismatched. Current value is: ",
proposals_pshape[1].get_length(),
", expected: ",
proposals_expected_2nd_dim,
".");
}
if (proposals_pshape[2].is_static())
{
if (num_prior_boxes.is_dynamic())
{
NODE_VALIDATION_CHECK(
this,
proposals_pshape[2].get_length() % prior_box_size == 0,
"Proposals' third dimension must be a multiply of prior_box_size (" +
std::to_string(prior_box_size) + "). Current value is: ",
proposals_pshape[2].get_length(),
".");
num_prior_boxes = proposals_pshape[2].get_length() / prior_box_size;
}
else
{
int num_prior_boxes_val = num_prior_boxes.get_length();
NODE_VALIDATION_CHECK(this,
proposals_pshape[2].get_length() ==
num_prior_boxes_val * prior_box_size,
"Proposals' third dimension must be equal to num_prior_boxes "
"* prior_box_size (" +
std::to_string(num_prior_boxes_val * prior_box_size) +
"). Current value is: ",
proposals_pshape[2].get_length(),
".");
}
}
}
if (get_input_size() > 3)
{
auto aux_class_preds_et = get_input_element_type(3);
NODE_VALIDATION_CHECK(this,
aux_class_preds_et == class_preds_et,
"Additional class predictions' data type must be the same as class "
"predictions data type (" +
class_preds_et.get_type_name() + "). Got " +
aux_class_preds_et.get_type_name());
auto aux_box_preds_et = get_input_element_type(4);
NODE_VALIDATION_CHECK(
this,
aux_box_preds_et == box_logits_et,
"Additional box predictions' data type must be the same as box logits data type (" +
box_logits_et.get_type_name() + "). Got " + aux_box_preds_et.get_type_name());
const PartialShape& aux_class_preds_pshape = get_input_partial_shape(3);
const PartialShape& aux_box_preds_pshape = get_input_partial_shape(4);
if (aux_class_preds_pshape.rank().is_static())
{
NODE_VALIDATION_CHECK(this,
aux_class_preds_pshape[0].compatible(num_images),
"Additional class predictions' first dimension must be "
"compatible with batch size.");
if (num_prior_boxes.is_static())
{
int num_prior_boxes_val = num_prior_boxes.get_length();
NODE_VALIDATION_CHECK(
this,
aux_class_preds_pshape[1].get_length() == num_prior_boxes_val * 2,
"Additional class predictions' second dimension must be equal to "
"num_prior_boxes * 2 (" +
std::to_string(num_prior_boxes_val * 2) + "). Got " +
std::to_string(aux_class_preds_pshape[1].get_length()) + ".");
}
}
NODE_VALIDATION_CHECK(
this,
aux_box_preds_pshape.compatible(box_logits_pshape),
"Additional box predictions' shape must be compatible with box logits shape.");
}
std::vector<Dimension> output_shape{1, 1};
if (m_attrs.keep_top_k[0] > 0)
{
output_shape.push_back(num_images * m_attrs.keep_top_k[0]);
}
else if (m_attrs.top_k > 0)
{
output_shape.push_back(num_images * m_attrs.top_k * m_attrs.num_classes);
} }
else else
{ {
set_output_type(0, element::f32, PartialShape::dynamic()); output_shape.push_back(num_images * num_prior_boxes * m_attrs.num_classes);
} }
output_shape.push_back(7);
set_output_type(0, box_logits_et, output_shape);
} }
shared_ptr<Node> op::DetectionOutput::clone_with_new_inputs(const OutputVector& new_args) const shared_ptr<Node> op::DetectionOutput::clone_with_new_inputs(const OutputVector& new_args) const

View File

@ -932,11 +932,11 @@ def test_detection_output(int_dtype, fp_dtype):
"nms_threshold": fp_dtype(0.645), "nms_threshold": fp_dtype(0.645),
} }
box_logits = ng.parameter([4, 1, 5, 5], fp_dtype, "box_logits") box_logits = ng.parameter([4, 8], fp_dtype, "box_logits")
class_preds = ng.parameter([2, 1, 4, 5], fp_dtype, "class_preds") class_preds = ng.parameter([4, 170], fp_dtype, "class_preds")
proposals = ng.parameter([2, 1, 4, 5], fp_dtype, "proposals") proposals = ng.parameter([4, 2, 10], fp_dtype, "proposals")
aux_class_preds = ng.parameter([2, 1, 4, 5], fp_dtype, "aux_class_preds") aux_class_preds = ng.parameter([4, 4], fp_dtype, "aux_class_preds")
aux_box_preds = ng.parameter([2, 1, 4, 5], fp_dtype, "aux_box_preds") aux_box_preds = ng.parameter([4, 8], fp_dtype, "aux_box_preds")
node = ng.detection_output(box_logits, class_preds, proposals, attributes, aux_class_preds, aux_box_preds) node = ng.detection_output(box_logits, class_preds, proposals, attributes, aux_class_preds, aux_box_preds)

View File

@ -71,7 +71,7 @@ def test_dynamic_get_attribute_value(int_dtype, fp_dtype):
"top_k": int_dtype(16), "top_k": int_dtype(16),
"variance_encoded_in_target": True, "variance_encoded_in_target": True,
"keep_top_k": np.array([64, 32, 16, 8], dtype=int_dtype), "keep_top_k": np.array([64, 32, 16, 8], dtype=int_dtype),
"code_type": "pytorch.some_parameter_name", "code_type": "caffe.PriorBoxParameter.CENTER_SIZE",
"share_location": False, "share_location": False,
"nms_threshold": fp_dtype(0.645), "nms_threshold": fp_dtype(0.645),
"confidence_threshold": fp_dtype(0.111), "confidence_threshold": fp_dtype(0.111),
@ -84,11 +84,11 @@ def test_dynamic_get_attribute_value(int_dtype, fp_dtype):
"objectness_score": fp_dtype(0.77), "objectness_score": fp_dtype(0.77),
} }
box_logits = ng.parameter([4, 1, 5, 5], fp_dtype, "box_logits") box_logits = ng.parameter([4, 680], fp_dtype, "box_logits")
class_preds = ng.parameter([2, 1, 4, 5], fp_dtype, "class_preds") class_preds = ng.parameter([4, 170], fp_dtype, "class_preds")
proposals = ng.parameter([2, 1, 4, 5], fp_dtype, "proposals") proposals = ng.parameter([4, 1, 8], fp_dtype, "proposals")
aux_class_preds = ng.parameter([2, 1, 4, 5], fp_dtype, "aux_class_preds") aux_class_preds = ng.parameter([4, 4], fp_dtype, "aux_class_preds")
aux_box_preds = ng.parameter([2, 1, 4, 5], fp_dtype, "aux_box_preds") aux_box_preds = ng.parameter([4, 680], fp_dtype, "aux_box_preds")
node = ng.detection_output(box_logits, class_preds, proposals, attributes, aux_class_preds, aux_box_preds) node = ng.detection_output(box_logits, class_preds, proposals, attributes, aux_class_preds, aux_box_preds)
@ -97,7 +97,7 @@ def test_dynamic_get_attribute_value(int_dtype, fp_dtype):
assert node.get_top_k() == int_dtype(16) assert node.get_top_k() == int_dtype(16)
assert node.get_variance_encoded_in_target() assert node.get_variance_encoded_in_target()
assert np.all(np.equal(node.get_keep_top_k(), np.array([64, 32, 16, 8], dtype=int_dtype))) assert np.all(np.equal(node.get_keep_top_k(), np.array([64, 32, 16, 8], dtype=int_dtype)))
assert node.get_code_type() == "pytorch.some_parameter_name" assert node.get_code_type() == "caffe.PriorBoxParameter.CENTER_SIZE"
assert not node.get_share_location() assert not node.get_share_location()
assert np.isclose(node.get_nms_threshold(), fp_dtype(0.645)) assert np.isclose(node.get_nms_threshold(), fp_dtype(0.645))
assert np.isclose(node.get_confidence_threshold(), fp_dtype(0.111)) assert np.isclose(node.get_confidence_threshold(), fp_dtype(0.111))

View File

@ -117,6 +117,7 @@ set(SRC
type_prop/ctc_loss.cpp type_prop/ctc_loss.cpp
type_prop/deformable_convolution.cpp type_prop/deformable_convolution.cpp
type_prop/deformable_psroi_pooling.cpp type_prop/deformable_psroi_pooling.cpp
type_prop/detection_output.cpp
type_prop/depth_to_space.cpp type_prop/depth_to_space.cpp
type_prop/dyn_reshape.cpp type_prop/dyn_reshape.cpp
type_prop/strided_slice.cpp type_prop/strided_slice.cpp
@ -280,6 +281,7 @@ set(MULTI_TEST_SRC
backend/cosh.in.cpp backend/cosh.in.cpp
backend/ctc_greedy_decoder.in.cpp backend/ctc_greedy_decoder.in.cpp
backend/cum_sum.in.cpp backend/cum_sum.in.cpp
backend/detection_output.in.cpp
backend/divide.in.cpp backend/divide.in.cpp
backend/dyn_reshape.in.cpp backend/dyn_reshape.in.cpp
backend/strided_slice.in.cpp backend/strided_slice.in.cpp

View File

@ -1473,11 +1473,11 @@ TEST(attributes, interpolate_op)
TEST(attributes, detection_output_op) TEST(attributes, detection_output_op)
{ {
FactoryRegistry<Node>::get().register_factory<opset1::DetectionOutput>(); FactoryRegistry<Node>::get().register_factory<opset1::DetectionOutput>();
const auto box_logits = make_shared<op::Parameter>(element::f32, Shape{1, 3, 32, 32}); const auto box_logits = make_shared<op::Parameter>(element::f32, Shape{1, 2 * 1 * 4});
const auto class_preds = make_shared<op::Parameter>(element::f32, Shape{32}); const auto class_preds = make_shared<op::Parameter>(element::f32, Shape{1, 2 * 32});
const auto proposals = make_shared<op::Parameter>(element::f32, Shape{128, 2}); const auto proposals = make_shared<op::Parameter>(element::f32, Shape{1, 2, 2 * 4});
const auto aux_class_preds = make_shared<op::Parameter>(element::f32, Shape{16}); const auto aux_class_preds = make_shared<op::Parameter>(element::f32, Shape{1, 2 * 2});
const auto aux_box_pred = make_shared<op::Parameter>(element::f32, Shape{32, 2}); const auto aux_box_pred = make_shared<op::Parameter>(element::f32, Shape{1, 2 * 1 * 4});
op::DetectionOutputAttrs attrs; op::DetectionOutputAttrs attrs;
attrs.num_classes = 32; attrs.num_classes = 32;

View File

@ -0,0 +1,872 @@
//*****************************************************************************
// Copyright 2020 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//*****************************************************************************
// clang-format off
#ifdef ${BACKEND_NAME}_FLOAT_TOLERANCE_BITS
#define DEFAULT_FLOAT_TOLERANCE_BITS ${BACKEND_NAME}_FLOAT_TOLERANCE_BITS
#endif
#ifdef ${BACKEND_NAME}_DOUBLE_TOLERANCE_BITS
#define DEFAULT_DOUBLE_TOLERANCE_BITS ${BACKEND_NAME}_DOUBLE_TOLERANCE_BITS
#endif
// clang-format on
#include "gtest/gtest.h"
#include "ngraph/ngraph.hpp"
#include "util/engine/test_engines.hpp"
#include "util/test_case.hpp"
#include "util/test_control.hpp"
using namespace std;
using namespace ngraph;
static string s_manifest = "${MANIFEST}";
using TestEngine = test::ENGINE_CLASS_NAME(${BACKEND_NAME});
NGRAPH_TEST(${BACKEND_NAME}, detection_output_3_inputs)
{
op::DetectionOutputAttrs attrs;
attrs.num_classes = 3;
attrs.background_label_id = -1;
attrs.top_k = -1;
attrs.variance_encoded_in_target = true;
attrs.keep_top_k = {2};
attrs.code_type = "caffe.PriorBoxParameter.CORNER";
attrs.share_location = false;
attrs.nms_threshold = 0.5;
attrs.confidence_threshold = 0.3;
attrs.clip_after_nms = false;
attrs.clip_before_nms = true;
attrs.decrease_label_id = false;
attrs.normalized = true;
attrs.input_height = 0;
attrs.input_width = 0;
attrs.objectness_score = 0;
size_t num_prior_boxes = 2;
size_t num_loc_classes = attrs.share_location ? 1 : attrs.num_classes;
size_t prior_box_size = attrs.normalized ? 4 : 5;
size_t num_images = 2;
Shape loc_shape{num_images, num_prior_boxes * num_loc_classes * prior_box_size};
Shape conf_shape{num_images, num_prior_boxes * attrs.num_classes};
Shape prior_boxes_shape{
1, attrs.variance_encoded_in_target ? 1UL : 2UL, num_prior_boxes * prior_box_size};
auto loc = make_shared<op::Parameter>(element::f32, loc_shape);
auto conf = make_shared<op::Parameter>(element::f32, conf_shape);
auto prior_boxes = make_shared<op::Parameter>(element::f32, prior_boxes_shape);
auto f = make_shared<Function>(make_shared<op::DetectionOutput>(loc, conf, prior_boxes, attrs),
ParameterVector{loc, conf, prior_boxes});
auto test_case = test::TestCase<TestEngine>(f);
// locations
test_case.add_input<float>({
// batch 0, class 0
0.1,
0.1,
0.2,
0.2,
0.0,
0.1,
0.2,
0.15,
// batch 0, class 1
0.3,
0.2,
0.5,
0.3,
0.2,
0.1,
0.42,
0.66,
// batch 0, class 2
0.05,
0.1,
0.2,
0.3,
0.2,
0.1,
0.33,
0.44,
// batch 1, class 0
0.2,
0.1,
0.4,
0.2,
0.1,
0.05,
0.2,
0.25,
// batch 1, class 1
0.1,
0.2,
0.5,
0.3,
0.1,
0.1,
0.12,
0.34,
// batch 1, class 2
0.25,
0.11,
0.4,
0.32,
0.2,
0.12,
0.38,
0.24,
});
test_case.add_input<float>({
// batch 0
0.1,
0.9,
0.4,
0.7,
0,
0.2,
// batch 1
0.7,
0.8,
0.42,
0.33,
0.81,
0.2,
});
test_case.add_input<float>({
// prior box 0
0.0,
0.5,
0.1,
0.2,
// prior box 1
0.0,
0.3,
0.1,
0.35,
});
Shape output_shape{1, 1, num_images * static_cast<size_t>(attrs.keep_top_k[0]), 7};
test_case.add_expected_output<float>(
output_shape, {0, 0, 0.7, 0.2, 0.4, 0.52, 1, 0, 1, 0.9, 0, 0.6, 0.3, 0.35,
1, 1, 0.81, 0.25, 0.41, 0.5, 0.67, 1, 1, 0.8, 0.1, 0.55, 0.3, 0.45});
test_case.run();
}
NGRAPH_TEST(${BACKEND_NAME}, detection_output_3_inputs_share_location)
{
op::DetectionOutputAttrs attrs;
attrs.num_classes = 3;
attrs.background_label_id = -1;
attrs.top_k = -1;
attrs.variance_encoded_in_target = true;
attrs.keep_top_k = {2};
attrs.code_type = "caffe.PriorBoxParameter.CORNER";
attrs.share_location = true;
attrs.nms_threshold = 0.5;
attrs.confidence_threshold = 0.3;
attrs.clip_after_nms = false;
attrs.clip_before_nms = true;
attrs.decrease_label_id = false;
attrs.normalized = true;
attrs.input_height = 0;
attrs.input_width = 0;
attrs.objectness_score = 0;
size_t num_prior_boxes = 2;
size_t num_loc_classes = attrs.share_location ? 1 : attrs.num_classes;
size_t prior_box_size = attrs.normalized ? 4 : 5;
size_t num_images = 2;
Shape loc_shape{num_images, num_prior_boxes * num_loc_classes * prior_box_size};
Shape conf_shape{num_images, num_prior_boxes * attrs.num_classes};
Shape prior_boxes_shape{
num_images, attrs.variance_encoded_in_target ? 1UL : 2UL, num_prior_boxes * prior_box_size};
auto loc = make_shared<op::Parameter>(element::f32, loc_shape);
auto conf = make_shared<op::Parameter>(element::f32, conf_shape);
auto prior_boxes = make_shared<op::Parameter>(element::f32, prior_boxes_shape);
auto f = make_shared<Function>(make_shared<op::DetectionOutput>(loc, conf, prior_boxes, attrs),
ParameterVector{loc, conf, prior_boxes});
auto test_case = test::TestCase<TestEngine>(f);
// locations
test_case.add_input<float>({
// batch 0
0.1,
0.1,
0.2,
0.2,
0.0,
0.1,
0.2,
0.15,
// batch 1
0.2,
0.1,
0.4,
0.2,
0.1,
0.05,
0.2,
0.25,
});
test_case.add_input<float>({
// batch 0
0.1,
0.9,
0.4,
0.7,
0,
0.2,
// batch 1
0.7,
0.8,
0.42,
0.33,
0.81,
0.2,
});
test_case.add_input<float>({
// batch 0
0.0,
0.5,
0.1,
0.2,
0.0,
0.3,
0.1,
0.35,
// batch 1
0.33,
0.2,
0.52,
0.37,
0.22,
0.1,
0.32,
0.36,
});
Shape output_shape{1, 1, num_images * static_cast<size_t>(attrs.keep_top_k[0]), 7};
test_case.add_expected_output<float>(output_shape,
{
0, 0, 0.7, 0, 0.4, 0.3, 0.5, 0, 1, 0.9,
0.1, 0.6, 0.3, 0.4, 1, 1, 0.81, 0.32, 0.15, 0.52,
0.61, 1, 1, 0.8, 0.53, 0.3, 0.92, 0.57,
});
test_case.run();
}
NGRAPH_TEST(${BACKEND_NAME}, detection_output_3_inputs_normalized)
{
op::DetectionOutputAttrs attrs;
attrs.num_classes = 3;
attrs.background_label_id = -1;
attrs.top_k = -1;
attrs.variance_encoded_in_target = true;
attrs.keep_top_k = {2};
attrs.code_type = "caffe.PriorBoxParameter.CORNER";
attrs.share_location = true;
attrs.nms_threshold = 0.5;
attrs.confidence_threshold = 0.3;
attrs.clip_after_nms = false;
attrs.clip_before_nms = true;
attrs.decrease_label_id = false;
attrs.normalized = true;
attrs.input_height = 0;
attrs.input_width = 0;
attrs.objectness_score = 0;
size_t num_prior_boxes = 2;
size_t num_loc_classes = attrs.share_location ? 1 : attrs.num_classes;
size_t prior_box_size = attrs.normalized ? 4 : 5;
size_t num_images = 2;
Shape loc_shape{num_images, num_prior_boxes * num_loc_classes * prior_box_size};
Shape conf_shape{num_images, num_prior_boxes * attrs.num_classes};
Shape prior_boxes_shape{
num_images, attrs.variance_encoded_in_target ? 1UL : 2UL, num_prior_boxes * prior_box_size};
auto loc = make_shared<op::Parameter>(element::f32, loc_shape);
auto conf = make_shared<op::Parameter>(element::f32, conf_shape);
auto prior_boxes = make_shared<op::Parameter>(element::f32, prior_boxes_shape);
auto f = make_shared<Function>(make_shared<op::DetectionOutput>(loc, conf, prior_boxes, attrs),
ParameterVector{loc, conf, prior_boxes});
auto test_case = test::TestCase<TestEngine>(f);
// locations
test_case.add_input<float>({
// batch 0
0.1,
0.1,
0.2,
0.2,
0.0,
0.1,
0.2,
0.15,
// batch 1
0.2,
0.1,
0.4,
0.2,
0.1,
0.05,
0.2,
0.25,
});
test_case.add_input<float>({
// batch 0
0.1,
0.9,
0.4,
0.7,
0,
0.2,
// batch 1
0.7,
0.8,
0.42,
0.33,
0.81,
0.2,
});
test_case.add_input<float>({
// batch 0
0.0,
0.5,
0.1,
0.2,
0.0,
0.3,
0.1,
0.35,
// batch 1
0.33,
0.2,
0.52,
0.37,
0.22,
0.1,
0.32,
0.36,
});
Shape output_shape{1, 1, num_images * static_cast<size_t>(attrs.keep_top_k[0]), 7};
test_case.add_expected_output<float>(output_shape,
{
0, 0, 0.7, 0, 0.4, 0.3, 0.5, 0, 1, 0.9,
0.1, 0.6, 0.3, 0.4, 1, 1, 0.81, 0.32, 0.15, 0.52,
0.61, 1, 1, 0.8, 0.53, 0.3, 0.92, 0.57,
});
test_case.run();
}
NGRAPH_TEST(${BACKEND_NAME}, detection_output_3_inputs_keep_all_bboxes)
{
op::DetectionOutputAttrs attrs;
attrs.num_classes = 2;
attrs.background_label_id = -1;
attrs.top_k = -1;
attrs.variance_encoded_in_target = false;
attrs.keep_top_k = {-1};
attrs.code_type = "caffe.PriorBoxParameter.CORNER";
attrs.share_location = false;
attrs.nms_threshold = 0.5;
attrs.confidence_threshold = 0.3;
attrs.clip_after_nms = false;
attrs.clip_before_nms = true;
attrs.decrease_label_id = false;
attrs.normalized = true;
attrs.input_height = 0;
attrs.input_width = 0;
attrs.objectness_score = 0;
size_t num_prior_boxes = 2;
size_t num_loc_classes = attrs.share_location ? 1 : attrs.num_classes;
size_t prior_box_size = attrs.normalized ? 4 : 5;
size_t num_images = 3;
Shape loc_shape{num_images, num_prior_boxes * num_loc_classes * prior_box_size};
Shape conf_shape{num_images, num_prior_boxes * attrs.num_classes};
Shape prior_boxes_shape{
num_images, attrs.variance_encoded_in_target ? 1UL : 2UL, num_prior_boxes * prior_box_size};
auto loc = make_shared<op::Parameter>(element::f32, loc_shape);
auto conf = make_shared<op::Parameter>(element::f32, conf_shape);
auto prior_boxes = make_shared<op::Parameter>(element::f32, prior_boxes_shape);
auto f = make_shared<Function>(make_shared<op::DetectionOutput>(loc, conf, prior_boxes, attrs),
ParameterVector{loc, conf, prior_boxes});
auto test_case = test::TestCase<TestEngine>(f);
// locations
test_case.add_input<float>({
// batch 0, class 0
0.1,
0.1,
0.2,
0.2,
0.0,
0.1,
0.2,
0.15,
// batch 0, class 1
0.3,
0.2,
0.5,
0.3,
0.2,
0.1,
0.42,
0.66,
// batch 1, class 0
0.05,
0.1,
0.2,
0.3,
0.2,
0.1,
0.33,
0.44,
// batch 1, class 1
0.2,
0.1,
0.4,
0.2,
0.1,
0.05,
0.2,
0.25,
// batch 2, class 0
0.1,
0.2,
0.5,
0.3,
0.1,
0.1,
0.12,
0.34,
// batch 2, class 1
0.25,
0.11,
0.4,
0.32,
0.2,
0.12,
0.38,
0.24,
});
test_case.add_input<float>({
// batch 0
0.1,
0.9,
0.4,
0.7,
// batch 1
0.7,
0.8,
0.42,
0.33,
// batch 1
0.1,
0.2,
0.32,
0.43,
});
test_case.add_input<float>({
// batch 0 priors
0.0,
0.5,
0.1,
0.2,
0.0,
0.3,
0.1,
0.35,
// batch 0 variances
0.12,
0.11,
0.32,
0.02,
0.02,
0.20,
0.09,
0.71,
// batch 1 priors
0.33,
0.2,
0.52,
0.37,
0.22,
0.1,
0.32,
0.36,
// batch 1 variances
0.01,
0.07,
0.12,
0.13,
0.41,
0.33,
0.2,
0.1,
// batch 2 priors
0.0,
0.3,
0.1,
0.35,
0.22,
0.1,
0.32,
0.36,
// batch 2 variances
0.32,
0.02,
0.13,
0.41,
0.33,
0.2,
0.02,
0.20,
});
Shape output_shape{1, 1, num_images * attrs.num_classes * num_prior_boxes, 7};
test_case.add_expected_output<float>(
output_shape,
{
0, 0, 0.4, 0.006, 0.34, 0.145, 0.563, 0, 1, 0.9, 0, 0.511, 0.164, 0.203,
0, 1, 0.7, 0.004, 0.32, 0.1378, 0.8186, 1, 0, 0.7, 0.3305, 0.207, 0.544, 0.409,
1, 0, 0.42, 0.302, 0.133, 0.4, 0.38, 1, 1, 0.8, 0.332, 0.207, 0.5596, 0.4272,
1, 1, 0.33, 0.261, 0.1165, 0.36, 0.385, 2, 0, 0.32, 0.3025, 0.122, 0.328, 0.424,
2, 1, 0.43, 0.286, 0.124, 0.3276, 0.408, -1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
});
test_case.run();
}
NGRAPH_TEST(${BACKEND_NAME}, detection_output_3_inputs_center_size)
{
op::DetectionOutputAttrs attrs;
attrs.num_classes = 3;
attrs.background_label_id = -1;
attrs.top_k = -1;
attrs.variance_encoded_in_target = true;
attrs.keep_top_k = {2};
attrs.code_type = "caffe.PriorBoxParameter.CENTER_SIZE";
attrs.share_location = false;
attrs.nms_threshold = 0.5;
attrs.confidence_threshold = 0.3;
attrs.clip_after_nms = false;
attrs.clip_before_nms = true;
attrs.decrease_label_id = false;
attrs.normalized = true;
attrs.input_height = 0;
attrs.input_width = 0;
attrs.objectness_score = 0;
size_t num_prior_boxes = 2;
size_t num_loc_classes = attrs.share_location ? 1 : attrs.num_classes;
size_t prior_box_size = attrs.normalized ? 4 : 5;
size_t num_images = 2;
Shape loc_shape{num_images, num_prior_boxes * num_loc_classes * prior_box_size};
Shape conf_shape{num_images, num_prior_boxes * attrs.num_classes};
Shape prior_boxes_shape{
num_images, attrs.variance_encoded_in_target ? 1UL : 2UL, num_prior_boxes * prior_box_size};
auto loc = make_shared<op::Parameter>(element::f32, loc_shape);
auto conf = make_shared<op::Parameter>(element::f32, conf_shape);
auto prior_boxes = make_shared<op::Parameter>(element::f32, prior_boxes_shape);
auto f = make_shared<Function>(make_shared<op::DetectionOutput>(loc, conf, prior_boxes, attrs),
ParameterVector{loc, conf, prior_boxes});
auto test_case = test::TestCase<TestEngine>(f);
// locations
test_case.add_input<float>({
// batch 0, class 0
0.1,
0.1,
0.2,
0.2,
0.0,
0.1,
0.2,
0.15,
// batch 0, class 1
0.3,
0.2,
0.5,
0.3,
0.2,
0.1,
0.42,
0.66,
// batch 0, class 2
0.05,
0.1,
0.2,
0.3,
0.2,
0.1,
0.33,
0.44,
// batch 1, class 0
0.2,
0.1,
0.4,
0.2,
0.1,
0.05,
0.2,
0.25,
// batch 1, class 1
0.1,
0.2,
0.5,
0.3,
0.1,
0.1,
0.12,
0.34,
// batch 1, class 2
0.25,
0.11,
0.4,
0.32,
0.2,
0.12,
0.38,
0.24,
});
test_case.add_input<float>({
// batch 0
0.1,
0.9,
0.4,
0.7,
0,
0.2,
// batch 1
0.7,
0.8,
0.42,
0.33,
0.81,
0.2,
});
test_case.add_input<float>({
// batch 0
0.0,
0.5,
0.1,
0.2,
0.0,
0.3,
0.1,
0.35,
// batch 1
0.33,
0.2,
0.52,
0.37,
0.22,
0.1,
0.32,
0.36,
});
Shape output_shape{1, 1, num_images * static_cast<size_t>(attrs.keep_top_k[0]), 7};
test_case.add_expected_output<float>(
output_shape,
{
0, 0, 0.7, 0, 0.28163019, 0.14609808, 0.37836978,
0, 1, 0.9, 0, 0.49427515, 0.11107014, 0.14572485,
1, 1, 0.81, 0.22040875, 0.079573378, 0.36959124, 0.4376266,
1, 1, 0.8, 0.32796675, 0.18435785, 0.56003326, 0.40264216,
});
test_case.run();
}
NGRAPH_TEST(${BACKEND_NAME}, detection_output_5_inputs)
{
op::DetectionOutputAttrs attrs;
attrs.num_classes = 2;
attrs.background_label_id = -1;
attrs.top_k = -1;
attrs.variance_encoded_in_target = true;
attrs.keep_top_k = {2};
attrs.code_type = "caffe.PriorBoxParameter.CORNER";
attrs.share_location = false;
attrs.nms_threshold = 0.5;
attrs.confidence_threshold = 0.3;
attrs.clip_after_nms = false;
attrs.clip_before_nms = true;
attrs.decrease_label_id = false;
attrs.normalized = true;
attrs.input_height = 0;
attrs.input_width = 0;
attrs.objectness_score = 0.6;
size_t num_prior_boxes = 2;
size_t num_loc_classes = attrs.share_location ? 1 : attrs.num_classes;
size_t prior_box_size = attrs.normalized ? 4 : 5;
size_t num_images = 2;
Shape loc_shape{num_images, num_prior_boxes * num_loc_classes * prior_box_size};
Shape conf_shape{num_images, num_prior_boxes * attrs.num_classes};
Shape prior_boxes_shape{
num_images, attrs.variance_encoded_in_target ? 1UL : 2UL, num_prior_boxes * prior_box_size};
auto loc = make_shared<op::Parameter>(element::f32, loc_shape);
auto conf = make_shared<op::Parameter>(element::f32, conf_shape);
auto prior_boxes = make_shared<op::Parameter>(element::f32, prior_boxes_shape);
auto aux_loc = make_shared<op::Parameter>(element::f32, loc_shape);
auto aux_conf = make_shared<op::Parameter>(element::f32, conf_shape);
auto f = make_shared<Function>(
make_shared<op::DetectionOutput>(loc, conf, prior_boxes, aux_conf, aux_loc, attrs),
ParameterVector{loc, conf, prior_boxes, aux_conf, aux_loc});
auto test_case = test::TestCase<TestEngine>(f);
// locations
test_case.add_input<float>({
// batch 0, class 0
0.1,
0.1,
0.2,
0.2,
0.0,
0.1,
0.2,
0.15,
// batch 0, class 1
0.3,
0.2,
0.5,
0.3,
0.2,
0.1,
0.42,
0.66,
// batch 1, class 0
0.2,
0.1,
0.4,
0.2,
0.1,
0.05,
0.2,
0.25,
// batch 1, class 1
0.1,
0.2,
0.5,
0.3,
0.1,
0.1,
0.12,
0.34,
});
// confidence
test_case.add_input<float>({
// batch 0
0.1,
0.9,
0.4,
0.7,
// batch 1
0.42,
0.33,
0.81,
0.2,
});
// prior boxes
test_case.add_input<float>({
// batch 0
0.0,
0.5,
0.1,
0.2,
0.0,
0.3,
0.1,
0.35,
// batch 1
0.33,
0.2,
0.52,
0.37,
0.22,
0.1,
0.32,
0.36,
});
// aux conf
test_case.add_input<float>({
// batch 0
0.1,
0.3,
0.5,
0.8,
// batch 1
0.5,
0.8,
0.01,
0.1,
});
// aux locations
test_case.add_input<float>({
// batch 0, class 0
0.1,
0.2,
0.5,
0.3,
0.1,
0.1,
0.12,
0.34,
// batch 0, class 1
0.25,
0.11,
0.4,
0.32,
0.2,
0.12,
0.38,
0.24,
// batch 1, class 0
0.3,
0.2,
0.5,
0.3,
0.2,
0.1,
0.42,
0.66,
// batch 1, class 1
0.05,
0.1,
0.2,
0.3,
0.2,
0.1,
0.33,
0.44,
});
Shape output_shape{1, 1, num_images * static_cast<size_t>(attrs.keep_top_k[0]), 7};
test_case.add_expected_output<float>(
output_shape,
{
0, 0, 0.4, 0.55, 0.61, 1, 0.97, 0, 1, 0.7, 0.4, 0.52, 0.9, 1,
1, 0, 0.42, 0.83, 0.5, 1, 0.87, 1, 1, 0.33, 0.63, 0.35, 1, 1,
});
test_case.run();
}

View File

@ -106,7 +106,7 @@ graph {
dim_value: 2 dim_value: 2
} }
dim { dim {
dim_value: 15 dim_value: 12
} }
} }
} }

View File

@ -3082,12 +3082,12 @@ NGRAPH_TEST(${BACKEND_NAME}, onnx_detection_output)
std::vector<float> logits = gen_vector(12, -2, 2); std::vector<float> logits = gen_vector(12, -2, 2);
std::vector<float> class_preds = gen_vector(9, 0, 1); std::vector<float> class_preds = gen_vector(9, 0, 1);
std::vector<float> proposals = gen_vector(15 * 2, 0, 1); std::vector<float> proposals = gen_vector(12 * 2, 0, 1);
std::vector<float> output = {0, 1, 0.777778, 0.241012, 0.260378, 0.418248, 0.499622, std::vector<float> output = {0, 1, 0.777778, 0.279849, 0.283779, 0.562743, 0.695387,
0, 1, 0.444444, 0.10963, 0.146239, 0.176296, 0.228576, 0, 1, 0.444444, 0.12963, 0.176075, 0.212963, 0.284573,
0, 2, 0.888889, 0.241012, 0.260378, 0.418248, 0.499622, 0, 2, 0.888889, 0.279849, 0.283779, 0.562743, 0.695387,
0, 2, 0.555556, 0.10963, 0.146239, 0.176296, 0.228576, 0, 2, 0.555556, 0.12963, 0.176075, 0.212963, 0.284573,
0, 2, 0.222222, -0.0378917, -0.00169918, -0.00210832, 0.0387362}; 0, 2, 0.222222, -0.0608094, -0.0142007, -0.0225239, 0.0304044};
test_case.add_input<float>(logits); test_case.add_input<float>(logits);
test_case.add_input<float>(class_preds); test_case.add_input<float>(class_preds);
test_case.add_input<float>(proposals); test_case.add_input<float>(proposals);

View File

@ -577,8 +577,10 @@ namespace
const HostTensorVector& inputs) const HostTensorVector& inputs)
{ {
using T = typename element_type_traits<ET>::value_type; using T = typename element_type_traits<ET>::value_type;
runtime::reference::referenceDetectionOutput<T> refDetOut( runtime::reference::referenceDetectionOutput<T> refDetOut(op->get_attrs(),
op->get_attrs(), op->get_input_shape(0), op->get_input_shape(2)); op->get_input_shape(0),
op->get_input_shape(2),
op->get_output_shape(0));
if (op->get_input_size() == 3) if (op->get_input_size() == 3)
{ {
refDetOut.run(inputs[0]->get_data_ptr<const T>(), refDetOut.run(inputs[0]->get_data_ptr<const T>(),

View File

@ -0,0 +1,783 @@
//*****************************************************************************
// Copyright 2020 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//*****************************************************************************
#include "gtest/gtest.h"
#include "ngraph/ngraph.hpp"
#include "ngraph/op/detection_output.hpp"
#include "util/type_prop.hpp"
#include <memory>
using namespace std;
using namespace ngraph;
std::shared_ptr<op::DetectionOutput>
create_detection_output(const PartialShape& box_logits_shape,
const PartialShape& class_preds_shape,
const PartialShape& proposals_shape,
const PartialShape& aux_class_preds_shape,
const PartialShape& aux_box_preds_shape,
const op::DetectionOutputAttrs& attrs,
element::Type input_type,
element::Type proposals_type)
{
auto box_logits = make_shared<op::Parameter>(input_type, box_logits_shape);
auto class_preds = make_shared<op::Parameter>(input_type, class_preds_shape);
auto proposals = make_shared<op::Parameter>(proposals_type, proposals_shape);
auto aux_class_preds = make_shared<op::Parameter>(input_type, aux_class_preds_shape);
auto aux_box_preds = make_shared<op::Parameter>(input_type, aux_box_preds_shape);
return make_shared<op::DetectionOutput>(
box_logits, class_preds, proposals, aux_class_preds, aux_box_preds, attrs);
}
TEST(type_prop_layers, detection_output)
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
ASSERT_EQ(op->get_shape(), (Shape{1, 1, 800, 7}));
ASSERT_EQ(op->get_element_type(), element::f32);
}
TEST(type_prop_layers, detection_output_f16)
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f16,
element::f16);
ASSERT_EQ(op->get_shape(), (Shape{1, 1, 800, 7}));
ASSERT_EQ(op->get_element_type(), element::f16);
}
TEST(type_prop_layers, detection_f16_with_proposals_f32)
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f16,
element::f32);
ASSERT_EQ(op->get_shape(), (Shape{1, 1, 800, 7}));
ASSERT_EQ(op->get_element_type(), element::f16);
}
TEST(type_prop_layers, detection_output_not_normalized)
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = false;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10},
Shape{4, 2, 25},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
ASSERT_EQ(op->get_shape(), (Shape{1, 1, 800, 7}));
ASSERT_EQ(op->get_element_type(), element::f32);
}
TEST(type_prop_layers, detection_output_negative_keep_top_k)
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.top_k = -1;
attrs.normalized = true;
attrs.num_classes = 2;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
ASSERT_EQ(op->get_shape(), (Shape{1, 1, 40, 7}));
ASSERT_EQ(op->get_element_type(), element::f32);
}
TEST(type_prop_layers, detection_output_no_share_location)
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.top_k = -1;
attrs.normalized = true;
attrs.num_classes = 2;
attrs.share_location = false;
auto op = create_detection_output(Shape{4, 40},
Shape{4, 10},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 40},
attrs,
element::f32,
element::f32);
ASSERT_EQ(op->get_shape(), (Shape{1, 1, 40, 7}));
ASSERT_EQ(op->get_element_type(), element::f32);
}
TEST(type_prop_layers, detection_output_top_k)
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.top_k = 7;
attrs.normalized = true;
attrs.num_classes = 2;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
ASSERT_EQ(op->get_shape(), (Shape{1, 1, 56, 7}));
ASSERT_EQ(op->get_element_type(), element::f32);
}
TEST(type_prop_layers, detection_output_all_dynamic_shapes)
{
PartialShape dyn_shape = PartialShape::dynamic();
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 1;
auto op = create_detection_output(
dyn_shape, dyn_shape, dyn_shape, dyn_shape, dyn_shape, attrs, element::f32, element::f32);
ASSERT_EQ(op->get_output_partial_shape(0), (PartialShape{1, 1, Dimension::dynamic(), 7}));
ASSERT_EQ(op->get_element_type(), element::f32);
}
TEST(type_prop_layers, detection_output_dynamic_batch)
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(PartialShape{Dimension::dynamic(), 20},
PartialShape{Dimension::dynamic(), 10},
PartialShape{Dimension::dynamic(), 2, 20},
PartialShape{Dimension::dynamic(), 10},
PartialShape{Dimension::dynamic(), 20},
attrs,
element::f32,
element::f32);
ASSERT_EQ(op->get_output_partial_shape(0), (PartialShape{{1, 1, Dimension::dynamic(), 7}}));
ASSERT_EQ(op->get_element_type(), element::f32);
}
void detection_output_invalid_data_type_test(element::Type box_logits_et,
element::Type class_preds_et,
element::Type proposals_et,
element::Type aux_class_preds_et,
element::Type aux_box_preds_et,
const std::string& expected_msg)
{
try
{
auto box_logits = make_shared<op::Parameter>(box_logits_et, Shape{4, 20});
auto class_preds = make_shared<op::Parameter>(class_preds_et, Shape{4, 10});
auto proposals = make_shared<op::Parameter>(proposals_et, Shape{4, 2, 20});
auto aux_class_preds = make_shared<op::Parameter>(aux_class_preds_et, Shape{4, 10});
auto aux_box_preds = make_shared<op::Parameter>(aux_box_preds_et, Shape{4, 20});
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = make_shared<op::DetectionOutput>(
box_logits, class_preds, proposals, aux_class_preds, aux_box_preds, attrs);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(error.what(), expected_msg);
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
TEST(type_prop_layers, detection_output_invalid_data_type)
{
detection_output_invalid_data_type_test(
element::i32,
element::f32,
element::f32,
element::f32,
element::f32,
"Box logits' data type must be floating point. Got i32");
detection_output_invalid_data_type_test(
element::f32,
element::i32,
element::f32,
element::f32,
element::f32,
"Class predictions' data type must be the same as box logits type (f32). Got i32");
detection_output_invalid_data_type_test(element::f32,
element::f32,
element::i32,
element::f32,
element::f32,
"Proposals' data type must be floating point. Got i32");
detection_output_invalid_data_type_test(element::f32,
element::f32,
element::f32,
element::i32,
element::f32,
"Additional class predictions' data type must be the "
"same as class predictions data type (f32). Got i32");
detection_output_invalid_data_type_test(element::f32,
element::f32,
element::f32,
element::f32,
element::i32,
"Additional box predictions' data type must be the "
"same as box logits data type (f32). Got i32");
}
TEST(type_prop_layers, detection_output_mismatched_batch_size)
{
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 20},
Shape{5, 10},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string(
"Class predictions' first dimension is not compatible with batch size."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10},
Shape{5, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(error.what(),
std::string("Proposals' first dimension is must be equal to "
"either batch size (4) or 1. Got: 5."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
}
TEST(type_prop_layers, detection_output_invalid_ranks)
{
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 20, 1},
Shape{4, 10},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(error.what(), std::string("Box logits rank must be 2. Got 3"));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10, 1},
Shape{4, 2, 20},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(error.what(),
std::string("Class predictions rank must be 2. Got 3"));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
attrs.num_classes = 2;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 20},
Shape{4, 10},
Shape{4, 2},
Shape{4, 10},
Shape{4, 20},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(error.what(), std::string("Proposals rank must be 3. Got 2"));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
}
TEST(type_prop_layers, detection_output_invalid_box_logits_shape)
{
// share_location = true
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = false;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 13},
Shape{4, 9},
Shape{4, 2, 12},
Shape{4, 6},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string(
"Box logits' second dimension must be a multiply of num_loc_classes * 4 (4)"));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
// share_location = false
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = false;
attrs.variance_encoded_in_target = false;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 37},
Shape{4, 9},
Shape{4, 2, 12},
Shape{4, 6},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string(
"Box logits' second dimension must be a multiply of num_loc_classes * 4 (12)"));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
}
TEST(type_prop_layers, detection_output_invalid_class_preds_shape)
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 10},
Shape{4, 2, 12},
Shape{4, 6},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string("Class predictions' second dimension must be equal to "
"num_prior_boxes * num_classes (9). Current value is: 10."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
TEST(type_prop_layers, detection_output_invalid_proposals_shape)
{
// variance_encoded_in_target = false
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = false;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 9},
Shape{4, 1, 12},
Shape{4, 6},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string(
"Proposals' second dimension is mismatched. Current value is: 1, expected: 2"));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
// variance_encoded_in_target = true
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = true;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 9},
Shape{4, 2, 12},
Shape{4, 6},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string(
"Proposals' second dimension is mismatched. Current value is: 2, expected: 1"));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
// normalized = false
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = false;
attrs.normalized = false;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 9},
Shape{4, 2, 16},
Shape{4, 6},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string("Proposals' third dimension must be equal to num_prior_boxes * "
"prior_box_size (15). Current value is: 16."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
// normalized = true
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = false;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 9},
Shape{4, 2, 13},
Shape{4, 6},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string("Proposals' third dimension must be equal to num_prior_boxes * "
"prior_box_size (12). Current value is: 13."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
}
TEST(type_prop_layers, detection_output_invalid_aux_class_preds)
{
// invalid batch size
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = false;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 9},
Shape{4, 2, 12},
Shape{5, 6},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(error.what(),
std::string("Additional class predictions' first dimension must "
"be compatible with batch size."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
// invalid 2nd dimension
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = false;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 9},
Shape{4, 2, 12},
Shape{4, 7},
Shape{4, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(error.what(),
std::string("Additional class predictions' second dimension must "
"be equal to num_prior_boxes * 2 (6). Got 7."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
}
TEST(type_prop_layers, detection_output_invalid_aux_box_preds)
{
// invalid batch size
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = false;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 9},
Shape{4, 2, 12},
Shape{4, 6},
Shape{5, 12},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string(
"Additional box predictions' shape must be compatible with box logits shape."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
// invalid 2nd dimension
{
try
{
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {-1};
attrs.num_classes = 3;
attrs.share_location = true;
attrs.variance_encoded_in_target = false;
attrs.normalized = true;
auto op = create_detection_output(Shape{4, 12},
Shape{4, 9},
Shape{4, 2, 12},
Shape{4, 6},
Shape{4, 22},
attrs,
element::f32,
element::f32);
FAIL() << "Exception expected";
}
catch (const NodeValidationFailure& error)
{
EXPECT_HAS_SUBSTRING(
error.what(),
std::string(
"Additional box predictions' shape must be compatible with box logits shape."));
}
catch (...)
{
FAIL() << "Unknown exception was thrown";
}
}
}

View File

@ -18,7 +18,6 @@
#include "ngraph/ngraph.hpp" #include "ngraph/ngraph.hpp"
#include "ngraph/op/ctc_greedy_decoder.hpp" #include "ngraph/op/ctc_greedy_decoder.hpp"
#include "ngraph/op/detection_output.hpp"
#include "ngraph/op/interpolate.hpp" #include "ngraph/op/interpolate.hpp"
#include "ngraph/op/prior_box.hpp" #include "ngraph/op/prior_box.hpp"
#include "ngraph/op/prior_box_clustered.hpp" #include "ngraph/op/prior_box_clustered.hpp"
@ -38,20 +37,6 @@ TEST(type_prop_layers, ctc_greedy_decoder)
ASSERT_EQ(op->get_shape(), (Shape{2, 88, 1, 1})); ASSERT_EQ(op->get_shape(), (Shape{2, 88, 1, 1}));
} }
TEST(type_prop_layers, detection_output)
{
auto box_logits = make_shared<op::Parameter>(element::f32, Shape{4, 1, 5, 5});
auto class_preds = make_shared<op::Parameter>(element::f32, Shape{2, 1, 4, 5});
auto proposals = make_shared<op::Parameter>(element::f32, Shape{2, 1, 4, 5});
auto aux_class_preds = make_shared<op::Parameter>(element::f32, Shape{2, 1, 4, 5});
auto aux_box_preds = make_shared<op::Parameter>(element::f32, Shape{2, 1, 4, 5});
op::DetectionOutputAttrs attrs;
attrs.keep_top_k = {200};
auto op = make_shared<op::DetectionOutput>(
box_logits, class_preds, proposals, aux_class_preds, aux_box_preds, attrs);
ASSERT_EQ(op->get_shape(), (Shape{1, 1, 800, 7}));
}
TEST(type_prop_layers, interpolate) TEST(type_prop_layers, interpolate)
{ {
auto image = make_shared<op::Parameter>(element::f32, Shape{2, 2, 33, 65}); auto image = make_shared<op::Parameter>(element::f32, Shape{2, 2, 33, 65});