Compare commits
55 Commits
releases/v
...
2020.4
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
023e7c2c3f | ||
|
|
34ddb70f7d | ||
|
|
21e092122f | ||
|
|
92c1333653 | ||
|
|
c26ec8b312 | ||
|
|
32054ff180 | ||
|
|
7cff005ada | ||
|
|
06707cc53f | ||
|
|
fff93d8f05 | ||
|
|
637ddd5dfb | ||
|
|
fa4c5e8e38 | ||
|
|
c9fc6f0531 | ||
|
|
c9eb6ae62b | ||
|
|
eef56ca80c | ||
|
|
36f1c00e02 | ||
|
|
5c43765011 | ||
|
|
bbfc9bbc14 | ||
|
|
9c607528ef | ||
|
|
ae9e0510f0 | ||
|
|
76af547c17 | ||
|
|
5e97a3123f | ||
|
|
532dec140b | ||
|
|
c41c6294f9 | ||
|
|
3bbe88e659 | ||
|
|
2f3d5f68cd | ||
|
|
843f81a1cc | ||
|
|
c596707a09 | ||
|
|
cf60baf2f0 | ||
|
|
aeb70036d7 | ||
|
|
dea04dae8c | ||
|
|
14b44803ba | ||
|
|
06286f2aae | ||
|
|
97e5fc4bae | ||
|
|
47218284b2 | ||
|
|
6079a35b81 | ||
|
|
4f4352f301 | ||
|
|
a67d74c41f | ||
|
|
26c563132d | ||
|
|
dc1ca195dd | ||
|
|
f5ad3e6f89 | ||
|
|
6c736ce001 | ||
|
|
30ab6534e1 | ||
|
|
259a4c25ce | ||
|
|
347930008c | ||
|
|
4fa251483a | ||
|
|
30f8af70fc | ||
|
|
3fc6d8a188 | ||
|
|
66c8df6a87 | ||
|
|
e53eb86334 | ||
|
|
2df99d4263 | ||
|
|
deab4d38b0 | ||
|
|
412428f1dd | ||
|
|
167c96a8af | ||
|
|
b7363ba711 | ||
|
|
5cef9f3734 |
@@ -1,5 +1,5 @@
|
||||
# [OpenVINO™ Toolkit](https://01.org/openvinotoolkit) - Deep Learning Deployment Toolkit repository
|
||||
[](https://github.com/openvinotoolkit/openvino/releases/tag/2020.3.0)
|
||||
[](https://github.com/openvinotoolkit/openvino/releases/tag/2020.4.0)
|
||||
[](LICENSE)
|
||||
|
||||
This toolkit allows developers to deploy pre-trained deep learning models
|
||||
|
||||
@@ -52,14 +52,15 @@ as a part of [Intel® Distribution of OpenVINO™].
|
||||
## Build on Linux\* Systems
|
||||
|
||||
The software was validated on:
|
||||
- Ubuntu\* 18.04 (64-bit) with default GCC\* 7.5.0
|
||||
- Ubuntu\* 16.04 (64-bit) with default GCC\* 5.4.0
|
||||
- CentOS\* 7.4 (64-bit) with default GCC\* 4.8.5
|
||||
|
||||
### Software Requirements
|
||||
- [CMake]\* 3.11 or higher
|
||||
- GCC\* 4.8 or higher to build the Inference Engine
|
||||
- Python 2.7 or higher for Inference Engine Python API wrapper
|
||||
- (Optional) [Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352].
|
||||
- Python 3.5 or higher for Inference Engine Python API wrapper
|
||||
- (Optional) [Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441].
|
||||
|
||||
### Build Steps
|
||||
1. Clone submodules:
|
||||
@@ -77,7 +78,7 @@ The software was validated on:
|
||||
```
|
||||
3. By default, the build enables the Inference Engine GPU plugin to infer models
|
||||
on your Intel® Processor Graphics. This requires you to
|
||||
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352]
|
||||
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441]
|
||||
before running the build. If you don't want to use the GPU plugin, use the
|
||||
`-DENABLE_CLDNN=OFF` CMake build option and skip the installation of the
|
||||
Intel® Graphics Compute Runtime for OpenCL™ Driver.
|
||||
@@ -202,7 +203,7 @@ Native compilation of the Inference Engine is the most straightforward solution.
|
||||
|
||||
This compilation was tested on the following configuration:
|
||||
|
||||
* Host: Ubuntu\* 16.04 (64-bit, Intel® Core™ i7-6700K CPU @ 4.00GHz × 8)
|
||||
* Host: Ubuntu\* 18.04 (64-bit, Intel® Core™ i7-6700K CPU @ 4.00GHz × 8)
|
||||
* Target: Raspbian\* Stretch (32-bit, ARMv7, Raspberry Pi\* 3)
|
||||
|
||||
1. Install Docker\*:
|
||||
@@ -337,7 +338,7 @@ The software was validated on:
|
||||
- [CMake]\*3.11 or higher
|
||||
- Microsoft\* Visual Studio 2017, 2019 or [Intel® C++ Compiler] 18.0
|
||||
- (Optional) Intel® Graphics Driver for Windows* (26.20) [driver package].
|
||||
- Python 3.4 or higher for Inference Engine Python API wrapper
|
||||
- Python 3.5 or higher for Inference Engine Python API wrapper
|
||||
|
||||
### Build Steps
|
||||
|
||||
@@ -454,7 +455,7 @@ The software was validated on:
|
||||
|
||||
- [CMake]\* 3.11 or higher
|
||||
- Clang\* compiler from Xcode\* 10.1 or higher
|
||||
- Python\* 3.4 or higher for the Inference Engine Python API wrapper
|
||||
- Python\* 3.5 or higher for the Inference Engine Python API wrapper
|
||||
|
||||
### Build Steps
|
||||
|
||||
@@ -574,8 +575,7 @@ This section describes how to build Inference Engine for Android x86 (64-bit) op
|
||||
|
||||
## Use Custom OpenCV Builds for Inference Engine
|
||||
|
||||
> **NOTE**: The recommended and tested version of OpenCV is 4.3. The minimum
|
||||
supported version is 3.4.0.
|
||||
> **NOTE**: The recommended and tested version of OpenCV is 4.4.0.
|
||||
|
||||
Required versions of OpenCV packages are downloaded automatically during the
|
||||
building Inference Engine library. If the build script can not find and download
|
||||
@@ -691,7 +691,7 @@ This target collects all dependencies, prepares the nGraph package and copies it
|
||||
|
||||
[Intel® Distribution of OpenVINO™]:https://software.intel.com/en-us/openvino-toolkit
|
||||
[CMake]:https://cmake.org/download/
|
||||
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352]:https://github.com/intel/compute-runtime/releases/tag/20.13.16352
|
||||
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441]:https://github.com/intel/compute-runtime/releases/tag/19.41.14441
|
||||
[MKL-DNN repository]:https://github.com/intel/mkl-dnn/releases/download/v0.19/mklml_lnx_2019.0.5.20190502.tgz
|
||||
[MKL-DNN repository for Windows]:(https://github.com/intel/mkl-dnn/releases/download/v0.19/mklml_win_2019.0.5.20190502.zip)
|
||||
[OpenBLAS]:https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download
|
||||
|
||||
@@ -27,8 +27,14 @@ endif()
|
||||
|
||||
if (ENABLE_THREAD_SANITIZER)
|
||||
set(SANITIZER_COMPILER_FLAGS "-g -fsanitize=thread -fno-omit-frame-pointer")
|
||||
set(SANITIZER_LINKER_FLAGS "-fsanitize=thread -static-libsan")
|
||||
|
||||
set(SANITIZER_LINKER_FLAGS "-fsanitize=thread")
|
||||
if(CMAKE_CXX_COMPILER_ID MATCHES "^(Apple)?Clang$" AND NOT WIN32)
|
||||
if(CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 8.0)
|
||||
set(SANITIZER_LINKER_FLAGS "${SANITIZER_LINKER_FLAGS} -fuse-ld=lld")
|
||||
else()
|
||||
set(SANITIZER_LINKER_FLAGS "${SANITIZER_LINKER_FLAGS} -static-libsan")
|
||||
endif()
|
||||
endif()
|
||||
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SANITIZER_COMPILER_FLAGS}")
|
||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SANITIZER_COMPILER_FLAGS}")
|
||||
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${SANITIZER_LINKER_FLAGS}")
|
||||
|
||||
@@ -79,7 +79,7 @@ function(ie_build_samples)
|
||||
MINGW64 CMAKE_BUILD_TYPE CMAKE_MACOSX_RPATH)
|
||||
unset(${var})
|
||||
endforeach()
|
||||
|
||||
include(sanitizer)
|
||||
add_subdirectory(samples)
|
||||
endfunction()
|
||||
|
||||
|
||||
@@ -19,7 +19,7 @@ set(VPU_SUPPORTED_FIRMWARES usb-ma2450 usb-ma2x8x pcie-ma248x)
|
||||
# Default packages
|
||||
#
|
||||
|
||||
set(FIRMWARE_PACKAGE_VERSION 1216)
|
||||
set(FIRMWARE_PACKAGE_VERSION 1223)
|
||||
set(VPU_CLC_MA2X8X_VERSION "movi-cltools-20.02.0")
|
||||
|
||||
#
|
||||
|
||||
@@ -1,2 +1,2 @@
|
||||
numpy
|
||||
cython>=0.29
|
||||
numpy==1.13.3
|
||||
cython==0.29.17
|
||||
|
||||
@@ -1,2 +1,2 @@
|
||||
opencv-python==3.4.4
|
||||
numpy==1.18.1
|
||||
opencv-python==3.4.4.19
|
||||
numpy==1.13.3
|
||||
|
||||
@@ -814,8 +814,8 @@ cdef class ExecutableNetwork:
|
||||
current_request = self.requests[0]
|
||||
current_request.infer(inputs)
|
||||
res = {}
|
||||
for out in current_request._outputs_list:
|
||||
res[out] = deepcopy(current_request.output_blobs[out].buffer)
|
||||
for name, value in current_request.output_blobs.items():
|
||||
res[name] = deepcopy(value.buffer)
|
||||
return res
|
||||
|
||||
|
||||
|
||||
@@ -229,12 +229,14 @@ void InferenceEnginePython::IENetwork::serialize(const std::string &path_to_xml,
|
||||
|
||||
const std::vector <InferenceEngine::CNNLayerPtr>
|
||||
InferenceEnginePython::IENetwork::getLayers() {
|
||||
IE_SUPPRESS_DEPRECATED_START
|
||||
std::vector<InferenceEngine::CNNLayerPtr> result;
|
||||
std::vector<InferenceEngine::CNNLayerPtr> sorted_layers = InferenceEngine::details::CNNNetSortTopologically(*actual);
|
||||
for (const auto &layer : sorted_layers) {
|
||||
result.emplace_back(layer);
|
||||
}
|
||||
return result;
|
||||
IE_SUPPRESS_DEPRECATED_END
|
||||
}
|
||||
|
||||
PyObject* InferenceEnginePython::IENetwork::getFunction() {
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
cython==0.29.17
|
||||
opencv-python==3.4.4.19
|
||||
pytest==4.0.1
|
||||
attrs==19.1.0
|
||||
pytest-html==1.19.0
|
||||
|
||||
@@ -22,12 +22,12 @@
|
||||
namespace InferenceEngine {
|
||||
|
||||
/**
|
||||
* @deprecated Use InferenceEngine::Core instead. Will be removed in 2020.3
|
||||
* @deprecated Use InferenceEngine::Core instead. Will be removed in 2021.1
|
||||
* @brief This class is a C++ API wrapper for IInferencePlugin.
|
||||
*
|
||||
* It can throw exceptions safely for the application, where it is properly handled.
|
||||
*/
|
||||
class INFERENCE_ENGINE_DEPRECATED("Use InferenceEngine::Core instead. Will be removed in 2020.3") InferencePlugin {
|
||||
class INFERENCE_ENGINE_DEPRECATED("Use InferenceEngine::Core instead. Will be removed in 2021.1") InferencePlugin {
|
||||
IE_SUPPRESS_DEPRECATED_START
|
||||
InferenceEnginePluginPtr actual;
|
||||
|
||||
|
||||
@@ -21,10 +21,10 @@ namespace InferenceEngine {
|
||||
namespace details {
|
||||
|
||||
/**
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||
* @brief This class enables range loops for CNNNetwork objects
|
||||
*/
|
||||
class INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3")
|
||||
class INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1")
|
||||
CNNNetworkIterator {
|
||||
IE_SUPPRESS_DEPRECATED_START
|
||||
|
||||
|
||||
@@ -16,6 +16,7 @@
|
||||
namespace InferenceEngine {
|
||||
namespace details {
|
||||
|
||||
INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1")
|
||||
INFERENCE_ENGINE_API_CPP(std::vector<CNNLayerPtr>) CNNNetSortTopologically(const ICNNNetwork& network);
|
||||
|
||||
} // namespace details
|
||||
|
||||
@@ -126,7 +126,7 @@ public:
|
||||
const SizeVector& getDims() const;
|
||||
|
||||
/**
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||
* @brief Returns an owner of this data layer, parent layer in di-graph
|
||||
* @return A weak pointer to CNNLayer that creates this data
|
||||
*/
|
||||
@@ -147,7 +147,7 @@ public:
|
||||
void setName(const std::string& newName);
|
||||
|
||||
/**
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||
* @brief Privates child layers in di-graph
|
||||
* @return A map of child layers
|
||||
*/
|
||||
|
||||
@@ -2049,7 +2049,7 @@ public:
|
||||
};
|
||||
|
||||
/**
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||
* @brief This class represents a standard ScatterUpdate layer
|
||||
*/
|
||||
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ScatterUpdateLayer): public CNNLayer {
|
||||
@@ -2063,7 +2063,7 @@ public:
|
||||
};
|
||||
|
||||
/**
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||
* @brief This class represents a standard ScatterElementsUpdate layer
|
||||
*/
|
||||
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ScatterElementsUpdateLayer): public CNNLayer {
|
||||
@@ -2077,7 +2077,7 @@ public:
|
||||
};
|
||||
|
||||
/**
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||
* @brief This class represents an onnx ExperimentalDetectronPriorGridGenerator Layer
|
||||
*/
|
||||
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ExperimentalDetectronPriorGridGeneratorLayer): public CNNLayer {
|
||||
|
||||
@@ -123,11 +123,13 @@ DECLARE_VPU_CONFIG_VALUE(NDHWC);
|
||||
DECLARE_VPU_CONFIG_KEY(CUSTOM_LAYERS);
|
||||
|
||||
/**
|
||||
* @deprecated IR statistic is not available in IR v10. The option will be removed in 2021.1
|
||||
* @brief Ignore statistic in IR by plugin.
|
||||
* Plugin could use statistic present in IR in order to try to improve calculations precision.
|
||||
* If you don't want statistic to be used enable this option.
|
||||
* This option should be used with values: CONFIG_VALUE(YES) or CONFIG_VALUE(NO) (default)
|
||||
*/
|
||||
INFERENCE_ENGINE_DEPRECATED("IR statistic is not available in IR v10. The option will be removed in 2021.1")
|
||||
DECLARE_VPU_CONFIG_KEY(IGNORE_IR_STATISTIC);
|
||||
|
||||
/**
|
||||
|
||||
@@ -382,6 +382,9 @@ int main(int argc, char* argv[]) {
|
||||
trim(strLine);
|
||||
labels.push_back(strLine);
|
||||
}
|
||||
inputFile.close();
|
||||
} else {
|
||||
throw std::logic_error("Cannot read label file");
|
||||
}
|
||||
|
||||
ClassificationResult classificationResult(outputBlob, images, batchSize, FLAGS_nt, labels);
|
||||
|
||||
@@ -71,8 +71,8 @@ cldnn::device_info clDNNEngine::GetDeviceInfo(const std::map<std::string, std::s
|
||||
}
|
||||
|
||||
InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngine::ICNNNetwork& network) const {
|
||||
std::shared_ptr<ICNNNetwork> clonedNetwork(nullptr);
|
||||
if (network.getFunction()) {
|
||||
std::shared_ptr<ICNNNetwork> clonedNetwork = cloneNetwork(network);
|
||||
if (clonedNetwork->getFunction()) {
|
||||
const auto transformations_callback = [](const std::shared_ptr<const ::ngraph::Node> &node) -> bool {
|
||||
// DepthToSpace node implementation supports only equal input/output tensors with rank <= 5
|
||||
// Reshape->Permute->Reshape pattern in theory can change output rank, so this check is added to be sure
|
||||
@@ -84,8 +84,7 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngin
|
||||
return std::dynamic_pointer_cast<const ::ngraph::opset2::Gelu>(node) ||
|
||||
std::dynamic_pointer_cast<const ::ngraph::opset3::ShuffleChannels>(node);
|
||||
};
|
||||
CNNNetwork net(network.getFunction());
|
||||
auto nGraphFunc = net.getFunction();
|
||||
auto nGraphFunc = clonedNetwork->getFunction();
|
||||
// Disable shape inference (WA for generic operations)
|
||||
::ngraph::op::GenericIE::DisableReshape noReshape(nGraphFunc);
|
||||
|
||||
@@ -94,9 +93,7 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngin
|
||||
ngraph::pass::ConvertOpSet3ToOpSet2(transformations_callback).run_on_function(nGraphFunc);
|
||||
ngraph::pass::ConvertOpSet2ToOpSet1(transformations_callback).run_on_function(nGraphFunc);
|
||||
ngraph::pass::ConvertOpSet1ToLegacy(transformations_callback).run_on_function(nGraphFunc);
|
||||
clonedNetwork = InferenceEngine::details::convertFunctionToICNNNetwork(nGraphFunc, network);
|
||||
} else {
|
||||
clonedNetwork = cloneNet(network);
|
||||
clonedNetwork = InferenceEngine::details::convertFunctionToICNNNetwork(nGraphFunc, *clonedNetwork);
|
||||
}
|
||||
|
||||
auto implNetwork = std::dynamic_pointer_cast<InferenceEngine::details::CNNNetworkImpl>(clonedNetwork);
|
||||
|
||||
@@ -3518,10 +3518,29 @@ void Program::AddConstantBlobInput(cldnn::topology& topology, InferenceEngine::C
|
||||
return false;
|
||||
};
|
||||
|
||||
// WA to inconsistency between input and const 1d tensors
|
||||
// For Concat along batch we go with batch interpretation
|
||||
// For Gather input we go with batch interpretation
|
||||
bool needsBatchInterpretation = false;
|
||||
if (constDims.size() == 1) {
|
||||
for (auto next : GetNextLayers(layer->outData[0])) {
|
||||
if (LayerTypeFromStr(next->type) == Concatenate) {
|
||||
auto nextConcat = as<InferenceEngine::ConcatLayer*>(next);
|
||||
if (nextConcat->_axis == cldnn::concatenation::concatenation_axis::along_b) {
|
||||
needsBatchInterpretation = true;
|
||||
break;
|
||||
}
|
||||
} else if (LayerTypeFromStr(next->type) == Gather) {
|
||||
needsBatchInterpretation = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// If quantize on weights has per-channel ranges, we have to swap channel and batch dimensions, because
|
||||
// quantization should be applied per output channel of weights
|
||||
// TODO: Check if it's still needed once LowPrecisionTransformations ready
|
||||
if (inputToConstQuantize(layer)) {
|
||||
if (inputToConstQuantize(layer) || needsBatchInterpretation) {
|
||||
constTensor.batch[0] = constTensor.count();
|
||||
constTensor.feature[0] = 1;
|
||||
}
|
||||
@@ -3862,11 +3881,13 @@ void Program::CreateStridedSlicePrimitive(cldnn::topology& topology, InferenceEn
|
||||
tmp = stridedSliceLayer->GetParamAsUInts("shrink_axis_mask");
|
||||
std::vector<uint8_t> shrink_axis_mask(tmp.begin(), tmp.end());
|
||||
|
||||
auto out_size = CldnnTensorFromIEDims(stridedSliceLayer->outData[0]->getTensorDesc().getDims());
|
||||
|
||||
std::string stridedSliceLayerName = layer_type_name_ID(layer);
|
||||
auto stridedSlicePrim = cldnn::strided_slice(
|
||||
stridedSliceLayerName,
|
||||
inputPrimitives[0], inputPrimitives[1], inputPrimitives[2], inputPrimitives[3],
|
||||
begin_mask, end_mask, new_axis_mask, shrink_axis_mask);
|
||||
begin_mask, end_mask, new_axis_mask, shrink_axis_mask, out_size);
|
||||
|
||||
topology.add(stridedSlicePrim);
|
||||
AddPrimitiveToProfiler(stridedSliceLayerName, layer);
|
||||
|
||||
@@ -359,7 +359,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitDeinterleaveComponentPrivate(intel_dn
|
||||
comp.operation = kDnnDeinterleaveOp;
|
||||
comp.macro_operation = kDnnMacroOpNone;
|
||||
comp.orientation_in = kDnnInterleavedOrientation;
|
||||
comp.orientation_out = kDnnNonInterleavedOrientation;
|
||||
comp.orientation_out = kDnnInterleavedOrientation;
|
||||
comp.output_scale_factor = output_scale_factor;
|
||||
comp.input_scale_factor = output_scale_factor;
|
||||
if (!postInitMem) {
|
||||
@@ -1524,6 +1524,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitGNAStruct(intel_nnet_type_t *ptr_nnet
|
||||
THROW_GNA_EXCEPTION << "Encountered activation component before pooling component at." << i;
|
||||
} else {
|
||||
const auto poolMode = reinterpret_cast<Gna2PoolingMode*>(gnaUserAllocator(sizeof(Gna2PoolingMode)));
|
||||
IE_ASSERT(poolMode != nullptr);
|
||||
*poolMode = (comp.op.maxpool.do_sum_not_max) ? Gna2PoolingModeSum : Gna2PoolingModeMax;
|
||||
const auto poolWindow = create_shape1D_parameter(comp.op.maxpool.num_inputs);
|
||||
const auto poolStride = create_shape1D_parameter(comp.op.maxpool.num_inputs_step);
|
||||
@@ -1583,6 +1584,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitGNAStruct(intel_nnet_type_t *ptr_nnet
|
||||
case kDnnPiecewiselinearOp:
|
||||
#if GNA_LIB_VER == 2
|
||||
{
|
||||
IE_ASSERT(gnaOperation->Operands != nullptr);
|
||||
auto& outputTensor = const_cast<Gna2Tensor&>(*gnaOperation->Operands[OutOpIdx]);
|
||||
outputTensor.Data = comp.ptr_outputs;
|
||||
outputTensor.Type = Gna2DataTypeFromBytes(comp.num_bytes_per_output);
|
||||
|
||||
@@ -80,7 +80,7 @@ static const char *intel_dnn_softmax_name[kSoftmaxNumType] = {
|
||||
};
|
||||
|
||||
typedef enum {
|
||||
kDnnUnknownOrientation,
|
||||
kDnnUnknownOrientation = 100,
|
||||
kDnnInterleavedOrientation,
|
||||
kDnnNonInterleavedOrientation,
|
||||
kDnnNumOrientation
|
||||
|
||||
@@ -199,9 +199,17 @@ class ScaleFactorPerLayer<InferenceEngine::CNNLayer *> {
|
||||
|
||||
if (cnnLayer->type == "Const") {
|
||||
auto blob = cnnLayer->blobs["custom"];
|
||||
if (blob->getTensorDesc().getPrecision() == InferenceEngine::Precision::FP16) {
|
||||
auto blob_precision = blob->getTensorDesc().getPrecision();
|
||||
|
||||
if (blob_precision != InferenceEngine::Precision::FP32 && blob_precision != InferenceEngine::Precision::FP16) {
|
||||
quant->_dst_quant.scale = 1.0f;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (blob_precision == InferenceEngine::Precision::FP16) {
|
||||
blob = make_fp32_blob(blob);
|
||||
}
|
||||
|
||||
auto max_val = std::numeric_limits<float>::min();
|
||||
auto min_val = std::numeric_limits<float>::max();
|
||||
|
||||
|
||||
@@ -9,6 +9,7 @@
|
||||
#if GNA_LIB_VER == 2
|
||||
#include "gna2_model_debug_log.hpp"
|
||||
#include "gna2-model-api.h"
|
||||
#include <details/ie_exception.hpp>
|
||||
|
||||
#include <cstdint>
|
||||
#include <fstream>
|
||||
@@ -52,6 +53,7 @@ template <class T>
|
||||
bool NextElement(T & elementIndex, const Gna2Shape& total) {
|
||||
if (total.NumberOfDimensions == 0) return false;
|
||||
auto idx = total.NumberOfDimensions - 1;
|
||||
IE_ASSERT(idx < GNA2_SHAPE_MAXIMUM_NUMBER_OF_DIMENSIONS);
|
||||
while (elementIndex[idx] + 1 >= total.Dimensions[idx] && idx > 0) {
|
||||
idx--;
|
||||
}
|
||||
|
||||
@@ -60,6 +60,7 @@ Gna2Tensor HelperGna2TensorInit3D(uint32_t x, uint32_t y, uint32_t z, Gna2DataTy
|
||||
|
||||
Gna2Tensor * createGna2Tensor1D(uint32_t x, uint32_t byteSize, void* data) {
|
||||
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
||||
IE_ASSERT(input != nullptr);
|
||||
*input = HelperGna2TensorInit1D(x, Gna2DataTypeFromBytes(byteSize), data);
|
||||
return input;
|
||||
}
|
||||
@@ -74,6 +75,7 @@ Gna2Tensor * createGna2TensorPwl(uint32_t x, void* data) {
|
||||
|
||||
Gna2Tensor * createGna2BiasTensor1D(uint32_t x, uint32_t byteSize, void* data) {
|
||||
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
||||
IE_ASSERT(input != nullptr);
|
||||
if (byteSize == 8) {
|
||||
*input = HelperGna2TensorInit1D(x, Gna2DataTypeCompoundBias, data);
|
||||
} else {
|
||||
@@ -84,24 +86,28 @@ Gna2Tensor * createGna2BiasTensor1D(uint32_t x, uint32_t byteSize, void* data) {
|
||||
|
||||
Gna2Tensor * createGna2Tensor2D(uint32_t x, uint32_t y, uint32_t byteSize, void* data) {
|
||||
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
||||
IE_ASSERT(input != nullptr);
|
||||
*input = HelperGna2TensorInit2D(x, y, Gna2DataTypeFromBytes(byteSize), data);
|
||||
return input;
|
||||
}
|
||||
|
||||
Gna2Tensor * createGna2Tensor3D(uint32_t x, uint32_t y, uint32_t z, uint32_t byteSize, void* data) {
|
||||
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
||||
IE_ASSERT(input != nullptr);
|
||||
*input = HelperGna2TensorInit3D(x, y, z, Gna2DataTypeFromBytes(byteSize), data);
|
||||
return input;
|
||||
}
|
||||
|
||||
uint32_t* create_uint32_parameter(uint32_t value) {
|
||||
const auto param = reinterpret_cast<uint32_t*>(gnaUserAllocator(sizeof(uint32_t)));
|
||||
IE_ASSERT(param != nullptr);
|
||||
*param = value;
|
||||
return param;
|
||||
}
|
||||
|
||||
Gna2Shape* create_shape1D_parameter(uint32_t x) {
|
||||
const auto shp = reinterpret_cast<Gna2Shape*>(gnaUserAllocator(sizeof(Gna2Shape)));
|
||||
IE_ASSERT(shp != nullptr);
|
||||
shp->NumberOfDimensions = 1;
|
||||
shp->Dimensions[0] = x;
|
||||
return shp;
|
||||
|
||||
@@ -25,7 +25,7 @@
|
||||
#include "gna_plugin_log.hpp"
|
||||
|
||||
uint8_t* GNADeviceHelper::alloc(uint32_t size_requested, uint32_t *size_granted) {
|
||||
void * memPtr;
|
||||
void * memPtr = nullptr;
|
||||
#if GNA_LIB_VER == 1
|
||||
memPtr = GNAAlloc(nGNAHandle, size_requested, size_granted);
|
||||
#else
|
||||
|
||||
@@ -337,6 +337,7 @@ void GNAGraphCompiler::ConvolutionPrimitive(InferenceEngine::CNNLayerPtr layer)
|
||||
void GNAGraphCompiler::PowerPrimitive(InferenceEngine::CNNLayerPtr layer) {
|
||||
auto& power = dynamic_cast<PowerLayer&>(*layer.get());
|
||||
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(layer);
|
||||
IE_ASSERT(gnaFlags->sw_fp32 ? (quantized == nullptr) : (quantized != nullptr));
|
||||
|
||||
if (power.power != 1.0) {
|
||||
THROW_IE_EXCEPTION << "[GNA plugin] unsupported power factor, expected 1 but was " << power.power;
|
||||
@@ -386,29 +387,14 @@ void GNAGraphCompiler::PowerPrimitive(InferenceEngine::CNNLayerPtr layer) {
|
||||
|
||||
if (gnaFlags->sw_fp32) {
|
||||
gnamem->readonly().push_value(ptr_weights, power.scale, num_rows_out, 64);
|
||||
gnamem->readonly().push_value(ptr_biases, power.scale, num_rows_out, 64);
|
||||
gnamem->readonly().push_value(ptr_biases, power.offset, num_rows_out, 64);
|
||||
} else {
|
||||
auto weightsScaledIdentity = power.scale;
|
||||
auto biasesScaledIdentity = power.scale;
|
||||
if (quantized != nullptr) {
|
||||
weightsScaledIdentity = quantized->_weights_quant.scale * weightsScaledIdentity;
|
||||
biasesScaledIdentity = quantized->_bias_quant.scale * biasesScaledIdentity;
|
||||
}
|
||||
|
||||
auto weightQuantizedIdentity = FLOAT_TO_INT16(std::min(weightsScaledIdentity, static_cast<float>(INT16_MAX)));
|
||||
auto biasesQuantizedIdentity = FLOAT_TO_INT16(std::min(biasesScaledIdentity, static_cast<float>(INT16_MAX)));
|
||||
gnamem->readonly().push_value<int16_t>(ptr_weights, weightQuantizedIdentity, num_rows_out, 64);
|
||||
gnamem->readonly().push_value<int32_t>(ptr_biases, biasesQuantizedIdentity, num_rows_out, 64);
|
||||
}
|
||||
|
||||
if (power.offset != 0.0f) {
|
||||
if (quantized == nullptr) {
|
||||
gnamem->readonly().push_value(ptr_biases, 0.0f, num_rows_out, 64);
|
||||
} else {
|
||||
gnamem->readonly().push_value<int32_t>(ptr_biases, 0, num_rows_out, 64);
|
||||
}
|
||||
} else {
|
||||
gnamem->readonly().push_value(ptr_biases, 0.0f, num_rows_out, 64);
|
||||
auto quantizedScale = FLOAT_TO_INT16(std::min(quantized->_weights_quant.scale * power.scale,
|
||||
static_cast<float>(INT16_MAX)));
|
||||
auto quantizedOffset = FLOAT_TO_INT32(std::min(quantized->_dst_quant.scale * power.offset,
|
||||
static_cast<float>(INT32_MAX)));
|
||||
gnamem->readonly().push_value<int16_t>(ptr_weights, quantizedScale, num_rows_out, 64);
|
||||
gnamem->readonly().push_value<int32_t>(ptr_biases, quantizedOffset, num_rows_out, 64);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1417,6 +1403,7 @@ void GNAGraphCompiler::PermutePrimitive(InferenceEngine::CNNLayerPtr layer) {
|
||||
}
|
||||
auto layerOrder = layer->GetParamAsInts("order");
|
||||
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(layer);
|
||||
IE_ASSERT(!layer->insData.empty());
|
||||
auto inputs = layer->insData.begin()->lock();
|
||||
auto inputsOrder = inputs->getTensorDesc().getDims();
|
||||
auto outputs = layer->outData.front();
|
||||
|
||||
@@ -176,6 +176,63 @@ inline std::pair<InferenceEngine::CNNLayerPtr, int> CNNNetCheckNextLayerSkipCer
|
||||
return CNNNetCheckNextLayerSkipCertain(outLayer->second, 0, 0, bOnlyCheck, shouldSkip);
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief return all layers reachable from given one
|
||||
* @param layer
|
||||
* @param oDataIdx - -1 means iterate over all odata indexes
|
||||
* @param shouldSkip
|
||||
* @return
|
||||
*/
|
||||
template <class Layer>
|
||||
inline std::vector<CNNLayerPtr> CNNNetGetAllNextLayersSkipCertain(Layer layer, int oDataIdx, const std::function<bool(CNNLayerPtr)> &shouldSkip) {
|
||||
// TODO: need to have generic function that creates slice of the graph : starting from given layer
|
||||
// and skipped all non functional - ending up into functional one
|
||||
|
||||
std::list<CNNLayerPtr> currentSet;
|
||||
std::vector<CNNLayerPtr> resultSet;
|
||||
|
||||
std::vector<std::map<std::string, CNNLayerPtr>> start;
|
||||
if (oDataIdx == -1) {
|
||||
for (int i = 0; i != layer->outData.size(); i++) {
|
||||
start.push_back(layer->outData[i]->getInputTo());
|
||||
}
|
||||
} else {
|
||||
start.push_back(layer->outData[oDataIdx]->getInputTo());
|
||||
}
|
||||
|
||||
auto separate_layers = [¤tSet, &resultSet, &shouldSkip](std::map<std::string, CNNLayerPtr>& inputTo) {
|
||||
for (auto &&bfsLayer : inputTo) {
|
||||
if (shouldSkip(bfsLayer.second)) {
|
||||
currentSet.push_back(bfsLayer.second);
|
||||
continue;
|
||||
}
|
||||
resultSet.push_back(bfsLayer.second);
|
||||
}
|
||||
};
|
||||
|
||||
int startIdx, endIdx;
|
||||
if (oDataIdx == -1) {
|
||||
startIdx = 0;
|
||||
endIdx = layer->outData.size();
|
||||
} else {
|
||||
startIdx = oDataIdx;
|
||||
endIdx = oDataIdx + 1;
|
||||
}
|
||||
|
||||
for (int i = startIdx; i != endIdx; i++) {
|
||||
separate_layers(layer->outData[i]->getInputTo());
|
||||
}
|
||||
|
||||
while (!currentSet.empty()) {
|
||||
auto currentLayer = currentSet.front();
|
||||
currentSet.pop_front();
|
||||
for (auto && oData : currentLayer->outData) {
|
||||
separate_layers(oData->getInputTo());
|
||||
}
|
||||
}
|
||||
return resultSet;
|
||||
}
|
||||
|
||||
/// @brief alias for strict checkNextLayer (false)
|
||||
template <class Layer>
|
||||
inline std::pair<InferenceEngine::CNNLayerPtr, int> CNNNetGetNextLayerSkipCertain(Layer layer, int oidx, int iidx,
|
||||
@@ -474,7 +531,31 @@ inline void CNNNetworkInsertLayer(CNNLayerPtr after,
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief remove givven layer from topology, currently only layers with one input data and one output data supported
|
||||
* @brief returns previous layers and outData index for it
|
||||
* @tparam T
|
||||
* @param origin
|
||||
* @param acceptanceCriteria
|
||||
* @param idx
|
||||
*/
|
||||
template <class T>
|
||||
std::vector<std::pair<CNNLayerPtr, int> > CNNNetGetPrevLayersSkip(CNNLayerPtr origin, const T &acceptanceCriteria, int idx = -1) {
|
||||
std::vector<std::pair<CNNLayerPtr, int> > prevLayers;
|
||||
for (int i = idx == -1 ? 0 : idx; CNNNetHasPrevLayer(origin.get(), i) && (idx == -1 || i == idx); i++) {
|
||||
auto prevLayer = CNNNetPrevLayer(origin, i);
|
||||
if (acceptanceCriteria(prevLayer)) {
|
||||
prevLayers.push_back({prevLayer, CNNLayerFindOutDataIdx(origin, i)});
|
||||
} else {
|
||||
// if for some input we need to look in upper layers - original index not used here intentionally
|
||||
auto prevPrevLayers = CNNNetGetPrevLayersSkip(prevLayer, acceptanceCriteria);
|
||||
prevLayers.insert(prevLayers.end(), prevPrevLayers.begin(), prevPrevLayers.end());
|
||||
}
|
||||
}
|
||||
|
||||
return prevLayers;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief remove given layer from topology, currently only layers with one input data and one output data supported
|
||||
*/
|
||||
inline void CNNNetworkRemoveLayer(CNNLayerPtr layer) {
|
||||
if (!layer) {
|
||||
|
||||
@@ -8,6 +8,9 @@
|
||||
#include <ios>
|
||||
#include <iomanip>
|
||||
#include <map>
|
||||
#include <ie_algorithm.hpp>
|
||||
#include <ie_common.h>
|
||||
#include <ie_precision.hpp>
|
||||
|
||||
#if defined __INTEL_COMPILER || defined _MSC_VER
|
||||
#include <malloc.h>
|
||||
@@ -119,15 +122,26 @@ const std::map<Gna2OperationType, std::vector<uint32_t>> GnaParamSize{
|
||||
sizeof(Gna2Shape),
|
||||
sizeof(Gna2Shape)}},
|
||||
{Gna2OperationTypeCopy, {sizeof(Gna2Shape)}},
|
||||
{Gna2OperationTypeTransposition, {sizeof(Gna2Shape)}},
|
||||
};
|
||||
|
||||
void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream & is) {
|
||||
void GNAModelSerial::Import(void *basePointer,
|
||||
size_t gnaGraphSize,
|
||||
std::istream & is,
|
||||
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||
InferenceEngine::InputsDataMap& inputsDataMap,
|
||||
InferenceEngine::OutputsDataMap& outputsDataMap) {
|
||||
is.exceptions(std::istream::failbit);
|
||||
|
||||
ImportInputs(is, basePointer, inputsDesc, inputsDataMap);
|
||||
ImportOutputs(is, basePointer, desc, outputsDataMap);
|
||||
|
||||
for (auto operation = gna2Model->Operations; operation != gna2Model->Operations + gna2Model->NumberOfOperations; ++operation) {
|
||||
readNBits<32>(operation->Type, is);
|
||||
readBits(operation->NumberOfOperands, is);
|
||||
operation->Operands = static_cast<Gna2Tensor const **>(gnaUserAllocator(sizeof(Gna2Tensor*) * operation->NumberOfOperands));
|
||||
IE_ASSERT(operation->Operands != nullptr);
|
||||
for (uint32_t i = 0; i < operation->NumberOfOperands; i++) {
|
||||
Gna2Tensor t{};
|
||||
readBits(t, is);
|
||||
@@ -145,11 +159,10 @@ void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream
|
||||
case Gna2OperationTypeFullyConnectedAffine:
|
||||
case Gna2OperationTypeConvolution:
|
||||
case Gna2OperationTypeCopy:
|
||||
case Gna2OperationTypeTransposition:
|
||||
break;
|
||||
case Gna2OperationTypeRecurrent:
|
||||
THROW_GNA_EXCEPTION << "Importing of recurrent operation not supported";
|
||||
case Gna2OperationTypeTransposition:
|
||||
THROW_GNA_EXCEPTION << "Importing of transposition operation not supported";
|
||||
default:
|
||||
THROW_GNA_EXCEPTION << "Importing of unknown GNA operation type(" << operation->Type << ") not supported";
|
||||
}
|
||||
@@ -158,8 +171,9 @@ void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream
|
||||
else
|
||||
operation->Parameters = nullptr;
|
||||
for (uint32_t i = 0; i < operation->NumberOfParameters; i++) {
|
||||
uint32_t paramSize;
|
||||
uint32_t paramSize = 0;
|
||||
readBits(paramSize, is);
|
||||
IE_ASSERT(operation->Parameters != nullptr);
|
||||
if (paramSize == 0) {
|
||||
operation->Parameters[i] = nullptr;
|
||||
continue;
|
||||
@@ -235,11 +249,12 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
||||
};
|
||||
|
||||
auto convert_to_serial = [getOffsetFromBase](const GNAModelSerial::RuntimeEndPoint& ep) {
|
||||
ModelHeader::EndPoint out;
|
||||
RuntimeEndPoint out;
|
||||
out.elements_count = ep.elements_count;
|
||||
out.descriptor_offset = offsetFromBase(ep.descriptor_ptr);
|
||||
out.scaleFactor = ep.scaleFactor;
|
||||
out.element_size = ep.element_size;
|
||||
out.orientation = ep.orientation;
|
||||
return out;
|
||||
};
|
||||
/**
|
||||
@@ -256,15 +271,21 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
||||
header.gnaMemSize = gnaGraphSize;
|
||||
header.layersCount = layers.size();
|
||||
header.nGroup = guessGrouping(*gna2Model);
|
||||
header.input = convert_to_serial(input);
|
||||
header.output = convert_to_serial(output);
|
||||
|
||||
header.nInputs = inputs.size();
|
||||
header.nOutputs = outputs.size();
|
||||
header.nRotateRows = nRotateRows;
|
||||
header.nRotateColumns = nRotateColumns;
|
||||
|
||||
|
||||
writeBits(header, os);
|
||||
|
||||
for (const auto &input : inputs) {
|
||||
writeBits(convert_to_serial(input), os);
|
||||
}
|
||||
for (const auto &output : outputs) {
|
||||
writeBits(convert_to_serial(output), os);
|
||||
}
|
||||
|
||||
for (const auto & layer : layers) {
|
||||
writeBits(static_cast<uint32_t>(layer.Type), os);
|
||||
writeBits(layer.NumberOfOperands, os);
|
||||
@@ -284,11 +305,10 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
||||
case Gna2OperationTypeFullyConnectedAffine:
|
||||
case Gna2OperationTypeConvolution:
|
||||
case Gna2OperationTypeCopy:
|
||||
case Gna2OperationTypeTransposition:
|
||||
break;
|
||||
case Gna2OperationTypeRecurrent:
|
||||
THROW_GNA_EXCEPTION << "Exporting of recurrent operation not supported";
|
||||
case Gna2OperationTypeTransposition:
|
||||
THROW_GNA_EXCEPTION << "Exporting of interleave operation not supported";
|
||||
default:
|
||||
THROW_GNA_EXCEPTION << "Exporting of unknown GNA operation type(" << layer.Type << ") not supported";
|
||||
}
|
||||
@@ -314,9 +334,18 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
||||
}
|
||||
#else
|
||||
|
||||
void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream & is) {
|
||||
void GNAModelSerial::Import(void *basePointer,
|
||||
size_t gnaGraphSize,
|
||||
std::istream & is,
|
||||
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||
InferenceEngine::InputsDataMap& inputsDataMap,
|
||||
InferenceEngine::OutputsDataMap& outputsDataMap) {
|
||||
is.exceptions(std::istream::failbit);
|
||||
|
||||
ImportInputs(is, basePointer, inputsDesc, inputsDataMap);
|
||||
ImportOutputs(is, basePointer, desc, outputsDataMap);
|
||||
|
||||
auto readPwl = [&is, basePointer](intel_pwl_func_t & value) {
|
||||
readBits(value.nSegments, is);
|
||||
if (value.nSegments != 0) {
|
||||
@@ -466,11 +495,12 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
||||
};
|
||||
|
||||
auto convert_to_serial = [getOffsetFromBase](const GNAModelSerial::RuntimeEndPoint& ep){
|
||||
ModelHeader::EndPoint out;
|
||||
RuntimeEndPoint out;
|
||||
out.elements_count = ep.elements_count;
|
||||
out.element_size = ep.element_size;
|
||||
out.descriptor_offset = offsetFromBase(ep.descriptor_ptr);
|
||||
out.scaleFactor = ep.scaleFactor;
|
||||
out.orientation = ep.orientation;
|
||||
return out;
|
||||
};
|
||||
/**
|
||||
@@ -486,14 +516,16 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
||||
header.gnaMemSize = gnaGraphSize;
|
||||
header.layersCount = layers.size();
|
||||
header.nGroup = ptr_nnet->nGroup;
|
||||
header.input = convert_to_serial(input);
|
||||
header.output = convert_to_serial(output);
|
||||
header.nInputs = 1;
|
||||
header.nOutputs = 1;
|
||||
header.headerSize = sizeof(ModelHeader);
|
||||
header.nRotateRows = nRotateRows;
|
||||
header.nRotateColumns = nRotateColumns;
|
||||
|
||||
|
||||
writeBits(header, os);
|
||||
writeBits(convert_to_serial(inputs[0]), os);
|
||||
writeBits(convert_to_serial(outputs[0]), os);
|
||||
|
||||
for (auto & layer : layers) {
|
||||
writeBits(layer.nInputColumns, os);
|
||||
@@ -572,3 +604,108 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
std::vector<GNAModelSerial::RuntimeEndPoint> GNAModelSerial::serializeOutputs(const InferenceEngine::OutputsDataMap& outputsDataMap,
|
||||
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc) {
|
||||
std::vector<GNAModelSerial::RuntimeEndPoint> endPoints;
|
||||
std::size_t outputIndex = 0;
|
||||
for (auto const &output : outputsDataMap) {
|
||||
auto outputName = output.first;
|
||||
auto inputDims = output.second->getTensorDesc().getDims();
|
||||
uint32_t elementsCount = static_cast<uint32_t>(InferenceEngine::details::product(inputDims.begin(), inputDims.end()));
|
||||
|
||||
GNAModelSerial::RuntimeEndPoint endPoint(outputsDesc[outputIndex].scale_factor,
|
||||
outputsDesc[outputIndex].ptrs[0],
|
||||
outputsDesc[outputIndex].num_bytes_per_element,
|
||||
elementsCount,
|
||||
outputsDesc[outputIndex].orientation);
|
||||
endPoints.push_back(endPoint);
|
||||
outputIndex++;
|
||||
}
|
||||
return endPoints;
|
||||
}
|
||||
|
||||
std::vector<GNAModelSerial::RuntimeEndPoint> GNAModelSerial::serializeInputs(const InferenceEngine::InputsDataMap& inputsDataMap,
|
||||
std::shared_ptr<GNAPluginNS::InputDesc> inputDesc) {
|
||||
std::vector<GNAModelSerial::RuntimeEndPoint> endPoints;
|
||||
|
||||
std::size_t inputIndex = 0;
|
||||
for (auto const& input : inputsDataMap) {
|
||||
auto inputName = input.first;
|
||||
auto inputDims = input.second->getTensorDesc().getDims();
|
||||
|
||||
double scaleFactor = inputDesc->getScaleFactor(inputIndex);
|
||||
std::vector<void *> descriptor_ptr = inputDesc->getPtrInputsGlobal(inputName);
|
||||
IE_ASSERT(descriptor_ptr.size() > 0);
|
||||
uint32_t element_size = 2u;
|
||||
uint32_t elementsCount = static_cast<uint32_t>(InferenceEngine::details::product(inputDims.begin(), inputDims.end()));
|
||||
intel_dnn_orientation_t orientation = inputDesc->getOrientation(inputName);
|
||||
|
||||
GNAModelSerial::RuntimeEndPoint endPoint(scaleFactor,
|
||||
descriptor_ptr[0],
|
||||
element_size,
|
||||
elementsCount,
|
||||
orientation);
|
||||
endPoints.push_back(endPoint);
|
||||
inputIndex++;
|
||||
}
|
||||
return endPoints;
|
||||
}
|
||||
|
||||
void GNAModelSerial::ImportInputs(std::istream &is,
|
||||
void* basePtr,
|
||||
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||
InferenceEngine::InputsDataMap& dataMap) {
|
||||
dataMap.clear();
|
||||
|
||||
for (auto inputIndex = 0; inputIndex < modelHeader.nInputs; inputIndex++) {
|
||||
std::string name = "input" + std::to_string(inputIndex);
|
||||
RuntimeEndPoint input;
|
||||
is.read(reinterpret_cast<char *>(&input), sizeof(input));
|
||||
inputsDesc->getPtrInputsGlobal(name).push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + input.descriptor_offset));
|
||||
inputsDesc->orientation_in[name] = input.orientation;
|
||||
|
||||
auto inputDims = InferenceEngine::SizeVector({modelHeader.nGroup, input.elements_count / modelHeader.nGroup});
|
||||
|
||||
dataMap[name] = std::make_shared<InferenceEngine::InputInfo>();
|
||||
dataMap[name]->setInputData(std::make_shared<InferenceEngine::Data>(name,
|
||||
InferenceEngine::TensorDesc(
|
||||
InferenceEngine::Precision::FP32,
|
||||
inputDims,
|
||||
InferenceEngine::Layout::NC)));
|
||||
inputsDesc->inputScaleFactors.push_back(input.scaleFactor);
|
||||
}
|
||||
}
|
||||
|
||||
void GNAModelSerial::ImportOutputs(std::istream &is,
|
||||
void* basePtr,
|
||||
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||
InferenceEngine::OutputsDataMap& dataMap) {
|
||||
desc.clear();
|
||||
dataMap.clear();
|
||||
desc.resize(modelHeader.nOutputs);
|
||||
|
||||
for (auto outputIndex = 0; outputIndex < modelHeader.nOutputs; outputIndex++) {
|
||||
std::string name = "output" + std::to_string(outputIndex);
|
||||
RuntimeEndPoint output;
|
||||
is.read(reinterpret_cast<char *>(&output), sizeof(output));
|
||||
GNAPluginNS::OutputDesc description;
|
||||
description.ptrs.push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + output.descriptor_offset));
|
||||
description.orientation = kDnnInterleavedOrientation;
|
||||
description.orientation = output.orientation;
|
||||
description.num_bytes_per_element = output.element_size;
|
||||
description.scale_factor = output.scaleFactor;
|
||||
|
||||
auto outputDims = InferenceEngine::SizeVector({modelHeader.nGroup, output.elements_count / modelHeader.nGroup});
|
||||
dataMap[name] = std::make_shared<InferenceEngine::Data>(name,
|
||||
InferenceEngine::TensorDesc(
|
||||
InferenceEngine::Precision::FP32,
|
||||
outputDims,
|
||||
InferenceEngine::Layout::NC));
|
||||
desc.at(outputIndex) = description;
|
||||
}
|
||||
}
|
||||
|
||||
void GNAModelSerial::setHeader(ModelHeader header) {
|
||||
modelHeader = header;
|
||||
}
|
||||
|
||||
@@ -7,7 +7,10 @@
|
||||
#include <istream>
|
||||
#include <vector>
|
||||
#include <utility>
|
||||
#include "gna-api.h"
|
||||
|
||||
#include <gna-api.h>
|
||||
#include "descriptions/gna_input_desc.hpp"
|
||||
#include "descriptions/gna_output_desc.hpp"
|
||||
#include "gna_plugin_log.hpp"
|
||||
#if GNA_LIB_VER == 2
|
||||
#include "gna2-model-api.h"
|
||||
@@ -20,18 +23,19 @@
|
||||
* 1.0 - basic support
|
||||
* 1.1 - added memory information
|
||||
* 2.0 - for use with GNA2 library
|
||||
* 2.1 - multiple i/o support
|
||||
*/
|
||||
#if GNA_LIB_VER == 2
|
||||
#define HEADER_MAJOR 2
|
||||
#define HEADER_MINOR 0
|
||||
#define HEADER_MINOR 1
|
||||
#else
|
||||
#define HEADER_MAJOR 1
|
||||
#define HEADER_MINOR 1
|
||||
#define HEADER_MINOR 2
|
||||
#endif
|
||||
|
||||
|
||||
/**
|
||||
* @brief Header version 1.0
|
||||
* @brief Header version 2.1
|
||||
*/
|
||||
struct ModelHeader {
|
||||
/**
|
||||
@@ -74,27 +78,8 @@ struct ModelHeader {
|
||||
uint32_t nRotateRows = 0u;
|
||||
uint32_t nRotateColumns = 0u;
|
||||
|
||||
|
||||
struct EndPoint {
|
||||
/**
|
||||
* if scale factor is different then pased into infer , network might need to be requantized
|
||||
*/
|
||||
float scaleFactor = 0.f;
|
||||
/**
|
||||
* Offset in bytes of pointer descriptor
|
||||
*/
|
||||
uint64_t descriptor_offset = 0ull;
|
||||
/**
|
||||
* Endpoint resolution in bytes.
|
||||
*/
|
||||
uint32_t element_size = 0u;
|
||||
/**
|
||||
* Number of elements
|
||||
*/
|
||||
uint32_t elements_count = 0u;
|
||||
};
|
||||
EndPoint input;
|
||||
EndPoint output;
|
||||
uint32_t nInputs = 0u;
|
||||
uint32_t nOutputs = 0u;
|
||||
|
||||
/**
|
||||
* Reserved Data might be here
|
||||
@@ -127,15 +112,23 @@ class GNAModelSerial {
|
||||
* Number of elements
|
||||
*/
|
||||
uint32_t elements_count = 0;
|
||||
/**
|
||||
* Offset in bytes of pointer descriptor
|
||||
*/
|
||||
uint64_t descriptor_offset = 0ull;
|
||||
|
||||
intel_dnn_orientation_t orientation = kDnnUnknownOrientation;
|
||||
|
||||
RuntimeEndPoint() = default;
|
||||
RuntimeEndPoint(double scaleFactor,
|
||||
void* descriptor_ptr,
|
||||
uint32_t element_size,
|
||||
uint32_t elements_count) : scaleFactor(scaleFactor),
|
||||
uint32_t elements_count,
|
||||
intel_dnn_orientation_t orientation) : scaleFactor(scaleFactor),
|
||||
descriptor_ptr(descriptor_ptr),
|
||||
element_size(element_size),
|
||||
elements_count(elements_count) {
|
||||
elements_count(elements_count),
|
||||
orientation(orientation) {
|
||||
}
|
||||
};
|
||||
using MemoryType = std::vector<std::pair<void*, uint32_t>>;
|
||||
@@ -146,11 +139,23 @@ private:
|
||||
#else
|
||||
intel_nnet_type_t *ptr_nnet;
|
||||
#endif
|
||||
RuntimeEndPoint input, output;
|
||||
std::vector<RuntimeEndPoint> inputs;
|
||||
std::vector<RuntimeEndPoint> outputs;
|
||||
uint32_t nRotateRows = 0;
|
||||
uint32_t nRotateColumns = 0;
|
||||
|
||||
MemoryType states, *pstates = nullptr;
|
||||
ModelHeader modelHeader;
|
||||
|
||||
void ImportInputs(std::istream &is,
|
||||
void* basePtr,
|
||||
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||
InferenceEngine::InputsDataMap& dataMap);
|
||||
|
||||
void ImportOutputs(std::istream &is,
|
||||
void* basePtr,
|
||||
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||
InferenceEngine::OutputsDataMap& dataMap);
|
||||
|
||||
public:
|
||||
#if GNA_LIB_VER == 2
|
||||
@@ -160,8 +165,12 @@ private:
|
||||
|
||||
GNAModelSerial(
|
||||
Gna2Model * model,
|
||||
RuntimeEndPoint input,
|
||||
RuntimeEndPoint output) : gna2Model(model), input(input), output(output) {
|
||||
const std::shared_ptr<GNAPluginNS::InputDesc> inputDesc,
|
||||
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc,
|
||||
const InferenceEngine::InputsDataMap& inputsDataMap,
|
||||
const InferenceEngine::OutputsDataMap& outputsDataMap) : gna2Model(model),
|
||||
inputs(serializeInputs(inputsDataMap, inputDesc)),
|
||||
outputs(serializeOutputs(outputsDataMap, outputsDesc)) {
|
||||
}
|
||||
|
||||
#else
|
||||
@@ -183,8 +192,12 @@ private:
|
||||
*/
|
||||
GNAModelSerial(
|
||||
intel_nnet_type_t *ptr_nnet,
|
||||
RuntimeEndPoint input,
|
||||
RuntimeEndPoint output) : ptr_nnet(ptr_nnet), input(input), output(output) {
|
||||
const std::shared_ptr<GNAPluginNS::InputDesc> inputDesc,
|
||||
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc,
|
||||
const InferenceEngine::InputsDataMap& inputsDataMap,
|
||||
const InferenceEngine::OutputsDataMap& outputsDataMap) : ptr_nnet(ptr_nnet),
|
||||
inputs(serializeInputs(inputsDataMap, inputDesc)),
|
||||
outputs(serializeOutputs(outputsDataMap, outputsDesc)) {
|
||||
}
|
||||
#endif
|
||||
|
||||
@@ -219,7 +232,13 @@ private:
|
||||
* @param basePointer
|
||||
* @param is - stream without header structure - TBD heder might be needed
|
||||
*/
|
||||
void Import(void *basePointer, size_t gnaGraphSize, std::istream &is);
|
||||
void Import(void *basePointer,
|
||||
size_t gnaGraphSize,
|
||||
std::istream & is,
|
||||
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||
InferenceEngine::InputsDataMap& inputsDataMap,
|
||||
InferenceEngine::OutputsDataMap& outputsDataMap);
|
||||
|
||||
/**
|
||||
* save gna graph to an outpus stream
|
||||
@@ -231,4 +250,13 @@ private:
|
||||
void Export(void *basePtr,
|
||||
size_t gnaGraphSize,
|
||||
std::ostream &os) const;
|
||||
|
||||
static std::vector<GNAModelSerial::RuntimeEndPoint> serializeOutputs(const InferenceEngine::OutputsDataMap& outputsDataMap,
|
||||
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc);
|
||||
|
||||
|
||||
static std::vector<GNAModelSerial::RuntimeEndPoint> serializeInputs(const InferenceEngine::InputsDataMap& inputsDataMap,
|
||||
const std::shared_ptr<GNAPluginNS::InputDesc>);
|
||||
|
||||
void setHeader(ModelHeader header);
|
||||
};
|
||||
|
||||
@@ -373,6 +373,7 @@ void GNAPlugin::LoadNetwork(ICNNNetwork &network) {
|
||||
passes->registerPass<InsertDiagonalLayerPass>();
|
||||
passes->registerPass<HandleMultipleActivationsForTheLayerPass>();
|
||||
passes->registerPass<SubstituteScaleShiftBroadCastPass>();
|
||||
passes->registerPass<FuseMultipleIdentitiesPass>();
|
||||
passIdx = passes->run(passIdx);
|
||||
};
|
||||
|
||||
@@ -1140,13 +1141,15 @@ InferenceEngine::IExecutableNetwork::Ptr GNAPlugin::ImportNetwork(const std::str
|
||||
#else
|
||||
auto serial = GNAModelSerial(&std::get<0>(nnets.back())->obj, mt);
|
||||
#endif
|
||||
serial.Import(basePtr, header.gnaMemSize, inputStream);
|
||||
|
||||
inputsDesc->getPtrInputsGlobal("input").push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + header.input.descriptor_offset));
|
||||
// TODO: import of multioutput network not supported
|
||||
outputsDesc.resize(1);
|
||||
auto &outputDesc = outputsDesc.front();
|
||||
outputDesc.ptrs.push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + header.output.descriptor_offset));
|
||||
serial.setHeader(header);
|
||||
serial.Import(basePtr,
|
||||
header.gnaMemSize,
|
||||
inputStream,
|
||||
inputsDesc,
|
||||
outputsDesc,
|
||||
inputsDataMap,
|
||||
outputsDataMap);
|
||||
|
||||
#if GNA_LIB_VER == 2
|
||||
auto getOrientation = [](Gna2Operation & gnaOperation) {
|
||||
@@ -1160,32 +1163,10 @@ InferenceEngine::IExecutableNetwork::Ptr GNAPlugin::ImportNetwork(const std::str
|
||||
};
|
||||
#endif
|
||||
|
||||
#if GNA_LIB_VER == 2
|
||||
inputsDesc->orientation_in["input"] = getOrientation(std::get<0>(gnaModels.back())->obj.Operations[0]);
|
||||
outputDesc.orientation = getOrientation(std::get<0>(gnaModels.back())->obj.Operations[std::get<0>(gnaModels.back())->obj.NumberOfOperations - 1]);
|
||||
#else
|
||||
#if GNA_LIB_VER == 1
|
||||
inputsDesc->orientation_in["input"] = getOrientation(std::get<0>(nnets.back())->obj.pLayers[0]);
|
||||
outputDesc.orientation = getOrientation(std::get<0>(nnets.back())->obj.pLayers[std::get<0>(nnets.back())->obj.nLayers - 1]);
|
||||
outputsDesc[0].orientation = getOrientation(std::get<0>(nnets.back())->obj.pLayers[std::get<0>(nnets.back())->obj.nLayers - 1]);
|
||||
#endif
|
||||
outputDesc.num_bytes_per_element = header.output.element_size;
|
||||
|
||||
auto outputDims = SizeVector({header.nGroup, header.output.elements_count / header.nGroup});
|
||||
auto inputDims = SizeVector({header.nGroup, header.input.elements_count / header.nGroup});
|
||||
|
||||
inputsDataMap["input"] = std::make_shared<InputInfo>();
|
||||
inputsDataMap["input"]->setInputData(make_shared<Data>("input",
|
||||
TensorDesc(
|
||||
Precision::FP32,
|
||||
inputDims,
|
||||
Layout::NC)));
|
||||
outputsDataMap["output"] = make_shared<Data>("output",
|
||||
TensorDesc(
|
||||
Precision::FP32,
|
||||
outputDims,
|
||||
Layout::NC));
|
||||
|
||||
outputDesc.scale_factor = header.output.scaleFactor;
|
||||
inputsDesc->inputScaleFactors.push_back(header.input.scaleFactor);
|
||||
|
||||
num_rotate_rows = header.nRotateRows;
|
||||
num_rotate_columns = header.nRotateColumns;
|
||||
@@ -1214,9 +1195,11 @@ void GNAPlugin::Export(const std::string &fileName) {
|
||||
THROW_GNA_EXCEPTION << " network not loaded";
|
||||
}
|
||||
|
||||
#if GNA_LIB_VER == 1
|
||||
if (inputsDesc->ptr_inputs_global_id.size() != 1) {
|
||||
THROW_GNA_EXCEPTION << " exporting network with multiple inputs not supported";
|
||||
}
|
||||
#endif
|
||||
|
||||
std::fstream outStream(fileName, ios_base::out | ios_base::binary);
|
||||
|
||||
@@ -1229,19 +1212,16 @@ void GNAPlugin::Export(const std::string &fileName) {
|
||||
#endif
|
||||
}
|
||||
#if GNA_LIB_VER == 2
|
||||
auto serial = GNAModelSerial(&std::get<0>(gnaModels.front())->obj,
|
||||
Gna2Model* modelToSerial = &std::get<0>(gnaModels.front())->obj;
|
||||
#else
|
||||
auto serial = GNAModelSerial(&std::get<0>(nnets.front())->obj,
|
||||
intel_nnet_type_t* modelToSerial = &std::get<0>(nnets.front())->obj;
|
||||
#endif
|
||||
{inputsDesc->inputScaleFactors.front(),
|
||||
inputsDesc->ptr_inputs_global_storage.front()[0],
|
||||
2,
|
||||
static_cast<uint32_t>(InferenceEngine::details::product(inputsDataMap.begin()->second->getTensorDesc().getDims()))},
|
||||
{outputsDesc.front().scale_factor,
|
||||
outputsDesc.front().ptrs.front(),
|
||||
outputsDesc.front().num_bytes_per_element,
|
||||
static_cast<uint32_t>(InferenceEngine::details::product(outputsDataMap.begin()->second->getTensorDesc().getDims()))})
|
||||
.SetInputRotation(dnn->num_rotate_rows, dnn->num_rotate_columns);
|
||||
auto serial = GNAModelSerial(modelToSerial,
|
||||
inputsDesc,
|
||||
outputsDesc,
|
||||
inputsDataMap,
|
||||
outputsDataMap)
|
||||
.SetInputRotation(dnn->num_rotate_rows, dnn->num_rotate_columns);
|
||||
|
||||
for (auto && memoryConnection : graphCompiler.memory_connection) {
|
||||
serial.AddState(memoryConnection.second.gna_ptr, memoryConnection.second.reserved_size);
|
||||
|
||||
@@ -71,7 +71,7 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& config) {
|
||||
key.erase(0, 1);
|
||||
try {
|
||||
input_index = std::stoi(key);
|
||||
if (input_index < 0 | input_index > 99) {
|
||||
if (input_index > 99) {
|
||||
throw std::out_of_range("");
|
||||
}
|
||||
} catch (std::invalid_argument&) {
|
||||
|
||||
@@ -107,6 +107,9 @@ class LayerInfo {
|
||||
bool isConcatAlignFilter() const noexcept {
|
||||
return isOfType("ConcatAlignFilter");
|
||||
}
|
||||
bool isLink() const noexcept {
|
||||
return isOfType("Link");
|
||||
}
|
||||
bool isAffineFilter() const noexcept {
|
||||
return isOfType("AffineFilter");
|
||||
}
|
||||
@@ -204,6 +207,7 @@ class LayerInfo {
|
||||
if (layerOrder == std::vector<int>({ 0, 3, 2, 1 })) {
|
||||
return true; // supported case
|
||||
}
|
||||
IE_ASSERT(!layer->insData.empty());
|
||||
auto inputs = layer->insData.begin()->lock();
|
||||
auto inputsOrder = inputs->getTensorDesc().getDims();
|
||||
|
||||
|
||||
@@ -40,7 +40,6 @@ public:
|
||||
|
||||
// length of current cycle
|
||||
std::list<cnt_type> permuteCycles;
|
||||
int seqId = 0;
|
||||
bool newSeq = false;
|
||||
|
||||
for (int i = 0; i != orderVec.size();) {
|
||||
|
||||
@@ -609,31 +609,6 @@ void InsertIdentityLayerPass::run() {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief returns previous layers and insData index for it
|
||||
* @tparam T
|
||||
* @param origin
|
||||
* @param acceptanceCriteria
|
||||
* @param idx
|
||||
*/
|
||||
// give previous layers while skipping certain layer according to expression
|
||||
template <class T>
|
||||
std::vector<std::pair<CNNLayerPtr, int> > CNNNetGetPrevLayersSkip(CNNLayerPtr origin, const T &acceptanceCriteria, int idx = -1) {
|
||||
std::vector<std::pair<CNNLayerPtr, int> > prevLayers;
|
||||
for (int i = idx == -1 ? 0 : idx; CNNNetHasPrevLayer(origin.get(), i) && (idx == -1 || i == idx); i++) {
|
||||
auto prevLayer = CNNNetPrevLayer(origin, i);
|
||||
if (acceptanceCriteria(prevLayer)) {
|
||||
prevLayers.push_back({prevLayer, CNNLayerFindOutDataIdx(origin, i)});
|
||||
} else {
|
||||
// if for some input we need to look in upper layers - original index not used here intentionally
|
||||
auto prevPrevLayers = CNNNetGetPrevLayersSkip(prevLayer, acceptanceCriteria);
|
||||
prevLayers.insert(prevLayers.end(), prevPrevLayers.begin(), prevPrevLayers.end());
|
||||
}
|
||||
}
|
||||
|
||||
return prevLayers;
|
||||
}
|
||||
|
||||
void InsertCopyLayerPass::run() {
|
||||
for (auto & l : *pLayers) {
|
||||
if (l->insData.empty()) continue;
|
||||
@@ -1084,6 +1059,78 @@ void RemoveConstPass::run() {
|
||||
transformer.fullTrim();
|
||||
}
|
||||
|
||||
void FuseMultipleIdentitiesPass::run() {
|
||||
for (auto &l : *pLayers) {
|
||||
if (l->insData.empty()) continue;
|
||||
|
||||
auto isNonFunctional = [](CNNLayerPtr ptr) {
|
||||
return LayerInfo(ptr).isNonFunctional();
|
||||
};
|
||||
auto eltwise = dynamic_cast<InferenceEngine::EltwiseLayer *>(l.get());
|
||||
auto concat = dynamic_cast<InferenceEngine::ConcatLayer *>(l.get());
|
||||
|
||||
if (LayerInfo(l).isNonFunctional() || LayerInfo(l).has32BInput())
|
||||
continue;
|
||||
gnalog() << "CNNNetPrevLayer skip non functional from :: " << l->name;
|
||||
auto prevLayersReached = CNNNetGetPrevLayersSkip(l, [](CNNLayerPtr ptr) {
|
||||
return !LayerInfo(ptr).isNonFunctional();
|
||||
});
|
||||
prevLayersReached.erase(std::remove_if(prevLayersReached.begin(),
|
||||
prevLayersReached.end(),
|
||||
[] (const std::pair<CNNLayerPtr, int> & candidate) {
|
||||
return LayerInfo(candidate.first).isLink();
|
||||
}), prevLayersReached.end());
|
||||
|
||||
if (prevLayersReached.size() != 1 && eltwise == nullptr && concat == nullptr) {
|
||||
std::stringstream layers;
|
||||
for (auto && prevLayer : prevLayersReached) {
|
||||
layers << prevLayer.first->name;
|
||||
layers << ", ";
|
||||
}
|
||||
THROW_GNA_LAYER_EXCEPTION(l) << "unsupported case: connected to "
|
||||
<< (prevLayersReached.empty() ? "zero" : "multiple") << " outputs : " << layers.str();
|
||||
}
|
||||
auto prevLayer = prevLayersReached.front().first;
|
||||
auto outDataIdx = prevLayersReached.front().second;
|
||||
gnalog() << ", reached " << prevLayer->name << " at " << outDataIdx << std::endl;
|
||||
|
||||
if (!LayerInfo(prevLayer).has32BOutput())
|
||||
continue;
|
||||
|
||||
std::vector<CNNLayerPtr> resultSet = CNNNetGetAllNextLayersSkipCertain(prevLayer, outDataIdx, isNonFunctional);
|
||||
|
||||
// now result set should have all needed layers
|
||||
// checking that result set consist of already identity
|
||||
CNNLayerPtr alreadyIdentity;
|
||||
for (auto &&res : resultSet) {
|
||||
if (LayerInfo(res).isIdentity()) {
|
||||
alreadyIdentity = res;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!alreadyIdentity) {
|
||||
continue;
|
||||
} else {
|
||||
// just figure out how to connect to that "already identity"
|
||||
// 1st stage - disconnect given layer from previous
|
||||
auto directPrev = l->insData.front().lock()->getCreatorLayer().lock();
|
||||
auto oDataIdx = CNNLayerFindOutDataIdx(directPrev, 0);
|
||||
auto &inputTo = directPrev->outData[oDataIdx]->getInputTo();
|
||||
for (auto inIterator = inputTo.begin(); inIterator != inputTo.end(); inIterator++) {
|
||||
if (inIterator->second == l) {
|
||||
inputTo.erase(inIterator);
|
||||
break;
|
||||
}
|
||||
}
|
||||
l->insData.clear();
|
||||
|
||||
//2nd stage - now setting up new connection
|
||||
l->insData.push_back(alreadyIdentity->outData.front());
|
||||
alreadyIdentity->outData.front()->getInputTo()[l->name] = l;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
int PassManager::run(int index) {
|
||||
// #define PLOT
|
||||
#ifdef PLOT
|
||||
|
||||
@@ -149,6 +149,11 @@ DECL_PASS_BEFORE_COPY(UnrollTI);
|
||||
*/
|
||||
DECL_PASS_BEFORE_COPY(RemoveConst);
|
||||
|
||||
/**
|
||||
* @brief removed extra identity layer for multi-output
|
||||
*/
|
||||
DECL_PASS(FuseMultipleIdentities);
|
||||
|
||||
struct PassManagerSettings {
|
||||
Policy policy;
|
||||
/// @brief whether to run passes before copy
|
||||
|
||||
@@ -139,7 +139,7 @@ private:
|
||||
|
||||
friend INFERENCE_ENGINE_API_CPP(std::shared_ptr<CNNNetworkImpl>)
|
||||
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph,
|
||||
const ICNNNetwork& nGraphImpl);
|
||||
const ICNNNetwork& nGraphImpl, bool keep_constant_inputs);
|
||||
|
||||
/**
|
||||
* @brief Reshape on the same shape
|
||||
|
||||
@@ -63,9 +63,9 @@ ngraph::op::GenericIE::GenericIE(const ngraph::NodeVector& inputs,
|
||||
: GenericIE(as_output_vector(inputs), params, type, outputs) {}
|
||||
|
||||
ngraph::op::GenericIE::GenericIE(const ngraph::OutputVector& inputs,
|
||||
const std::map<std::string, InferenceEngine::Parameter>& params,
|
||||
const std::string type, const std::vector<PortIE>& outputs)
|
||||
: Op(inputs), params(params), outputs(outputs), type(type), initialized(0) {
|
||||
const std::map<std::string, InferenceEngine::Parameter>& params_,
|
||||
const std::string type_, const std::vector<PortIE>& outputs_)
|
||||
: Op(inputs), params(params_), outputs(outputs_), type(type_), initialized(0) {
|
||||
constructor_validate_and_infer_types();
|
||||
}
|
||||
|
||||
|
||||
@@ -179,7 +179,9 @@ CNNNetwork details::ReadNetwork(const std::string& modelPath, const std::string&
|
||||
THROW_IE_EXCEPTION << "Weights file " << bPath << " cannot be opened!";
|
||||
|
||||
// read model with weights
|
||||
return reader->read(modelStream, binStream, exts);
|
||||
auto network = reader->read(modelStream, binStream, exts);
|
||||
modelStream.close();
|
||||
return network;
|
||||
}
|
||||
// read model without weights
|
||||
return reader->read(modelStream, exts);
|
||||
|
||||
@@ -15,7 +15,8 @@ namespace InferenceEngine {
|
||||
namespace details {
|
||||
|
||||
INFERENCE_ENGINE_API_CPP(std::shared_ptr<CNNNetworkImpl>)
|
||||
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph, const ICNNNetwork &network);
|
||||
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph,
|
||||
const ICNNNetwork &network, bool keep_constant_inputs = false);
|
||||
|
||||
} // namespace details
|
||||
} // namespace InferenceEngine
|
||||
|
||||
@@ -24,6 +24,8 @@
|
||||
#include "ngraph_ops/pad_ie.hpp"
|
||||
#include "ngraph_ops/onehot_ie.hpp"
|
||||
#include "ngraph_ops/power.hpp"
|
||||
#include "ngraph_ops/prior_box_clustered_ie.hpp"
|
||||
#include "ngraph_ops/prior_box_ie.hpp"
|
||||
#include "ngraph_ops/proposal_ie.hpp"
|
||||
#include "ngraph_ops/relu_ie.hpp"
|
||||
#include "ngraph_ops/scaleshift.hpp"
|
||||
@@ -472,20 +474,6 @@ InferenceEngine::details::CNNLayerCreator::CNNLayerCreator(const std::shared_ptr
|
||||
return res;
|
||||
|
||||
});
|
||||
|
||||
addSpecificCreator({"PriorBox"}, [](const std::shared_ptr<::ngraph::Node>& node,
|
||||
const std::map<std::string, std::string> params) -> CNNLayerPtr {
|
||||
THROW_IE_EXCEPTION << "PriorBox operation has a form that is not supported." << node->get_friendly_name()
|
||||
<< " should be replaced by constant during constant folding.";
|
||||
return nullptr;
|
||||
});
|
||||
|
||||
addSpecificCreator({"PriorBoxClustered"}, [](const std::shared_ptr<::ngraph::Node>& node,
|
||||
const std::map<std::string, std::string> params) -> CNNLayerPtr {
|
||||
THROW_IE_EXCEPTION << "PriorBoxClustered operation has a form that is not supported." << node->get_friendly_name()
|
||||
<< " should be replaced by constant during constant folding.";
|
||||
return nullptr;
|
||||
});
|
||||
}
|
||||
|
||||
CNNLayerPtr InferenceEngine::details::CNNLayerCreator::create() {
|
||||
@@ -499,7 +487,9 @@ CNNLayerPtr InferenceEngine::details::CNNLayerCreator::create() {
|
||||
return res;
|
||||
}
|
||||
|
||||
std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph, const ICNNNetwork &network) {
|
||||
std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function> &graph,
|
||||
const ICNNNetwork &network,
|
||||
bool keep_constant_inputs) {
|
||||
IE_PROFILING_AUTO_SCOPE(convertFunctionToICNNNetwork)
|
||||
const auto createCNNLayer = [](const std::shared_ptr<::ngraph::Node> &node) -> CNNLayerPtr {
|
||||
class NGraphCNNLayer: public CNNLayer {
|
||||
@@ -565,6 +555,10 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::PadIE>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::v1::Power>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::PowerIE>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBox>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxClustered>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxClusteredIE>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxIE>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::Proposal>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::ProposalIE>>(),
|
||||
std::make_shared<Builder::NodeConverter<::ngraph::op::Relu>>(),
|
||||
@@ -715,7 +709,7 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
||||
for (const auto &layer : nodes)
|
||||
op_names.insert(layer->get_name());
|
||||
|
||||
bool keep_constants = ::ngraph::op::util::has_op_with_type<::ngraph::op::FakeQuantize>(graph);
|
||||
bool keep_constants = keep_constant_inputs || ::ngraph::op::util::has_op_with_type<::ngraph::op::FakeQuantize>(graph);
|
||||
|
||||
// Create layers and output data
|
||||
for (const auto &layer : nodes) {
|
||||
@@ -766,6 +760,20 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
||||
cnnLayer->insData.resize(inputCount);
|
||||
|
||||
for (size_t i = 0; i < layer->get_output_size(); i++) {
|
||||
// Memory node with index = 1 has no inputs according to the specification.
|
||||
// For proper conversion, we must cut off all the layers and data nodes above ReadValue,
|
||||
// if they are connected only with this layer.
|
||||
// Now MO generates only constants or constant sub-graphs as input to ReadValue op.
|
||||
if (std::dynamic_pointer_cast<::ngraph::op::Constant>(layer)) {
|
||||
bool all_to_read_value = !layer->output(i).get_target_inputs().empty();
|
||||
for (const auto &output_input : layer->output(i).get_target_inputs()) {
|
||||
all_to_read_value
|
||||
&= dynamic_cast<ngraph::op::ReadValue *>(output_input.get_node()) != nullptr;
|
||||
}
|
||||
if (all_to_read_value)
|
||||
continue;
|
||||
}
|
||||
|
||||
if (cnnLayer->type == "Memory" && cnnLayer->params["index"] == "0") {
|
||||
cnnLayer->outData.clear();
|
||||
continue;
|
||||
@@ -773,7 +781,6 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
||||
std::string outName = layer->get_friendly_name();
|
||||
if (layer->get_output_size() != 1) outName += "." + std::to_string(i);
|
||||
DataPtr &ptr = cnnNetworkImpl->getData(outName.c_str());
|
||||
|
||||
SizeVector dims;
|
||||
dims = layer->get_output_shape(i);
|
||||
for (const auto &dim : dims) {
|
||||
@@ -889,6 +896,7 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
||||
for (const auto &ext : ::ngraph::op::GenericIE::getExtensions(graph)) {
|
||||
cnnNetworkImpl->AddExtension(ext, nullptr);
|
||||
}
|
||||
|
||||
return cnnNetworkImpl;
|
||||
}
|
||||
} // namespace details
|
||||
|
||||
@@ -232,7 +232,8 @@ std::vector<CNNLayerPtr> ConstTransformer::foldConstSubgraphsInternal(const std:
|
||||
static std::vector<std::string> skipConstInfer = {
|
||||
"FakeQuantize",
|
||||
"Quantize",
|
||||
"CumSum" // Const inference function for CumSum is not implemented!
|
||||
"CumSum", // Const inference function for CumSum is not implemented
|
||||
"Convolution" // Const inference function for Convolution is not implemented
|
||||
};
|
||||
|
||||
const std::map<std::string, bool> ConstTransformer::getConstLayers(const std::vector<CNNLayerPtr>& sortedLayers) {
|
||||
|
||||
@@ -34,6 +34,8 @@
|
||||
#include "ngraph_ops/onehot_ie.hpp"
|
||||
#include "ngraph_ops/pad_ie.hpp"
|
||||
#include "ngraph_ops/power.hpp"
|
||||
#include "ngraph_ops/prior_box_clustered_ie.hpp"
|
||||
#include "ngraph_ops/prior_box_ie.hpp"
|
||||
#include "ngraph_ops/proposal_ie.hpp"
|
||||
#include "ngraph_ops/relu_ie.hpp"
|
||||
#include "ngraph_ops/selu_ie.hpp"
|
||||
@@ -1473,6 +1475,136 @@ CNNLayer::Ptr NodeConverter<ngraph::op::ProposalIE>::createLayer(const std::shar
|
||||
return res;
|
||||
}
|
||||
|
||||
template <>
|
||||
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxClusteredIE>::createLayer(
|
||||
const std::shared_ptr<ngraph::Node>& layer) const {
|
||||
LayerParams params = {layer->get_friendly_name(), "PriorBoxClustered",
|
||||
details::convertPrecision(layer->get_output_element_type(0))};
|
||||
auto res = std::make_shared<InferenceEngine::CNNLayer>(params);
|
||||
auto castedLayer = ngraph::as_type_ptr<ngraph::op::PriorBoxClusteredIE>(layer);
|
||||
if (castedLayer == nullptr) THROW_IE_EXCEPTION << "Cannot get " << params.type << " layer " << params.name;
|
||||
|
||||
auto attr = castedLayer->get_attrs();
|
||||
std::string param;
|
||||
for (const auto& val : attr.widths) {
|
||||
if (!param.empty()) param += ",";
|
||||
param += asString(val);
|
||||
}
|
||||
res->params["width"] = param;
|
||||
|
||||
param.clear();
|
||||
for (const auto& val : attr.heights) {
|
||||
if (!param.empty()) param += ",";
|
||||
param += asString(val);
|
||||
}
|
||||
res->params["height"] = param;
|
||||
|
||||
param.clear();
|
||||
for (const auto& val : attr.variances) {
|
||||
if (!param.empty()) param += ",";
|
||||
param += asString(val);
|
||||
}
|
||||
res->params["variance"] = param;
|
||||
|
||||
if (std::abs(attr.step_heights - attr.step_widths) < 1e-5) {
|
||||
res->params["step"] = asString(attr.step_widths);
|
||||
} else {
|
||||
res->params["step_w"] = asString(attr.step_widths);
|
||||
res->params["step_h"] = asString(attr.step_heights);
|
||||
}
|
||||
res->params["offset"] = asString(attr.offset);
|
||||
res->params["clip"] = asString(attr.clip ? 1 : 0);
|
||||
res->params["flip"] = "1";
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
template <>
|
||||
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxClustered>::createLayer(
|
||||
const std::shared_ptr<ngraph::Node>& layer) const {
|
||||
THROW_IE_EXCEPTION << "PriorBoxClustered operation must be converted to PriorBoxClusteredIE operation.";
|
||||
}
|
||||
|
||||
template <>
|
||||
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxIE>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
|
||||
LayerParams params = {layer->get_friendly_name(), "PriorBox",
|
||||
details::convertPrecision(layer->get_output_element_type(0))};
|
||||
auto res = std::make_shared<InferenceEngine::CNNLayer>(params);
|
||||
auto castedLayer = ngraph::as_type_ptr<ngraph::op::PriorBoxIE>(layer);
|
||||
auto layer_info = params.type + " layer " + params.name;
|
||||
|
||||
if (castedLayer == nullptr) THROW_IE_EXCEPTION << "Cannot get " << layer_info;
|
||||
|
||||
auto attr = castedLayer->get_attrs();
|
||||
std::string param;
|
||||
|
||||
auto data_pshape = castedLayer->get_input_partial_shape(0);
|
||||
if (data_pshape.is_dynamic()) THROW_IE_EXCEPTION << "Dynamic 0-port input of " << layer_info << " is not supported";
|
||||
auto data_shape = data_pshape.to_shape();
|
||||
if (data_shape.size() != 4) THROW_IE_EXCEPTION << layer_info << " has " << data_shape.size() << " items in 0-port input, 4 expected";
|
||||
|
||||
auto img_pshape = castedLayer->get_input_partial_shape(1);
|
||||
if (img_pshape.is_dynamic()) THROW_IE_EXCEPTION << "Dynamic 1-port input of " << layer_info << " is not supported";
|
||||
auto img_shape = img_pshape.to_shape();
|
||||
if (img_shape.size() != 4) THROW_IE_EXCEPTION << layer_info << " has " << data_shape.size() << " items in 1-port input, 4 expected";
|
||||
|
||||
if (!attr.scale_all_sizes) {
|
||||
// mxnet-like PriorBox
|
||||
auto img_H = img_shape[2];
|
||||
auto data_H = data_shape[2];
|
||||
if (attr.step == -1)
|
||||
attr.step = 1. * img_H / data_H;
|
||||
else
|
||||
attr.step *= img_H;
|
||||
for (auto& size : attr.min_size)
|
||||
size *= img_H;
|
||||
}
|
||||
|
||||
for (const auto& val : attr.max_size) {
|
||||
if (!param.empty()) param += ",";
|
||||
param += asString(val);
|
||||
}
|
||||
res->params["max_size"] = param;
|
||||
|
||||
param.clear();
|
||||
for (const auto& val : attr.min_size) {
|
||||
if (!param.empty()) param += ",";
|
||||
param += asString(val);
|
||||
}
|
||||
res->params["min_size"] = param;
|
||||
|
||||
param.clear();
|
||||
for (const auto& val : attr.aspect_ratio) {
|
||||
if (!param.empty()) param += ",";
|
||||
param += asString(val);
|
||||
}
|
||||
res->params["aspect_ratio"] = param;
|
||||
|
||||
param.clear();
|
||||
for (const auto& val : attr.variance) {
|
||||
if (!param.empty()) param += ",";
|
||||
param += asString(val);
|
||||
}
|
||||
res->params["variance"] = param;
|
||||
|
||||
res->params["step"] = asString(attr.step);
|
||||
res->params["offset"] = asString(attr.offset);
|
||||
res->params["clip"] = asString(attr.clip ? 1 : 0);
|
||||
res->params["flip"] = asString(attr.flip ? 1 : 0);
|
||||
res->params["scale_all_sizes"] = asString(attr.scale_all_sizes ? 1 : 0);
|
||||
|
||||
res->params["density"] = asString(attr.density);
|
||||
res->params["fixed_size"] = asString(attr.fixed_size);
|
||||
res->params["fixed_ratio"] = asString(attr.fixed_ratio);
|
||||
|
||||
return res;
|
||||
}
|
||||
|
||||
template <>
|
||||
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBox>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
|
||||
THROW_IE_EXCEPTION << "PriorBox operation must be converted to PriorBoxIE operation.";
|
||||
}
|
||||
|
||||
template <>
|
||||
CNNLayer::Ptr NodeConverter<ngraph::op::PowerIE>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
|
||||
LayerParams params = {layer->get_friendly_name(), "Power",
|
||||
|
||||
@@ -272,6 +272,48 @@ void CombineData(DataPtr& master, DataPtr& slave) {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Preserve output data name and update output data map of the network
|
||||
*
|
||||
* @param in_data name to update
|
||||
* @param out_data name to preserve
|
||||
* @param net output data map to update with in_data
|
||||
*/
|
||||
template <typename NET>
|
||||
void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, NET &net) {
|
||||
// TODO: update outputs of the network if out_data was output
|
||||
if (out_data->getInputTo().empty()) {
|
||||
auto data_name = out_data->getName();
|
||||
in_data->setName(data_name);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, NET &net), where
|
||||
* NET = ICNNNetwork
|
||||
*/
|
||||
void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, ICNNNetwork& net) {
|
||||
if (out_data->getInputTo().empty()) {
|
||||
InferenceEngine::OutputsDataMap outputs_data_map;
|
||||
net.getOutputsInfo(outputs_data_map);
|
||||
auto out_data_name = out_data->getName();
|
||||
in_data->setName(out_data_name);
|
||||
if (outputs_data_map.count(out_data_name)) {
|
||||
auto parent_layer_ptr = in_data->getCreatorLayer().lock();
|
||||
IE_ASSERT(parent_layer_ptr != nullptr);
|
||||
auto parent_layer_name = parent_layer_ptr->name;
|
||||
size_t in_data_out_index = 0;
|
||||
for (size_t ind = 0; ind < parent_layer_ptr->outData.size(); ++ind) {
|
||||
if (parent_layer_ptr->outData[ind] == in_data) {
|
||||
in_data_out_index = ind;
|
||||
}
|
||||
}
|
||||
net.addOutput(parent_layer_name, in_data_out_index);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Remove layer form graph
|
||||
* May be applied only for inplace layer. One input, one output,
|
||||
@@ -279,7 +321,8 @@ void CombineData(DataPtr& master, DataPtr& slave) {
|
||||
*
|
||||
* @param layer to remove from graph
|
||||
*/
|
||||
void RemoveLayer(CNNLayerPtr& layer) {
|
||||
template <typename NET>
|
||||
void RemoveLayer(CNNLayerPtr& layer, NET &net) {
|
||||
IE_ASSERT(layer->insData.size() == 1);
|
||||
IE_ASSERT(layer->outData.size() == 1);
|
||||
|
||||
@@ -299,10 +342,8 @@ void RemoveLayer(CNNLayerPtr& layer) {
|
||||
// transfer output connections into parent data
|
||||
CombineData(in_data, out_data);
|
||||
|
||||
// Save name for output data
|
||||
if (out_data->getInputTo().empty()) {
|
||||
in_data->setName(out_data->getName());
|
||||
}
|
||||
// save name for output data and update network output
|
||||
SaveOutputDataName(in_data, out_data, net);
|
||||
}
|
||||
|
||||
/************************************************************/
|
||||
@@ -1371,7 +1412,7 @@ void fixConvertLayers(NET &net) {
|
||||
}
|
||||
}
|
||||
for (auto &layer : to_remove) {
|
||||
RemoveLayer(layer);
|
||||
RemoveLayer(layer, net);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -21,6 +21,8 @@ public:
|
||||
~GemmTransformation() override {};
|
||||
bool canBeTransformed(const TransformationContext& context, const CNNLayer& layer) const override;
|
||||
void transform(TransformationContext& context, CNNLayer& layer) const override;
|
||||
|
||||
bool isQuantized(const CNNLayer& layer) const noexcept override;
|
||||
};
|
||||
|
||||
IE_SUPPRESS_DEPRECATED_END
|
||||
|
||||
@@ -83,6 +83,8 @@ protected:
|
||||
const std::vector<float>& originalWeightsDequantizationShifts,
|
||||
std::vector<float>& dequantizationScales,
|
||||
std::vector<float>& dequantizationShifts) const;
|
||||
|
||||
static bool getDequantizationDimIsSupported(const CNNLayer& weightableLayer);
|
||||
};
|
||||
|
||||
typedef std::shared_ptr<WeightableLayerTransformation> WeightableLayerTransformationPtr;
|
||||
|
||||
@@ -135,7 +135,6 @@ void ConcatTransformation::transform(TransformationContext& context, CNNLayer& c
|
||||
|
||||
|
||||
dequantizationScale = maxOutputInterval / (dataPrecision.max - dataPrecision.min);
|
||||
const float max = maxOutputInterval / ((dataPrecision.max - dataPrecision.min) / dataPrecision.max);
|
||||
const float min = maxOutputInterval / ((dataPrecision.max - dataPrecision.min) / dataPrecision.min);
|
||||
dequantizationShift = outputLowValue - min;
|
||||
|
||||
|
||||
@@ -25,15 +25,6 @@
|
||||
using namespace InferenceEngine;
|
||||
using namespace InferenceEngine::details;
|
||||
|
||||
bool getDequantizationValuesAreBroadcasted(const CNNLayer& fullyConnected) {
|
||||
const DataPtr inputData = fullyConnected.insData[0].lock();
|
||||
if (inputData == nullptr) {
|
||||
THROW_IE_LPT_EXCEPTION(fullyConnected) << "input data is absent";
|
||||
}
|
||||
|
||||
return inputData->getDims().size() == 3ul;
|
||||
}
|
||||
|
||||
bool FullyConnectedTransformation::canBeTransformed(const TransformationContext& context, const CNNLayer& fullyConnected) const {
|
||||
if (!WeightableLayerTransformation::canBeTransformed(context, fullyConnected)) {
|
||||
return false;
|
||||
@@ -72,7 +63,12 @@ bool FullyConnectedTransformation::canBeTransformed(const TransformationContext&
|
||||
std::vector<float> dequantizationShifts;
|
||||
fillFromDequantizationLayer(*scaleShift, dequantizationScales, dequantizationShifts);
|
||||
|
||||
if ((inTensorDims.size() == 3ul) && (!DequantizationDetails::isPerTensor(dequantizationScales, dequantizationShifts))) {
|
||||
const bool dequantizationDimIsSupported = !getDequantizationDimIsSupported(fullyConnected);
|
||||
if ((!dequantizationDimIsSupported) &&
|
||||
(!DequantizationDetails::isPerTensor(dequantizationScales, dequantizationShifts) ||
|
||||
// if asymmetric quantization is not supported then no shifts for dequantizationDimIsSupported = false case:
|
||||
// in this case we can not dequantize with shifts
|
||||
(!supportAsymmetricQuantization && (dequantizationShifts[0] != 0.f)))) {
|
||||
return false;
|
||||
}
|
||||
|
||||
@@ -318,7 +314,7 @@ void FullyConnectedTransformation::calculateDequantizationForSymmetric(
|
||||
const auto prevDequantizationScaleBuffer = CNNNetworkHelper::getFloatData(CNNNetworkHelper::getBlob(scaleShift, "weights"));
|
||||
const auto prevDequantizationShiftBuffer = CNNNetworkHelper::getFloatData(CNNNetworkHelper::getBlob(scaleShift, "biases"));
|
||||
|
||||
const bool dequantizationValuesAreBroadcasted = getDequantizationValuesAreBroadcasted(fullyConnected);
|
||||
const bool dequantizationValuesAreBroadcasted = !getDequantizationDimIsSupported(fullyConnected);
|
||||
for (size_t i = 0; i < outputChannelsCount; ++i) {
|
||||
dequantizationScales[i] =
|
||||
prevDequantizationScaleBuffer.get()[0] *
|
||||
@@ -401,7 +397,7 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
|
||||
THROW_IE_EXCEPTION << "Unexpected layer type to calculate quantization values " << scaleShift->type;
|
||||
}
|
||||
|
||||
const bool dequantizationValuesAreBroadcasted = getDequantizationValuesAreBroadcasted(fullyConnected);
|
||||
const bool dequantizationValuesAreBroadcasted = !getDequantizationDimIsSupported(fullyConnected);
|
||||
|
||||
dequantizationScales.resize(outputChannelsCount);
|
||||
dequantizationShifts.resize(outputChannelsCount);
|
||||
@@ -412,10 +408,10 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
|
||||
prevDequantizationScaleBuffer.get()[0] *
|
||||
(originalWeightsDequantizationScales.size() == 0 ?
|
||||
1.0 :
|
||||
(originalWeightsDequantizationScales.size() == 1 ? originalWeightsDequantizationScales[0] : originalWeightsDequantizationScales[i]));
|
||||
originalWeightsDequantizationScales[((originalWeightsDequantizationScales.size() == 1) || dequantizationValuesAreBroadcasted) ? 0 : i]);
|
||||
}
|
||||
|
||||
if (CNNNetworkHelper::isQuantizedConstWeights(fullyConnected)) {
|
||||
if (CNNNetworkHelper::isQuantizedConstWeights(fullyConnected) && (!dequantizationValuesAreBroadcasted)) {
|
||||
const Blob::Ptr weightsBlob = CNNNetworkHelper::getWeights(fullyConnected, roundQuantizedValues);
|
||||
const auto weightsBuffer = CNNNetworkHelper::getFloatData(weightsBlob);
|
||||
const Blob::Ptr biasesBlob = CNNNetworkHelper::getBiases(fullyConnected);
|
||||
@@ -432,7 +428,7 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
|
||||
|
||||
for (size_t w = 0; w < inputChannelsCount; ++w) {
|
||||
const float kernel = weightsBuffer.get()[channel * inputChannelsCount + w];
|
||||
const float shift = dequantizationValuesAreBroadcasted ? prevDequantizationShiftBuffer.get()[0] : prevDequantizationShiftBuffer.get()[w];
|
||||
const float shift = prevDequantizationShiftBuffer.get()[w];
|
||||
sum1 += kernel * shift * weightsDequantizationScale;
|
||||
sum2 += kernel * dataZeroPoints[w] * weightsDequantizationScale;
|
||||
}
|
||||
|
||||
@@ -133,3 +133,8 @@ void GemmTransformation::transform(TransformationContext& context, CNNLayer& gem
|
||||
|
||||
addDequantizationLayer(context, gemm, dequantizationScales, dequantizationShifts);
|
||||
}
|
||||
|
||||
bool GemmTransformation::isQuantized(const CNNLayer& layer) const noexcept {
|
||||
// weightable layer version overriding
|
||||
return true;
|
||||
}
|
||||
|
||||
@@ -128,6 +128,15 @@ bool WeightableLayerTransformation::isPrecisionPreserved(const CNNLayer& layer)
|
||||
return false;
|
||||
}
|
||||
|
||||
bool WeightableLayerTransformation::getDequantizationDimIsSupported(const CNNLayer& fullyConnected) {
|
||||
const DataPtr inputData = fullyConnected.insData[0].lock();
|
||||
if (inputData == nullptr) {
|
||||
THROW_IE_LPT_EXCEPTION(fullyConnected) << "input data is absent";
|
||||
}
|
||||
|
||||
return inputData->getDims().size() != 3ul;
|
||||
}
|
||||
|
||||
void WeightableLayerTransformation::updateLayerBiases(
|
||||
TransformationContext& context,
|
||||
const CNNLayer& weightableLayer,
|
||||
@@ -135,7 +144,17 @@ void WeightableLayerTransformation::updateLayerBiases(
|
||||
std::vector<float>& dequantizationScales,
|
||||
std::vector<float>& dequantizationShifts,
|
||||
std::vector<float>& biasesShifts) const {
|
||||
if (!std::all_of(dequantizationShifts.begin(), dequantizationShifts.end(), [](float value) { return value == 0.0; })) {
|
||||
const bool dequantizationShiftsAreZero = std::all_of(
|
||||
dequantizationShifts.begin(),
|
||||
dequantizationShifts.end(),
|
||||
[](float value) { return value == 0.0; });
|
||||
|
||||
const bool dequantizationDimIsNotSupported = !getDequantizationDimIsSupported(weightableLayer);
|
||||
CNNLayerPtr biasesLayer = CNNNetworkHelper::getParent(weightableLayer, 2);
|
||||
|
||||
// we need to correct biases if dequantization shifts values are not zero or
|
||||
// dequantization dimention is not supported (as result dequantization shifts can not be calculated)
|
||||
if ((dequantizationDimIsNotSupported && (biasesLayer != nullptr)) || (!dequantizationShiftsAreZero)) {
|
||||
const DataPtr insData = weightableLayer.insData[0].lock();
|
||||
if (insData == nullptr) {
|
||||
THROW_IE_LPT_EXCEPTION(weightableLayer) << "input data is absent";
|
||||
@@ -144,7 +163,6 @@ void WeightableLayerTransformation::updateLayerBiases(
|
||||
|
||||
std::shared_ptr<float> biasesBufferPtr;
|
||||
Blob::Ptr biasesBlob;
|
||||
CNNLayerPtr biasesLayer = CNNNetworkHelper::getParent(weightableLayer, 2);
|
||||
if (biasesLayer == nullptr) {
|
||||
if (weightableLayer.outData.size() != 1ul) {
|
||||
THROW_IE_LPT_EXCEPTION(weightableLayer) << "unexpected output data count " << weightableLayer.outData.size();
|
||||
|
||||
@@ -661,6 +661,13 @@ MKLDNNMemoryDesc::operator InferenceEngine::TensorDesc() const {
|
||||
blkDims.push_back(8);
|
||||
layout = Layout::BLOCKED;
|
||||
break;
|
||||
case memory::gOdhwi8o:
|
||||
order = {0, 1, 2, 3, 4, 5, 1};
|
||||
blkDims = dims;
|
||||
blkDims[1] = blkDims[1] / 8 + (blkDims[1] % 8 ? 1 : 0);
|
||||
blkDims.push_back(8);
|
||||
layout = Layout::BLOCKED;
|
||||
break;
|
||||
case memory::nChw16c:
|
||||
order = {0, 1, 2, 3, 1};
|
||||
blkDims = dims;
|
||||
@@ -676,6 +683,13 @@ MKLDNNMemoryDesc::operator InferenceEngine::TensorDesc() const {
|
||||
blkDims.push_back(16);
|
||||
layout = Layout::BLOCKED;
|
||||
break;
|
||||
case memory::gOdhwi16o:
|
||||
order = {0, 1, 2, 3, 4, 5, 1};
|
||||
blkDims = dims;
|
||||
blkDims[1] = blkDims[1] / 16 + (blkDims[1] % 16 ? 1 : 0);
|
||||
blkDims.push_back(16);
|
||||
layout = Layout::BLOCKED;
|
||||
break;
|
||||
case memory::Ohwi8o:
|
||||
order = {0, 1, 2, 3, 0};
|
||||
blkDims = dims;
|
||||
@@ -1267,6 +1281,13 @@ MKLDNNMemoryDesc::MKLDNNMemoryDesc(const TensorDesc& tDesc):
|
||||
} else if (blkdDims[6] == 16) {
|
||||
mkldnnFormat = memory::format::Goidhw16g;
|
||||
}
|
||||
} else if (order.size() == 7 &&
|
||||
order[0] == 0 && order[1] == 1 && order[2] == 2 && order[3] == 3 && order[4] == 4 && order[5] == 5 && order[6] == 1) {
|
||||
if (blkdDims[6] == 8) {
|
||||
mkldnnFormat = memory::format::gOdhwi8o;
|
||||
} else if (blkdDims[6] == 16) {
|
||||
mkldnnFormat = memory::format::gOdhwi16o;
|
||||
}
|
||||
} else if (order.size() == 8 &&
|
||||
order[0] == 0 && order[1] == 1 && order[2] == 3 && order[3] == 4 && order[4] == 2 && order[5] == 5 &&
|
||||
order[6] == 1 && order[7] == 2) {
|
||||
|
||||
@@ -182,8 +182,6 @@ void argmax_many_classes_has_axis(const float* src_data, float* dst_data, Shape
|
||||
vmask_type vmask;
|
||||
int s_index = i0 * dim * after_num + ib1 * block_size;
|
||||
|
||||
std::memset(reinterpret_cast<void*>(&vmax_values[0]), 0, sizeof(vmax_values));
|
||||
|
||||
auto vswap_func = [&](int index1, int index2) {
|
||||
vtmp = vmax_values[index1];
|
||||
vmax_values[index1] = _mm_uni_blendv_ps(vmax_values[index1], vmax_values[index2], vmask);
|
||||
|
||||
@@ -157,7 +157,7 @@ void MKLDNNDepthwiseNode::createDescriptor(const std::vector<InferenceEngine::Te
|
||||
const std::vector<InferenceEngine::TensorDesc> &outputDesc) {
|
||||
MKLDNNMemoryDesc in_candidate(inputDesc[0]);
|
||||
MKLDNNMemoryDesc out_candidate(inputDesc[0]);
|
||||
MKLDNNDims weightDims({in_candidate.getDims()[1]});
|
||||
MKLDNNDims weightDims({in_candidate.getDims().ndims() == 1 ? in_candidate.getDims()[0] : in_candidate.getDims()[1]});
|
||||
|
||||
MKLDNNMemoryDesc wgh_candidate{weightDims, in_candidate.getDataType(), memory::x};
|
||||
|
||||
|
||||
@@ -209,32 +209,34 @@ void MKLDNNFullyConnectedNode::setPostOps(mkldnn::primitive_attr &attr, bool ini
|
||||
PostOpsIntBlobMemory.push_back(MKLDNNMemoryPtr(new MKLDNNMemory(getEngine())));
|
||||
PostOpsIntBlobMemory[blob_idx]->Create(depthwiseDims, memory::data_type::f32, memory::format::x);
|
||||
|
||||
PostOpsIntBlobMemory[blob_idx]->SetData(memory::data_type::f32, memory::x,
|
||||
depthwiseLayer->_weights->buffer(),
|
||||
depthwiseLayer->_weights->size() *
|
||||
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
|
||||
|
||||
// In case ndims == 3 graph optimizer allows fusing only if all weights values are the same
|
||||
if (depthwiseNode->isBroadcast() || ndims == 3) {
|
||||
float broadcastValue = static_cast<float *>(PostOpsIntBlobMemory[blob_idx]->GetData())[0];
|
||||
for (int i = 1; i < PostOpsIntBlobMemory[blob_idx]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
|
||||
float broadcastValue = static_cast<float *>(depthwiseLayer->_weights->buffer())[0];
|
||||
for (int i = 0; i < PostOpsIntBlobMemory[blob_idx]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
|
||||
static_cast<float *>(PostOpsIntBlobMemory[blob_idx]->GetData())[i] = broadcastValue;
|
||||
}
|
||||
} else {
|
||||
PostOpsIntBlobMemory[blob_idx]->SetData(memory::data_type::f32, memory::x,
|
||||
depthwiseLayer->_weights->buffer(),
|
||||
depthwiseLayer->_weights->size() *
|
||||
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
|
||||
}
|
||||
|
||||
if (depthwiseNode->getAlgorithm() == depthwise_scale_shift) {
|
||||
PostOpsIntBlobMemory.push_back(MKLDNNMemoryPtr(new MKLDNNMemory(getEngine())));
|
||||
PostOpsIntBlobMemory[blob_idx + 1]->Create(depthwiseDims, memory::data_type::f32,
|
||||
memory::format::x);
|
||||
PostOpsIntBlobMemory[blob_idx + 1]->SetData(memory::data_type::f32, memory::x,
|
||||
depthwiseLayer->_biases->buffer(),
|
||||
depthwiseLayer->_biases->size() *
|
||||
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
|
||||
PostOpsIntBlobMemory[blob_idx + 1]->Create(depthwiseDims, memory::data_type::f32, memory::format::x);
|
||||
|
||||
// In case ndims == 3 graph optimizer allows fusing only if all biases values are the same
|
||||
if (depthwiseNode->isBroadcast() || ndims == 3) {
|
||||
float broadcastValue = static_cast<float *>(PostOpsIntBlobMemory[blob_idx + 1]->GetData())[0];
|
||||
for (int i = 1; i < PostOpsIntBlobMemory[blob_idx + 1]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
|
||||
float broadcastValue = static_cast<float *>(depthwiseLayer->_biases->buffer())[0];
|
||||
for (int i = 0; i < PostOpsIntBlobMemory[blob_idx + 1]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
|
||||
static_cast<float *>(PostOpsIntBlobMemory[blob_idx + 1]->GetData())[i] = broadcastValue;
|
||||
}
|
||||
} else {
|
||||
PostOpsIntBlobMemory[blob_idx + 1]->SetData(memory::data_type::f32, memory::x,
|
||||
depthwiseLayer->_biases->buffer(),
|
||||
depthwiseLayer->_biases->size() *
|
||||
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
|
||||
}
|
||||
|
||||
ops.append_depthwise(depthwiseNode->getAlgorithm(),
|
||||
|
||||
@@ -667,7 +667,8 @@ private:
|
||||
};
|
||||
|
||||
MKLDNNNormalizeNode::MKLDNNNormalizeNode(const InferenceEngine::CNNLayerPtr& layer, const mkldnn::engine& eng, MKLDNNWeightsSharing::Ptr &cache)
|
||||
: MKLDNNNode(layer, eng, cache) {}
|
||||
: MKLDNNNode(layer, eng, cache), src_data_size(0lu), dst_data_size(0lu), weights_data_size(0lu),
|
||||
input_prec(Precision::UNSPECIFIED), output_prec(Precision::UNSPECIFIED), weights_prec(Precision::UNSPECIFIED) {}
|
||||
|
||||
void MKLDNNNormalizeNode::getSupportedDescriptors() {
|
||||
if (!descs.empty())
|
||||
|
||||
@@ -120,13 +120,18 @@ void MKLDNNReorderNode::createReorderPrimitive(const mkldnn::memory::desc &srcDe
|
||||
// Code block below tries to detect such cases and reinterpret data planar formats (e.g. nchw)
|
||||
// as grouped weights planar formats (e.g. goihw) since they have same physical memory layout.
|
||||
if (MKLDNNMemory::GetPlainFormat(src_blocked->GetDims()) == src_blocked->GetFormat() &&
|
||||
MKLDNNMemory::IsGroupedFormat(dst_blocked->GetFormat())) {
|
||||
src_blocked->GetDims().size() + 1 == dst_blocked->GetDims().size()) {
|
||||
try {
|
||||
mkldnn::memory::dims newDims = dst_blocked->GetDims();
|
||||
mkldnn::memory::format newFormat;
|
||||
newFormat = src_blocked->GetDims().size() == 4 ? memory::goihw :
|
||||
src_blocked->GetDims().size() == 5 ? memory::goidhw :
|
||||
src_blocked->GetFormat();
|
||||
if (MKLDNNMemory::IsGroupedFormat(dst_blocked->GetFormat())) {
|
||||
newFormat = src_blocked->GetDims().size() == 4 ? memory::goihw :
|
||||
src_blocked->GetDims().size() == 5 ? memory::goidhw :
|
||||
src_blocked->GetFormat();
|
||||
} else {
|
||||
newFormat = src_blocked->GetDims().size() == 4 ? memory::ncdhw :
|
||||
src_blocked->GetFormat();
|
||||
}
|
||||
|
||||
auto newDesc = mkldnn::memory::desc(newDims, src_blocked->GetDataType(), newFormat);
|
||||
src_blocked->Create(newDesc, srcPtr, false);
|
||||
|
||||
@@ -413,6 +413,16 @@ std::shared_ptr<ngraph::Node> V10Parser::createNode(const std::vector<ngraph::Ou
|
||||
std::make_shared<LayerCreator<ngraph::op::v1::ReduceLogicalOr>>("ReduceLogicalOr"),
|
||||
};
|
||||
|
||||
// Check that operation in default opsets
|
||||
auto isDefaultOpSet = [](const std::string& version) -> bool {
|
||||
for (size_t i = 1; i <= 3; i++) {
|
||||
std::string opset_name = "opset" + std::to_string(i);
|
||||
if (version == opset_name)
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
};
|
||||
|
||||
for (size_t i = 0; i < inputs.size(); i++) {
|
||||
if (!inputs[i].get_node())
|
||||
THROW_IE_EXCEPTION << params.type << " layer " << params.name << " with id: " << params.layerId
|
||||
@@ -423,21 +433,23 @@ std::shared_ptr<ngraph::Node> V10Parser::createNode(const std::vector<ngraph::Ou
|
||||
}
|
||||
|
||||
std::shared_ptr<ngraph::Node> ngraphNode;
|
||||
// Try to create operation from creators
|
||||
for (const auto& creator : creators) {
|
||||
if (creator->shouldCreate(params.type)) {
|
||||
bool useCreator = false;
|
||||
// Check that opset is registered
|
||||
useCreator |= opsets.find(params.version) == opsets.end();
|
||||
if (!useCreator) {
|
||||
// Check that creator can create operation with the version from opset
|
||||
const auto opset = opsets.at(params.version);
|
||||
// Opset should contains the same version of operation or doesn't contain operation with current type
|
||||
useCreator |= opset.contains_type(creator->getNodeType()) || !opset.contains_type(params.type);
|
||||
if (isDefaultOpSet(params.version)) {
|
||||
// Try to create operation from creators
|
||||
for (const auto& creator : creators) {
|
||||
if (creator->shouldCreate(params.type)) {
|
||||
bool useCreator = false;
|
||||
// Check that opset is registered
|
||||
useCreator |= opsets.find(params.version) == opsets.end();
|
||||
if (!useCreator) {
|
||||
// Check that creator can create operation with the version from opset
|
||||
const auto opset = opsets.at(params.version);
|
||||
// Opset should contains the same version of operation or doesn't contain operation with current type
|
||||
useCreator |= opset.contains_type(creator->getNodeType()) || !opset.contains_type(params.type);
|
||||
}
|
||||
if (useCreator)
|
||||
ngraphNode = creator->createLayer(inputs, node, binStream, params);
|
||||
break;
|
||||
}
|
||||
if (useCreator)
|
||||
ngraphNode = creator->createLayer(inputs, node, binStream, params);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -0,0 +1,43 @@
|
||||
// Copyright (C) 2018-2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include <transformations_visibility.hpp>
|
||||
|
||||
#include <ngraph/op/op.hpp>
|
||||
#include <ngraph/op/experimental/layers/prior_box_clustered.hpp>
|
||||
|
||||
namespace ngraph {
|
||||
namespace op {
|
||||
|
||||
class TRANSFORMATIONS_API PriorBoxClusteredIE : public Op {
|
||||
public:
|
||||
static constexpr NodeTypeInfo type_info{"PriorBoxClusteredIE", 1};
|
||||
const NodeTypeInfo& get_type_info() const override { return type_info; }
|
||||
|
||||
/// \brief Constructs a PriorBoxClusteredIE operation
|
||||
///
|
||||
/// \param layer Layer for which prior boxes are computed
|
||||
/// \param image Input Input to which prior boxes are scaled
|
||||
/// \param attrs PriorBoxClustered attributes
|
||||
PriorBoxClusteredIE(const Output<Node>& input,
|
||||
const Output<Node>& image,
|
||||
const ngraph::op::PriorBoxClusteredAttrs& attrs);
|
||||
|
||||
void validate_and_infer_types() override;
|
||||
|
||||
std::shared_ptr<Node> copy_with_new_args(const NodeVector& new_args) const override;
|
||||
|
||||
const PriorBoxClusteredAttrs& get_attrs() const { return m_attrs; }
|
||||
|
||||
private:
|
||||
PriorBoxClusteredAttrs m_attrs;
|
||||
};
|
||||
|
||||
} // namespace op
|
||||
} // namespace ngraph
|
||||
|
||||
@@ -0,0 +1,42 @@
|
||||
// Copyright (C) 2018-2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include <transformations_visibility.hpp>
|
||||
|
||||
#include "ngraph/op/op.hpp"
|
||||
#include "ngraph/op/experimental/layers/prior_box.hpp"
|
||||
|
||||
namespace ngraph {
|
||||
namespace op {
|
||||
|
||||
class TRANSFORMATIONS_API PriorBoxIE : public Op {
|
||||
public:
|
||||
static constexpr NodeTypeInfo type_info{"PriorBoxIE", 1};
|
||||
const NodeTypeInfo& get_type_info() const override { return type_info; }
|
||||
|
||||
/// \brief Constructs a PriorBoxIE operation
|
||||
///
|
||||
/// \param layer Layer for which prior boxes are computed
|
||||
/// \param image Input Input to which prior boxes are scaled
|
||||
/// \param attrs PriorBox attributes
|
||||
PriorBoxIE(const Output<Node>& input,
|
||||
const Output<Node>& image,
|
||||
const ngraph::op::PriorBoxAttrs& attrs);
|
||||
|
||||
void validate_and_infer_types() override;
|
||||
|
||||
std::shared_ptr<Node> copy_with_new_args(const NodeVector& new_args) const override;
|
||||
|
||||
const PriorBoxAttrs& get_attrs() const { return m_attrs; }
|
||||
|
||||
private:
|
||||
PriorBoxAttrs m_attrs;
|
||||
};
|
||||
|
||||
} // namespace op
|
||||
} // namespace ngraph
|
||||
@@ -16,6 +16,8 @@
|
||||
|
||||
// This pass must be called first in pipeline
|
||||
NGRAPH_PASS(InitNodeInfo, ::ngraph::pass)
|
||||
NGRAPH_PASS(ConvertPriorBox, ::ngraph::pass) // WA: ConvertPriorBox must be executed before CF
|
||||
NGRAPH_PASS(ConstantFolding, ::ngraph::pass)
|
||||
NGRAPH_PASS(RemoveFilteringBoxesBySize, ::ngraph::pass) // Resolves dynamism (replaces NonZero), CF needed
|
||||
NGRAPH_PASS(ConstantFolding, ::ngraph::pass)
|
||||
NGRAPH_PASS(StridedSliceOptimization, ::ngraph::pass) // depends on CF
|
||||
|
||||
@@ -0,0 +1,33 @@
|
||||
// Copyright (C) 2018-2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <vector>
|
||||
#include <memory>
|
||||
|
||||
#include <transformations_visibility.hpp>
|
||||
|
||||
#include <ngraph/pass/graph_rewrite.hpp>
|
||||
|
||||
namespace ngraph {
|
||||
namespace pass {
|
||||
|
||||
class TRANSFORMATIONS_API ConvertPriorBox;
|
||||
|
||||
} // namespace pass
|
||||
} // namespace ngraph
|
||||
|
||||
class ngraph::pass::ConvertPriorBox: public ngraph::pass::GraphRewrite {
|
||||
public:
|
||||
ConvertPriorBox() : GraphRewrite() {
|
||||
convert_prior_box();
|
||||
convert_prior_box_clustered();
|
||||
}
|
||||
|
||||
private:
|
||||
void convert_prior_box();
|
||||
|
||||
void convert_prior_box_clustered();
|
||||
};
|
||||
@@ -0,0 +1,39 @@
|
||||
// Copyright (C) 2018-2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include "ngraph_ops/prior_box_clustered_ie.hpp"
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "ngraph/op/constant.hpp"
|
||||
|
||||
using namespace std;
|
||||
using namespace ngraph;
|
||||
|
||||
constexpr NodeTypeInfo op::PriorBoxClusteredIE::type_info;
|
||||
|
||||
op::PriorBoxClusteredIE::PriorBoxClusteredIE(const Output<Node>& input, const Output<Node>& image,
|
||||
const PriorBoxClusteredAttrs& attrs)
|
||||
: Op({input, image}), m_attrs(attrs) {
|
||||
constructor_validate_and_infer_types();
|
||||
}
|
||||
|
||||
void op::PriorBoxClusteredIE::validate_and_infer_types() {
|
||||
if (get_input_partial_shape(0).is_dynamic() || get_input_partial_shape(1).is_dynamic()) {
|
||||
set_output_type(0, element::f32, PartialShape::dynamic(3));
|
||||
return;
|
||||
}
|
||||
|
||||
auto input_shape = get_input_shape(0);
|
||||
auto image_shape = get_input_shape(1);
|
||||
|
||||
size_t num_priors = m_attrs.widths.size();
|
||||
|
||||
set_output_type(0, element::f32, Shape {1, 2, 4 * input_shape[2] * input_shape[3] * num_priors});
|
||||
}
|
||||
|
||||
shared_ptr<Node> op::PriorBoxClusteredIE::copy_with_new_args(const NodeVector& new_args) const {
|
||||
check_new_args_count(this, new_args);
|
||||
return make_shared<PriorBoxClusteredIE>(new_args.at(0), new_args.at(1), m_attrs);
|
||||
}
|
||||
@@ -0,0 +1,36 @@
|
||||
// Copyright (C) 2018-2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include "ngraph_ops/prior_box_ie.hpp"
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "ngraph/op/constant.hpp"
|
||||
|
||||
using namespace std;
|
||||
using namespace ngraph;
|
||||
|
||||
constexpr NodeTypeInfo op::PriorBoxIE::type_info;
|
||||
|
||||
op::PriorBoxIE::PriorBoxIE(const Output<Node>& input, const Output<Node>& image, const PriorBoxAttrs& attrs)
|
||||
: Op({input, image}), m_attrs(attrs) {
|
||||
constructor_validate_and_infer_types();
|
||||
}
|
||||
|
||||
void op::PriorBoxIE::validate_and_infer_types() {
|
||||
if (get_input_partial_shape(0).is_dynamic() || get_input_partial_shape(1).is_dynamic()) {
|
||||
set_output_type(0, element::f32, PartialShape::dynamic(3));
|
||||
return;
|
||||
}
|
||||
auto input_shape = get_input_shape(0);
|
||||
auto image_shape = get_input_shape(1);
|
||||
|
||||
set_output_type(0, element::f32, Shape {
|
||||
1, 2, 4 * input_shape[2] * input_shape[3] * static_cast<size_t>(op::PriorBox::number_of_priors(m_attrs))});
|
||||
}
|
||||
|
||||
shared_ptr<Node> op::PriorBoxIE::copy_with_new_args(const NodeVector& new_args) const {
|
||||
check_new_args_count(this, new_args);
|
||||
return make_shared<PriorBoxIE>(new_args.at(0), new_args.at(1), m_attrs);
|
||||
}
|
||||
@@ -5,6 +5,7 @@
|
||||
#include <memory>
|
||||
|
||||
#include "transformations/common_optimizations/common_optimizations.hpp"
|
||||
#include "transformations/convert_opset1_to_legacy/convert_prior_to_ie_prior.hpp"
|
||||
#include "transformations/depth_to_space_fusion.hpp"
|
||||
#include "transformations/optimize_strided_slice.hpp"
|
||||
#include "transformations/convert_scatter_elements_to_scatter.hpp"
|
||||
|
||||
@@ -17,7 +17,8 @@ void ngraph::pass::ConvertDivide::convert_divide() {
|
||||
|
||||
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
|
||||
auto div = std::dynamic_pointer_cast<ngraph::opset1::Divide> (m.get_match_root());
|
||||
if (!div) {
|
||||
// We can not apply this transformation in case with integer input data type
|
||||
if (!div || div->input(0).get_element_type().is_integral()) {
|
||||
return false;
|
||||
}
|
||||
|
||||
|
||||
@@ -0,0 +1,294 @@
|
||||
// Copyright (C) 2018-2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include "transformations/convert_opset1_to_legacy/convert_prior_to_ie_prior.hpp"
|
||||
|
||||
#include <memory>
|
||||
#include <vector>
|
||||
|
||||
#include <ngraph/opsets/opset3.hpp>
|
||||
#include <ngraph/opsets/opset1.hpp>
|
||||
|
||||
#include <ngraph_ops/prior_box_ie.hpp>
|
||||
#include <ngraph_ops/prior_box_clustered_ie.hpp>
|
||||
#include <ngraph/rt_info.hpp>
|
||||
|
||||
void ngraph::pass::ConvertPriorBox::convert_prior_box() {
|
||||
auto data = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
|
||||
auto axes = ngraph::opset1::Constant::create(element::i64, Shape{1}, {0});
|
||||
auto image = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
|
||||
|
||||
ngraph::op::PriorBoxAttrs attr;
|
||||
attr.min_size = {162.0f};
|
||||
attr.max_size = {213.0f};
|
||||
attr.aspect_ratio = {2.0f, 3.0f};
|
||||
attr.variance = {0.1f, 0.1f, 0.2f, 0.2f};
|
||||
attr.step = 64.0f;
|
||||
attr.offset = 0.5f;
|
||||
attr.clip = 0;
|
||||
attr.flip = 1;
|
||||
attr.scale_all_sizes = true;
|
||||
|
||||
auto prior_box = std::make_shared<ngraph::opset1::PriorBox>(data, image, attr);
|
||||
auto unsqueeze = std::make_shared<ngraph::opset1::Unsqueeze> (prior_box, axes);
|
||||
|
||||
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
|
||||
auto unsqueeze = std::dynamic_pointer_cast<ngraph::opset1::Unsqueeze> (m.get_match_root());
|
||||
if (!unsqueeze) {
|
||||
return false;
|
||||
}
|
||||
auto prior_box_node = std::dynamic_pointer_cast<ngraph::opset1::PriorBox> (unsqueeze->input_value(0).get_node_shared_ptr());
|
||||
|
||||
if (!prior_box_node) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// vector of nGraph nodes that will be replaced
|
||||
ngraph::NodeVector ops_to_replace{unsqueeze, prior_box_node};
|
||||
|
||||
std::shared_ptr<Node> input_1(prior_box_node->input_value(0).get_node_shared_ptr());
|
||||
std::shared_ptr<Node> input_2(prior_box_node->input_value(1).get_node_shared_ptr());
|
||||
|
||||
auto convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||
auto convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||
|
||||
if (convert1 && convert2) {
|
||||
ops_to_replace.push_back(convert1);
|
||||
ops_to_replace.push_back(convert2);
|
||||
input_1 = convert1->input_value(0).get_node_shared_ptr();
|
||||
input_2 = convert2->input_value(0).get_node_shared_ptr();
|
||||
}
|
||||
|
||||
auto strided_slice1 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_1);
|
||||
auto strided_slice2 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_2);
|
||||
|
||||
if (!strided_slice1 || !strided_slice2) {
|
||||
return false;
|
||||
}
|
||||
|
||||
ops_to_replace.push_back(strided_slice1);
|
||||
ops_to_replace.push_back(strided_slice2);
|
||||
|
||||
// Check that StridedSlice1 cuts H,W dims for PriorBox
|
||||
auto begin = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(1).get_node_shared_ptr());
|
||||
auto end = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(2).get_node_shared_ptr());
|
||||
auto stride = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(3).get_node_shared_ptr());
|
||||
|
||||
if (!begin || !end || !stride) {
|
||||
return false;
|
||||
}
|
||||
|
||||
auto begin_val = begin->get_vector<int64_t>();
|
||||
auto end_val = end->get_vector<int64_t>();
|
||||
auto stride_val = stride->get_vector<int64_t>();
|
||||
|
||||
if (begin_val.size() != 1 && begin_val[0] != 2) {
|
||||
return false;
|
||||
}
|
||||
|
||||
if (end_val.size() != 1 && end_val[0] != 4) {
|
||||
return false;
|
||||
}
|
||||
|
||||
if (stride_val.size() != 1 && stride_val[0] != 1) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// TODO: should we check second StridedSlice?
|
||||
input_1 = strided_slice1->input_value(0).get_node_shared_ptr();
|
||||
input_2 = strided_slice2->input_value(0).get_node_shared_ptr();
|
||||
|
||||
convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||
convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||
|
||||
if (convert1 && convert2) {
|
||||
ops_to_replace.push_back(convert1);
|
||||
ops_to_replace.push_back(convert2);
|
||||
input_1 = convert1->input_value(0).get_node_shared_ptr();
|
||||
input_2 = convert2->input_value(0).get_node_shared_ptr();
|
||||
}
|
||||
|
||||
// the input can be either ShapeOf-1 or ShapeOf-3
|
||||
std::shared_ptr<ngraph::op::Op> shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_1);
|
||||
std::shared_ptr<ngraph::op::Op> shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_2);
|
||||
|
||||
if (!shape_of1 || !shape_of2) {
|
||||
shape_of1 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_1);
|
||||
shape_of2 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_2);
|
||||
}
|
||||
if (!shape_of1 || !shape_of2) {
|
||||
return false;
|
||||
}
|
||||
// keep this code for a while if will decide to run this transformation again in the opset1->legacy
|
||||
// the input can be either ShapeOf or Convert(ShapeOf)
|
||||
// if (!shape_of1 || !shape_of2) {
|
||||
// auto shapeof1_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||
// auto shapeof2_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||
// if (!shapeof1_convert || !shapeof2_convert)
|
||||
// return false;
|
||||
// shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof1_convert->input_value(0).get_node_shared_ptr());
|
||||
// shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof2_convert->input_value(0).get_node_shared_ptr());
|
||||
// if (!shape_of1 || !shape_of2)
|
||||
// return false;
|
||||
// ops_to_replace.push_back(shapeof1_convert);
|
||||
// ops_to_replace.push_back(shapeof2_convert);
|
||||
// }
|
||||
|
||||
ops_to_replace.push_back(shape_of1);
|
||||
ops_to_replace.push_back(shape_of2);
|
||||
|
||||
auto prior_box_ie = std::make_shared<ngraph::op::PriorBoxIE> (shape_of1->input_value(0),
|
||||
shape_of2->input_value(0),
|
||||
prior_box_node->get_attrs());
|
||||
|
||||
prior_box_ie->set_friendly_name(unsqueeze->get_friendly_name());
|
||||
|
||||
// Nodes in copy runtime info function should be in topological order
|
||||
std::reverse(ops_to_replace.begin(), ops_to_replace.end());
|
||||
ngraph::copy_runtime_info(ops_to_replace, prior_box_ie);
|
||||
ngraph::replace_node(m.get_match_root(), prior_box_ie);
|
||||
return true;
|
||||
};
|
||||
|
||||
auto m = std::make_shared<ngraph::pattern::Matcher>(unsqueeze, "CPUFusion.ConvertPriorBoxToPriorBoxIE");
|
||||
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
|
||||
}
|
||||
|
||||
void ngraph::pass::ConvertPriorBox::convert_prior_box_clustered() {
|
||||
auto data = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
|
||||
auto axes = ngraph::opset1::Constant::create(element::i64, Shape{1}, {0});
|
||||
auto image = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
|
||||
|
||||
ngraph::op::PriorBoxClusteredAttrs attr;
|
||||
attr.widths = {0.1f, 0.1f, 0.2f, 0.2f};
|
||||
attr.heights = {0.1f, 0.1f, 0.2f, 0.2f};
|
||||
attr.variances = {0.1f, 0.1f, 0.2f, 0.2f};
|
||||
attr.step_widths = 64.0f;
|
||||
attr.step_heights = 64.0f;
|
||||
attr.offset = 0.5f;
|
||||
attr.clip = false;
|
||||
|
||||
auto prior_box = std::make_shared<ngraph::opset1::PriorBoxClustered>(data, image, attr);
|
||||
auto unsqueeze = std::make_shared<ngraph::opset1::Unsqueeze> (prior_box, axes);
|
||||
|
||||
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
|
||||
auto unsqueeze = std::dynamic_pointer_cast<ngraph::opset1::Unsqueeze> (m.get_match_root());
|
||||
if (!unsqueeze) {
|
||||
return false;
|
||||
}
|
||||
auto prior_box_node = std::dynamic_pointer_cast<ngraph::opset1::PriorBoxClustered> (unsqueeze->get_argument(0));
|
||||
|
||||
if (!prior_box_node) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// vector of nGraph nodes that will be replaced
|
||||
ngraph::NodeVector ops_to_replace{unsqueeze, prior_box_node};
|
||||
|
||||
std::shared_ptr<Node> input_1(prior_box_node->input_value(0).get_node_shared_ptr());
|
||||
std::shared_ptr<Node> input_2(prior_box_node->input_value(1).get_node_shared_ptr());
|
||||
|
||||
auto convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||
auto convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||
|
||||
if (convert1 && convert2) {
|
||||
ops_to_replace.push_back(convert1);
|
||||
ops_to_replace.push_back(convert2);
|
||||
input_1 = convert1->input_value(0).get_node_shared_ptr();
|
||||
input_2 = convert2->input_value(0).get_node_shared_ptr();
|
||||
}
|
||||
|
||||
auto strided_slice1 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_1);
|
||||
auto strided_slice2 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_2);
|
||||
|
||||
if (!strided_slice1 || !strided_slice2) {
|
||||
return false;
|
||||
}
|
||||
|
||||
ops_to_replace.push_back(strided_slice1);
|
||||
ops_to_replace.push_back(strided_slice2);
|
||||
|
||||
// Check that StridedSlice1 cuts H,W dims for PriorBox
|
||||
auto begin = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(1));
|
||||
auto end = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(2));
|
||||
auto stride = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(3));
|
||||
|
||||
if (!begin || !end || !stride) {
|
||||
return false;
|
||||
}
|
||||
|
||||
auto begin_val = begin->get_vector<int64_t>();
|
||||
auto end_val = end->get_vector<int64_t>();
|
||||
auto stride_val = stride->get_vector<int64_t>();
|
||||
|
||||
if (begin_val.size() != 1 && begin_val[0] != 2) {
|
||||
return false;
|
||||
}
|
||||
|
||||
if (end_val.size() != 1 && end_val[0] != 4) {
|
||||
return false;
|
||||
}
|
||||
|
||||
if (stride_val.size() != 1 && stride_val[0] != 1) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// TODO: should we check second StridedSlice?
|
||||
input_1 = strided_slice1->input_value(0).get_node_shared_ptr();
|
||||
input_2 = strided_slice2->input_value(0).get_node_shared_ptr();
|
||||
|
||||
convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||
convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||
|
||||
if (convert1 && convert2) {
|
||||
ops_to_replace.push_back(convert1);
|
||||
ops_to_replace.push_back(convert2);
|
||||
input_1 = convert1->input_value(0).get_node_shared_ptr();
|
||||
input_2 = convert2->input_value(0).get_node_shared_ptr();
|
||||
}
|
||||
|
||||
// the input can be either ShapeOf-1 or ShapeOf-3
|
||||
std::shared_ptr<ngraph::op::Op> shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_1);
|
||||
std::shared_ptr<ngraph::op::Op> shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_2);
|
||||
|
||||
if (!shape_of1 || !shape_of2) {
|
||||
shape_of1 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_1);
|
||||
shape_of2 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_2);
|
||||
}
|
||||
if (!shape_of1 || !shape_of2) {
|
||||
return false;
|
||||
}
|
||||
// keep this code for a while if will decide to run this transformation again in the opset1->legacy
|
||||
// the input can be either ShapeOf or Convert(ShapeOf)
|
||||
// if (!shape_of1 || !shape_of2) {
|
||||
// auto shapeof1_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||
// auto shapeof2_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||
// if (!shapeof1_convert || !shapeof2_convert)
|
||||
// return false;
|
||||
// shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof1_convert->input_value(0).get_node_shared_ptr());
|
||||
// shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof2_convert->input_value(0).get_node_shared_ptr());
|
||||
// if (!shape_of1 || !shape_of2)
|
||||
// return false;
|
||||
// ops_to_replace.push_back(shapeof1_convert);
|
||||
// ops_to_replace.push_back(shapeof2_convert);
|
||||
// }
|
||||
|
||||
ops_to_replace.push_back(shape_of1);
|
||||
ops_to_replace.push_back(shape_of2);
|
||||
|
||||
auto prior_box_ie = std::make_shared<ngraph::op::PriorBoxClusteredIE> (shape_of1->get_argument(0),
|
||||
shape_of2->get_argument(0),
|
||||
prior_box_node->get_attrs());
|
||||
prior_box_ie->set_friendly_name(unsqueeze->get_friendly_name());
|
||||
|
||||
// Nodes in copy runtime info function should be in topological order
|
||||
std::reverse(ops_to_replace.begin(), ops_to_replace.end());
|
||||
ngraph::copy_runtime_info(ops_to_replace, prior_box_ie);
|
||||
ngraph::replace_node(unsqueeze, prior_box_ie);
|
||||
return true;
|
||||
};
|
||||
|
||||
auto m = std::make_shared<ngraph::pattern::Matcher>(unsqueeze, "CPUFusion.ConvertPriorBoxClusteredToPriorBoxClusteredIE");
|
||||
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
|
||||
}
|
||||
@@ -41,10 +41,6 @@ void ngraph::pass::ConvertStridedSliceToCrop::convert_strided_slice_to_crop() {
|
||||
|
||||
auto input_shape = slice->get_input_shape(0);
|
||||
auto output_shape = slice->get_output_shape(0);
|
||||
// MKLDNN: "Crop supports only 2d, 4d and 5d blobs."
|
||||
if (input_shape.size() != 2 && input_shape.size() != 4 && input_shape.size() != 5) {
|
||||
return false;
|
||||
}
|
||||
|
||||
auto begin = begin_node->cast_vector<int64_t>();
|
||||
auto end = end_node->cast_vector<int64_t>();
|
||||
@@ -201,6 +197,12 @@ void ngraph::pass::ConvertStridedSliceToCrop::convert_strided_slice_to_crop() {
|
||||
new_ops.push_back(data_node);
|
||||
}
|
||||
|
||||
auto data_node_shape = data_node->get_output_shape(0);
|
||||
// MKLDNN: "Crop supports only 2d, 4d and 5d blobs."
|
||||
if (data_node_shape.size() != 2 && data_node_shape.size() != 4 && data_node_shape.size() != 5) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Crop
|
||||
data_node = std::make_shared<ngraph::op::CropIE> (data_node, axes, dim, offset);
|
||||
data_node->set_friendly_name(slice->get_friendly_name());
|
||||
|
||||
@@ -42,22 +42,37 @@ void ngraph::pass::ConvertTopKToTopKIE::convert_topk_to_topk_ie() {
|
||||
topk->get_sort_type());
|
||||
new_ops.push_back(topk_ie);
|
||||
|
||||
Output<Node> element_output;
|
||||
Output<Node> index_output;
|
||||
// insert Convert if index element type not equal to i32
|
||||
if (topk->get_index_element_type() == element::i32) {
|
||||
// insert Convert if index element type not equal to i32 and output #1 of TopK has consumers
|
||||
if (topk->get_index_element_type() == element::i32 || topk->get_output_target_inputs(1).size() == 0) {
|
||||
element_output = topk_ie->output(0);
|
||||
index_output = topk_ie->output(1);
|
||||
} else {
|
||||
topk_ie->set_friendly_name(topk->get_friendly_name());
|
||||
} else if (topk->get_output_target_inputs(0).size() == 0) {
|
||||
index_output = std::make_shared<opset1::Convert>(topk_ie->output(1), topk->get_index_element_type());
|
||||
new_ops.push_back(index_output.get_node_shared_ptr());
|
||||
|
||||
// workaround for naming output #1 of TopK
|
||||
index_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
|
||||
} else {
|
||||
// create fake convert for 0 output, it is a workaround in purpose of correct output names preserving
|
||||
element_output = std::make_shared<opset1::Convert>(topk_ie->output(0), topk->get_output_element_type(0));
|
||||
index_output = std::make_shared<opset1::Convert>(topk_ie->output(1), topk->get_index_element_type());
|
||||
new_ops.push_back(element_output.get_node_shared_ptr());
|
||||
new_ops.push_back(index_output.get_node_shared_ptr());
|
||||
|
||||
// workaround for naming two outputs of TopK
|
||||
element_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".0");
|
||||
index_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
|
||||
}
|
||||
|
||||
topk_ie->set_friendly_name(topk->get_friendly_name());
|
||||
ngraph::copy_runtime_info(topk, new_ops);
|
||||
topk->output(0).replace(topk_ie->output(0));
|
||||
topk->output(0).replace(element_output);
|
||||
topk->output(1).replace(index_output);
|
||||
return true;
|
||||
};
|
||||
|
||||
auto m = std::make_shared<ngraph::pattern::Matcher>(topk, "ConvertTopKToTopKIE");
|
||||
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -20,24 +20,40 @@ void ngraph::pass::ConvertTopK3::convert_topk3() {
|
||||
if (!topk) {
|
||||
return false;
|
||||
}
|
||||
Output<Node> last;
|
||||
Output<Node> last0;
|
||||
Output<Node> last1;
|
||||
ngraph::NodeVector new_ops;
|
||||
|
||||
auto new_topk = std::make_shared<ngraph::opset2::TopK>(topk->input_value(0), topk->input_value(1),
|
||||
topk->get_axis(), topk->get_mode(), topk->get_sort_type(), element::i32);
|
||||
new_ops.push_back(new_topk);
|
||||
// if the output is the i32 then it matches behavior of the v1::TopK otherwise need to insert Convert
|
||||
if (topk->get_index_element_type() == element::i32) {
|
||||
last = new_topk->output(1);
|
||||
// if the output is the i32 or output #1 has no consumers
|
||||
// then it matches behavior of the v1::TopK otherwise need to insert Convert
|
||||
if (topk->get_index_element_type() == element::i32 || topk->get_output_target_inputs(1).size() == 0) {
|
||||
last0 = new_topk->output(0);
|
||||
last1 = new_topk->output(1);
|
||||
new_topk->set_friendly_name(topk->get_friendly_name());
|
||||
} else if (topk->get_output_target_inputs(0).size() == 0) {
|
||||
last1 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
|
||||
new_ops.push_back(last1.get_node_shared_ptr());
|
||||
|
||||
// workaround for naming two outputs of TopK
|
||||
last1.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
|
||||
} else {
|
||||
last = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
|
||||
new_ops.push_back(last.get_node_shared_ptr());
|
||||
// create fake convert for 0 output, it is a workaround in purpose of correct output names preserving
|
||||
last0 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(0), topk->get_output_element_type(0));
|
||||
last1 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
|
||||
new_ops.push_back(last0.get_node_shared_ptr());
|
||||
new_ops.push_back(last1.get_node_shared_ptr());
|
||||
|
||||
// workaround for naming two outputs of TopK
|
||||
last0.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".0");
|
||||
last1.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
|
||||
}
|
||||
|
||||
new_topk->set_friendly_name(topk->get_friendly_name());
|
||||
ngraph::copy_runtime_info(topk, new_ops);
|
||||
topk->output(0).replace(new_topk->output(0));
|
||||
topk->output(1).replace(last);
|
||||
topk->output(0).replace(last0);
|
||||
topk->output(1).replace(last1);
|
||||
return true;
|
||||
};
|
||||
|
||||
|
||||
@@ -30,7 +30,7 @@ bool check_block_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
|
||||
is_transformation_valid &= (expected_shape == shape_reshape_before);
|
||||
|
||||
// x'' = transpose(x', [0, K + 1, K + 2, 1, K + 3, 2, K + 4, 3, ..., K + (K + 1), K])
|
||||
ngraph::AxisVector expected_permutation = {0, spatial_dims + 1};
|
||||
ngraph::AxisVector expected_permutation = {0, static_cast<size_t>(spatial_dims + 1)};
|
||||
for (uint64_t i = 2; i < shape_input.size(); ++i) {
|
||||
expected_permutation.push_back(spatial_dims + i);
|
||||
expected_permutation.push_back(i - 1);
|
||||
@@ -38,7 +38,7 @@ bool check_block_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
|
||||
is_transformation_valid &= (expected_permutation == permutation);
|
||||
|
||||
// y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])
|
||||
expected_shape = {shape_input[0], c_dim};
|
||||
expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
|
||||
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
||||
expected_shape.push_back(shape_input[i] * possible_block_size);
|
||||
is_transformation_valid &= (expected_shape == shape_reshape_after);
|
||||
@@ -57,7 +57,7 @@ bool check_depth_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
|
||||
uint64_t c_dim = shape_input[1] / std::pow(possible_block_size, spatial_dims);
|
||||
|
||||
// x' = reshape(data, [N, C / (block_size ^ K), block_size, block_size, ..., block_size, D1, D2, ..., DK])
|
||||
ngraph::Shape expected_shape = {shape_input[0], c_dim};
|
||||
ngraph::Shape expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
|
||||
for (uint64_t i = 0; i < spatial_dims; ++i)
|
||||
expected_shape.push_back(possible_block_size);
|
||||
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
||||
@@ -73,7 +73,7 @@ bool check_depth_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
|
||||
is_transformation_valid &= (expected_permutation == permutation);
|
||||
|
||||
// y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])
|
||||
expected_shape = {shape_input[0], c_dim};
|
||||
expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
|
||||
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
||||
expected_shape.push_back(shape_input[i] * possible_block_size);
|
||||
is_transformation_valid &= (expected_shape == shape_reshape_after);
|
||||
|
||||
@@ -26,7 +26,7 @@ namespace vpu {
|
||||
|
||||
template <typename T>
|
||||
Optional<int> parseNumber(const std::string& s) {
|
||||
T value;
|
||||
auto value = T{};
|
||||
if ((std::istringstream(s) >> value >> std::ws).eof()) {
|
||||
return {value};
|
||||
}
|
||||
|
||||
@@ -39,7 +39,7 @@ void dynamicToStaticShapeBinaryEltwise(std::shared_ptr<ngraph::Node> eltwise) {
|
||||
const auto diff = std::abs(lhsRank.get_length() - rhsRank.get_length());
|
||||
if (diff) {
|
||||
auto & broadcastInput = lhsRank.get_length() < rhsRank.get_length() ? lhsInput : rhsInput;
|
||||
const auto broadcastConst = ngraph::opset3::Constant::create(broadcastInput.get_element_type(), {static_cast<uint64_t>(diff)}, {1});
|
||||
const auto broadcastConst = ngraph::opset3::Constant::create(broadcastInput.get_element_type(), {static_cast<size_t>(diff)}, {1});
|
||||
broadcastInput = std::make_shared<ngraph::opset3::Concat>(ngraph::OutputVector{broadcastConst, broadcastInput}, 0);
|
||||
}
|
||||
|
||||
|
||||
@@ -392,8 +392,17 @@ inline Stage ModelObj::addNewStage(
|
||||
// runAllocator
|
||||
//
|
||||
|
||||
VPU_DECLARE_ENUM(EnableShapeAllocation,
|
||||
YES,
|
||||
NO)
|
||||
|
||||
VPU_DECLARE_ENUM(CheckOnlyCMX,
|
||||
YES,
|
||||
NO)
|
||||
|
||||
AllocationResult runAllocator(
|
||||
const Model& model,
|
||||
bool onlyCheckCMX = false);
|
||||
EnableShapeAllocation = EnableShapeAllocation::NO,
|
||||
CheckOnlyCMX = CheckOnlyCMX::NO);
|
||||
|
||||
} // namespace vpu
|
||||
|
||||
@@ -84,9 +84,11 @@ void BackEnd::getMetaData(
|
||||
stageMeta.layerName = "<Extra>";
|
||||
stageMeta.layerType = "<Extra>";
|
||||
} else {
|
||||
stageMeta.layerName = stage->origLayer()->name;
|
||||
stageMeta.layerType = stage->origLayer()->type;
|
||||
visitedLayers.insert(stage->origLayer());
|
||||
const auto& origLayer = stage->origLayer();
|
||||
stageMeta.layerName = origLayer->params.count("originalLayersNames") ? origLayer->params["originalLayersNames"] :
|
||||
origLayer->name;
|
||||
stageMeta.layerType = origLayer->type;
|
||||
visitedLayers.insert(origLayer);
|
||||
}
|
||||
|
||||
return stageMeta;
|
||||
|
||||
@@ -184,9 +184,9 @@ CustomLayer::CustomLayer(std::string configDir, const pugi::xml_node& customLaye
|
||||
stageOrder.emplace(stageNum, CustomKernel{kernel, _configDir});
|
||||
}
|
||||
|
||||
VPU_THROW_UNLESS(stageOrder.begin()->first == 0,
|
||||
VPU_THROW_UNLESS(!stageOrder.empty() && stageOrder.begin()->first == 0,
|
||||
"Error while binding %s custom layer: Stage 0 is not found.", _layerName);
|
||||
VPU_THROW_UNLESS(stageOrder.rbegin()->first == stageOrder.size() - 1,
|
||||
VPU_THROW_UNLESS(!stageOrder.empty() && stageOrder.rbegin()->first == stageOrder.size() - 1,
|
||||
"Error while binding %s custom layer: Kernels should have stage id from 0 to N.", _layerName);
|
||||
|
||||
for (auto& stage : stageOrder) {
|
||||
|
||||
@@ -430,6 +430,19 @@ bool checkHWRestrictions(
|
||||
int kernelSizeX, int kernelSizeY,
|
||||
int kernelStride,
|
||||
HwOpMode mode, HwOpType type) {
|
||||
// Workaround for HW ops failure if too wide input:
|
||||
// Looks like HW operations (primarily Pooling) can
|
||||
// use only part of available CMX, up to 1014 * 128
|
||||
// bits (i.e. 1014 * 16 bytes)
|
||||
// Provided HwOpMode is 16x16, this means HW needs
|
||||
// to read up to 16 lines of input tensor, so each
|
||||
// line mustn't exceed 1014 bytes or 507 pixels if
|
||||
// precision is FP16
|
||||
// More details available with the ticket #-33366
|
||||
if (inTileWidth > 507) {
|
||||
return false;
|
||||
}
|
||||
|
||||
const int chansPerBlock = 1 << static_cast<int>(mode);
|
||||
int noOfBlocks = divUp(inTileChannels, chansPerBlock);
|
||||
|
||||
|
||||
@@ -193,10 +193,10 @@ void PassImpl::wrapInLoop(const Model& model, const StageList& subgraph) {
|
||||
loopEndOutputs.push_back(originalOutput);
|
||||
const auto rule = IterationRule{Dim::N, 0, 1, -1};
|
||||
endIterationComponents.emplace(std::make_pair(loopEndOutputs.size() - 1, rule), loopEndInputs.size() - 1);
|
||||
} else {
|
||||
for (const auto& consumerEdge : originalOutput->consumerEdges()) {
|
||||
}
|
||||
for (const auto& consumerEdge : originalOutput->consumerEdges()) {
|
||||
if (subgraph.has(consumerEdge->consumer()))
|
||||
model->replaceStageInput(consumerEdge, output);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -458,7 +458,7 @@ void PassImpl::packDataInCmx(const Model& model) {
|
||||
return DataLoopStatus::NextChild;
|
||||
});
|
||||
|
||||
auto allocRes = runAllocator(model, true);
|
||||
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
|
||||
env.log->trace("Allocation result : %v", allocRes.status);
|
||||
|
||||
if (allocRes.status != AllocationStatus::OK) {
|
||||
|
||||
@@ -25,7 +25,7 @@ namespace vpu {
|
||||
// runAllocator
|
||||
//
|
||||
|
||||
AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
||||
AllocationResult runAllocator(const Model& model, EnableShapeAllocation enableShapeAllocation, CheckOnlyCMX checkOnlyCmx) {
|
||||
VPU_PROFILE(runAllocator);
|
||||
|
||||
auto& allocator = model->getAllocator();
|
||||
@@ -40,7 +40,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
||||
// Allocate Const/Input/Output datas.
|
||||
//
|
||||
|
||||
if (!onlyCheckCMX) {
|
||||
if (checkOnlyCmx == CheckOnlyCMX::NO) {
|
||||
auto result = allocator.preprocess(model);
|
||||
if (result.status != vpu::AllocationStatus::OK) {
|
||||
return result;
|
||||
@@ -86,14 +86,14 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
||||
// Allocate stage outputs.
|
||||
//
|
||||
|
||||
const auto allocateStageOutputs = [onlyCheckCMX, &allocator](const Stage& stage) -> AllocationResult {
|
||||
const auto allocateStageOutputs = [checkOnlyCmx, &allocator](const Stage& stage) -> AllocationResult {
|
||||
for (const auto& output : stage->outputs()) {
|
||||
if (onlyCheckCMX && output->memReqs() != MemoryType::CMX) {
|
||||
if (checkOnlyCmx == CheckOnlyCMX::YES && output->memReqs() != MemoryType::CMX) {
|
||||
continue;
|
||||
}
|
||||
|
||||
if (!allocator.allocateData(output)) {
|
||||
if (output->memReqs() == MemoryType::CMX && !onlyCheckCMX) {
|
||||
if (output->memReqs() == MemoryType::CMX && checkOnlyCmx == CheckOnlyCMX::NO) {
|
||||
if (allocator.removeCMXCandidates(output)) {
|
||||
if (allocator.allocateData(output)) {
|
||||
continue;
|
||||
@@ -123,7 +123,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
||||
// Allocate stage temporary buffers.
|
||||
//
|
||||
|
||||
if (!onlyCheckCMX) {
|
||||
if (checkOnlyCmx == CheckOnlyCMX::NO) {
|
||||
for (const auto& tempBufferEdge : stage->tempBufferEdges()) {
|
||||
if (!allocator.allocateData(tempBufferEdge->tempBuffer())) {
|
||||
allocator.setNeedToAllocNonIntermData();
|
||||
@@ -157,7 +157,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
||||
//
|
||||
|
||||
for (const auto& input : stage->inputs()) {
|
||||
if (onlyCheckCMX && input->memReqs() != MemoryType::CMX) {
|
||||
if (checkOnlyCmx == CheckOnlyCMX::YES && input->memReqs() != MemoryType::CMX) {
|
||||
continue;
|
||||
}
|
||||
|
||||
@@ -168,7 +168,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
||||
// Release stage temporary buffers.
|
||||
//
|
||||
|
||||
if (!onlyCheckCMX) {
|
||||
if (checkOnlyCmx == CheckOnlyCMX::NO) {
|
||||
for (const auto& tempBufferEdge : stage->tempBufferEdges()) {
|
||||
allocator.freeData(tempBufferEdge->tempBuffer());
|
||||
}
|
||||
@@ -195,7 +195,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
||||
|
||||
if (const auto& parentEdge = data->parentDataToShapeEdge()) {
|
||||
const auto& parent = parentEdge->parent();
|
||||
if (parent->usage() == DataUsage::Intermediate && (!onlyCheckCMX || parent->memReqs() == MemoryType::CMX)) {
|
||||
if (parent->usage() == DataUsage::Intermediate && (checkOnlyCmx == CheckOnlyCMX::NO || parent->memReqs() == MemoryType::CMX)) {
|
||||
allocator.freeData(parent);
|
||||
}
|
||||
}
|
||||
@@ -205,9 +205,11 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
||||
// Allocate shape for all datas
|
||||
//
|
||||
|
||||
for (auto data : model->datas()) {
|
||||
const auto shapeLocation = allocator.allocateShape(data);
|
||||
data->setShapeAllocationInfo(shapeLocation);
|
||||
if (enableShapeAllocation == EnableShapeAllocation::YES) {
|
||||
for (auto data : model->datas()) {
|
||||
const auto shapeLocation = allocator.allocateShape(data);
|
||||
data->setShapeAllocationInfo(shapeLocation);
|
||||
}
|
||||
}
|
||||
|
||||
return AllocationResult();
|
||||
@@ -233,7 +235,7 @@ void PassImpl::run(const Model& model) {
|
||||
// Allocate all resources
|
||||
//
|
||||
|
||||
auto allocRes = runAllocator(model);
|
||||
auto allocRes = runAllocator(model, EnableShapeAllocation::YES);
|
||||
IE_ASSERT(allocRes.status == AllocationStatus::OK);
|
||||
|
||||
//
|
||||
|
||||
@@ -160,7 +160,7 @@ void PassImpl::run(const Model& model) {
|
||||
model->replaceStageInput(consumerEdge, copyOutput);
|
||||
}
|
||||
|
||||
auto allocRes = runAllocator(model, true);
|
||||
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
|
||||
if (allocRes.status != AllocationStatus::OK) {
|
||||
model->replaceStageOutput(copyProducer->outputEdge(0), copyInput);
|
||||
|
||||
|
||||
@@ -171,7 +171,7 @@ void PassImpl::run(const Model& model) {
|
||||
.childSW(swStage)
|
||||
.done();
|
||||
|
||||
auto allocRes = runAllocator(model, true);
|
||||
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
|
||||
if (allocRes.status == AllocationStatus::OK) {
|
||||
// TODO: try to merge more than one SW stage?
|
||||
break;
|
||||
|
||||
@@ -160,7 +160,9 @@ void ParsedConfig::parse(const std::map<std::string, std::string>& config) {
|
||||
setOption(_compileConfig.hwExtraSplit, switches, config, VPU_CONFIG_KEY(HW_EXTRA_SPLIT));
|
||||
setOption(_compileConfig.injectSwOps, switches, config, VPU_CONFIG_KEY(HW_INJECT_STAGES));
|
||||
setOption(_compileConfig.mergeHwPoolToConv, switches, config, VPU_CONFIG_KEY(HW_POOL_CONV_MERGE));
|
||||
IE_SUPPRESS_DEPRECATED_START
|
||||
setOption(_compileConfig.ignoreIRStatistic, switches, config, VPU_CONFIG_KEY(IGNORE_IR_STATISTIC));
|
||||
IE_SUPPRESS_DEPRECATED_END
|
||||
setOption(_compileConfig.hwDilation, switches, config, VPU_CONFIG_KEY(HW_DILATION));
|
||||
setOption(_compileConfig.forceDeprecatedCnnConversion, switches, config, VPU_CONFIG_KEY(FORCE_DEPRECATED_CNN_CONVERSION));
|
||||
setOption(_compileConfig.disableReorder, switches, config, VPU_CONFIG_KEY(DISABLE_REORDER));
|
||||
|
||||
@@ -266,6 +266,8 @@ void FrontEnd::parseConcat(
|
||||
const ie::CNNLayerPtr& layer,
|
||||
const DataVector& inputs,
|
||||
const DataVector& outputs) const {
|
||||
VPU_THROW_UNLESS(layer != nullptr, "parseConcat expects valid CNNLayerPtr, actually got nullptr");
|
||||
|
||||
VPU_THROW_UNLESS(!inputs.empty(),
|
||||
"{} layer with name {} must have no less than 1 input, "
|
||||
"actually provided 0 inputs", layer->type, layer->name);
|
||||
@@ -275,10 +277,8 @@ void FrontEnd::parseConcat(
|
||||
|
||||
auto output = outputs[0];
|
||||
|
||||
auto concat = std::dynamic_pointer_cast<ie::ConcatLayer>(layer);
|
||||
VPU_THROW_UNLESS(layer != nullptr,
|
||||
"{} layer with name {} must be able to convert to ie::ConcatLayer",
|
||||
layer->type, layer->name);
|
||||
const auto& concat = std::dynamic_pointer_cast<ie::ConcatLayer>(layer);
|
||||
VPU_THROW_UNLESS(concat != nullptr, "{} layer with name {} must be convertable to ie::ConcatLayer", layer->type, layer->name);
|
||||
|
||||
VPU_THROW_UNLESS(concat->_axis < output->desc().numDims(),
|
||||
"{} layer with name {} must have axis attribute no grater than number of "
|
||||
|
||||
@@ -128,9 +128,8 @@ private:
|
||||
|
||||
void FrontEnd::parseReduce(const Model& model, const ie::CNNLayerPtr& _layer, const DataVector& inputs, const DataVector& outputs) const {
|
||||
auto layer = std::dynamic_pointer_cast<ie::ReduceLayer>(_layer);
|
||||
VPU_THROW_UNLESS(layer != nullptr,
|
||||
"Layer {} of type {} is nullptr",
|
||||
layer->name, layer->type);
|
||||
VPU_THROW_UNLESS(layer != nullptr, "parseReduce expects valid ReduceLayer, actually got nullptr");
|
||||
|
||||
VPU_THROW_UNLESS(inputs.size() == 2,
|
||||
"Layer {} of type {} expects {} inputs, but provided {}",
|
||||
layer->name, layer->type, 2, inputs.size());
|
||||
|
||||
@@ -107,6 +107,7 @@ Engine::Engine(std::shared_ptr<IMvnc> mvnc) :
|
||||
|
||||
_pluginName = "MYRIAD";
|
||||
|
||||
IE_SUPPRESS_DEPRECATED_START
|
||||
_config = {
|
||||
{ KEY_VPU_HW_STAGES_OPTIMIZATION, "ON" },
|
||||
{ KEY_LOG_LEVEL, "LOG_NONE" },
|
||||
@@ -120,6 +121,7 @@ Engine::Engine(std::shared_ptr<IMvnc> mvnc) :
|
||||
{ KEY_CONFIG_FILE, "" },
|
||||
{ KEY_DEVICE_ID, "" },
|
||||
};
|
||||
IE_SUPPRESS_DEPRECATED_END
|
||||
}
|
||||
|
||||
InferenceEngine::ExecutableNetwork Engine::ImportNetwork(
|
||||
|
||||
@@ -17,6 +17,7 @@
|
||||
#include <ie_core.hpp>
|
||||
#include <net_pass.h>
|
||||
|
||||
#include <ngraph/opsets/opset3.hpp>
|
||||
#include <ngraph/function.hpp>
|
||||
#include <ngraph/variant.hpp>
|
||||
#include <ngraph/op/maximum.hpp>
|
||||
@@ -680,4 +681,25 @@ TEST(CNNNGraphImplTests, TestCheckStats) {
|
||||
ASSERT_EQ(nullptr, _stats);
|
||||
}
|
||||
|
||||
TEST(CNNNGraphImplTests, CanSetBatchReadValue) {
|
||||
std::shared_ptr<ngraph::Function> ngraph;
|
||||
{
|
||||
auto input = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::f32, ngraph::Shape{1, 2});
|
||||
auto constant = std::make_shared<ngraph::opset3::Constant>(ngraph::element::f32, ngraph::Shape{1, 2},
|
||||
std::vector<float>{1, 2});
|
||||
|
||||
auto read_value = std::make_shared<ngraph::opset3::ReadValue>(constant, "variable_id");
|
||||
auto add = std::make_shared<ngraph::opset3::Add>(input, read_value);
|
||||
auto result = std::make_shared<ngraph::op::Result>(add);
|
||||
|
||||
ngraph::ParameterVector params = {input};
|
||||
ngraph::ResultVector results = {result};
|
||||
|
||||
ngraph = std::make_shared<ngraph::Function>(results, params);
|
||||
}
|
||||
|
||||
InferenceEngine::details::CNNNetworkNGraphImpl cnnNet(ngraph);
|
||||
auto status = cnnNet.getCNNNetwork()->setBatchSize(4, nullptr);
|
||||
EXPECT_EQ(status, StatusCode::OK);
|
||||
}
|
||||
IE_SUPPRESS_DEPRECATED_END
|
||||
|
||||
@@ -60,9 +60,11 @@ protected:
|
||||
/* validates a read network with the reference map of CNN layers */
|
||||
void compareWithRef(const InferenceEngine::CNNNetwork &network,
|
||||
const std::vector<InferenceEngine::CNNLayerPtr> &refLayersVec) {
|
||||
IE_SUPPRESS_DEPRECATED_START
|
||||
ASSERT_NO_THROW(FuncTestUtils::compareLayerByLayer<std::vector<InferenceEngine::CNNLayerPtr>>(
|
||||
InferenceEngine::details::CNNNetSortTopologically(network),
|
||||
refLayersVec, false));
|
||||
IE_SUPPRESS_DEPRECATED_END
|
||||
}
|
||||
|
||||
const std::string _modelPath = "NetReader_test.xml";
|
||||
|
||||
@@ -30,16 +30,6 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="15" name="in3" type="Parameter" version="opset1">
|
||||
<data element_type="f32" shape="1,2,32400"/>
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="2" name="shape_of1" type="ShapeOf" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
@@ -182,63 +172,19 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="concat" id="16" type="Concat" version="opset1">
|
||||
<data axis="1"/>
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
<port id="1" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="2" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>4</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="10" name="output" type="Result" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>4</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</input>
|
||||
</layer>
|
||||
<layer id="13" name="output_2" type="Result" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>768</dim>
|
||||
<dim>30</dim>
|
||||
<dim>30</dim>
|
||||
</port>
|
||||
</input>
|
||||
</layer>
|
||||
<layer id="14" name="output_3" type="Result" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>3</dim>
|
||||
<dim>512</dim>
|
||||
<dim>512</dim>
|
||||
</port>
|
||||
</input>
|
||||
</layer>
|
||||
</layers>
|
||||
<edges>
|
||||
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
|
||||
<edge from-layer="0" from-port="0" to-layer="13" to-port="0"/>
|
||||
<edge from-layer="1" from-port="0" to-layer="6" to-port="0"/>
|
||||
<edge from-layer="1" from-port="0" to-layer="14" to-port="0"/>
|
||||
<edge from-layer="2" from-port="1" to-layer="5" to-port="0"/>
|
||||
<edge from-layer="6" from-port="1" to-layer="7" to-port="0"/>
|
||||
<edge from-layer="3" from-port="1" to-layer="5" to-port="1"/>
|
||||
@@ -251,90 +197,66 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
|
||||
<edge from-layer="7" from-port="4" to-layer="8" to-port="1"/>
|
||||
<edge from-layer="8" from-port="2" to-layer="11" to-port="0"/>
|
||||
<edge from-layer="12" from-port="0" to-layer="11" to-port="1"/>
|
||||
<edge from-layer="11" from-port="2" to-layer="16" to-port="1"/>
|
||||
<edge from-layer="16" from-port="2" to-layer="10" to-port="0"/>
|
||||
<edge from-layer="15" from-port="0" to-layer="16" to-port="0"/>
|
||||
<edge from-layer="11" from-port="2" to-layer="10" to-port="0"/>
|
||||
</edges>
|
||||
</net>
|
||||
)V0G0N";
|
||||
std::string modelV5 = R"V0G0N(
|
||||
<net name="Network" version="5" precision="FP32" batch="1">
|
||||
<layers>
|
||||
<layer name="in2" type="Input" precision="FP32" id="0">
|
||||
<data originalLayersNames="in2" />
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>3</dim>
|
||||
<dim>512</dim>
|
||||
<dim>512</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="in1" type="Input" precision="FP32" id="1">
|
||||
<data originalLayersNames="in1" />
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>768</dim>
|
||||
<dim>30</dim>
|
||||
<dim>30</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="in3" type="Input" precision="FP32" id="2">
|
||||
<data originalLayersNames="in3" />
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="Constant_49" type="Const" precision="FP32" id="3">
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</output>
|
||||
<blobs>
|
||||
<custom offset="0" size="259200" precision="FP32" />
|
||||
</blobs>
|
||||
</layer>
|
||||
<layer name="concat" type="Concat" precision="FP32" id="4">
|
||||
<data axis="1" originalLayersNames="concat" />
|
||||
<input>
|
||||
<port id="0">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
<port id="1">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="2" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>4</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
</layers>
|
||||
<edges>
|
||||
<edge from-layer="2" from-port="0" to-layer="4" to-port="0" />
|
||||
<edge from-layer="3" from-port="0" to-layer="4" to-port="1" />
|
||||
</edges>
|
||||
<layers>
|
||||
<layer id="0" name="in1" type="Input" precision="FP32">
|
||||
<output>
|
||||
<port id="0">
|
||||
<dim>1</dim>
|
||||
<dim>768</dim>
|
||||
<dim>30</dim>
|
||||
<dim>30</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="1" name="in2" type="Input" precision="FP32">
|
||||
<output>
|
||||
<port id="0">
|
||||
<dim>1</dim>
|
||||
<dim>3</dim>
|
||||
<dim>512</dim>
|
||||
<dim>512</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="ExpandDims" id="2" type="PriorBoxClustered" precision="FP32">
|
||||
<data clip="0" step_h="16.000000" step_w="16.000000" flip="1" height="44,10,30,19,94,32,61,53,17" offset="0.500000" step="16.000000" variance="0.1,0.1,0.2,0.2" width="86,13,57,39,68,34,142,50,23" originalLayersNames="ExpandDims,prior,shape_of1,shape_of2,ss1,ss2"/>
|
||||
<input>
|
||||
<port id="1">
|
||||
<dim>1</dim>
|
||||
<dim>768</dim>
|
||||
<dim>30</dim>
|
||||
<dim>30</dim>
|
||||
</port>
|
||||
<port id="2">
|
||||
<dim>1</dim>
|
||||
<dim>3</dim>
|
||||
<dim>512</dim>
|
||||
<dim>512</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="3">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>32400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
</layers>
|
||||
<edges>
|
||||
<edge from-layer="0" from-port="0" to-layer="2" to-port="1"/>
|
||||
<edge from-layer="1" from-port="0" to-layer="2" to-port="2"/>
|
||||
</edges>
|
||||
</net>
|
||||
)V0G0N";
|
||||
|
||||
compareIRs(model, modelV5, 259200, [](Blob::Ptr& weights) {
|
||||
compareIRs(model, modelV5, 50, [](Blob::Ptr& weights) {
|
||||
auto* buffer = weights->buffer().as<int64_t*>();
|
||||
buffer[0] = 2;
|
||||
buffer[1] = 4;
|
||||
@@ -369,16 +291,6 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="15" name="in3" type="Parameter" version="opset1">
|
||||
<data element_type="f32" shape="1,2,14400"/>
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="2" name="shape_of1" type="ShapeOf" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
@@ -520,63 +432,19 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="concat" id="16" type="Concat" version="opset1">
|
||||
<data axis="1"/>
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
<port id="1" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="2" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>4</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="10" name="output" type="Result" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>4</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</input>
|
||||
</layer>
|
||||
<layer id="13" name="output_2" type="Result" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>768</dim>
|
||||
<dim>30</dim>
|
||||
<dim>30</dim>
|
||||
</port>
|
||||
</input>
|
||||
</layer>
|
||||
<layer id="14" name="output_3" type="Result" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>3</dim>
|
||||
<dim>512</dim>
|
||||
<dim>512</dim>
|
||||
</port>
|
||||
</input>
|
||||
</layer>
|
||||
</layers>
|
||||
<edges>
|
||||
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
|
||||
<edge from-layer="0" from-port="0" to-layer="13" to-port="0"/>
|
||||
<edge from-layer="1" from-port="0" to-layer="6" to-port="0"/>
|
||||
<edge from-layer="1" from-port="0" to-layer="14" to-port="0"/>
|
||||
<edge from-layer="2" from-port="1" to-layer="5" to-port="0"/>
|
||||
<edge from-layer="6" from-port="1" to-layer="7" to-port="0"/>
|
||||
<edge from-layer="3" from-port="1" to-layer="5" to-port="1"/>
|
||||
@@ -589,90 +457,66 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
|
||||
<edge from-layer="7" from-port="4" to-layer="8" to-port="1"/>
|
||||
<edge from-layer="8" from-port="2" to-layer="11" to-port="0"/>
|
||||
<edge from-layer="12" from-port="0" to-layer="11" to-port="1"/>
|
||||
<edge from-layer="11" from-port="2" to-layer="16" to-port="0"/>
|
||||
<edge from-layer="15" from-port="0" to-layer="16" to-port="1"/>
|
||||
<edge from-layer="16" from-port="2" to-layer="10" to-port="0"/>
|
||||
<edge from-layer="11" from-port="2" to-layer="10" to-port="0"/>
|
||||
</edges>
|
||||
</net>
|
||||
)V0G0N";
|
||||
std::string modelV5 = R"V0G0N(
|
||||
<net name="Network" version="5" precision="FP32" batch="1">
|
||||
<layers>
|
||||
<layer name="in2" type="Input" precision="FP32" id="0">
|
||||
<data originalLayersNames="in2" />
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>3</dim>
|
||||
<dim>512</dim>
|
||||
<dim>512</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="in1" type="Input" precision="FP32" id="1">
|
||||
<data originalLayersNames="in1" />
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>768</dim>
|
||||
<dim>30</dim>
|
||||
<dim>30</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="Constant_49" type="Const" precision="FP32" id="2">
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</output>
|
||||
<blobs>
|
||||
<custom offset="0" size="115200" precision="FP32" />
|
||||
</blobs>
|
||||
</layer>
|
||||
<layer name="in3" type="Input" precision="FP32" id="3">
|
||||
<data originalLayersNames="in3" />
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="concat" type="Concat" precision="FP32" id="4">
|
||||
<data axis="1" originalLayersNames="concat" />
|
||||
<input>
|
||||
<port id="0">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
<port id="1">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="2" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>4</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
</layers>
|
||||
<edges>
|
||||
<edge from-layer="2" from-port="0" to-layer="4" to-port="0" />
|
||||
<edge from-layer="3" from-port="0" to-layer="4" to-port="1" />
|
||||
</edges>
|
||||
<layers>
|
||||
<layer id="0" name="in1" type="Input" precision="FP32">
|
||||
<output>
|
||||
<port id="0">
|
||||
<dim>1</dim>
|
||||
<dim>768</dim>
|
||||
<dim>30</dim>
|
||||
<dim>30</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="1" name="in2" type="Input" precision="FP32">
|
||||
<output>
|
||||
<port id="0">
|
||||
<dim>1</dim>
|
||||
<dim>3</dim>
|
||||
<dim>512</dim>
|
||||
<dim>512</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="ExpandDims" id="2" type="PriorBox" precision="FP32">
|
||||
<data density="" fixed_ratio="" fixed_size="" aspect_ratio="2,0.5" clip="0" flip="0" img_h="0" img_size="0" img_w="0" max_size="" min_size="51.200001,72.407555" offset="0.500000" scale_all_sizes="0" step="17.066666666666666" step_h="0" step_w="0" variance="0.1,0.1,0.2,0.2" originalLayersNames="ExpandDims,prior,shape_of1,shape_of2,ss1,ss2"/>
|
||||
<input>
|
||||
<port id="1">
|
||||
<dim>1</dim>
|
||||
<dim>768</dim>
|
||||
<dim>30</dim>
|
||||
<dim>30</dim>
|
||||
</port>
|
||||
<port id="2">
|
||||
<dim>1</dim>
|
||||
<dim>3</dim>
|
||||
<dim>512</dim>
|
||||
<dim>512</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="3">
|
||||
<dim>1</dim>
|
||||
<dim>2</dim>
|
||||
<dim>14400</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
</layers>
|
||||
<edges>
|
||||
<edge from-layer="0" from-port="0" to-layer="2" to-port="1"/>
|
||||
<edge from-layer="1" from-port="0" to-layer="2" to-port="2"/>
|
||||
</edges>
|
||||
</net>
|
||||
)V0G0N";
|
||||
|
||||
compareIRs(model, modelV5, 115200, [](Blob::Ptr& weights) {
|
||||
compareIRs(model, modelV5, 40, [](Blob::Ptr& weights) {
|
||||
auto* buffer = weights->buffer().as<int64_t*>();
|
||||
buffer[0] = 2;
|
||||
buffer[1] = 4;
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
//
|
||||
|
||||
#include <string>
|
||||
#include <generic_ie.hpp>
|
||||
#include "ngraph_reader_tests.hpp"
|
||||
TEST_F(NGraphReaderTests, ReadProposalNetwork) {
|
||||
std::string model_v10 = R"V0G0N(
|
||||
@@ -305,3 +306,100 @@ TEST_F(NGraphReaderTests, ReadProposalNetwork_2) {
|
||||
|
||||
compareIRs(model_v10, model_v6, 32);
|
||||
}
|
||||
|
||||
TEST_F(NGraphReaderTests, ReadExtensionProposalNetwork) {
|
||||
std::string model_v10 = R"V0G0N(
|
||||
<net name="Network" version="10">
|
||||
<layers>
|
||||
<layer id="0" name="in1" type="Parameter" version="opset1">
|
||||
<data element_type="f32" shape="1,12,34,62"/>
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>12</dim>
|
||||
<dim>34</dim>
|
||||
<dim>62</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="1" name="in2" type="Parameter" version="opset1">
|
||||
<data element_type="f32" shape="1,24,34,62"/>
|
||||
<output>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>1</dim>
|
||||
<dim>24</dim>
|
||||
<dim>34</dim>
|
||||
<dim>62</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="2" name="in3" type="Const" version="opset1">
|
||||
<data offset="0" size="24"/>
|
||||
<output>
|
||||
<port id="0" precision="I64">
|
||||
<dim>3</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer name="proposal" type="Proposal" precision="FP32" id="3" version="extension">
|
||||
<data feat_stride="16" base_size="16" min_size="16" ratio="2.669000" scale="4.000000,6.000000,9.000000,16.000000,24.000000,32.000000" pre_nms_topn="6000" post_nms_topn="200" nms_thresh="0.600000"/>
|
||||
<input>
|
||||
<port id="1">
|
||||
<dim>1</dim>
|
||||
<dim>12</dim>
|
||||
<dim>34</dim>
|
||||
<dim>62</dim>
|
||||
</port>
|
||||
<port id="2">
|
||||
<dim>1</dim>
|
||||
<dim>24</dim>
|
||||
<dim>34</dim>
|
||||
<dim>62</dim>
|
||||
</port>
|
||||
<port id="3">
|
||||
<dim>3</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="3" precision="FP32">
|
||||
<dim>1000</dim>
|
||||
<dim>5</dim>
|
||||
</port>
|
||||
<port id="4" precision="FP32">
|
||||
<dim>1000</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
<layer id="4" name="output" type="Result" version="opset1">
|
||||
<input>
|
||||
<port id="0" precision="FP32">
|
||||
<dim>200</dim>
|
||||
<dim>5</dim>
|
||||
</port>
|
||||
</input>
|
||||
</layer>
|
||||
</layers>
|
||||
<edges>
|
||||
<edge from-layer="0" from-port="0" to-layer="3" to-port="1"/>
|
||||
<edge from-layer="1" from-port="0" to-layer="3" to-port="2"/>
|
||||
<edge from-layer="2" from-port="0" to-layer="3" to-port="3"/>
|
||||
<edge from-layer="3" from-port="4" to-layer="4" to-port="0"/>
|
||||
</edges>
|
||||
</net>
|
||||
)V0G0N";
|
||||
|
||||
Core ie;
|
||||
Blob::Ptr weights;
|
||||
|
||||
weights = make_shared_blob<uint8_t>(TensorDesc(Precision::U8, {24}, Layout::C));
|
||||
weights->allocate();
|
||||
CommonTestUtils::fill_data(weights->buffer().as<float *>(), weights->size() / sizeof(float));
|
||||
|
||||
auto func = ie.ReadNetwork(model_v10, weights).getFunction();
|
||||
for (auto op : func->get_ordered_ops()) {
|
||||
if (op->get_friendly_name() == "proposal" && op->get_type_info() == ngraph::op::GenericIE::type_info) {
|
||||
return;
|
||||
}
|
||||
}
|
||||
FAIL() << "Custom proposal layer is not a Generic operation!";
|
||||
}
|
||||
|
||||
@@ -1,218 +0,0 @@
|
||||
// Copyright (C) 2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include "common_test_utils/test_common.hpp"
|
||||
#include <string>
|
||||
#include <memory>
|
||||
|
||||
#include <ngraph/opsets/opset3.hpp>
|
||||
#include <ngraph/function.hpp>
|
||||
#include <transformations/init_node_info.hpp>
|
||||
#include <ngraph/pass/constant_folding.hpp>
|
||||
#include <ngraph/ops.hpp>
|
||||
#include "ngraph_test_utils.hpp"
|
||||
|
||||
using namespace testing;
|
||||
|
||||
TEST(TransformationTests, ConstFoldingPriorBox) {
|
||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||
|
||||
{
|
||||
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
||||
ngraph::op::PriorBoxAttrs attrs;
|
||||
attrs.min_size = {256.0f};
|
||||
attrs.max_size = {315.0f};
|
||||
attrs.aspect_ratio = {2.0f};
|
||||
attrs.flip = true;
|
||||
attrs.scale_all_sizes = true;
|
||||
|
||||
auto layer_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {1, 1});
|
||||
auto image_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {300, 300});
|
||||
auto pb = std::make_shared<ngraph::opset3::PriorBox>(layer_shape, image_shape, attrs);
|
||||
auto res = std::make_shared<ngraph::opset3::Result>(pb);
|
||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in});
|
||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||
ngraph::pass::ConstantFolding().run_on_function(f);
|
||||
ASSERT_NO_THROW(check_rt_info(f));
|
||||
}
|
||||
|
||||
{
|
||||
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
||||
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 16},
|
||||
{ -0.426667, -0.426667, 0.426667, 0.426667, -0.473286, -0.473286, 0.473286, 0.473286,
|
||||
-0.603398, -0.301699, 0.603398, 0.301699, -0.301699, -0.603398, 0.301699, 0.603398,
|
||||
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
|
||||
});
|
||||
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
|
||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
|
||||
}
|
||||
|
||||
auto res = compare_functions(f, f_ref);
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
|
||||
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
||||
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
||||
|
||||
EXPECT_TRUE(fused != nullptr);
|
||||
EXPECT_TRUE(ref != nullptr);
|
||||
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
|
||||
}
|
||||
|
||||
TEST(TransformationTests, ConstFoldingPriorBoxClustered) {
|
||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||
|
||||
{
|
||||
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
||||
ngraph::op::PriorBoxClusteredAttrs attrs;
|
||||
attrs.widths = {4.0f, 2.0f, 3.2f};
|
||||
attrs.heights = {1.0f, 2.0f, 1.1f};
|
||||
|
||||
auto layer_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {2, 2});
|
||||
auto image_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {300, 300});
|
||||
auto pb = std::make_shared<ngraph::opset3::PriorBoxClustered>(layer_shape, image_shape, attrs);
|
||||
auto res = std::make_shared<ngraph::opset3::Result>(pb);
|
||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in});
|
||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||
ngraph::pass::ConstantFolding().run_on_function(f);
|
||||
ASSERT_NO_THROW(check_rt_info(f));
|
||||
}
|
||||
|
||||
{
|
||||
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
||||
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 48},
|
||||
{ -0.00666667, -0.00166667, 0.00666667, 0.00166667, -0.00333333, -0.00333333, 0.00333333,
|
||||
0.00333333, -0.00533333, -0.00183333, 0.00533333, 0.00183333, -0.00333333, -0.00166667,
|
||||
0.01, 0.00166667, 0, -0.00333333, 0.00666667, 0.00333333, -0.002, -0.00183333, 0.00866667,
|
||||
0.00183333, -0.00666667, 0.00166667, 0.00666667, 0.005, -0.00333333, 0, 0.00333333,
|
||||
0.00666667, -0.00533333, 0.0015, 0.00533333, 0.00516667, -0.00333333, 0.00166667, 0.01,
|
||||
0.005, 0, 0, 0.00666667, 0.00666667, -0.002, 0.0015, 0.00866667, 0.00516667, 0.1, 0.1,
|
||||
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||
});
|
||||
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
|
||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
|
||||
}
|
||||
|
||||
auto res = compare_functions(f, f_ref);
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
|
||||
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
||||
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
||||
|
||||
EXPECT_TRUE(fused != nullptr);
|
||||
EXPECT_TRUE(ref != nullptr);
|
||||
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
|
||||
}
|
||||
|
||||
TEST(TransformationTests, ConstFoldingPriorBoxSubgraph) {
|
||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||
|
||||
{
|
||||
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 1, 1});
|
||||
auto in_2 = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 300, 300});
|
||||
ngraph::op::PriorBoxAttrs attrs;
|
||||
attrs.min_size = {256.0f};
|
||||
attrs.max_size = {315.0f};
|
||||
attrs.aspect_ratio = {2.0f};
|
||||
attrs.flip = true;
|
||||
attrs.scale_all_sizes = true;
|
||||
|
||||
auto layer_shape = std::make_shared<ngraph::opset3::ShapeOf>(in);
|
||||
auto image_shape = std::make_shared<ngraph::opset3::ShapeOf>(in_2);
|
||||
|
||||
auto begin = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {2});
|
||||
auto end = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {4});
|
||||
auto stride = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {1});
|
||||
auto ss_data = std::make_shared<ngraph::opset3::StridedSlice>(layer_shape, begin, end, stride,
|
||||
std::vector<int64_t>{0}, std::vector<int64_t>{0});
|
||||
|
||||
auto ss_image = std::make_shared<ngraph::opset3::StridedSlice>(image_shape, begin, end, stride,
|
||||
std::vector<int64_t>{0}, std::vector<int64_t>{0});
|
||||
auto pb = std::make_shared<ngraph::opset3::PriorBox>(ss_data, ss_image, attrs);
|
||||
auto res = std::make_shared<ngraph::opset3::Result>(pb);
|
||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in, in_2});
|
||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||
ngraph::pass::ConstantFolding().run_on_function(f);
|
||||
ASSERT_NO_THROW(check_rt_info(f));
|
||||
}
|
||||
|
||||
{
|
||||
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
||||
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 16},
|
||||
{ -0.426667, -0.426667, 0.426667, 0.426667, -0.473286, -0.473286, 0.473286, 0.473286,
|
||||
-0.603398, -0.301699, 0.603398, 0.301699, -0.301699, -0.603398, 0.301699, 0.603398,
|
||||
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1
|
||||
});
|
||||
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
|
||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
|
||||
}
|
||||
|
||||
auto res = compare_functions(f, f_ref);
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
|
||||
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
||||
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
||||
|
||||
EXPECT_TRUE(fused != nullptr);
|
||||
EXPECT_TRUE(ref != nullptr);
|
||||
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
|
||||
}
|
||||
|
||||
TEST(TransformationTests, ConstFoldingPriorBoxClusteredSubgraph) {
|
||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||
|
||||
{
|
||||
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 2, 2});
|
||||
auto in_2 = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 300, 300});
|
||||
ngraph::op::PriorBoxClusteredAttrs attrs;
|
||||
attrs.widths = {4.0f, 2.0f, 3.2f};
|
||||
attrs.heights = {1.0f, 2.0f, 1.1f};
|
||||
|
||||
auto layer_shape = std::make_shared<ngraph::opset3::ShapeOf>(in);
|
||||
auto image_shape = std::make_shared<ngraph::opset3::ShapeOf>(in_2);
|
||||
|
||||
auto begin = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {2});
|
||||
auto end = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {4});
|
||||
auto stride = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {1});
|
||||
auto ss_data = std::make_shared<ngraph::opset3::StridedSlice>(layer_shape, begin, end, stride,
|
||||
std::vector<int64_t>{0}, std::vector<int64_t>{0});
|
||||
|
||||
auto ss_image = std::make_shared<ngraph::opset3::StridedSlice>(image_shape, begin, end, stride,
|
||||
std::vector<int64_t>{0}, std::vector<int64_t>{0});
|
||||
auto pb = std::make_shared<ngraph::opset3::PriorBoxClustered>(ss_data, ss_image, attrs);
|
||||
auto res = std::make_shared<ngraph::opset3::Result>(pb);
|
||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in, in_2});
|
||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||
ngraph::pass::ConstantFolding().run_on_function(f);
|
||||
ASSERT_NO_THROW(check_rt_info(f));
|
||||
}
|
||||
|
||||
{
|
||||
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
||||
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 48},
|
||||
{ -0.00666667, -0.00166667, 0.00666667, 0.00166667, -0.00333333, -0.00333333, 0.00333333,
|
||||
0.00333333, -0.00533333, -0.00183333, 0.00533333, 0.00183333, -0.00333333, -0.00166667,
|
||||
0.01, 0.00166667, 0, -0.00333333, 0.00666667, 0.00333333, -0.002, -0.00183333, 0.00866667,
|
||||
0.00183333, -0.00666667, 0.00166667, 0.00666667, 0.005, -0.00333333, 0, 0.00333333,
|
||||
0.00666667, -0.00533333, 0.0015, 0.00533333, 0.00516667, -0.00333333, 0.00166667, 0.01,
|
||||
0.005, 0, 0, 0.00666667, 0.00666667, -0.002, 0.0015, 0.00866667, 0.00516667, 0.1, 0.1,
|
||||
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||
});
|
||||
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
|
||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
|
||||
}
|
||||
|
||||
auto res = compare_functions(f, f_ref);
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
|
||||
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
||||
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
||||
|
||||
EXPECT_TRUE(fused != nullptr);
|
||||
EXPECT_TRUE(ref != nullptr);
|
||||
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
|
||||
}
|
||||
@@ -0,0 +1,73 @@
|
||||
// Copyright (C) 2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <gtest/gtest.h>
|
||||
|
||||
#include <string>
|
||||
#include <memory>
|
||||
#include <queue>
|
||||
|
||||
#include <ngraph/function.hpp>
|
||||
#include <ngraph/opsets/opset1.hpp>
|
||||
#include <transformations/convert_divide.hpp>
|
||||
#include <transformations/init_node_info.hpp>
|
||||
#include <transformations/utils/utils.hpp>
|
||||
|
||||
#include "ngraph_test_utils.hpp"
|
||||
|
||||
using namespace testing;
|
||||
|
||||
TEST(TransformationTests, ConvertDivide) {
|
||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||
{
|
||||
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{3, 1, 2});
|
||||
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {1.5});
|
||||
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
|
||||
|
||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
|
||||
|
||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||
ngraph::pass::ConvertDivide().run_on_function(f);
|
||||
ASSERT_NO_THROW(check_rt_info(f));
|
||||
}
|
||||
|
||||
{
|
||||
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{3, 1, 2});
|
||||
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {1.5});
|
||||
auto pow = std::make_shared<ngraph::opset1::Power>(divide_constant,
|
||||
ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {-1}));
|
||||
auto mul = std::make_shared<ngraph::opset1::Multiply>(data, pow);
|
||||
|
||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{mul}, ngraph::ParameterVector{data});
|
||||
}
|
||||
|
||||
auto res = compare_functions(f, f_ref);
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
}
|
||||
|
||||
TEST(TransformationTests, ConvertDivideNegative) {
|
||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||
{
|
||||
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::i32, ngraph::Shape{3, 1, 2});
|
||||
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::i32, ngraph::Shape{1}, {2});
|
||||
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
|
||||
|
||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
|
||||
|
||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||
ngraph::pass::ConvertDivide().run_on_function(f);
|
||||
ASSERT_NO_THROW(check_rt_info(f));
|
||||
}
|
||||
|
||||
{
|
||||
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::i32, ngraph::Shape{3, 1, 2});
|
||||
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::i32, ngraph::Shape{1}, {2});
|
||||
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
|
||||
|
||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
|
||||
}
|
||||
|
||||
auto res = compare_functions(f, f_ref);
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
}
|
||||
@@ -177,6 +177,56 @@ TEST(TransformationTests, ConvertStridedSliceToCropNegative) {
|
||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
|
||||
}
|
||||
|
||||
auto res = compare_functions(f, f_ref);
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
}
|
||||
|
||||
// in this test the Crop will get 3D input which is not supported so the transformation will not be applied
|
||||
TEST(TransformationTests, ConvertStridedSliceToCropNegative2) {
|
||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||
{
|
||||
auto input = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{128, 1});
|
||||
auto slice_begin = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
|
||||
auto slice_end = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
|
||||
auto slice_stride = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {1, 1, 1});
|
||||
|
||||
std::vector<int64_t> begin_mask = {0, 1, 1};
|
||||
std::vector<int64_t> end_mask = {0, 1, 1};
|
||||
std::vector<int64_t> new_axis_mask = {1, 0, 0};
|
||||
std::vector<int64_t> shrink_axis_mask = {0, 0, 0};
|
||||
std::vector<int64_t> ellipsis_mask = {0, 0, 0};
|
||||
|
||||
auto sslice = std::make_shared<ngraph::opset1::StridedSlice>(input, slice_begin, slice_end, slice_stride,
|
||||
begin_mask, end_mask,
|
||||
new_axis_mask, shrink_axis_mask, ellipsis_mask);
|
||||
sslice->set_friendly_name("strided_slice");
|
||||
|
||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
|
||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||
ngraph::pass::ConvertStridedSliceToCrop().run_on_function(f);
|
||||
ASSERT_NO_THROW(check_rt_info(f));
|
||||
}
|
||||
|
||||
{
|
||||
auto input = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{128, 1});
|
||||
auto slice_begin = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
|
||||
auto slice_end = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
|
||||
auto slice_stride = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {1, 1, 1});
|
||||
|
||||
std::vector<int64_t> begin_mask = {0, 1, 1};
|
||||
std::vector<int64_t> end_mask = {0, 1, 1};
|
||||
std::vector<int64_t> new_axis_mask = {1, 0, 0};
|
||||
std::vector<int64_t> shrink_axis_mask = {0, 0, 0};
|
||||
std::vector<int64_t> ellipsis_mask = {0, 0, 0};
|
||||
|
||||
auto sslice = std::make_shared<ngraph::opset1::StridedSlice>(input, slice_begin, slice_end, slice_stride,
|
||||
begin_mask, end_mask,
|
||||
new_axis_mask, shrink_axis_mask, ellipsis_mask);
|
||||
sslice->set_friendly_name("strided_slice");
|
||||
|
||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
|
||||
}
|
||||
|
||||
auto res = compare_functions(f, f_ref);
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
}
|
||||
@@ -157,5 +157,6 @@ TEST(TransformationTests, ConvertTopK3I64Output1) {
|
||||
ASSERT_TRUE(res.first) << res.second;
|
||||
|
||||
auto result_node_of_converted_f = f->get_output_op(0);
|
||||
auto topk_node = result_node_of_converted_f->input(0).get_source_output().get_node_shared_ptr();
|
||||
auto convert_node = result_node_of_converted_f->input(0).get_source_output().get_node_shared_ptr();
|
||||
ASSERT_TRUE(convert_node->get_friendly_name() == "topk.1") << "Transformation ConvertTopK3 should keep output names.\n";
|
||||
}
|
||||
|
||||
@@ -11,14 +11,15 @@ std::vector<std::string> disabledTestPatterns() {
|
||||
return {
|
||||
// TODO: Issue 26264
|
||||
R"(.*(MaxPool|AvgPool).*S\(1\.2\).*Rounding=CEIL.*)",
|
||||
// TODO: Issue 31839
|
||||
R"(.*(QuantConvBackpropData3D).*)",
|
||||
// TODO: Issue 31841
|
||||
R"(.*(QuantGroupConvBackpropData3D).*)",
|
||||
// TODO: Issue 31843
|
||||
R"(.*(QuantGroupConvBackpropData2D)*QG=Perchannel.*)",
|
||||
// TODO: Issue 32023
|
||||
R"(.*(QuantGroupConvBackpropData2D)*QG=Pertensor.*)",
|
||||
R"(.*(QuantConvBackpropData3D).*)",
|
||||
R"(.*(QuantConvBackpropData2D).*(QG=Perchannel).*)",
|
||||
R"(.*(QuantGroupConvBackpropData2D).*(QG=Perchannel).*)",
|
||||
// TODO: Issue 33886
|
||||
R"(.*(QuantGroupConv2D).*)",
|
||||
R"(.*(QuantGroupConv3D).*)",
|
||||
// TODO: Issue 31845
|
||||
R"(.*(FakeQuantize).*)",
|
||||
R"(.*(EltwiseLayerTest).*IS=\(.*\..*\..*\..*\..*\).*secondaryInputType=PARAMETER.*opType=SCALAR.*)",
|
||||
|
||||
@@ -19,7 +19,6 @@ const std::vector<InferenceEngine::Precision> netPrecisions = {
|
||||
const std::vector<size_t> numOutChannels = {16, 32};
|
||||
|
||||
const std::vector<size_t > levels = {256};
|
||||
// FIXME: Perchannel tests fail because of bug in LPT
|
||||
const std::vector<QuantizationGranularity > granularity = {Pertensor, Perchannel};
|
||||
|
||||
/* ============= 2D GroupConvolutionBackpropData ============= */
|
||||
|
||||
@@ -0,0 +1,86 @@
|
||||
// Copyright (C) 2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <vector>
|
||||
|
||||
#include "subgraph_tests/quantized_group_convolution.hpp"
|
||||
#include "common_test_utils/test_constants.hpp"
|
||||
|
||||
using namespace LayerTestsDefinitions;
|
||||
using namespace ngraph::helpers;
|
||||
|
||||
namespace {
|
||||
|
||||
const std::vector<InferenceEngine::Precision> netPrecisions = {
|
||||
InferenceEngine::Precision::FP32
|
||||
};
|
||||
|
||||
|
||||
const std::vector<size_t> numOutChannels = {3, 24, 48};
|
||||
const std::vector<size_t> numGroups = {3};
|
||||
|
||||
const std::vector<size_t > levels = {256};
|
||||
const std::vector<QuantizationGranularity> granularity = {Pertensor, Perchannel};
|
||||
const std::vector<bool> quantizeWeights = {false, true};
|
||||
|
||||
/* ============= 2D GroupConvolution ============= */
|
||||
const std::vector<std::vector<size_t >> inputShapes2D = {{1, 3, 10, 10}, {1, 24, 10, 10}};
|
||||
const std::vector<std::vector<size_t >> kernels2D = {{1, 1}, {3, 3}};
|
||||
const std::vector<std::vector<size_t >> strides2D = {{1, 1}};
|
||||
const std::vector<std::vector<ptrdiff_t>> padBegins2D = {{0, 0}};
|
||||
const std::vector<std::vector<ptrdiff_t>> padEnds2D = {{0, 0}};
|
||||
const std::vector<std::vector<size_t >> dilations2D = {{1, 1}};
|
||||
|
||||
|
||||
const auto quantGroupConv2DParams = ::testing::Combine(
|
||||
::testing::ValuesIn(kernels2D),
|
||||
::testing::ValuesIn(strides2D),
|
||||
::testing::ValuesIn(padBegins2D),
|
||||
::testing::ValuesIn(padEnds2D),
|
||||
::testing::ValuesIn(dilations2D),
|
||||
::testing::ValuesIn(numOutChannels),
|
||||
::testing::ValuesIn(numGroups),
|
||||
::testing::ValuesIn(levels),
|
||||
::testing::ValuesIn(granularity),
|
||||
::testing::ValuesIn(quantizeWeights)
|
||||
);
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(QuantGroupConv2D, QuantGroupConvLayerTest,
|
||||
::testing::Combine(
|
||||
quantGroupConv2DParams,
|
||||
::testing::ValuesIn(netPrecisions),
|
||||
::testing::ValuesIn(inputShapes2D),
|
||||
::testing::Values(CommonTestUtils::DEVICE_CPU)),
|
||||
QuantGroupConvLayerTest::getTestCaseName);
|
||||
|
||||
/* ============= 3D GroupConvolution ============= */
|
||||
const std::vector<std::vector<size_t >> inputShapes3D = {{1, 3, 5, 5, 5}, {1, 24, 5, 5, 5}};
|
||||
const std::vector<std::vector<size_t >> kernels3D = {{3, 3, 3}};
|
||||
const std::vector<std::vector<size_t >> strides3D = {{1, 1, 1}};
|
||||
const std::vector<std::vector<ptrdiff_t>> padBegins3D = {{0, 0, 0}};
|
||||
const std::vector<std::vector<ptrdiff_t>> padEnds3D = {{0, 0, 0}};
|
||||
const std::vector<std::vector<size_t >> dilations3D = {{1, 1, 1}};
|
||||
|
||||
const auto quantGroupConv3DParams = ::testing::Combine(
|
||||
::testing::ValuesIn(kernels3D),
|
||||
::testing::ValuesIn(strides3D),
|
||||
::testing::ValuesIn(padBegins3D),
|
||||
::testing::ValuesIn(padEnds3D),
|
||||
::testing::ValuesIn(dilations3D),
|
||||
::testing::ValuesIn(numOutChannels),
|
||||
::testing::ValuesIn(numGroups),
|
||||
::testing::ValuesIn(levels),
|
||||
::testing::ValuesIn(granularity),
|
||||
::testing::ValuesIn(quantizeWeights)
|
||||
);
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(QuantGroupConv3D, QuantGroupConvLayerTest,
|
||||
::testing::Combine(
|
||||
quantGroupConv3DParams,
|
||||
::testing::ValuesIn(netPrecisions),
|
||||
::testing::ValuesIn(inputShapes3D),
|
||||
::testing::Values(CommonTestUtils::DEVICE_CPU)),
|
||||
QuantGroupConvLayerTest::getTestCaseName);
|
||||
|
||||
} // namespace
|
||||
@@ -21,7 +21,7 @@ const std::vector<std::map<std::string, std::string>> configs = {
|
||||
}
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(ConcatQuantization, ConcatQuantization,
|
||||
INSTANTIATE_TEST_CASE_P(smoke_ConcatQuantization, ConcatQuantization,
|
||||
::testing::Combine(
|
||||
::testing::ValuesIn(netPrecisions),
|
||||
::testing::Values(CommonTestUtils::DEVICE_GNA),
|
||||
|
||||
@@ -0,0 +1,39 @@
|
||||
// Copyright (C) 2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
#include <vector>
|
||||
#include "subgraph_tests/multioutput_eltwise_squeeze_eltwise.hpp"
|
||||
#include "common_test_utils/test_constants.hpp"
|
||||
|
||||
using namespace LayerTestsDefinitions;
|
||||
|
||||
namespace {
|
||||
std::vector<std::vector<std::vector<size_t>>> inputs{
|
||||
{{1, 16}},
|
||||
{{2, 16}},
|
||||
{{1, 160}},
|
||||
{{8, 40}},
|
||||
{{3, 8}},
|
||||
{{4, 32}},
|
||||
{{5, 64}},
|
||||
{{6, 128}},
|
||||
{{7, 256}},
|
||||
{{8, 512}},
|
||||
{{8, 1024}}
|
||||
};
|
||||
|
||||
std::map<std::string, std::string> additional_config = {
|
||||
{"GNA_COMPACT_MODE", "NO"},
|
||||
};
|
||||
|
||||
std::vector<InferenceEngine::Precision> netPrecisions = {InferenceEngine::Precision::FP32,
|
||||
InferenceEngine::Precision::FP16,
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(multioutput_eltwise_identity, MultioutputEltwiseReshapeEltwise,
|
||||
::testing::Combine(
|
||||
::testing::ValuesIn(inputs),
|
||||
::testing::ValuesIn(netPrecisions),
|
||||
::testing::Values(CommonTestUtils::DEVICE_GNA),
|
||||
::testing::Values(additional_config)),
|
||||
MultioutputEltwiseReshapeEltwise::getTestCaseName);
|
||||
} // namespace
|
||||
@@ -9,6 +9,7 @@ using namespace LayerTestsDefinitions;
|
||||
namespace {
|
||||
std::vector<std::vector<std::vector<size_t>>> inputs{
|
||||
{{1, 4 , 160}, {0, 2, 1}},
|
||||
{{1, 160, 4}, {0, 2, 1}},
|
||||
{{8, 16}, {1, 0}},
|
||||
{{1, 1, 4, 16}, {3, 1, 2, 0}},
|
||||
{{1, 8, 200}, {0, 2, 1}},
|
||||
|
||||
@@ -0,0 +1,53 @@
|
||||
// Copyright (C) 2020 Intel Corporation
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <vector>
|
||||
#include "subgraph_tests/scaleshift.hpp"
|
||||
#include "common_test_utils/test_constants.hpp"
|
||||
|
||||
using namespace LayerTestsDefinitions;
|
||||
|
||||
namespace {
|
||||
|
||||
std::vector<std::vector<std::vector<size_t>>> inShapes = {
|
||||
{{1, 8}},
|
||||
{{2, 16}},
|
||||
{{3, 32}},
|
||||
{{4, 64}},
|
||||
{{5, 128}},
|
||||
{{6, 256}},
|
||||
{{7, 512}},
|
||||
{{8, 1024}}
|
||||
};
|
||||
|
||||
std::vector<std::vector<float >> Scales = {
|
||||
{2.0f},
|
||||
{3.0f},
|
||||
{-1.0f},
|
||||
{-2.0f},
|
||||
{-3.0f}
|
||||
};
|
||||
|
||||
std::vector<std::vector<float >> Shifts = {
|
||||
{1.0f},
|
||||
{2.0f},
|
||||
{3.0f},
|
||||
{-1.0f},
|
||||
{-2.0f},
|
||||
{-3.0f}
|
||||
};
|
||||
|
||||
std::vector<InferenceEngine::Precision> netPrecisions = {InferenceEngine::Precision::FP32,
|
||||
InferenceEngine::Precision::FP16,
|
||||
};
|
||||
|
||||
INSTANTIATE_TEST_CASE_P(scale_shift, ScaleShiftLayerTest,
|
||||
::testing::Combine(
|
||||
::testing::ValuesIn(inShapes),
|
||||
::testing::ValuesIn(netPrecisions),
|
||||
::testing::Values(CommonTestUtils::DEVICE_GNA),
|
||||
::testing::ValuesIn(Scales),
|
||||
::testing::ValuesIn(Shifts)),
|
||||
ScaleShiftLayerTest::getTestCaseName);
|
||||
} // namespace
|
||||
@@ -60,8 +60,8 @@ INSTANTIATE_TEST_CASE_P(PriorBoxClustered_Basic, PriorBoxClusteredLayerTest,
|
||||
::testing::Combine(
|
||||
layerSpeficParams,
|
||||
::testing::ValuesIn(netPrecisions),
|
||||
::testing::Values(std::vector<size_t>({ 4, 4 })),
|
||||
::testing::Values(std::vector<size_t>({ 50, 50 })),
|
||||
::testing::Values(std::vector<size_t>({ 1, 16, 4, 4 })),
|
||||
::testing::Values(std::vector<size_t>({ 1, 3, 50, 50 })),
|
||||
::testing::Values(CommonTestUtils::DEVICE_GPU)),
|
||||
PriorBoxClusteredLayerTest::getTestCaseName
|
||||
);
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user