Compare commits

...

55 Commits

Author SHA1 Message Date
Alexey Suhov
023e7c2c3f update system requirements (#1321)
* update system requirements

* update release version in readme
2020-07-14 20:25:39 +03:00
Alexey Suhov
34ddb70f7d fix build target name in demos for Windows (#1248) 2020-07-07 18:26:50 +03:00
Andrew Bakalin
21e092122f [VPU] WA for statis shape allocation (#1106) 2020-06-24 16:28:59 +03:00
Roman Kazantsev
92c1333653 Correct removing nodes from graph and add test for ConstToResult transform (#1083)
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
2020-06-24 15:39:08 +03:00
Roman Kazantsev
c26ec8b312 [IE] Preserve output data name after merging and update output data map (#1092)
Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
2020-06-24 12:30:25 +03:00
Andrew Bakalin
32054ff180 [VPU] Support for originalLayersNames attribute in exec graph (#1073) 2020-06-23 12:19:15 +03:00
Ilya Churaev
7cff005ada Disable ref implementations (#951)
* Add NGRAPH_EVALUATE_ENABLE flag and disable all reference implementations

* Enable some evaluate methods

* Added dynamic library with reference implementations

* Fixed tests

* Enabled unsqueeze  CF

* Removed nGraph test library

* Disable all nGraph tests to check

* Enable some reference implementations

* Added debug message

* EVALUATE true

* Revert "Disable all nGraph tests to check"

This reverts commit 38bca3ed3dfed029e892fe609ea7e48c5cfadb67.

* Enable some implementations

* Removed some TYPE_CASE reference implementations

* Fixed reshape

* Revert types for Broadcast and Add

* Disabled failing gpu_engine.user_context test

* Disabled failed nGraph tests

* Add u8 for non_zero

* Revert "Added debug message"

This reverts commit 4b9f4894f5ae9963426830ac5e5eb833af8847aa.

* Revert "Enable some reference implementations"

This reverts commit d2001a636df7504e0ad5abe5c98725ef0be07379.

Revert "Enabled unsqueeze  CF"

This reverts commit 814a8e52cb2b673446d24e54ed11af1dd3d80fad.

Revert "Enable some evaluate methods"

This reverts commit 73767b8942d857bf60317f29120c98c528344a04.

* Revert "Add NGRAPH_EVALUATE_ENABLE flag and disable all reference implementations"

This reverts commit cfaa7d7e7bf34b617f53a556d24fea2189372592.
2020-06-23 12:17:40 +03:00
Ivan Tikhonov
06707cc53f Fix for Kaldi models with Memory layers and a batch more than 1 (#1025)
* fix kaldi models with memory (batch > 1)

* apply review comments

* Added test for the case using the SetBatchSize function when ReadValue op is in the network

* Check status code instead of message

* Use new ngraph api
2020-06-23 11:47:18 +03:00
Konrad Dobros
fff93d8f05 [IE CLDNN] Add work-around for 1d input to Gather (#1069) 2020-06-23 11:44:20 +03:00
Gladilov, Gleb
637ddd5dfb [IE][VPU]: Fixes klocwork issues (#1075) 2020-06-23 09:58:12 +03:00
Ivan Tikhonov
fa4c5e8e38 Fix ARM build: explicit type conversion (#1061)
* fix arm build: explicit type conversion

* Use explicit conversion in prior_box_ie.cpp
2020-06-22 23:37:54 +03:00
Maxim Vafin
c9fc6f0531 Fix OneHot transformation for Bert Squad opset 10 (#954)
* Add transformation for squeezing depth input for ONNX OneHot operation because from some TF models it has shape [1] instead of []
2020-06-22 18:58:07 +03:00
Denis Orlov
c9eb6ae62b [GNA] Initialize a local variable (#1066) 2020-06-22 18:49:22 +03:00
Alexander Chaiko
eef56ca80c [IE CLDNN] WA to 1d input for concat (#1040) 2020-06-22 15:25:17 +03:00
Gorokhov Dmitriy
36f1c00e02 [CPU] Fixed issue with unsupported reorder case for groupped convolutions (#893) 2020-06-22 14:06:53 +03:00
Konrad Dobros
5c43765011 [IE CLDNN] Fix activation implementation for fsv16 format (#1038)
For b_fs_yx_fsv16 format in reference kernel features for dispatch are
rounded to multiple of 16. This change adds correct check in kernel to
return work-items that are inside this dispatch padding.
Previously those work-items could corrupt memory expected to be filled
with 0s, and for parametrized activation due to bounds checking with
modulo operator they could have been corrupting actual layer output.

Issue: CVS-27672
2020-06-22 09:17:00 +03:00
Ilya Lavrenov
bbfc9bbc14 Deprecated IGNORE_IR_STATISTIC VPU option (#1028) 2020-06-20 10:38:47 +03:00
Pavel Rodionov
9c607528ef [GNA] Support export model with multiple inputs/outputs and Permute layer (#1024) 2020-06-19 18:06:38 +03:00
Denis Orlov
ae9e0510f0 [GNA] Additional checks (#998) 2020-06-19 13:14:32 +03:00
Edward Shogulin
76af547c17 [LPT] BERT with specific biases support & improvement (#968)
* [LPT] BERT with biases support

* [LPT] Gemm biases and quantization

* [CPU] Fixed FullyConnected + Depthwise node fusing

* [LPT] FullyConnected 3D: symmetric quantization support

* [LPT] FullyConnected 3D: symmetric quantization support fix

* [CPU] Fixed FullyConnected + Depthwise fusing initialization

Co-authored-by: dmitrygo <dmitry.gorokhov@intel.com>
2020-06-19 13:14:20 +03:00
Kamil Magierski
5e97a3123f Fix cases then const blob precision is not FP32/FP16 (#1000)
Co-authored-by: kmagiers <kmagiers@intel.com>
2020-06-19 13:13:19 +03:00
Andrey Dmitriev
532dec140b [GNA] fix permute 0_2_1 (#993) 2020-06-19 10:20:55 +03:00
Vladimir Paramuzov
c41c6294f9 [IE CLDNN] Fix strided slice (#953) 2020-06-19 08:23:25 +03:00
Gorokhov Dmitriy
3bbe88e659 [IE Common][WA] Skipped const folding for Convolution layer (#1002) 2020-06-19 01:25:20 +03:00
Maxim Andronov
2f3d5f68cd [CPU] fix one dims scale shift (#983) 2020-06-18 14:21:07 +03:00
Evgeny Talanin
843f81a1cc [IE TESTS] disable Some myriad tests on Win (#763) (#988)
* [IE TESTS] disable Some myriad tests on Windisable Some myriad tests on Win

* Skip test with todo

Co-authored-by: Irina Efode <irina.efode@intel.com>
2020-06-18 13:57:21 +03:00
Pavel Esir
c596707a09 fixed some typos in MO help (#979) 2020-06-18 11:02:28 +03:00
Konrad Dobros
cf60baf2f0 [IE CLDNN] Fix gather dimensions calculation (#960) 2020-06-18 00:31:17 +03:00
Nikita Kudriavtsev
aeb70036d7 [IE Myriad] Remove Myriad 2 from supported devices in XLink (#978) 2020-06-17 17:47:55 +03:00
Daria Mityagina
dea04dae8c [IE Myriad] - WrapInLoop fix: if data has consumer's input inside subgraph - replace them (#958) 2020-06-17 17:27:17 +03:00
Ilya Churaev
14b44803ba Fixed cpack information, removed some links (#975) 2020-06-17 17:17:10 +03:00
Andrey Dmitriev
06286f2aae [GNA] Added fix multiple output with one go to memory and test (#888)
[GNA] Added fix multiple output with one go to memory and test

[GNA] Added fix multiple output with one go to memory and test

[GNA] Added fix multiple output with one go to memory and test

Added multi output

Update gna_pass_manager.cpp

test

[GNA] Added fix multiple output with one go to memory and test

[GNA] Added fix multiple output with one go to memory and test

[GNA] Added fix multiple output with one go to memory and test

Added multi output

Update gna_pass_manager.cpp

test

tests

[GNA] Added fix multiple output with one go to memory and test

[GNA] Added fix multiple output with one go to memory and test

Added multi output

Update gna_pass_manager.cpp

test

tests

Added pass

Test

test

tests_2

return old
2020-06-17 11:23:56 +03:00
Ilya Churaev
97e5fc4bae Use creators only for default opsets (#932) 2020-06-16 12:25:06 +03:00
Alexey Tarakanov
47218284b2 Support fp16 networks for releases_2020_4 (#936) 2020-06-16 10:31:57 +03:00
Andrey Dmitriev
6079a35b81 [GNA] Added test for ScaleShift and fixed power layer with non-zero shift (#922)
* [GNA] Added test ScaleShift and fixed power layer with non zero shift

added tests

[GNA] Added test ScaleShift and fixed power layer with non zero shift

* Test Assert

* rebuild
2020-06-16 00:32:28 +03:00
Roman Kazantsev
4f4352f301 Fix preserving names of output layers after TopK NGraph transformation (#928)
* Fix preserving names of output layers after TopK NGraph transformation (#843)

* Fix preserving names of output layers after TopK NGraph transformation

It helps to infer semantic-segmentation-adas-0001 model. See CVS-31977.

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Fix a test for TopK

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Fix TopK NGraph transformation and its test

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>

* Disable smoke_LoadNetworkAccuracy due to sporadic failure

Signed-off-by: Roman Kazantsev <roman.kazantsev@intel.com>
2020-06-15 20:57:45 +03:00
Anastasia Kuporosova
a67d74c41f [Python API] Fix long inference (#897) 2020-06-15 16:21:41 +03:00
Ivan Tikhonov
26c563132d Revert prior box constant folding (#906)
* Revert "Const folding and reference implementation for PriorBox(Clustered) ops (#785)"

This reverts commit 9fc818478a.

* apply codestyle for ngraph part
2020-06-15 12:38:27 +03:00
Ilya Lavrenov
dc1ca195dd Updated dates of removal for deprecated API (#911) 2020-06-15 12:24:27 +03:00
Vladimir Paramuzov
f5ad3e6f89 [IE CLDNN] Fixed clone network to preserve original CNNNetwork (#870) 2020-06-12 15:53:30 +03:00
Konrad Dobros
6c736ce001 [IE CLDNN] Fix fsv16 -> bfyx reorder removal (#873) 2020-06-12 15:43:54 +03:00
Anastasia Kuporosova
30ab6534e1 [Python API] Fixate requirements (#905) 2020-06-12 12:06:11 +03:00
Ilya Lavrenov
259a4c25ce TESTS: Added test for parallel LoadNetwork with accuracy check (#858) 2020-06-12 11:56:59 +03:00
Andrey Somsikov
347930008c Use default thread sanitizer linkage (#899)
GCC and CLang *default* sanitizer linkage differs (static vs. dynamic).
Prefer default behavior as alternate seen having issues.

Default (GN)U linker fails with unresolved symbols linking Clang built
binaries with sanitizer enabled. Force use LLVM linker lld for Clang
builds.

Sanitizer instrumentation and link flags should be retained for all
binaries. Updating samples cmake configuration to keep those flags
after unset logic at the ie_build_samples().
2020-06-12 00:36:03 +03:00
Evgeny Latkin
4fa251483a [IE][Myriad] fix HW tiling (#894) 2020-06-11 20:48:56 +03:00
Vladimir Paramuzov
30f8af70fc [IE CLDNN] fix perf for fsv16 global avg pooling (#666) 2020-06-11 20:44:37 +03:00
Andrew Bakalin
3fc6d8a188 [VPU] Update firmware (#898) 2020-06-11 20:44:20 +03:00
Denis Orlov
66c8df6a87 [GNA] Fixes in checks, asserts, etc. (#867) 2020-06-11 20:04:46 +03:00
Nikolay Shchegolev
e53eb86334 [Common] Static analysed issues. Part II. 2020-06-11 19:59:44 +03:00
Edward Shogulin
2df99d4263 [LPT] Static code analysis issues fix (#889) 2020-06-11 15:09:20 +03:00
Gleb Kazantaev
deab4d38b0 Fix NopElimination (#869) 2020-06-11 13:28:27 +03:00
Vladimir Paramuzov
412428f1dd [IE CLDNN] Always use FP32 as intermediate type for fused quantize (#829) 2020-06-11 12:22:27 +03:00
Evgeny Lazarev
167c96a8af Relaxed MO requirements for "protobuf" package (#862) 2020-06-10 18:26:16 +03:00
Gleb Kazantaev
b7363ba711 Fix divide conversion for integer input type (#853) 2020-06-10 16:25:57 +03:00
Evgeny Lazarev
5cef9f3734 Fixed StridedSlice to Crop transformation (#836) (#845)
* Fixed StridedSlice to Crop transformation to not apply when rank of data is changed

* Added unit test for StridedSlice to Crop transformation
2020-06-10 11:54:02 +03:00
248 changed files with 3762 additions and 2497 deletions

View File

@@ -1,5 +1,5 @@
# [OpenVINO™ Toolkit](https://01.org/openvinotoolkit) - Deep Learning Deployment Toolkit repository
[![Stable release](https://img.shields.io/badge/version-2020.3-green.svg)](https://github.com/openvinotoolkit/openvino/releases/tag/2020.3.0)
[![Stable release](https://img.shields.io/badge/version-2020.4-green.svg)](https://github.com/openvinotoolkit/openvino/releases/tag/2020.4.0)
[![Apache License Version 2.0](https://img.shields.io/badge/license-Apache_2.0-green.svg)](LICENSE)
This toolkit allows developers to deploy pre-trained deep learning models

View File

@@ -52,14 +52,15 @@ as a part of [Intel® Distribution of OpenVINO™].
## Build on Linux\* Systems
The software was validated on:
- Ubuntu\* 18.04 (64-bit) with default GCC\* 7.5.0
- Ubuntu\* 16.04 (64-bit) with default GCC\* 5.4.0
- CentOS\* 7.4 (64-bit) with default GCC\* 4.8.5
### Software Requirements
- [CMake]\* 3.11 or higher
- GCC\* 4.8 or higher to build the Inference Engine
- Python 2.7 or higher for Inference Engine Python API wrapper
- (Optional) [Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352].
- Python 3.5 or higher for Inference Engine Python API wrapper
- (Optional) [Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441].
### Build Steps
1. Clone submodules:
@@ -77,7 +78,7 @@ The software was validated on:
```
3. By default, the build enables the Inference Engine GPU plugin to infer models
on your Intel® Processor Graphics. This requires you to
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352]
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441]
before running the build. If you don't want to use the GPU plugin, use the
`-DENABLE_CLDNN=OFF` CMake build option and skip the installation of the
Intel® Graphics Compute Runtime for OpenCL™ Driver.
@@ -202,7 +203,7 @@ Native compilation of the Inference Engine is the most straightforward solution.
This compilation was tested on the following configuration:
* Host: Ubuntu\* 16.04 (64-bit, Intel® Core™ i7-6700K CPU @ 4.00GHz × 8)
* Host: Ubuntu\* 18.04 (64-bit, Intel® Core™ i7-6700K CPU @ 4.00GHz × 8)
* Target: Raspbian\* Stretch (32-bit, ARMv7, Raspberry Pi\* 3)
1. Install Docker\*:
@@ -337,7 +338,7 @@ The software was validated on:
- [CMake]\*3.11 or higher
- Microsoft\* Visual Studio 2017, 2019 or [Intel® C++ Compiler] 18.0
- (Optional) Intel® Graphics Driver for Windows* (26.20) [driver package].
- Python 3.4 or higher for Inference Engine Python API wrapper
- Python 3.5 or higher for Inference Engine Python API wrapper
### Build Steps
@@ -454,7 +455,7 @@ The software was validated on:
- [CMake]\* 3.11 or higher
- Clang\* compiler from Xcode\* 10.1 or higher
- Python\* 3.4 or higher for the Inference Engine Python API wrapper
- Python\* 3.5 or higher for the Inference Engine Python API wrapper
### Build Steps
@@ -574,8 +575,7 @@ This section describes how to build Inference Engine for Android x86 (64-bit) op
## Use Custom OpenCV Builds for Inference Engine
> **NOTE**: The recommended and tested version of OpenCV is 4.3. The minimum
supported version is 3.4.0.
> **NOTE**: The recommended and tested version of OpenCV is 4.4.0.
Required versions of OpenCV packages are downloaded automatically during the
building Inference Engine library. If the build script can not find and download
@@ -691,7 +691,7 @@ This target collects all dependencies, prepares the nGraph package and copies it
[Intel® Distribution of OpenVINO™]:https://software.intel.com/en-us/openvino-toolkit
[CMake]:https://cmake.org/download/
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352]:https://github.com/intel/compute-runtime/releases/tag/20.13.16352
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441]:https://github.com/intel/compute-runtime/releases/tag/19.41.14441
[MKL-DNN repository]:https://github.com/intel/mkl-dnn/releases/download/v0.19/mklml_lnx_2019.0.5.20190502.tgz
[MKL-DNN repository for Windows]:(https://github.com/intel/mkl-dnn/releases/download/v0.19/mklml_win_2019.0.5.20190502.zip)
[OpenBLAS]:https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download

View File

@@ -27,8 +27,14 @@ endif()
if (ENABLE_THREAD_SANITIZER)
set(SANITIZER_COMPILER_FLAGS "-g -fsanitize=thread -fno-omit-frame-pointer")
set(SANITIZER_LINKER_FLAGS "-fsanitize=thread -static-libsan")
set(SANITIZER_LINKER_FLAGS "-fsanitize=thread")
if(CMAKE_CXX_COMPILER_ID MATCHES "^(Apple)?Clang$" AND NOT WIN32)
if(CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 8.0)
set(SANITIZER_LINKER_FLAGS "${SANITIZER_LINKER_FLAGS} -fuse-ld=lld")
else()
set(SANITIZER_LINKER_FLAGS "${SANITIZER_LINKER_FLAGS} -static-libsan")
endif()
endif()
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SANITIZER_COMPILER_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SANITIZER_COMPILER_FLAGS}")
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${SANITIZER_LINKER_FLAGS}")

View File

@@ -79,7 +79,7 @@ function(ie_build_samples)
MINGW64 CMAKE_BUILD_TYPE CMAKE_MACOSX_RPATH)
unset(${var})
endforeach()
include(sanitizer)
add_subdirectory(samples)
endfunction()

View File

@@ -19,7 +19,7 @@ set(VPU_SUPPORTED_FIRMWARES usb-ma2450 usb-ma2x8x pcie-ma248x)
# Default packages
#
set(FIRMWARE_PACKAGE_VERSION 1216)
set(FIRMWARE_PACKAGE_VERSION 1223)
set(VPU_CLC_MA2X8X_VERSION "movi-cltools-20.02.0")
#

View File

@@ -1,2 +1,2 @@
numpy
cython>=0.29
numpy==1.13.3
cython==0.29.17

View File

@@ -1,2 +1,2 @@
opencv-python==3.4.4
numpy==1.18.1
opencv-python==3.4.4.19
numpy==1.13.3

View File

@@ -814,8 +814,8 @@ cdef class ExecutableNetwork:
current_request = self.requests[0]
current_request.infer(inputs)
res = {}
for out in current_request._outputs_list:
res[out] = deepcopy(current_request.output_blobs[out].buffer)
for name, value in current_request.output_blobs.items():
res[name] = deepcopy(value.buffer)
return res

View File

@@ -229,12 +229,14 @@ void InferenceEnginePython::IENetwork::serialize(const std::string &path_to_xml,
const std::vector <InferenceEngine::CNNLayerPtr>
InferenceEnginePython::IENetwork::getLayers() {
IE_SUPPRESS_DEPRECATED_START
std::vector<InferenceEngine::CNNLayerPtr> result;
std::vector<InferenceEngine::CNNLayerPtr> sorted_layers = InferenceEngine::details::CNNNetSortTopologically(*actual);
for (const auto &layer : sorted_layers) {
result.emplace_back(layer);
}
return result;
IE_SUPPRESS_DEPRECATED_END
}
PyObject* InferenceEnginePython::IENetwork::getFunction() {

View File

@@ -1,4 +1,4 @@
cython==0.29.17
opencv-python==3.4.4.19
pytest==4.0.1
attrs==19.1.0
pytest-html==1.19.0

View File

@@ -22,12 +22,12 @@
namespace InferenceEngine {
/**
* @deprecated Use InferenceEngine::Core instead. Will be removed in 2020.3
* @deprecated Use InferenceEngine::Core instead. Will be removed in 2021.1
* @brief This class is a C++ API wrapper for IInferencePlugin.
*
* It can throw exceptions safely for the application, where it is properly handled.
*/
class INFERENCE_ENGINE_DEPRECATED("Use InferenceEngine::Core instead. Will be removed in 2020.3") InferencePlugin {
class INFERENCE_ENGINE_DEPRECATED("Use InferenceEngine::Core instead. Will be removed in 2021.1") InferencePlugin {
IE_SUPPRESS_DEPRECATED_START
InferenceEnginePluginPtr actual;

View File

@@ -21,10 +21,10 @@ namespace InferenceEngine {
namespace details {
/**
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
* @brief This class enables range loops for CNNNetwork objects
*/
class INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3")
class INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1")
CNNNetworkIterator {
IE_SUPPRESS_DEPRECATED_START

View File

@@ -16,6 +16,7 @@
namespace InferenceEngine {
namespace details {
INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1")
INFERENCE_ENGINE_API_CPP(std::vector<CNNLayerPtr>) CNNNetSortTopologically(const ICNNNetwork& network);
} // namespace details

View File

@@ -126,7 +126,7 @@ public:
const SizeVector& getDims() const;
/**
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
* @brief Returns an owner of this data layer, parent layer in di-graph
* @return A weak pointer to CNNLayer that creates this data
*/
@@ -147,7 +147,7 @@ public:
void setName(const std::string& newName);
/**
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
* @brief Privates child layers in di-graph
* @return A map of child layers
*/

View File

@@ -2049,7 +2049,7 @@ public:
};
/**
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
* @brief This class represents a standard ScatterUpdate layer
*/
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ScatterUpdateLayer): public CNNLayer {
@@ -2063,7 +2063,7 @@ public:
};
/**
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
* @brief This class represents a standard ScatterElementsUpdate layer
*/
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ScatterElementsUpdateLayer): public CNNLayer {
@@ -2077,7 +2077,7 @@ public:
};
/**
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
* @brief This class represents an onnx ExperimentalDetectronPriorGridGenerator Layer
*/
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ExperimentalDetectronPriorGridGeneratorLayer): public CNNLayer {

View File

@@ -123,11 +123,13 @@ DECLARE_VPU_CONFIG_VALUE(NDHWC);
DECLARE_VPU_CONFIG_KEY(CUSTOM_LAYERS);
/**
* @deprecated IR statistic is not available in IR v10. The option will be removed in 2021.1
* @brief Ignore statistic in IR by plugin.
* Plugin could use statistic present in IR in order to try to improve calculations precision.
* If you don't want statistic to be used enable this option.
* This option should be used with values: CONFIG_VALUE(YES) or CONFIG_VALUE(NO) (default)
*/
INFERENCE_ENGINE_DEPRECATED("IR statistic is not available in IR v10. The option will be removed in 2021.1")
DECLARE_VPU_CONFIG_KEY(IGNORE_IR_STATISTIC);
/**

View File

@@ -382,6 +382,9 @@ int main(int argc, char* argv[]) {
trim(strLine);
labels.push_back(strLine);
}
inputFile.close();
} else {
throw std::logic_error("Cannot read label file");
}
ClassificationResult classificationResult(outputBlob, images, batchSize, FLAGS_nt, labels);

View File

@@ -71,8 +71,8 @@ cldnn::device_info clDNNEngine::GetDeviceInfo(const std::map<std::string, std::s
}
InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngine::ICNNNetwork& network) const {
std::shared_ptr<ICNNNetwork> clonedNetwork(nullptr);
if (network.getFunction()) {
std::shared_ptr<ICNNNetwork> clonedNetwork = cloneNetwork(network);
if (clonedNetwork->getFunction()) {
const auto transformations_callback = [](const std::shared_ptr<const ::ngraph::Node> &node) -> bool {
// DepthToSpace node implementation supports only equal input/output tensors with rank <= 5
// Reshape->Permute->Reshape pattern in theory can change output rank, so this check is added to be sure
@@ -84,8 +84,7 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngin
return std::dynamic_pointer_cast<const ::ngraph::opset2::Gelu>(node) ||
std::dynamic_pointer_cast<const ::ngraph::opset3::ShuffleChannels>(node);
};
CNNNetwork net(network.getFunction());
auto nGraphFunc = net.getFunction();
auto nGraphFunc = clonedNetwork->getFunction();
// Disable shape inference (WA for generic operations)
::ngraph::op::GenericIE::DisableReshape noReshape(nGraphFunc);
@@ -94,9 +93,7 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngin
ngraph::pass::ConvertOpSet3ToOpSet2(transformations_callback).run_on_function(nGraphFunc);
ngraph::pass::ConvertOpSet2ToOpSet1(transformations_callback).run_on_function(nGraphFunc);
ngraph::pass::ConvertOpSet1ToLegacy(transformations_callback).run_on_function(nGraphFunc);
clonedNetwork = InferenceEngine::details::convertFunctionToICNNNetwork(nGraphFunc, network);
} else {
clonedNetwork = cloneNet(network);
clonedNetwork = InferenceEngine::details::convertFunctionToICNNNetwork(nGraphFunc, *clonedNetwork);
}
auto implNetwork = std::dynamic_pointer_cast<InferenceEngine::details::CNNNetworkImpl>(clonedNetwork);

View File

@@ -3518,10 +3518,29 @@ void Program::AddConstantBlobInput(cldnn::topology& topology, InferenceEngine::C
return false;
};
// WA to inconsistency between input and const 1d tensors
// For Concat along batch we go with batch interpretation
// For Gather input we go with batch interpretation
bool needsBatchInterpretation = false;
if (constDims.size() == 1) {
for (auto next : GetNextLayers(layer->outData[0])) {
if (LayerTypeFromStr(next->type) == Concatenate) {
auto nextConcat = as<InferenceEngine::ConcatLayer*>(next);
if (nextConcat->_axis == cldnn::concatenation::concatenation_axis::along_b) {
needsBatchInterpretation = true;
break;
}
} else if (LayerTypeFromStr(next->type) == Gather) {
needsBatchInterpretation = true;
break;
}
}
}
// If quantize on weights has per-channel ranges, we have to swap channel and batch dimensions, because
// quantization should be applied per output channel of weights
// TODO: Check if it's still needed once LowPrecisionTransformations ready
if (inputToConstQuantize(layer)) {
if (inputToConstQuantize(layer) || needsBatchInterpretation) {
constTensor.batch[0] = constTensor.count();
constTensor.feature[0] = 1;
}
@@ -3862,11 +3881,13 @@ void Program::CreateStridedSlicePrimitive(cldnn::topology& topology, InferenceEn
tmp = stridedSliceLayer->GetParamAsUInts("shrink_axis_mask");
std::vector<uint8_t> shrink_axis_mask(tmp.begin(), tmp.end());
auto out_size = CldnnTensorFromIEDims(stridedSliceLayer->outData[0]->getTensorDesc().getDims());
std::string stridedSliceLayerName = layer_type_name_ID(layer);
auto stridedSlicePrim = cldnn::strided_slice(
stridedSliceLayerName,
inputPrimitives[0], inputPrimitives[1], inputPrimitives[2], inputPrimitives[3],
begin_mask, end_mask, new_axis_mask, shrink_axis_mask);
begin_mask, end_mask, new_axis_mask, shrink_axis_mask, out_size);
topology.add(stridedSlicePrim);
AddPrimitiveToProfiler(stridedSliceLayerName, layer);

View File

@@ -359,7 +359,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitDeinterleaveComponentPrivate(intel_dn
comp.operation = kDnnDeinterleaveOp;
comp.macro_operation = kDnnMacroOpNone;
comp.orientation_in = kDnnInterleavedOrientation;
comp.orientation_out = kDnnNonInterleavedOrientation;
comp.orientation_out = kDnnInterleavedOrientation;
comp.output_scale_factor = output_scale_factor;
comp.input_scale_factor = output_scale_factor;
if (!postInitMem) {
@@ -1524,6 +1524,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitGNAStruct(intel_nnet_type_t *ptr_nnet
THROW_GNA_EXCEPTION << "Encountered activation component before pooling component at." << i;
} else {
const auto poolMode = reinterpret_cast<Gna2PoolingMode*>(gnaUserAllocator(sizeof(Gna2PoolingMode)));
IE_ASSERT(poolMode != nullptr);
*poolMode = (comp.op.maxpool.do_sum_not_max) ? Gna2PoolingModeSum : Gna2PoolingModeMax;
const auto poolWindow = create_shape1D_parameter(comp.op.maxpool.num_inputs);
const auto poolStride = create_shape1D_parameter(comp.op.maxpool.num_inputs_step);
@@ -1583,6 +1584,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitGNAStruct(intel_nnet_type_t *ptr_nnet
case kDnnPiecewiselinearOp:
#if GNA_LIB_VER == 2
{
IE_ASSERT(gnaOperation->Operands != nullptr);
auto& outputTensor = const_cast<Gna2Tensor&>(*gnaOperation->Operands[OutOpIdx]);
outputTensor.Data = comp.ptr_outputs;
outputTensor.Type = Gna2DataTypeFromBytes(comp.num_bytes_per_output);

View File

@@ -80,7 +80,7 @@ static const char *intel_dnn_softmax_name[kSoftmaxNumType] = {
};
typedef enum {
kDnnUnknownOrientation,
kDnnUnknownOrientation = 100,
kDnnInterleavedOrientation,
kDnnNonInterleavedOrientation,
kDnnNumOrientation

View File

@@ -199,9 +199,17 @@ class ScaleFactorPerLayer<InferenceEngine::CNNLayer *> {
if (cnnLayer->type == "Const") {
auto blob = cnnLayer->blobs["custom"];
if (blob->getTensorDesc().getPrecision() == InferenceEngine::Precision::FP16) {
auto blob_precision = blob->getTensorDesc().getPrecision();
if (blob_precision != InferenceEngine::Precision::FP32 && blob_precision != InferenceEngine::Precision::FP16) {
quant->_dst_quant.scale = 1.0f;
return true;
}
if (blob_precision == InferenceEngine::Precision::FP16) {
blob = make_fp32_blob(blob);
}
auto max_val = std::numeric_limits<float>::min();
auto min_val = std::numeric_limits<float>::max();

View File

@@ -9,6 +9,7 @@
#if GNA_LIB_VER == 2
#include "gna2_model_debug_log.hpp"
#include "gna2-model-api.h"
#include <details/ie_exception.hpp>
#include <cstdint>
#include <fstream>
@@ -52,6 +53,7 @@ template <class T>
bool NextElement(T & elementIndex, const Gna2Shape& total) {
if (total.NumberOfDimensions == 0) return false;
auto idx = total.NumberOfDimensions - 1;
IE_ASSERT(idx < GNA2_SHAPE_MAXIMUM_NUMBER_OF_DIMENSIONS);
while (elementIndex[idx] + 1 >= total.Dimensions[idx] && idx > 0) {
idx--;
}

View File

@@ -60,6 +60,7 @@ Gna2Tensor HelperGna2TensorInit3D(uint32_t x, uint32_t y, uint32_t z, Gna2DataTy
Gna2Tensor * createGna2Tensor1D(uint32_t x, uint32_t byteSize, void* data) {
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
IE_ASSERT(input != nullptr);
*input = HelperGna2TensorInit1D(x, Gna2DataTypeFromBytes(byteSize), data);
return input;
}
@@ -74,6 +75,7 @@ Gna2Tensor * createGna2TensorPwl(uint32_t x, void* data) {
Gna2Tensor * createGna2BiasTensor1D(uint32_t x, uint32_t byteSize, void* data) {
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
IE_ASSERT(input != nullptr);
if (byteSize == 8) {
*input = HelperGna2TensorInit1D(x, Gna2DataTypeCompoundBias, data);
} else {
@@ -84,24 +86,28 @@ Gna2Tensor * createGna2BiasTensor1D(uint32_t x, uint32_t byteSize, void* data) {
Gna2Tensor * createGna2Tensor2D(uint32_t x, uint32_t y, uint32_t byteSize, void* data) {
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
IE_ASSERT(input != nullptr);
*input = HelperGna2TensorInit2D(x, y, Gna2DataTypeFromBytes(byteSize), data);
return input;
}
Gna2Tensor * createGna2Tensor3D(uint32_t x, uint32_t y, uint32_t z, uint32_t byteSize, void* data) {
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
IE_ASSERT(input != nullptr);
*input = HelperGna2TensorInit3D(x, y, z, Gna2DataTypeFromBytes(byteSize), data);
return input;
}
uint32_t* create_uint32_parameter(uint32_t value) {
const auto param = reinterpret_cast<uint32_t*>(gnaUserAllocator(sizeof(uint32_t)));
IE_ASSERT(param != nullptr);
*param = value;
return param;
}
Gna2Shape* create_shape1D_parameter(uint32_t x) {
const auto shp = reinterpret_cast<Gna2Shape*>(gnaUserAllocator(sizeof(Gna2Shape)));
IE_ASSERT(shp != nullptr);
shp->NumberOfDimensions = 1;
shp->Dimensions[0] = x;
return shp;

View File

@@ -25,7 +25,7 @@
#include "gna_plugin_log.hpp"
uint8_t* GNADeviceHelper::alloc(uint32_t size_requested, uint32_t *size_granted) {
void * memPtr;
void * memPtr = nullptr;
#if GNA_LIB_VER == 1
memPtr = GNAAlloc(nGNAHandle, size_requested, size_granted);
#else

View File

@@ -337,6 +337,7 @@ void GNAGraphCompiler::ConvolutionPrimitive(InferenceEngine::CNNLayerPtr layer)
void GNAGraphCompiler::PowerPrimitive(InferenceEngine::CNNLayerPtr layer) {
auto& power = dynamic_cast<PowerLayer&>(*layer.get());
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(layer);
IE_ASSERT(gnaFlags->sw_fp32 ? (quantized == nullptr) : (quantized != nullptr));
if (power.power != 1.0) {
THROW_IE_EXCEPTION << "[GNA plugin] unsupported power factor, expected 1 but was " << power.power;
@@ -386,29 +387,14 @@ void GNAGraphCompiler::PowerPrimitive(InferenceEngine::CNNLayerPtr layer) {
if (gnaFlags->sw_fp32) {
gnamem->readonly().push_value(ptr_weights, power.scale, num_rows_out, 64);
gnamem->readonly().push_value(ptr_biases, power.scale, num_rows_out, 64);
gnamem->readonly().push_value(ptr_biases, power.offset, num_rows_out, 64);
} else {
auto weightsScaledIdentity = power.scale;
auto biasesScaledIdentity = power.scale;
if (quantized != nullptr) {
weightsScaledIdentity = quantized->_weights_quant.scale * weightsScaledIdentity;
biasesScaledIdentity = quantized->_bias_quant.scale * biasesScaledIdentity;
}
auto weightQuantizedIdentity = FLOAT_TO_INT16(std::min(weightsScaledIdentity, static_cast<float>(INT16_MAX)));
auto biasesQuantizedIdentity = FLOAT_TO_INT16(std::min(biasesScaledIdentity, static_cast<float>(INT16_MAX)));
gnamem->readonly().push_value<int16_t>(ptr_weights, weightQuantizedIdentity, num_rows_out, 64);
gnamem->readonly().push_value<int32_t>(ptr_biases, biasesQuantizedIdentity, num_rows_out, 64);
}
if (power.offset != 0.0f) {
if (quantized == nullptr) {
gnamem->readonly().push_value(ptr_biases, 0.0f, num_rows_out, 64);
} else {
gnamem->readonly().push_value<int32_t>(ptr_biases, 0, num_rows_out, 64);
}
} else {
gnamem->readonly().push_value(ptr_biases, 0.0f, num_rows_out, 64);
auto quantizedScale = FLOAT_TO_INT16(std::min(quantized->_weights_quant.scale * power.scale,
static_cast<float>(INT16_MAX)));
auto quantizedOffset = FLOAT_TO_INT32(std::min(quantized->_dst_quant.scale * power.offset,
static_cast<float>(INT32_MAX)));
gnamem->readonly().push_value<int16_t>(ptr_weights, quantizedScale, num_rows_out, 64);
gnamem->readonly().push_value<int32_t>(ptr_biases, quantizedOffset, num_rows_out, 64);
}
}
@@ -1417,6 +1403,7 @@ void GNAGraphCompiler::PermutePrimitive(InferenceEngine::CNNLayerPtr layer) {
}
auto layerOrder = layer->GetParamAsInts("order");
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(layer);
IE_ASSERT(!layer->insData.empty());
auto inputs = layer->insData.begin()->lock();
auto inputsOrder = inputs->getTensorDesc().getDims();
auto outputs = layer->outData.front();

View File

@@ -176,6 +176,63 @@ inline std::pair<InferenceEngine::CNNLayerPtr, int> CNNNetCheckNextLayerSkipCer
return CNNNetCheckNextLayerSkipCertain(outLayer->second, 0, 0, bOnlyCheck, shouldSkip);
}
/**
* @brief return all layers reachable from given one
* @param layer
* @param oDataIdx - -1 means iterate over all odata indexes
* @param shouldSkip
* @return
*/
template <class Layer>
inline std::vector<CNNLayerPtr> CNNNetGetAllNextLayersSkipCertain(Layer layer, int oDataIdx, const std::function<bool(CNNLayerPtr)> &shouldSkip) {
// TODO: need to have generic function that creates slice of the graph : starting from given layer
// and skipped all non functional - ending up into functional one
std::list<CNNLayerPtr> currentSet;
std::vector<CNNLayerPtr> resultSet;
std::vector<std::map<std::string, CNNLayerPtr>> start;
if (oDataIdx == -1) {
for (int i = 0; i != layer->outData.size(); i++) {
start.push_back(layer->outData[i]->getInputTo());
}
} else {
start.push_back(layer->outData[oDataIdx]->getInputTo());
}
auto separate_layers = [&currentSet, &resultSet, &shouldSkip](std::map<std::string, CNNLayerPtr>& inputTo) {
for (auto &&bfsLayer : inputTo) {
if (shouldSkip(bfsLayer.second)) {
currentSet.push_back(bfsLayer.second);
continue;
}
resultSet.push_back(bfsLayer.second);
}
};
int startIdx, endIdx;
if (oDataIdx == -1) {
startIdx = 0;
endIdx = layer->outData.size();
} else {
startIdx = oDataIdx;
endIdx = oDataIdx + 1;
}
for (int i = startIdx; i != endIdx; i++) {
separate_layers(layer->outData[i]->getInputTo());
}
while (!currentSet.empty()) {
auto currentLayer = currentSet.front();
currentSet.pop_front();
for (auto && oData : currentLayer->outData) {
separate_layers(oData->getInputTo());
}
}
return resultSet;
}
/// @brief alias for strict checkNextLayer (false)
template <class Layer>
inline std::pair<InferenceEngine::CNNLayerPtr, int> CNNNetGetNextLayerSkipCertain(Layer layer, int oidx, int iidx,
@@ -474,7 +531,31 @@ inline void CNNNetworkInsertLayer(CNNLayerPtr after,
}
/**
* @brief remove givven layer from topology, currently only layers with one input data and one output data supported
* @brief returns previous layers and outData index for it
* @tparam T
* @param origin
* @param acceptanceCriteria
* @param idx
*/
template <class T>
std::vector<std::pair<CNNLayerPtr, int> > CNNNetGetPrevLayersSkip(CNNLayerPtr origin, const T &acceptanceCriteria, int idx = -1) {
std::vector<std::pair<CNNLayerPtr, int> > prevLayers;
for (int i = idx == -1 ? 0 : idx; CNNNetHasPrevLayer(origin.get(), i) && (idx == -1 || i == idx); i++) {
auto prevLayer = CNNNetPrevLayer(origin, i);
if (acceptanceCriteria(prevLayer)) {
prevLayers.push_back({prevLayer, CNNLayerFindOutDataIdx(origin, i)});
} else {
// if for some input we need to look in upper layers - original index not used here intentionally
auto prevPrevLayers = CNNNetGetPrevLayersSkip(prevLayer, acceptanceCriteria);
prevLayers.insert(prevLayers.end(), prevPrevLayers.begin(), prevPrevLayers.end());
}
}
return prevLayers;
}
/**
* @brief remove given layer from topology, currently only layers with one input data and one output data supported
*/
inline void CNNNetworkRemoveLayer(CNNLayerPtr layer) {
if (!layer) {

View File

@@ -8,6 +8,9 @@
#include <ios>
#include <iomanip>
#include <map>
#include <ie_algorithm.hpp>
#include <ie_common.h>
#include <ie_precision.hpp>
#if defined __INTEL_COMPILER || defined _MSC_VER
#include <malloc.h>
@@ -119,15 +122,26 @@ const std::map<Gna2OperationType, std::vector<uint32_t>> GnaParamSize{
sizeof(Gna2Shape),
sizeof(Gna2Shape)}},
{Gna2OperationTypeCopy, {sizeof(Gna2Shape)}},
{Gna2OperationTypeTransposition, {sizeof(Gna2Shape)}},
};
void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream & is) {
void GNAModelSerial::Import(void *basePointer,
size_t gnaGraphSize,
std::istream & is,
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
std::vector<GNAPluginNS::OutputDesc> &desc,
InferenceEngine::InputsDataMap& inputsDataMap,
InferenceEngine::OutputsDataMap& outputsDataMap) {
is.exceptions(std::istream::failbit);
ImportInputs(is, basePointer, inputsDesc, inputsDataMap);
ImportOutputs(is, basePointer, desc, outputsDataMap);
for (auto operation = gna2Model->Operations; operation != gna2Model->Operations + gna2Model->NumberOfOperations; ++operation) {
readNBits<32>(operation->Type, is);
readBits(operation->NumberOfOperands, is);
operation->Operands = static_cast<Gna2Tensor const **>(gnaUserAllocator(sizeof(Gna2Tensor*) * operation->NumberOfOperands));
IE_ASSERT(operation->Operands != nullptr);
for (uint32_t i = 0; i < operation->NumberOfOperands; i++) {
Gna2Tensor t{};
readBits(t, is);
@@ -145,11 +159,10 @@ void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream
case Gna2OperationTypeFullyConnectedAffine:
case Gna2OperationTypeConvolution:
case Gna2OperationTypeCopy:
case Gna2OperationTypeTransposition:
break;
case Gna2OperationTypeRecurrent:
THROW_GNA_EXCEPTION << "Importing of recurrent operation not supported";
case Gna2OperationTypeTransposition:
THROW_GNA_EXCEPTION << "Importing of transposition operation not supported";
default:
THROW_GNA_EXCEPTION << "Importing of unknown GNA operation type(" << operation->Type << ") not supported";
}
@@ -158,8 +171,9 @@ void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream
else
operation->Parameters = nullptr;
for (uint32_t i = 0; i < operation->NumberOfParameters; i++) {
uint32_t paramSize;
uint32_t paramSize = 0;
readBits(paramSize, is);
IE_ASSERT(operation->Parameters != nullptr);
if (paramSize == 0) {
operation->Parameters[i] = nullptr;
continue;
@@ -235,11 +249,12 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
};
auto convert_to_serial = [getOffsetFromBase](const GNAModelSerial::RuntimeEndPoint& ep) {
ModelHeader::EndPoint out;
RuntimeEndPoint out;
out.elements_count = ep.elements_count;
out.descriptor_offset = offsetFromBase(ep.descriptor_ptr);
out.scaleFactor = ep.scaleFactor;
out.element_size = ep.element_size;
out.orientation = ep.orientation;
return out;
};
/**
@@ -256,15 +271,21 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
header.gnaMemSize = gnaGraphSize;
header.layersCount = layers.size();
header.nGroup = guessGrouping(*gna2Model);
header.input = convert_to_serial(input);
header.output = convert_to_serial(output);
header.nInputs = inputs.size();
header.nOutputs = outputs.size();
header.nRotateRows = nRotateRows;
header.nRotateColumns = nRotateColumns;
writeBits(header, os);
for (const auto &input : inputs) {
writeBits(convert_to_serial(input), os);
}
for (const auto &output : outputs) {
writeBits(convert_to_serial(output), os);
}
for (const auto & layer : layers) {
writeBits(static_cast<uint32_t>(layer.Type), os);
writeBits(layer.NumberOfOperands, os);
@@ -284,11 +305,10 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
case Gna2OperationTypeFullyConnectedAffine:
case Gna2OperationTypeConvolution:
case Gna2OperationTypeCopy:
case Gna2OperationTypeTransposition:
break;
case Gna2OperationTypeRecurrent:
THROW_GNA_EXCEPTION << "Exporting of recurrent operation not supported";
case Gna2OperationTypeTransposition:
THROW_GNA_EXCEPTION << "Exporting of interleave operation not supported";
default:
THROW_GNA_EXCEPTION << "Exporting of unknown GNA operation type(" << layer.Type << ") not supported";
}
@@ -314,9 +334,18 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
}
#else
void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream & is) {
void GNAModelSerial::Import(void *basePointer,
size_t gnaGraphSize,
std::istream & is,
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
std::vector<GNAPluginNS::OutputDesc> &desc,
InferenceEngine::InputsDataMap& inputsDataMap,
InferenceEngine::OutputsDataMap& outputsDataMap) {
is.exceptions(std::istream::failbit);
ImportInputs(is, basePointer, inputsDesc, inputsDataMap);
ImportOutputs(is, basePointer, desc, outputsDataMap);
auto readPwl = [&is, basePointer](intel_pwl_func_t & value) {
readBits(value.nSegments, is);
if (value.nSegments != 0) {
@@ -466,11 +495,12 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
};
auto convert_to_serial = [getOffsetFromBase](const GNAModelSerial::RuntimeEndPoint& ep){
ModelHeader::EndPoint out;
RuntimeEndPoint out;
out.elements_count = ep.elements_count;
out.element_size = ep.element_size;
out.descriptor_offset = offsetFromBase(ep.descriptor_ptr);
out.scaleFactor = ep.scaleFactor;
out.orientation = ep.orientation;
return out;
};
/**
@@ -486,14 +516,16 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
header.gnaMemSize = gnaGraphSize;
header.layersCount = layers.size();
header.nGroup = ptr_nnet->nGroup;
header.input = convert_to_serial(input);
header.output = convert_to_serial(output);
header.nInputs = 1;
header.nOutputs = 1;
header.headerSize = sizeof(ModelHeader);
header.nRotateRows = nRotateRows;
header.nRotateColumns = nRotateColumns;
writeBits(header, os);
writeBits(convert_to_serial(inputs[0]), os);
writeBits(convert_to_serial(outputs[0]), os);
for (auto & layer : layers) {
writeBits(layer.nInputColumns, os);
@@ -572,3 +604,108 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
}
#endif
std::vector<GNAModelSerial::RuntimeEndPoint> GNAModelSerial::serializeOutputs(const InferenceEngine::OutputsDataMap& outputsDataMap,
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc) {
std::vector<GNAModelSerial::RuntimeEndPoint> endPoints;
std::size_t outputIndex = 0;
for (auto const &output : outputsDataMap) {
auto outputName = output.first;
auto inputDims = output.second->getTensorDesc().getDims();
uint32_t elementsCount = static_cast<uint32_t>(InferenceEngine::details::product(inputDims.begin(), inputDims.end()));
GNAModelSerial::RuntimeEndPoint endPoint(outputsDesc[outputIndex].scale_factor,
outputsDesc[outputIndex].ptrs[0],
outputsDesc[outputIndex].num_bytes_per_element,
elementsCount,
outputsDesc[outputIndex].orientation);
endPoints.push_back(endPoint);
outputIndex++;
}
return endPoints;
}
std::vector<GNAModelSerial::RuntimeEndPoint> GNAModelSerial::serializeInputs(const InferenceEngine::InputsDataMap& inputsDataMap,
std::shared_ptr<GNAPluginNS::InputDesc> inputDesc) {
std::vector<GNAModelSerial::RuntimeEndPoint> endPoints;
std::size_t inputIndex = 0;
for (auto const& input : inputsDataMap) {
auto inputName = input.first;
auto inputDims = input.second->getTensorDesc().getDims();
double scaleFactor = inputDesc->getScaleFactor(inputIndex);
std::vector<void *> descriptor_ptr = inputDesc->getPtrInputsGlobal(inputName);
IE_ASSERT(descriptor_ptr.size() > 0);
uint32_t element_size = 2u;
uint32_t elementsCount = static_cast<uint32_t>(InferenceEngine::details::product(inputDims.begin(), inputDims.end()));
intel_dnn_orientation_t orientation = inputDesc->getOrientation(inputName);
GNAModelSerial::RuntimeEndPoint endPoint(scaleFactor,
descriptor_ptr[0],
element_size,
elementsCount,
orientation);
endPoints.push_back(endPoint);
inputIndex++;
}
return endPoints;
}
void GNAModelSerial::ImportInputs(std::istream &is,
void* basePtr,
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
InferenceEngine::InputsDataMap& dataMap) {
dataMap.clear();
for (auto inputIndex = 0; inputIndex < modelHeader.nInputs; inputIndex++) {
std::string name = "input" + std::to_string(inputIndex);
RuntimeEndPoint input;
is.read(reinterpret_cast<char *>(&input), sizeof(input));
inputsDesc->getPtrInputsGlobal(name).push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + input.descriptor_offset));
inputsDesc->orientation_in[name] = input.orientation;
auto inputDims = InferenceEngine::SizeVector({modelHeader.nGroup, input.elements_count / modelHeader.nGroup});
dataMap[name] = std::make_shared<InferenceEngine::InputInfo>();
dataMap[name]->setInputData(std::make_shared<InferenceEngine::Data>(name,
InferenceEngine::TensorDesc(
InferenceEngine::Precision::FP32,
inputDims,
InferenceEngine::Layout::NC)));
inputsDesc->inputScaleFactors.push_back(input.scaleFactor);
}
}
void GNAModelSerial::ImportOutputs(std::istream &is,
void* basePtr,
std::vector<GNAPluginNS::OutputDesc> &desc,
InferenceEngine::OutputsDataMap& dataMap) {
desc.clear();
dataMap.clear();
desc.resize(modelHeader.nOutputs);
for (auto outputIndex = 0; outputIndex < modelHeader.nOutputs; outputIndex++) {
std::string name = "output" + std::to_string(outputIndex);
RuntimeEndPoint output;
is.read(reinterpret_cast<char *>(&output), sizeof(output));
GNAPluginNS::OutputDesc description;
description.ptrs.push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + output.descriptor_offset));
description.orientation = kDnnInterleavedOrientation;
description.orientation = output.orientation;
description.num_bytes_per_element = output.element_size;
description.scale_factor = output.scaleFactor;
auto outputDims = InferenceEngine::SizeVector({modelHeader.nGroup, output.elements_count / modelHeader.nGroup});
dataMap[name] = std::make_shared<InferenceEngine::Data>(name,
InferenceEngine::TensorDesc(
InferenceEngine::Precision::FP32,
outputDims,
InferenceEngine::Layout::NC));
desc.at(outputIndex) = description;
}
}
void GNAModelSerial::setHeader(ModelHeader header) {
modelHeader = header;
}

View File

@@ -7,7 +7,10 @@
#include <istream>
#include <vector>
#include <utility>
#include "gna-api.h"
#include <gna-api.h>
#include "descriptions/gna_input_desc.hpp"
#include "descriptions/gna_output_desc.hpp"
#include "gna_plugin_log.hpp"
#if GNA_LIB_VER == 2
#include "gna2-model-api.h"
@@ -20,18 +23,19 @@
* 1.0 - basic support
* 1.1 - added memory information
* 2.0 - for use with GNA2 library
* 2.1 - multiple i/o support
*/
#if GNA_LIB_VER == 2
#define HEADER_MAJOR 2
#define HEADER_MINOR 0
#define HEADER_MINOR 1
#else
#define HEADER_MAJOR 1
#define HEADER_MINOR 1
#define HEADER_MINOR 2
#endif
/**
* @brief Header version 1.0
* @brief Header version 2.1
*/
struct ModelHeader {
/**
@@ -74,27 +78,8 @@ struct ModelHeader {
uint32_t nRotateRows = 0u;
uint32_t nRotateColumns = 0u;
struct EndPoint {
/**
* if scale factor is different then pased into infer , network might need to be requantized
*/
float scaleFactor = 0.f;
/**
* Offset in bytes of pointer descriptor
*/
uint64_t descriptor_offset = 0ull;
/**
* Endpoint resolution in bytes.
*/
uint32_t element_size = 0u;
/**
* Number of elements
*/
uint32_t elements_count = 0u;
};
EndPoint input;
EndPoint output;
uint32_t nInputs = 0u;
uint32_t nOutputs = 0u;
/**
* Reserved Data might be here
@@ -127,15 +112,23 @@ class GNAModelSerial {
* Number of elements
*/
uint32_t elements_count = 0;
/**
* Offset in bytes of pointer descriptor
*/
uint64_t descriptor_offset = 0ull;
intel_dnn_orientation_t orientation = kDnnUnknownOrientation;
RuntimeEndPoint() = default;
RuntimeEndPoint(double scaleFactor,
void* descriptor_ptr,
uint32_t element_size,
uint32_t elements_count) : scaleFactor(scaleFactor),
uint32_t elements_count,
intel_dnn_orientation_t orientation) : scaleFactor(scaleFactor),
descriptor_ptr(descriptor_ptr),
element_size(element_size),
elements_count(elements_count) {
elements_count(elements_count),
orientation(orientation) {
}
};
using MemoryType = std::vector<std::pair<void*, uint32_t>>;
@@ -146,11 +139,23 @@ private:
#else
intel_nnet_type_t *ptr_nnet;
#endif
RuntimeEndPoint input, output;
std::vector<RuntimeEndPoint> inputs;
std::vector<RuntimeEndPoint> outputs;
uint32_t nRotateRows = 0;
uint32_t nRotateColumns = 0;
MemoryType states, *pstates = nullptr;
ModelHeader modelHeader;
void ImportInputs(std::istream &is,
void* basePtr,
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
InferenceEngine::InputsDataMap& dataMap);
void ImportOutputs(std::istream &is,
void* basePtr,
std::vector<GNAPluginNS::OutputDesc> &desc,
InferenceEngine::OutputsDataMap& dataMap);
public:
#if GNA_LIB_VER == 2
@@ -160,8 +165,12 @@ private:
GNAModelSerial(
Gna2Model * model,
RuntimeEndPoint input,
RuntimeEndPoint output) : gna2Model(model), input(input), output(output) {
const std::shared_ptr<GNAPluginNS::InputDesc> inputDesc,
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc,
const InferenceEngine::InputsDataMap& inputsDataMap,
const InferenceEngine::OutputsDataMap& outputsDataMap) : gna2Model(model),
inputs(serializeInputs(inputsDataMap, inputDesc)),
outputs(serializeOutputs(outputsDataMap, outputsDesc)) {
}
#else
@@ -183,8 +192,12 @@ private:
*/
GNAModelSerial(
intel_nnet_type_t *ptr_nnet,
RuntimeEndPoint input,
RuntimeEndPoint output) : ptr_nnet(ptr_nnet), input(input), output(output) {
const std::shared_ptr<GNAPluginNS::InputDesc> inputDesc,
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc,
const InferenceEngine::InputsDataMap& inputsDataMap,
const InferenceEngine::OutputsDataMap& outputsDataMap) : ptr_nnet(ptr_nnet),
inputs(serializeInputs(inputsDataMap, inputDesc)),
outputs(serializeOutputs(outputsDataMap, outputsDesc)) {
}
#endif
@@ -219,7 +232,13 @@ private:
* @param basePointer
* @param is - stream without header structure - TBD heder might be needed
*/
void Import(void *basePointer, size_t gnaGraphSize, std::istream &is);
void Import(void *basePointer,
size_t gnaGraphSize,
std::istream & is,
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
std::vector<GNAPluginNS::OutputDesc> &desc,
InferenceEngine::InputsDataMap& inputsDataMap,
InferenceEngine::OutputsDataMap& outputsDataMap);
/**
* save gna graph to an outpus stream
@@ -231,4 +250,13 @@ private:
void Export(void *basePtr,
size_t gnaGraphSize,
std::ostream &os) const;
static std::vector<GNAModelSerial::RuntimeEndPoint> serializeOutputs(const InferenceEngine::OutputsDataMap& outputsDataMap,
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc);
static std::vector<GNAModelSerial::RuntimeEndPoint> serializeInputs(const InferenceEngine::InputsDataMap& inputsDataMap,
const std::shared_ptr<GNAPluginNS::InputDesc>);
void setHeader(ModelHeader header);
};

View File

@@ -373,6 +373,7 @@ void GNAPlugin::LoadNetwork(ICNNNetwork &network) {
passes->registerPass<InsertDiagonalLayerPass>();
passes->registerPass<HandleMultipleActivationsForTheLayerPass>();
passes->registerPass<SubstituteScaleShiftBroadCastPass>();
passes->registerPass<FuseMultipleIdentitiesPass>();
passIdx = passes->run(passIdx);
};
@@ -1140,13 +1141,15 @@ InferenceEngine::IExecutableNetwork::Ptr GNAPlugin::ImportNetwork(const std::str
#else
auto serial = GNAModelSerial(&std::get<0>(nnets.back())->obj, mt);
#endif
serial.Import(basePtr, header.gnaMemSize, inputStream);
inputsDesc->getPtrInputsGlobal("input").push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + header.input.descriptor_offset));
// TODO: import of multioutput network not supported
outputsDesc.resize(1);
auto &outputDesc = outputsDesc.front();
outputDesc.ptrs.push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + header.output.descriptor_offset));
serial.setHeader(header);
serial.Import(basePtr,
header.gnaMemSize,
inputStream,
inputsDesc,
outputsDesc,
inputsDataMap,
outputsDataMap);
#if GNA_LIB_VER == 2
auto getOrientation = [](Gna2Operation & gnaOperation) {
@@ -1160,32 +1163,10 @@ InferenceEngine::IExecutableNetwork::Ptr GNAPlugin::ImportNetwork(const std::str
};
#endif
#if GNA_LIB_VER == 2
inputsDesc->orientation_in["input"] = getOrientation(std::get<0>(gnaModels.back())->obj.Operations[0]);
outputDesc.orientation = getOrientation(std::get<0>(gnaModels.back())->obj.Operations[std::get<0>(gnaModels.back())->obj.NumberOfOperations - 1]);
#else
#if GNA_LIB_VER == 1
inputsDesc->orientation_in["input"] = getOrientation(std::get<0>(nnets.back())->obj.pLayers[0]);
outputDesc.orientation = getOrientation(std::get<0>(nnets.back())->obj.pLayers[std::get<0>(nnets.back())->obj.nLayers - 1]);
outputsDesc[0].orientation = getOrientation(std::get<0>(nnets.back())->obj.pLayers[std::get<0>(nnets.back())->obj.nLayers - 1]);
#endif
outputDesc.num_bytes_per_element = header.output.element_size;
auto outputDims = SizeVector({header.nGroup, header.output.elements_count / header.nGroup});
auto inputDims = SizeVector({header.nGroup, header.input.elements_count / header.nGroup});
inputsDataMap["input"] = std::make_shared<InputInfo>();
inputsDataMap["input"]->setInputData(make_shared<Data>("input",
TensorDesc(
Precision::FP32,
inputDims,
Layout::NC)));
outputsDataMap["output"] = make_shared<Data>("output",
TensorDesc(
Precision::FP32,
outputDims,
Layout::NC));
outputDesc.scale_factor = header.output.scaleFactor;
inputsDesc->inputScaleFactors.push_back(header.input.scaleFactor);
num_rotate_rows = header.nRotateRows;
num_rotate_columns = header.nRotateColumns;
@@ -1214,9 +1195,11 @@ void GNAPlugin::Export(const std::string &fileName) {
THROW_GNA_EXCEPTION << " network not loaded";
}
#if GNA_LIB_VER == 1
if (inputsDesc->ptr_inputs_global_id.size() != 1) {
THROW_GNA_EXCEPTION << " exporting network with multiple inputs not supported";
}
#endif
std::fstream outStream(fileName, ios_base::out | ios_base::binary);
@@ -1229,19 +1212,16 @@ void GNAPlugin::Export(const std::string &fileName) {
#endif
}
#if GNA_LIB_VER == 2
auto serial = GNAModelSerial(&std::get<0>(gnaModels.front())->obj,
Gna2Model* modelToSerial = &std::get<0>(gnaModels.front())->obj;
#else
auto serial = GNAModelSerial(&std::get<0>(nnets.front())->obj,
intel_nnet_type_t* modelToSerial = &std::get<0>(nnets.front())->obj;
#endif
{inputsDesc->inputScaleFactors.front(),
inputsDesc->ptr_inputs_global_storage.front()[0],
2,
static_cast<uint32_t>(InferenceEngine::details::product(inputsDataMap.begin()->second->getTensorDesc().getDims()))},
{outputsDesc.front().scale_factor,
outputsDesc.front().ptrs.front(),
outputsDesc.front().num_bytes_per_element,
static_cast<uint32_t>(InferenceEngine::details::product(outputsDataMap.begin()->second->getTensorDesc().getDims()))})
.SetInputRotation(dnn->num_rotate_rows, dnn->num_rotate_columns);
auto serial = GNAModelSerial(modelToSerial,
inputsDesc,
outputsDesc,
inputsDataMap,
outputsDataMap)
.SetInputRotation(dnn->num_rotate_rows, dnn->num_rotate_columns);
for (auto && memoryConnection : graphCompiler.memory_connection) {
serial.AddState(memoryConnection.second.gna_ptr, memoryConnection.second.reserved_size);

View File

@@ -71,7 +71,7 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& config) {
key.erase(0, 1);
try {
input_index = std::stoi(key);
if (input_index < 0 | input_index > 99) {
if (input_index > 99) {
throw std::out_of_range("");
}
} catch (std::invalid_argument&) {

View File

@@ -107,6 +107,9 @@ class LayerInfo {
bool isConcatAlignFilter() const noexcept {
return isOfType("ConcatAlignFilter");
}
bool isLink() const noexcept {
return isOfType("Link");
}
bool isAffineFilter() const noexcept {
return isOfType("AffineFilter");
}
@@ -204,6 +207,7 @@ class LayerInfo {
if (layerOrder == std::vector<int>({ 0, 3, 2, 1 })) {
return true; // supported case
}
IE_ASSERT(!layer->insData.empty());
auto inputs = layer->insData.begin()->lock();
auto inputsOrder = inputs->getTensorDesc().getDims();

View File

@@ -40,7 +40,6 @@ public:
// length of current cycle
std::list<cnt_type> permuteCycles;
int seqId = 0;
bool newSeq = false;
for (int i = 0; i != orderVec.size();) {

View File

@@ -609,31 +609,6 @@ void InsertIdentityLayerPass::run() {
}
}
/**
* @brief returns previous layers and insData index for it
* @tparam T
* @param origin
* @param acceptanceCriteria
* @param idx
*/
// give previous layers while skipping certain layer according to expression
template <class T>
std::vector<std::pair<CNNLayerPtr, int> > CNNNetGetPrevLayersSkip(CNNLayerPtr origin, const T &acceptanceCriteria, int idx = -1) {
std::vector<std::pair<CNNLayerPtr, int> > prevLayers;
for (int i = idx == -1 ? 0 : idx; CNNNetHasPrevLayer(origin.get(), i) && (idx == -1 || i == idx); i++) {
auto prevLayer = CNNNetPrevLayer(origin, i);
if (acceptanceCriteria(prevLayer)) {
prevLayers.push_back({prevLayer, CNNLayerFindOutDataIdx(origin, i)});
} else {
// if for some input we need to look in upper layers - original index not used here intentionally
auto prevPrevLayers = CNNNetGetPrevLayersSkip(prevLayer, acceptanceCriteria);
prevLayers.insert(prevLayers.end(), prevPrevLayers.begin(), prevPrevLayers.end());
}
}
return prevLayers;
}
void InsertCopyLayerPass::run() {
for (auto & l : *pLayers) {
if (l->insData.empty()) continue;
@@ -1084,6 +1059,78 @@ void RemoveConstPass::run() {
transformer.fullTrim();
}
void FuseMultipleIdentitiesPass::run() {
for (auto &l : *pLayers) {
if (l->insData.empty()) continue;
auto isNonFunctional = [](CNNLayerPtr ptr) {
return LayerInfo(ptr).isNonFunctional();
};
auto eltwise = dynamic_cast<InferenceEngine::EltwiseLayer *>(l.get());
auto concat = dynamic_cast<InferenceEngine::ConcatLayer *>(l.get());
if (LayerInfo(l).isNonFunctional() || LayerInfo(l).has32BInput())
continue;
gnalog() << "CNNNetPrevLayer skip non functional from :: " << l->name;
auto prevLayersReached = CNNNetGetPrevLayersSkip(l, [](CNNLayerPtr ptr) {
return !LayerInfo(ptr).isNonFunctional();
});
prevLayersReached.erase(std::remove_if(prevLayersReached.begin(),
prevLayersReached.end(),
[] (const std::pair<CNNLayerPtr, int> & candidate) {
return LayerInfo(candidate.first).isLink();
}), prevLayersReached.end());
if (prevLayersReached.size() != 1 && eltwise == nullptr && concat == nullptr) {
std::stringstream layers;
for (auto && prevLayer : prevLayersReached) {
layers << prevLayer.first->name;
layers << ", ";
}
THROW_GNA_LAYER_EXCEPTION(l) << "unsupported case: connected to "
<< (prevLayersReached.empty() ? "zero" : "multiple") << " outputs : " << layers.str();
}
auto prevLayer = prevLayersReached.front().first;
auto outDataIdx = prevLayersReached.front().second;
gnalog() << ", reached " << prevLayer->name << " at " << outDataIdx << std::endl;
if (!LayerInfo(prevLayer).has32BOutput())
continue;
std::vector<CNNLayerPtr> resultSet = CNNNetGetAllNextLayersSkipCertain(prevLayer, outDataIdx, isNonFunctional);
// now result set should have all needed layers
// checking that result set consist of already identity
CNNLayerPtr alreadyIdentity;
for (auto &&res : resultSet) {
if (LayerInfo(res).isIdentity()) {
alreadyIdentity = res;
break;
}
}
if (!alreadyIdentity) {
continue;
} else {
// just figure out how to connect to that "already identity"
// 1st stage - disconnect given layer from previous
auto directPrev = l->insData.front().lock()->getCreatorLayer().lock();
auto oDataIdx = CNNLayerFindOutDataIdx(directPrev, 0);
auto &inputTo = directPrev->outData[oDataIdx]->getInputTo();
for (auto inIterator = inputTo.begin(); inIterator != inputTo.end(); inIterator++) {
if (inIterator->second == l) {
inputTo.erase(inIterator);
break;
}
}
l->insData.clear();
//2nd stage - now setting up new connection
l->insData.push_back(alreadyIdentity->outData.front());
alreadyIdentity->outData.front()->getInputTo()[l->name] = l;
}
}
}
int PassManager::run(int index) {
// #define PLOT
#ifdef PLOT

View File

@@ -149,6 +149,11 @@ DECL_PASS_BEFORE_COPY(UnrollTI);
*/
DECL_PASS_BEFORE_COPY(RemoveConst);
/**
* @brief removed extra identity layer for multi-output
*/
DECL_PASS(FuseMultipleIdentities);
struct PassManagerSettings {
Policy policy;
/// @brief whether to run passes before copy

View File

@@ -139,7 +139,7 @@ private:
friend INFERENCE_ENGINE_API_CPP(std::shared_ptr<CNNNetworkImpl>)
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph,
const ICNNNetwork& nGraphImpl);
const ICNNNetwork& nGraphImpl, bool keep_constant_inputs);
/**
* @brief Reshape on the same shape

View File

@@ -63,9 +63,9 @@ ngraph::op::GenericIE::GenericIE(const ngraph::NodeVector& inputs,
: GenericIE(as_output_vector(inputs), params, type, outputs) {}
ngraph::op::GenericIE::GenericIE(const ngraph::OutputVector& inputs,
const std::map<std::string, InferenceEngine::Parameter>& params,
const std::string type, const std::vector<PortIE>& outputs)
: Op(inputs), params(params), outputs(outputs), type(type), initialized(0) {
const std::map<std::string, InferenceEngine::Parameter>& params_,
const std::string type_, const std::vector<PortIE>& outputs_)
: Op(inputs), params(params_), outputs(outputs_), type(type_), initialized(0) {
constructor_validate_and_infer_types();
}

View File

@@ -179,7 +179,9 @@ CNNNetwork details::ReadNetwork(const std::string& modelPath, const std::string&
THROW_IE_EXCEPTION << "Weights file " << bPath << " cannot be opened!";
// read model with weights
return reader->read(modelStream, binStream, exts);
auto network = reader->read(modelStream, binStream, exts);
modelStream.close();
return network;
}
// read model without weights
return reader->read(modelStream, exts);

View File

@@ -15,7 +15,8 @@ namespace InferenceEngine {
namespace details {
INFERENCE_ENGINE_API_CPP(std::shared_ptr<CNNNetworkImpl>)
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph, const ICNNNetwork &network);
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph,
const ICNNNetwork &network, bool keep_constant_inputs = false);
} // namespace details
} // namespace InferenceEngine

View File

@@ -24,6 +24,8 @@
#include "ngraph_ops/pad_ie.hpp"
#include "ngraph_ops/onehot_ie.hpp"
#include "ngraph_ops/power.hpp"
#include "ngraph_ops/prior_box_clustered_ie.hpp"
#include "ngraph_ops/prior_box_ie.hpp"
#include "ngraph_ops/proposal_ie.hpp"
#include "ngraph_ops/relu_ie.hpp"
#include "ngraph_ops/scaleshift.hpp"
@@ -472,20 +474,6 @@ InferenceEngine::details::CNNLayerCreator::CNNLayerCreator(const std::shared_ptr
return res;
});
addSpecificCreator({"PriorBox"}, [](const std::shared_ptr<::ngraph::Node>& node,
const std::map<std::string, std::string> params) -> CNNLayerPtr {
THROW_IE_EXCEPTION << "PriorBox operation has a form that is not supported." << node->get_friendly_name()
<< " should be replaced by constant during constant folding.";
return nullptr;
});
addSpecificCreator({"PriorBoxClustered"}, [](const std::shared_ptr<::ngraph::Node>& node,
const std::map<std::string, std::string> params) -> CNNLayerPtr {
THROW_IE_EXCEPTION << "PriorBoxClustered operation has a form that is not supported." << node->get_friendly_name()
<< " should be replaced by constant during constant folding.";
return nullptr;
});
}
CNNLayerPtr InferenceEngine::details::CNNLayerCreator::create() {
@@ -499,7 +487,9 @@ CNNLayerPtr InferenceEngine::details::CNNLayerCreator::create() {
return res;
}
std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph, const ICNNNetwork &network) {
std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function> &graph,
const ICNNNetwork &network,
bool keep_constant_inputs) {
IE_PROFILING_AUTO_SCOPE(convertFunctionToICNNNetwork)
const auto createCNNLayer = [](const std::shared_ptr<::ngraph::Node> &node) -> CNNLayerPtr {
class NGraphCNNLayer: public CNNLayer {
@@ -565,6 +555,10 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
std::make_shared<Builder::NodeConverter<::ngraph::op::PadIE>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::v1::Power>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::PowerIE>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBox>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxClustered>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxClusteredIE>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxIE>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::Proposal>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::ProposalIE>>(),
std::make_shared<Builder::NodeConverter<::ngraph::op::Relu>>(),
@@ -715,7 +709,7 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
for (const auto &layer : nodes)
op_names.insert(layer->get_name());
bool keep_constants = ::ngraph::op::util::has_op_with_type<::ngraph::op::FakeQuantize>(graph);
bool keep_constants = keep_constant_inputs || ::ngraph::op::util::has_op_with_type<::ngraph::op::FakeQuantize>(graph);
// Create layers and output data
for (const auto &layer : nodes) {
@@ -766,6 +760,20 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
cnnLayer->insData.resize(inputCount);
for (size_t i = 0; i < layer->get_output_size(); i++) {
// Memory node with index = 1 has no inputs according to the specification.
// For proper conversion, we must cut off all the layers and data nodes above ReadValue,
// if they are connected only with this layer.
// Now MO generates only constants or constant sub-graphs as input to ReadValue op.
if (std::dynamic_pointer_cast<::ngraph::op::Constant>(layer)) {
bool all_to_read_value = !layer->output(i).get_target_inputs().empty();
for (const auto &output_input : layer->output(i).get_target_inputs()) {
all_to_read_value
&= dynamic_cast<ngraph::op::ReadValue *>(output_input.get_node()) != nullptr;
}
if (all_to_read_value)
continue;
}
if (cnnLayer->type == "Memory" && cnnLayer->params["index"] == "0") {
cnnLayer->outData.clear();
continue;
@@ -773,7 +781,6 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
std::string outName = layer->get_friendly_name();
if (layer->get_output_size() != 1) outName += "." + std::to_string(i);
DataPtr &ptr = cnnNetworkImpl->getData(outName.c_str());
SizeVector dims;
dims = layer->get_output_shape(i);
for (const auto &dim : dims) {
@@ -889,6 +896,7 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
for (const auto &ext : ::ngraph::op::GenericIE::getExtensions(graph)) {
cnnNetworkImpl->AddExtension(ext, nullptr);
}
return cnnNetworkImpl;
}
} // namespace details

View File

@@ -232,7 +232,8 @@ std::vector<CNNLayerPtr> ConstTransformer::foldConstSubgraphsInternal(const std:
static std::vector<std::string> skipConstInfer = {
"FakeQuantize",
"Quantize",
"CumSum" // Const inference function for CumSum is not implemented!
"CumSum", // Const inference function for CumSum is not implemented
"Convolution" // Const inference function for Convolution is not implemented
};
const std::map<std::string, bool> ConstTransformer::getConstLayers(const std::vector<CNNLayerPtr>& sortedLayers) {

View File

@@ -34,6 +34,8 @@
#include "ngraph_ops/onehot_ie.hpp"
#include "ngraph_ops/pad_ie.hpp"
#include "ngraph_ops/power.hpp"
#include "ngraph_ops/prior_box_clustered_ie.hpp"
#include "ngraph_ops/prior_box_ie.hpp"
#include "ngraph_ops/proposal_ie.hpp"
#include "ngraph_ops/relu_ie.hpp"
#include "ngraph_ops/selu_ie.hpp"
@@ -1473,6 +1475,136 @@ CNNLayer::Ptr NodeConverter<ngraph::op::ProposalIE>::createLayer(const std::shar
return res;
}
template <>
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxClusteredIE>::createLayer(
const std::shared_ptr<ngraph::Node>& layer) const {
LayerParams params = {layer->get_friendly_name(), "PriorBoxClustered",
details::convertPrecision(layer->get_output_element_type(0))};
auto res = std::make_shared<InferenceEngine::CNNLayer>(params);
auto castedLayer = ngraph::as_type_ptr<ngraph::op::PriorBoxClusteredIE>(layer);
if (castedLayer == nullptr) THROW_IE_EXCEPTION << "Cannot get " << params.type << " layer " << params.name;
auto attr = castedLayer->get_attrs();
std::string param;
for (const auto& val : attr.widths) {
if (!param.empty()) param += ",";
param += asString(val);
}
res->params["width"] = param;
param.clear();
for (const auto& val : attr.heights) {
if (!param.empty()) param += ",";
param += asString(val);
}
res->params["height"] = param;
param.clear();
for (const auto& val : attr.variances) {
if (!param.empty()) param += ",";
param += asString(val);
}
res->params["variance"] = param;
if (std::abs(attr.step_heights - attr.step_widths) < 1e-5) {
res->params["step"] = asString(attr.step_widths);
} else {
res->params["step_w"] = asString(attr.step_widths);
res->params["step_h"] = asString(attr.step_heights);
}
res->params["offset"] = asString(attr.offset);
res->params["clip"] = asString(attr.clip ? 1 : 0);
res->params["flip"] = "1";
return res;
}
template <>
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxClustered>::createLayer(
const std::shared_ptr<ngraph::Node>& layer) const {
THROW_IE_EXCEPTION << "PriorBoxClustered operation must be converted to PriorBoxClusteredIE operation.";
}
template <>
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxIE>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
LayerParams params = {layer->get_friendly_name(), "PriorBox",
details::convertPrecision(layer->get_output_element_type(0))};
auto res = std::make_shared<InferenceEngine::CNNLayer>(params);
auto castedLayer = ngraph::as_type_ptr<ngraph::op::PriorBoxIE>(layer);
auto layer_info = params.type + " layer " + params.name;
if (castedLayer == nullptr) THROW_IE_EXCEPTION << "Cannot get " << layer_info;
auto attr = castedLayer->get_attrs();
std::string param;
auto data_pshape = castedLayer->get_input_partial_shape(0);
if (data_pshape.is_dynamic()) THROW_IE_EXCEPTION << "Dynamic 0-port input of " << layer_info << " is not supported";
auto data_shape = data_pshape.to_shape();
if (data_shape.size() != 4) THROW_IE_EXCEPTION << layer_info << " has " << data_shape.size() << " items in 0-port input, 4 expected";
auto img_pshape = castedLayer->get_input_partial_shape(1);
if (img_pshape.is_dynamic()) THROW_IE_EXCEPTION << "Dynamic 1-port input of " << layer_info << " is not supported";
auto img_shape = img_pshape.to_shape();
if (img_shape.size() != 4) THROW_IE_EXCEPTION << layer_info << " has " << data_shape.size() << " items in 1-port input, 4 expected";
if (!attr.scale_all_sizes) {
// mxnet-like PriorBox
auto img_H = img_shape[2];
auto data_H = data_shape[2];
if (attr.step == -1)
attr.step = 1. * img_H / data_H;
else
attr.step *= img_H;
for (auto& size : attr.min_size)
size *= img_H;
}
for (const auto& val : attr.max_size) {
if (!param.empty()) param += ",";
param += asString(val);
}
res->params["max_size"] = param;
param.clear();
for (const auto& val : attr.min_size) {
if (!param.empty()) param += ",";
param += asString(val);
}
res->params["min_size"] = param;
param.clear();
for (const auto& val : attr.aspect_ratio) {
if (!param.empty()) param += ",";
param += asString(val);
}
res->params["aspect_ratio"] = param;
param.clear();
for (const auto& val : attr.variance) {
if (!param.empty()) param += ",";
param += asString(val);
}
res->params["variance"] = param;
res->params["step"] = asString(attr.step);
res->params["offset"] = asString(attr.offset);
res->params["clip"] = asString(attr.clip ? 1 : 0);
res->params["flip"] = asString(attr.flip ? 1 : 0);
res->params["scale_all_sizes"] = asString(attr.scale_all_sizes ? 1 : 0);
res->params["density"] = asString(attr.density);
res->params["fixed_size"] = asString(attr.fixed_size);
res->params["fixed_ratio"] = asString(attr.fixed_ratio);
return res;
}
template <>
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBox>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
THROW_IE_EXCEPTION << "PriorBox operation must be converted to PriorBoxIE operation.";
}
template <>
CNNLayer::Ptr NodeConverter<ngraph::op::PowerIE>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
LayerParams params = {layer->get_friendly_name(), "Power",

View File

@@ -272,6 +272,48 @@ void CombineData(DataPtr& master, DataPtr& slave) {
}
}
/**
* Preserve output data name and update output data map of the network
*
* @param in_data name to update
* @param out_data name to preserve
* @param net output data map to update with in_data
*/
template <typename NET>
void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, NET &net) {
// TODO: update outputs of the network if out_data was output
if (out_data->getInputTo().empty()) {
auto data_name = out_data->getName();
in_data->setName(data_name);
}
}
/**
* void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, NET &net), where
* NET = ICNNNetwork
*/
void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, ICNNNetwork& net) {
if (out_data->getInputTo().empty()) {
InferenceEngine::OutputsDataMap outputs_data_map;
net.getOutputsInfo(outputs_data_map);
auto out_data_name = out_data->getName();
in_data->setName(out_data_name);
if (outputs_data_map.count(out_data_name)) {
auto parent_layer_ptr = in_data->getCreatorLayer().lock();
IE_ASSERT(parent_layer_ptr != nullptr);
auto parent_layer_name = parent_layer_ptr->name;
size_t in_data_out_index = 0;
for (size_t ind = 0; ind < parent_layer_ptr->outData.size(); ++ind) {
if (parent_layer_ptr->outData[ind] == in_data) {
in_data_out_index = ind;
}
}
net.addOutput(parent_layer_name, in_data_out_index);
}
}
}
/**
* Remove layer form graph
* May be applied only for inplace layer. One input, one output,
@@ -279,7 +321,8 @@ void CombineData(DataPtr& master, DataPtr& slave) {
*
* @param layer to remove from graph
*/
void RemoveLayer(CNNLayerPtr& layer) {
template <typename NET>
void RemoveLayer(CNNLayerPtr& layer, NET &net) {
IE_ASSERT(layer->insData.size() == 1);
IE_ASSERT(layer->outData.size() == 1);
@@ -299,10 +342,8 @@ void RemoveLayer(CNNLayerPtr& layer) {
// transfer output connections into parent data
CombineData(in_data, out_data);
// Save name for output data
if (out_data->getInputTo().empty()) {
in_data->setName(out_data->getName());
}
// save name for output data and update network output
SaveOutputDataName(in_data, out_data, net);
}
/************************************************************/
@@ -1371,7 +1412,7 @@ void fixConvertLayers(NET &net) {
}
}
for (auto &layer : to_remove) {
RemoveLayer(layer);
RemoveLayer(layer, net);
}
}

View File

@@ -21,6 +21,8 @@ public:
~GemmTransformation() override {};
bool canBeTransformed(const TransformationContext& context, const CNNLayer& layer) const override;
void transform(TransformationContext& context, CNNLayer& layer) const override;
bool isQuantized(const CNNLayer& layer) const noexcept override;
};
IE_SUPPRESS_DEPRECATED_END

View File

@@ -83,6 +83,8 @@ protected:
const std::vector<float>& originalWeightsDequantizationShifts,
std::vector<float>& dequantizationScales,
std::vector<float>& dequantizationShifts) const;
static bool getDequantizationDimIsSupported(const CNNLayer& weightableLayer);
};
typedef std::shared_ptr<WeightableLayerTransformation> WeightableLayerTransformationPtr;

View File

@@ -135,7 +135,6 @@ void ConcatTransformation::transform(TransformationContext& context, CNNLayer& c
dequantizationScale = maxOutputInterval / (dataPrecision.max - dataPrecision.min);
const float max = maxOutputInterval / ((dataPrecision.max - dataPrecision.min) / dataPrecision.max);
const float min = maxOutputInterval / ((dataPrecision.max - dataPrecision.min) / dataPrecision.min);
dequantizationShift = outputLowValue - min;

View File

@@ -25,15 +25,6 @@
using namespace InferenceEngine;
using namespace InferenceEngine::details;
bool getDequantizationValuesAreBroadcasted(const CNNLayer& fullyConnected) {
const DataPtr inputData = fullyConnected.insData[0].lock();
if (inputData == nullptr) {
THROW_IE_LPT_EXCEPTION(fullyConnected) << "input data is absent";
}
return inputData->getDims().size() == 3ul;
}
bool FullyConnectedTransformation::canBeTransformed(const TransformationContext& context, const CNNLayer& fullyConnected) const {
if (!WeightableLayerTransformation::canBeTransformed(context, fullyConnected)) {
return false;
@@ -72,7 +63,12 @@ bool FullyConnectedTransformation::canBeTransformed(const TransformationContext&
std::vector<float> dequantizationShifts;
fillFromDequantizationLayer(*scaleShift, dequantizationScales, dequantizationShifts);
if ((inTensorDims.size() == 3ul) && (!DequantizationDetails::isPerTensor(dequantizationScales, dequantizationShifts))) {
const bool dequantizationDimIsSupported = !getDequantizationDimIsSupported(fullyConnected);
if ((!dequantizationDimIsSupported) &&
(!DequantizationDetails::isPerTensor(dequantizationScales, dequantizationShifts) ||
// if asymmetric quantization is not supported then no shifts for dequantizationDimIsSupported = false case:
// in this case we can not dequantize with shifts
(!supportAsymmetricQuantization && (dequantizationShifts[0] != 0.f)))) {
return false;
}
@@ -318,7 +314,7 @@ void FullyConnectedTransformation::calculateDequantizationForSymmetric(
const auto prevDequantizationScaleBuffer = CNNNetworkHelper::getFloatData(CNNNetworkHelper::getBlob(scaleShift, "weights"));
const auto prevDequantizationShiftBuffer = CNNNetworkHelper::getFloatData(CNNNetworkHelper::getBlob(scaleShift, "biases"));
const bool dequantizationValuesAreBroadcasted = getDequantizationValuesAreBroadcasted(fullyConnected);
const bool dequantizationValuesAreBroadcasted = !getDequantizationDimIsSupported(fullyConnected);
for (size_t i = 0; i < outputChannelsCount; ++i) {
dequantizationScales[i] =
prevDequantizationScaleBuffer.get()[0] *
@@ -401,7 +397,7 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
THROW_IE_EXCEPTION << "Unexpected layer type to calculate quantization values " << scaleShift->type;
}
const bool dequantizationValuesAreBroadcasted = getDequantizationValuesAreBroadcasted(fullyConnected);
const bool dequantizationValuesAreBroadcasted = !getDequantizationDimIsSupported(fullyConnected);
dequantizationScales.resize(outputChannelsCount);
dequantizationShifts.resize(outputChannelsCount);
@@ -412,10 +408,10 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
prevDequantizationScaleBuffer.get()[0] *
(originalWeightsDequantizationScales.size() == 0 ?
1.0 :
(originalWeightsDequantizationScales.size() == 1 ? originalWeightsDequantizationScales[0] : originalWeightsDequantizationScales[i]));
originalWeightsDequantizationScales[((originalWeightsDequantizationScales.size() == 1) || dequantizationValuesAreBroadcasted) ? 0 : i]);
}
if (CNNNetworkHelper::isQuantizedConstWeights(fullyConnected)) {
if (CNNNetworkHelper::isQuantizedConstWeights(fullyConnected) && (!dequantizationValuesAreBroadcasted)) {
const Blob::Ptr weightsBlob = CNNNetworkHelper::getWeights(fullyConnected, roundQuantizedValues);
const auto weightsBuffer = CNNNetworkHelper::getFloatData(weightsBlob);
const Blob::Ptr biasesBlob = CNNNetworkHelper::getBiases(fullyConnected);
@@ -432,7 +428,7 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
for (size_t w = 0; w < inputChannelsCount; ++w) {
const float kernel = weightsBuffer.get()[channel * inputChannelsCount + w];
const float shift = dequantizationValuesAreBroadcasted ? prevDequantizationShiftBuffer.get()[0] : prevDequantizationShiftBuffer.get()[w];
const float shift = prevDequantizationShiftBuffer.get()[w];
sum1 += kernel * shift * weightsDequantizationScale;
sum2 += kernel * dataZeroPoints[w] * weightsDequantizationScale;
}

View File

@@ -133,3 +133,8 @@ void GemmTransformation::transform(TransformationContext& context, CNNLayer& gem
addDequantizationLayer(context, gemm, dequantizationScales, dequantizationShifts);
}
bool GemmTransformation::isQuantized(const CNNLayer& layer) const noexcept {
// weightable layer version overriding
return true;
}

View File

@@ -128,6 +128,15 @@ bool WeightableLayerTransformation::isPrecisionPreserved(const CNNLayer& layer)
return false;
}
bool WeightableLayerTransformation::getDequantizationDimIsSupported(const CNNLayer& fullyConnected) {
const DataPtr inputData = fullyConnected.insData[0].lock();
if (inputData == nullptr) {
THROW_IE_LPT_EXCEPTION(fullyConnected) << "input data is absent";
}
return inputData->getDims().size() != 3ul;
}
void WeightableLayerTransformation::updateLayerBiases(
TransformationContext& context,
const CNNLayer& weightableLayer,
@@ -135,7 +144,17 @@ void WeightableLayerTransformation::updateLayerBiases(
std::vector<float>& dequantizationScales,
std::vector<float>& dequantizationShifts,
std::vector<float>& biasesShifts) const {
if (!std::all_of(dequantizationShifts.begin(), dequantizationShifts.end(), [](float value) { return value == 0.0; })) {
const bool dequantizationShiftsAreZero = std::all_of(
dequantizationShifts.begin(),
dequantizationShifts.end(),
[](float value) { return value == 0.0; });
const bool dequantizationDimIsNotSupported = !getDequantizationDimIsSupported(weightableLayer);
CNNLayerPtr biasesLayer = CNNNetworkHelper::getParent(weightableLayer, 2);
// we need to correct biases if dequantization shifts values are not zero or
// dequantization dimention is not supported (as result dequantization shifts can not be calculated)
if ((dequantizationDimIsNotSupported && (biasesLayer != nullptr)) || (!dequantizationShiftsAreZero)) {
const DataPtr insData = weightableLayer.insData[0].lock();
if (insData == nullptr) {
THROW_IE_LPT_EXCEPTION(weightableLayer) << "input data is absent";
@@ -144,7 +163,6 @@ void WeightableLayerTransformation::updateLayerBiases(
std::shared_ptr<float> biasesBufferPtr;
Blob::Ptr biasesBlob;
CNNLayerPtr biasesLayer = CNNNetworkHelper::getParent(weightableLayer, 2);
if (biasesLayer == nullptr) {
if (weightableLayer.outData.size() != 1ul) {
THROW_IE_LPT_EXCEPTION(weightableLayer) << "unexpected output data count " << weightableLayer.outData.size();

View File

@@ -661,6 +661,13 @@ MKLDNNMemoryDesc::operator InferenceEngine::TensorDesc() const {
blkDims.push_back(8);
layout = Layout::BLOCKED;
break;
case memory::gOdhwi8o:
order = {0, 1, 2, 3, 4, 5, 1};
blkDims = dims;
blkDims[1] = blkDims[1] / 8 + (blkDims[1] % 8 ? 1 : 0);
blkDims.push_back(8);
layout = Layout::BLOCKED;
break;
case memory::nChw16c:
order = {0, 1, 2, 3, 1};
blkDims = dims;
@@ -676,6 +683,13 @@ MKLDNNMemoryDesc::operator InferenceEngine::TensorDesc() const {
blkDims.push_back(16);
layout = Layout::BLOCKED;
break;
case memory::gOdhwi16o:
order = {0, 1, 2, 3, 4, 5, 1};
blkDims = dims;
blkDims[1] = blkDims[1] / 16 + (blkDims[1] % 16 ? 1 : 0);
blkDims.push_back(16);
layout = Layout::BLOCKED;
break;
case memory::Ohwi8o:
order = {0, 1, 2, 3, 0};
blkDims = dims;
@@ -1267,6 +1281,13 @@ MKLDNNMemoryDesc::MKLDNNMemoryDesc(const TensorDesc& tDesc):
} else if (blkdDims[6] == 16) {
mkldnnFormat = memory::format::Goidhw16g;
}
} else if (order.size() == 7 &&
order[0] == 0 && order[1] == 1 && order[2] == 2 && order[3] == 3 && order[4] == 4 && order[5] == 5 && order[6] == 1) {
if (blkdDims[6] == 8) {
mkldnnFormat = memory::format::gOdhwi8o;
} else if (blkdDims[6] == 16) {
mkldnnFormat = memory::format::gOdhwi16o;
}
} else if (order.size() == 8 &&
order[0] == 0 && order[1] == 1 && order[2] == 3 && order[3] == 4 && order[4] == 2 && order[5] == 5 &&
order[6] == 1 && order[7] == 2) {

View File

@@ -182,8 +182,6 @@ void argmax_many_classes_has_axis(const float* src_data, float* dst_data, Shape
vmask_type vmask;
int s_index = i0 * dim * after_num + ib1 * block_size;
std::memset(reinterpret_cast<void*>(&vmax_values[0]), 0, sizeof(vmax_values));
auto vswap_func = [&](int index1, int index2) {
vtmp = vmax_values[index1];
vmax_values[index1] = _mm_uni_blendv_ps(vmax_values[index1], vmax_values[index2], vmask);

View File

@@ -157,7 +157,7 @@ void MKLDNNDepthwiseNode::createDescriptor(const std::vector<InferenceEngine::Te
const std::vector<InferenceEngine::TensorDesc> &outputDesc) {
MKLDNNMemoryDesc in_candidate(inputDesc[0]);
MKLDNNMemoryDesc out_candidate(inputDesc[0]);
MKLDNNDims weightDims({in_candidate.getDims()[1]});
MKLDNNDims weightDims({in_candidate.getDims().ndims() == 1 ? in_candidate.getDims()[0] : in_candidate.getDims()[1]});
MKLDNNMemoryDesc wgh_candidate{weightDims, in_candidate.getDataType(), memory::x};

View File

@@ -209,32 +209,34 @@ void MKLDNNFullyConnectedNode::setPostOps(mkldnn::primitive_attr &attr, bool ini
PostOpsIntBlobMemory.push_back(MKLDNNMemoryPtr(new MKLDNNMemory(getEngine())));
PostOpsIntBlobMemory[blob_idx]->Create(depthwiseDims, memory::data_type::f32, memory::format::x);
PostOpsIntBlobMemory[blob_idx]->SetData(memory::data_type::f32, memory::x,
depthwiseLayer->_weights->buffer(),
depthwiseLayer->_weights->size() *
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
// In case ndims == 3 graph optimizer allows fusing only if all weights values are the same
if (depthwiseNode->isBroadcast() || ndims == 3) {
float broadcastValue = static_cast<float *>(PostOpsIntBlobMemory[blob_idx]->GetData())[0];
for (int i = 1; i < PostOpsIntBlobMemory[blob_idx]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
float broadcastValue = static_cast<float *>(depthwiseLayer->_weights->buffer())[0];
for (int i = 0; i < PostOpsIntBlobMemory[blob_idx]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
static_cast<float *>(PostOpsIntBlobMemory[blob_idx]->GetData())[i] = broadcastValue;
}
} else {
PostOpsIntBlobMemory[blob_idx]->SetData(memory::data_type::f32, memory::x,
depthwiseLayer->_weights->buffer(),
depthwiseLayer->_weights->size() *
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
}
if (depthwiseNode->getAlgorithm() == depthwise_scale_shift) {
PostOpsIntBlobMemory.push_back(MKLDNNMemoryPtr(new MKLDNNMemory(getEngine())));
PostOpsIntBlobMemory[blob_idx + 1]->Create(depthwiseDims, memory::data_type::f32,
memory::format::x);
PostOpsIntBlobMemory[blob_idx + 1]->SetData(memory::data_type::f32, memory::x,
depthwiseLayer->_biases->buffer(),
depthwiseLayer->_biases->size() *
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
PostOpsIntBlobMemory[blob_idx + 1]->Create(depthwiseDims, memory::data_type::f32, memory::format::x);
// In case ndims == 3 graph optimizer allows fusing only if all biases values are the same
if (depthwiseNode->isBroadcast() || ndims == 3) {
float broadcastValue = static_cast<float *>(PostOpsIntBlobMemory[blob_idx + 1]->GetData())[0];
for (int i = 1; i < PostOpsIntBlobMemory[blob_idx + 1]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
float broadcastValue = static_cast<float *>(depthwiseLayer->_biases->buffer())[0];
for (int i = 0; i < PostOpsIntBlobMemory[blob_idx + 1]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
static_cast<float *>(PostOpsIntBlobMemory[blob_idx + 1]->GetData())[i] = broadcastValue;
}
} else {
PostOpsIntBlobMemory[blob_idx + 1]->SetData(memory::data_type::f32, memory::x,
depthwiseLayer->_biases->buffer(),
depthwiseLayer->_biases->size() *
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
}
ops.append_depthwise(depthwiseNode->getAlgorithm(),

View File

@@ -667,7 +667,8 @@ private:
};
MKLDNNNormalizeNode::MKLDNNNormalizeNode(const InferenceEngine::CNNLayerPtr& layer, const mkldnn::engine& eng, MKLDNNWeightsSharing::Ptr &cache)
: MKLDNNNode(layer, eng, cache) {}
: MKLDNNNode(layer, eng, cache), src_data_size(0lu), dst_data_size(0lu), weights_data_size(0lu),
input_prec(Precision::UNSPECIFIED), output_prec(Precision::UNSPECIFIED), weights_prec(Precision::UNSPECIFIED) {}
void MKLDNNNormalizeNode::getSupportedDescriptors() {
if (!descs.empty())

View File

@@ -120,13 +120,18 @@ void MKLDNNReorderNode::createReorderPrimitive(const mkldnn::memory::desc &srcDe
// Code block below tries to detect such cases and reinterpret data planar formats (e.g. nchw)
// as grouped weights planar formats (e.g. goihw) since they have same physical memory layout.
if (MKLDNNMemory::GetPlainFormat(src_blocked->GetDims()) == src_blocked->GetFormat() &&
MKLDNNMemory::IsGroupedFormat(dst_blocked->GetFormat())) {
src_blocked->GetDims().size() + 1 == dst_blocked->GetDims().size()) {
try {
mkldnn::memory::dims newDims = dst_blocked->GetDims();
mkldnn::memory::format newFormat;
newFormat = src_blocked->GetDims().size() == 4 ? memory::goihw :
src_blocked->GetDims().size() == 5 ? memory::goidhw :
src_blocked->GetFormat();
if (MKLDNNMemory::IsGroupedFormat(dst_blocked->GetFormat())) {
newFormat = src_blocked->GetDims().size() == 4 ? memory::goihw :
src_blocked->GetDims().size() == 5 ? memory::goidhw :
src_blocked->GetFormat();
} else {
newFormat = src_blocked->GetDims().size() == 4 ? memory::ncdhw :
src_blocked->GetFormat();
}
auto newDesc = mkldnn::memory::desc(newDims, src_blocked->GetDataType(), newFormat);
src_blocked->Create(newDesc, srcPtr, false);

View File

@@ -413,6 +413,16 @@ std::shared_ptr<ngraph::Node> V10Parser::createNode(const std::vector<ngraph::Ou
std::make_shared<LayerCreator<ngraph::op::v1::ReduceLogicalOr>>("ReduceLogicalOr"),
};
// Check that operation in default opsets
auto isDefaultOpSet = [](const std::string& version) -> bool {
for (size_t i = 1; i <= 3; i++) {
std::string opset_name = "opset" + std::to_string(i);
if (version == opset_name)
return true;
}
return false;
};
for (size_t i = 0; i < inputs.size(); i++) {
if (!inputs[i].get_node())
THROW_IE_EXCEPTION << params.type << " layer " << params.name << " with id: " << params.layerId
@@ -423,21 +433,23 @@ std::shared_ptr<ngraph::Node> V10Parser::createNode(const std::vector<ngraph::Ou
}
std::shared_ptr<ngraph::Node> ngraphNode;
// Try to create operation from creators
for (const auto& creator : creators) {
if (creator->shouldCreate(params.type)) {
bool useCreator = false;
// Check that opset is registered
useCreator |= opsets.find(params.version) == opsets.end();
if (!useCreator) {
// Check that creator can create operation with the version from opset
const auto opset = opsets.at(params.version);
// Opset should contains the same version of operation or doesn't contain operation with current type
useCreator |= opset.contains_type(creator->getNodeType()) || !opset.contains_type(params.type);
if (isDefaultOpSet(params.version)) {
// Try to create operation from creators
for (const auto& creator : creators) {
if (creator->shouldCreate(params.type)) {
bool useCreator = false;
// Check that opset is registered
useCreator |= opsets.find(params.version) == opsets.end();
if (!useCreator) {
// Check that creator can create operation with the version from opset
const auto opset = opsets.at(params.version);
// Opset should contains the same version of operation or doesn't contain operation with current type
useCreator |= opset.contains_type(creator->getNodeType()) || !opset.contains_type(params.type);
}
if (useCreator)
ngraphNode = creator->createLayer(inputs, node, binStream, params);
break;
}
if (useCreator)
ngraphNode = creator->createLayer(inputs, node, binStream, params);
break;
}
}

View File

@@ -0,0 +1,43 @@
// Copyright (C) 2018-2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#pragma once
#include <memory>
#include <transformations_visibility.hpp>
#include <ngraph/op/op.hpp>
#include <ngraph/op/experimental/layers/prior_box_clustered.hpp>
namespace ngraph {
namespace op {
class TRANSFORMATIONS_API PriorBoxClusteredIE : public Op {
public:
static constexpr NodeTypeInfo type_info{"PriorBoxClusteredIE", 1};
const NodeTypeInfo& get_type_info() const override { return type_info; }
/// \brief Constructs a PriorBoxClusteredIE operation
///
/// \param layer Layer for which prior boxes are computed
/// \param image Input Input to which prior boxes are scaled
/// \param attrs PriorBoxClustered attributes
PriorBoxClusteredIE(const Output<Node>& input,
const Output<Node>& image,
const ngraph::op::PriorBoxClusteredAttrs& attrs);
void validate_and_infer_types() override;
std::shared_ptr<Node> copy_with_new_args(const NodeVector& new_args) const override;
const PriorBoxClusteredAttrs& get_attrs() const { return m_attrs; }
private:
PriorBoxClusteredAttrs m_attrs;
};
} // namespace op
} // namespace ngraph

View File

@@ -0,0 +1,42 @@
// Copyright (C) 2018-2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#pragma once
#include <memory>
#include <transformations_visibility.hpp>
#include "ngraph/op/op.hpp"
#include "ngraph/op/experimental/layers/prior_box.hpp"
namespace ngraph {
namespace op {
class TRANSFORMATIONS_API PriorBoxIE : public Op {
public:
static constexpr NodeTypeInfo type_info{"PriorBoxIE", 1};
const NodeTypeInfo& get_type_info() const override { return type_info; }
/// \brief Constructs a PriorBoxIE operation
///
/// \param layer Layer for which prior boxes are computed
/// \param image Input Input to which prior boxes are scaled
/// \param attrs PriorBox attributes
PriorBoxIE(const Output<Node>& input,
const Output<Node>& image,
const ngraph::op::PriorBoxAttrs& attrs);
void validate_and_infer_types() override;
std::shared_ptr<Node> copy_with_new_args(const NodeVector& new_args) const override;
const PriorBoxAttrs& get_attrs() const { return m_attrs; }
private:
PriorBoxAttrs m_attrs;
};
} // namespace op
} // namespace ngraph

View File

@@ -16,6 +16,8 @@
// This pass must be called first in pipeline
NGRAPH_PASS(InitNodeInfo, ::ngraph::pass)
NGRAPH_PASS(ConvertPriorBox, ::ngraph::pass) // WA: ConvertPriorBox must be executed before CF
NGRAPH_PASS(ConstantFolding, ::ngraph::pass)
NGRAPH_PASS(RemoveFilteringBoxesBySize, ::ngraph::pass) // Resolves dynamism (replaces NonZero), CF needed
NGRAPH_PASS(ConstantFolding, ::ngraph::pass)
NGRAPH_PASS(StridedSliceOptimization, ::ngraph::pass) // depends on CF

View File

@@ -0,0 +1,33 @@
// Copyright (C) 2018-2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#pragma once
#include <vector>
#include <memory>
#include <transformations_visibility.hpp>
#include <ngraph/pass/graph_rewrite.hpp>
namespace ngraph {
namespace pass {
class TRANSFORMATIONS_API ConvertPriorBox;
} // namespace pass
} // namespace ngraph
class ngraph::pass::ConvertPriorBox: public ngraph::pass::GraphRewrite {
public:
ConvertPriorBox() : GraphRewrite() {
convert_prior_box();
convert_prior_box_clustered();
}
private:
void convert_prior_box();
void convert_prior_box_clustered();
};

View File

@@ -0,0 +1,39 @@
// Copyright (C) 2018-2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include "ngraph_ops/prior_box_clustered_ie.hpp"
#include <memory>
#include "ngraph/op/constant.hpp"
using namespace std;
using namespace ngraph;
constexpr NodeTypeInfo op::PriorBoxClusteredIE::type_info;
op::PriorBoxClusteredIE::PriorBoxClusteredIE(const Output<Node>& input, const Output<Node>& image,
const PriorBoxClusteredAttrs& attrs)
: Op({input, image}), m_attrs(attrs) {
constructor_validate_and_infer_types();
}
void op::PriorBoxClusteredIE::validate_and_infer_types() {
if (get_input_partial_shape(0).is_dynamic() || get_input_partial_shape(1).is_dynamic()) {
set_output_type(0, element::f32, PartialShape::dynamic(3));
return;
}
auto input_shape = get_input_shape(0);
auto image_shape = get_input_shape(1);
size_t num_priors = m_attrs.widths.size();
set_output_type(0, element::f32, Shape {1, 2, 4 * input_shape[2] * input_shape[3] * num_priors});
}
shared_ptr<Node> op::PriorBoxClusteredIE::copy_with_new_args(const NodeVector& new_args) const {
check_new_args_count(this, new_args);
return make_shared<PriorBoxClusteredIE>(new_args.at(0), new_args.at(1), m_attrs);
}

View File

@@ -0,0 +1,36 @@
// Copyright (C) 2018-2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include "ngraph_ops/prior_box_ie.hpp"
#include <memory>
#include "ngraph/op/constant.hpp"
using namespace std;
using namespace ngraph;
constexpr NodeTypeInfo op::PriorBoxIE::type_info;
op::PriorBoxIE::PriorBoxIE(const Output<Node>& input, const Output<Node>& image, const PriorBoxAttrs& attrs)
: Op({input, image}), m_attrs(attrs) {
constructor_validate_and_infer_types();
}
void op::PriorBoxIE::validate_and_infer_types() {
if (get_input_partial_shape(0).is_dynamic() || get_input_partial_shape(1).is_dynamic()) {
set_output_type(0, element::f32, PartialShape::dynamic(3));
return;
}
auto input_shape = get_input_shape(0);
auto image_shape = get_input_shape(1);
set_output_type(0, element::f32, Shape {
1, 2, 4 * input_shape[2] * input_shape[3] * static_cast<size_t>(op::PriorBox::number_of_priors(m_attrs))});
}
shared_ptr<Node> op::PriorBoxIE::copy_with_new_args(const NodeVector& new_args) const {
check_new_args_count(this, new_args);
return make_shared<PriorBoxIE>(new_args.at(0), new_args.at(1), m_attrs);
}

View File

@@ -5,6 +5,7 @@
#include <memory>
#include "transformations/common_optimizations/common_optimizations.hpp"
#include "transformations/convert_opset1_to_legacy/convert_prior_to_ie_prior.hpp"
#include "transformations/depth_to_space_fusion.hpp"
#include "transformations/optimize_strided_slice.hpp"
#include "transformations/convert_scatter_elements_to_scatter.hpp"

View File

@@ -17,7 +17,8 @@ void ngraph::pass::ConvertDivide::convert_divide() {
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
auto div = std::dynamic_pointer_cast<ngraph::opset1::Divide> (m.get_match_root());
if (!div) {
// We can not apply this transformation in case with integer input data type
if (!div || div->input(0).get_element_type().is_integral()) {
return false;
}

View File

@@ -0,0 +1,294 @@
// Copyright (C) 2018-2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include "transformations/convert_opset1_to_legacy/convert_prior_to_ie_prior.hpp"
#include <memory>
#include <vector>
#include <ngraph/opsets/opset3.hpp>
#include <ngraph/opsets/opset1.hpp>
#include <ngraph_ops/prior_box_ie.hpp>
#include <ngraph_ops/prior_box_clustered_ie.hpp>
#include <ngraph/rt_info.hpp>
void ngraph::pass::ConvertPriorBox::convert_prior_box() {
auto data = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
auto axes = ngraph::opset1::Constant::create(element::i64, Shape{1}, {0});
auto image = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
ngraph::op::PriorBoxAttrs attr;
attr.min_size = {162.0f};
attr.max_size = {213.0f};
attr.aspect_ratio = {2.0f, 3.0f};
attr.variance = {0.1f, 0.1f, 0.2f, 0.2f};
attr.step = 64.0f;
attr.offset = 0.5f;
attr.clip = 0;
attr.flip = 1;
attr.scale_all_sizes = true;
auto prior_box = std::make_shared<ngraph::opset1::PriorBox>(data, image, attr);
auto unsqueeze = std::make_shared<ngraph::opset1::Unsqueeze> (prior_box, axes);
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
auto unsqueeze = std::dynamic_pointer_cast<ngraph::opset1::Unsqueeze> (m.get_match_root());
if (!unsqueeze) {
return false;
}
auto prior_box_node = std::dynamic_pointer_cast<ngraph::opset1::PriorBox> (unsqueeze->input_value(0).get_node_shared_ptr());
if (!prior_box_node) {
return false;
}
// vector of nGraph nodes that will be replaced
ngraph::NodeVector ops_to_replace{unsqueeze, prior_box_node};
std::shared_ptr<Node> input_1(prior_box_node->input_value(0).get_node_shared_ptr());
std::shared_ptr<Node> input_2(prior_box_node->input_value(1).get_node_shared_ptr());
auto convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
auto convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
if (convert1 && convert2) {
ops_to_replace.push_back(convert1);
ops_to_replace.push_back(convert2);
input_1 = convert1->input_value(0).get_node_shared_ptr();
input_2 = convert2->input_value(0).get_node_shared_ptr();
}
auto strided_slice1 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_1);
auto strided_slice2 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_2);
if (!strided_slice1 || !strided_slice2) {
return false;
}
ops_to_replace.push_back(strided_slice1);
ops_to_replace.push_back(strided_slice2);
// Check that StridedSlice1 cuts H,W dims for PriorBox
auto begin = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(1).get_node_shared_ptr());
auto end = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(2).get_node_shared_ptr());
auto stride = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(3).get_node_shared_ptr());
if (!begin || !end || !stride) {
return false;
}
auto begin_val = begin->get_vector<int64_t>();
auto end_val = end->get_vector<int64_t>();
auto stride_val = stride->get_vector<int64_t>();
if (begin_val.size() != 1 && begin_val[0] != 2) {
return false;
}
if (end_val.size() != 1 && end_val[0] != 4) {
return false;
}
if (stride_val.size() != 1 && stride_val[0] != 1) {
return false;
}
// TODO: should we check second StridedSlice?
input_1 = strided_slice1->input_value(0).get_node_shared_ptr();
input_2 = strided_slice2->input_value(0).get_node_shared_ptr();
convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
if (convert1 && convert2) {
ops_to_replace.push_back(convert1);
ops_to_replace.push_back(convert2);
input_1 = convert1->input_value(0).get_node_shared_ptr();
input_2 = convert2->input_value(0).get_node_shared_ptr();
}
// the input can be either ShapeOf-1 or ShapeOf-3
std::shared_ptr<ngraph::op::Op> shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_1);
std::shared_ptr<ngraph::op::Op> shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_2);
if (!shape_of1 || !shape_of2) {
shape_of1 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_1);
shape_of2 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_2);
}
if (!shape_of1 || !shape_of2) {
return false;
}
// keep this code for a while if will decide to run this transformation again in the opset1->legacy
// the input can be either ShapeOf or Convert(ShapeOf)
// if (!shape_of1 || !shape_of2) {
// auto shapeof1_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
// auto shapeof2_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
// if (!shapeof1_convert || !shapeof2_convert)
// return false;
// shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof1_convert->input_value(0).get_node_shared_ptr());
// shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof2_convert->input_value(0).get_node_shared_ptr());
// if (!shape_of1 || !shape_of2)
// return false;
// ops_to_replace.push_back(shapeof1_convert);
// ops_to_replace.push_back(shapeof2_convert);
// }
ops_to_replace.push_back(shape_of1);
ops_to_replace.push_back(shape_of2);
auto prior_box_ie = std::make_shared<ngraph::op::PriorBoxIE> (shape_of1->input_value(0),
shape_of2->input_value(0),
prior_box_node->get_attrs());
prior_box_ie->set_friendly_name(unsqueeze->get_friendly_name());
// Nodes in copy runtime info function should be in topological order
std::reverse(ops_to_replace.begin(), ops_to_replace.end());
ngraph::copy_runtime_info(ops_to_replace, prior_box_ie);
ngraph::replace_node(m.get_match_root(), prior_box_ie);
return true;
};
auto m = std::make_shared<ngraph::pattern::Matcher>(unsqueeze, "CPUFusion.ConvertPriorBoxToPriorBoxIE");
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
}
void ngraph::pass::ConvertPriorBox::convert_prior_box_clustered() {
auto data = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
auto axes = ngraph::opset1::Constant::create(element::i64, Shape{1}, {0});
auto image = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
ngraph::op::PriorBoxClusteredAttrs attr;
attr.widths = {0.1f, 0.1f, 0.2f, 0.2f};
attr.heights = {0.1f, 0.1f, 0.2f, 0.2f};
attr.variances = {0.1f, 0.1f, 0.2f, 0.2f};
attr.step_widths = 64.0f;
attr.step_heights = 64.0f;
attr.offset = 0.5f;
attr.clip = false;
auto prior_box = std::make_shared<ngraph::opset1::PriorBoxClustered>(data, image, attr);
auto unsqueeze = std::make_shared<ngraph::opset1::Unsqueeze> (prior_box, axes);
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
auto unsqueeze = std::dynamic_pointer_cast<ngraph::opset1::Unsqueeze> (m.get_match_root());
if (!unsqueeze) {
return false;
}
auto prior_box_node = std::dynamic_pointer_cast<ngraph::opset1::PriorBoxClustered> (unsqueeze->get_argument(0));
if (!prior_box_node) {
return false;
}
// vector of nGraph nodes that will be replaced
ngraph::NodeVector ops_to_replace{unsqueeze, prior_box_node};
std::shared_ptr<Node> input_1(prior_box_node->input_value(0).get_node_shared_ptr());
std::shared_ptr<Node> input_2(prior_box_node->input_value(1).get_node_shared_ptr());
auto convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
auto convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
if (convert1 && convert2) {
ops_to_replace.push_back(convert1);
ops_to_replace.push_back(convert2);
input_1 = convert1->input_value(0).get_node_shared_ptr();
input_2 = convert2->input_value(0).get_node_shared_ptr();
}
auto strided_slice1 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_1);
auto strided_slice2 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_2);
if (!strided_slice1 || !strided_slice2) {
return false;
}
ops_to_replace.push_back(strided_slice1);
ops_to_replace.push_back(strided_slice2);
// Check that StridedSlice1 cuts H,W dims for PriorBox
auto begin = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(1));
auto end = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(2));
auto stride = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(3));
if (!begin || !end || !stride) {
return false;
}
auto begin_val = begin->get_vector<int64_t>();
auto end_val = end->get_vector<int64_t>();
auto stride_val = stride->get_vector<int64_t>();
if (begin_val.size() != 1 && begin_val[0] != 2) {
return false;
}
if (end_val.size() != 1 && end_val[0] != 4) {
return false;
}
if (stride_val.size() != 1 && stride_val[0] != 1) {
return false;
}
// TODO: should we check second StridedSlice?
input_1 = strided_slice1->input_value(0).get_node_shared_ptr();
input_2 = strided_slice2->input_value(0).get_node_shared_ptr();
convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
if (convert1 && convert2) {
ops_to_replace.push_back(convert1);
ops_to_replace.push_back(convert2);
input_1 = convert1->input_value(0).get_node_shared_ptr();
input_2 = convert2->input_value(0).get_node_shared_ptr();
}
// the input can be either ShapeOf-1 or ShapeOf-3
std::shared_ptr<ngraph::op::Op> shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_1);
std::shared_ptr<ngraph::op::Op> shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_2);
if (!shape_of1 || !shape_of2) {
shape_of1 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_1);
shape_of2 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_2);
}
if (!shape_of1 || !shape_of2) {
return false;
}
// keep this code for a while if will decide to run this transformation again in the opset1->legacy
// the input can be either ShapeOf or Convert(ShapeOf)
// if (!shape_of1 || !shape_of2) {
// auto shapeof1_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
// auto shapeof2_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
// if (!shapeof1_convert || !shapeof2_convert)
// return false;
// shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof1_convert->input_value(0).get_node_shared_ptr());
// shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof2_convert->input_value(0).get_node_shared_ptr());
// if (!shape_of1 || !shape_of2)
// return false;
// ops_to_replace.push_back(shapeof1_convert);
// ops_to_replace.push_back(shapeof2_convert);
// }
ops_to_replace.push_back(shape_of1);
ops_to_replace.push_back(shape_of2);
auto prior_box_ie = std::make_shared<ngraph::op::PriorBoxClusteredIE> (shape_of1->get_argument(0),
shape_of2->get_argument(0),
prior_box_node->get_attrs());
prior_box_ie->set_friendly_name(unsqueeze->get_friendly_name());
// Nodes in copy runtime info function should be in topological order
std::reverse(ops_to_replace.begin(), ops_to_replace.end());
ngraph::copy_runtime_info(ops_to_replace, prior_box_ie);
ngraph::replace_node(unsqueeze, prior_box_ie);
return true;
};
auto m = std::make_shared<ngraph::pattern::Matcher>(unsqueeze, "CPUFusion.ConvertPriorBoxClusteredToPriorBoxClusteredIE");
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
}

View File

@@ -41,10 +41,6 @@ void ngraph::pass::ConvertStridedSliceToCrop::convert_strided_slice_to_crop() {
auto input_shape = slice->get_input_shape(0);
auto output_shape = slice->get_output_shape(0);
// MKLDNN: "Crop supports only 2d, 4d and 5d blobs."
if (input_shape.size() != 2 && input_shape.size() != 4 && input_shape.size() != 5) {
return false;
}
auto begin = begin_node->cast_vector<int64_t>();
auto end = end_node->cast_vector<int64_t>();
@@ -201,6 +197,12 @@ void ngraph::pass::ConvertStridedSliceToCrop::convert_strided_slice_to_crop() {
new_ops.push_back(data_node);
}
auto data_node_shape = data_node->get_output_shape(0);
// MKLDNN: "Crop supports only 2d, 4d and 5d blobs."
if (data_node_shape.size() != 2 && data_node_shape.size() != 4 && data_node_shape.size() != 5) {
return false;
}
// Crop
data_node = std::make_shared<ngraph::op::CropIE> (data_node, axes, dim, offset);
data_node->set_friendly_name(slice->get_friendly_name());

View File

@@ -42,22 +42,37 @@ void ngraph::pass::ConvertTopKToTopKIE::convert_topk_to_topk_ie() {
topk->get_sort_type());
new_ops.push_back(topk_ie);
Output<Node> element_output;
Output<Node> index_output;
// insert Convert if index element type not equal to i32
if (topk->get_index_element_type() == element::i32) {
// insert Convert if index element type not equal to i32 and output #1 of TopK has consumers
if (topk->get_index_element_type() == element::i32 || topk->get_output_target_inputs(1).size() == 0) {
element_output = topk_ie->output(0);
index_output = topk_ie->output(1);
} else {
topk_ie->set_friendly_name(topk->get_friendly_name());
} else if (topk->get_output_target_inputs(0).size() == 0) {
index_output = std::make_shared<opset1::Convert>(topk_ie->output(1), topk->get_index_element_type());
new_ops.push_back(index_output.get_node_shared_ptr());
// workaround for naming output #1 of TopK
index_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
} else {
// create fake convert for 0 output, it is a workaround in purpose of correct output names preserving
element_output = std::make_shared<opset1::Convert>(topk_ie->output(0), topk->get_output_element_type(0));
index_output = std::make_shared<opset1::Convert>(topk_ie->output(1), topk->get_index_element_type());
new_ops.push_back(element_output.get_node_shared_ptr());
new_ops.push_back(index_output.get_node_shared_ptr());
// workaround for naming two outputs of TopK
element_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".0");
index_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
}
topk_ie->set_friendly_name(topk->get_friendly_name());
ngraph::copy_runtime_info(topk, new_ops);
topk->output(0).replace(topk_ie->output(0));
topk->output(0).replace(element_output);
topk->output(1).replace(index_output);
return true;
};
auto m = std::make_shared<ngraph::pattern::Matcher>(topk, "ConvertTopKToTopKIE");
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
}
}

View File

@@ -20,24 +20,40 @@ void ngraph::pass::ConvertTopK3::convert_topk3() {
if (!topk) {
return false;
}
Output<Node> last;
Output<Node> last0;
Output<Node> last1;
ngraph::NodeVector new_ops;
auto new_topk = std::make_shared<ngraph::opset2::TopK>(topk->input_value(0), topk->input_value(1),
topk->get_axis(), topk->get_mode(), topk->get_sort_type(), element::i32);
new_ops.push_back(new_topk);
// if the output is the i32 then it matches behavior of the v1::TopK otherwise need to insert Convert
if (topk->get_index_element_type() == element::i32) {
last = new_topk->output(1);
// if the output is the i32 or output #1 has no consumers
// then it matches behavior of the v1::TopK otherwise need to insert Convert
if (topk->get_index_element_type() == element::i32 || topk->get_output_target_inputs(1).size() == 0) {
last0 = new_topk->output(0);
last1 = new_topk->output(1);
new_topk->set_friendly_name(topk->get_friendly_name());
} else if (topk->get_output_target_inputs(0).size() == 0) {
last1 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
new_ops.push_back(last1.get_node_shared_ptr());
// workaround for naming two outputs of TopK
last1.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
} else {
last = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
new_ops.push_back(last.get_node_shared_ptr());
// create fake convert for 0 output, it is a workaround in purpose of correct output names preserving
last0 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(0), topk->get_output_element_type(0));
last1 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
new_ops.push_back(last0.get_node_shared_ptr());
new_ops.push_back(last1.get_node_shared_ptr());
// workaround for naming two outputs of TopK
last0.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".0");
last1.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
}
new_topk->set_friendly_name(topk->get_friendly_name());
ngraph::copy_runtime_info(topk, new_ops);
topk->output(0).replace(new_topk->output(0));
topk->output(1).replace(last);
topk->output(0).replace(last0);
topk->output(1).replace(last1);
return true;
};

View File

@@ -30,7 +30,7 @@ bool check_block_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
is_transformation_valid &= (expected_shape == shape_reshape_before);
// x'' = transpose(x', [0, K + 1, K + 2, 1, K + 3, 2, K + 4, 3, ..., K + (K + 1), K])
ngraph::AxisVector expected_permutation = {0, spatial_dims + 1};
ngraph::AxisVector expected_permutation = {0, static_cast<size_t>(spatial_dims + 1)};
for (uint64_t i = 2; i < shape_input.size(); ++i) {
expected_permutation.push_back(spatial_dims + i);
expected_permutation.push_back(i - 1);
@@ -38,7 +38,7 @@ bool check_block_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
is_transformation_valid &= (expected_permutation == permutation);
// y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])
expected_shape = {shape_input[0], c_dim};
expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
for (uint64_t i = 2; i < shape_input.size(); ++i)
expected_shape.push_back(shape_input[i] * possible_block_size);
is_transformation_valid &= (expected_shape == shape_reshape_after);
@@ -57,7 +57,7 @@ bool check_depth_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
uint64_t c_dim = shape_input[1] / std::pow(possible_block_size, spatial_dims);
// x' = reshape(data, [N, C / (block_size ^ K), block_size, block_size, ..., block_size, D1, D2, ..., DK])
ngraph::Shape expected_shape = {shape_input[0], c_dim};
ngraph::Shape expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
for (uint64_t i = 0; i < spatial_dims; ++i)
expected_shape.push_back(possible_block_size);
for (uint64_t i = 2; i < shape_input.size(); ++i)
@@ -73,7 +73,7 @@ bool check_depth_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
is_transformation_valid &= (expected_permutation == permutation);
// y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])
expected_shape = {shape_input[0], c_dim};
expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
for (uint64_t i = 2; i < shape_input.size(); ++i)
expected_shape.push_back(shape_input[i] * possible_block_size);
is_transformation_valid &= (expected_shape == shape_reshape_after);

View File

@@ -26,7 +26,7 @@ namespace vpu {
template <typename T>
Optional<int> parseNumber(const std::string& s) {
T value;
auto value = T{};
if ((std::istringstream(s) >> value >> std::ws).eof()) {
return {value};
}

View File

@@ -39,7 +39,7 @@ void dynamicToStaticShapeBinaryEltwise(std::shared_ptr<ngraph::Node> eltwise) {
const auto diff = std::abs(lhsRank.get_length() - rhsRank.get_length());
if (diff) {
auto & broadcastInput = lhsRank.get_length() < rhsRank.get_length() ? lhsInput : rhsInput;
const auto broadcastConst = ngraph::opset3::Constant::create(broadcastInput.get_element_type(), {static_cast<uint64_t>(diff)}, {1});
const auto broadcastConst = ngraph::opset3::Constant::create(broadcastInput.get_element_type(), {static_cast<size_t>(diff)}, {1});
broadcastInput = std::make_shared<ngraph::opset3::Concat>(ngraph::OutputVector{broadcastConst, broadcastInput}, 0);
}

View File

@@ -392,8 +392,17 @@ inline Stage ModelObj::addNewStage(
// runAllocator
//
VPU_DECLARE_ENUM(EnableShapeAllocation,
YES,
NO)
VPU_DECLARE_ENUM(CheckOnlyCMX,
YES,
NO)
AllocationResult runAllocator(
const Model& model,
bool onlyCheckCMX = false);
EnableShapeAllocation = EnableShapeAllocation::NO,
CheckOnlyCMX = CheckOnlyCMX::NO);
} // namespace vpu

View File

@@ -84,9 +84,11 @@ void BackEnd::getMetaData(
stageMeta.layerName = "<Extra>";
stageMeta.layerType = "<Extra>";
} else {
stageMeta.layerName = stage->origLayer()->name;
stageMeta.layerType = stage->origLayer()->type;
visitedLayers.insert(stage->origLayer());
const auto& origLayer = stage->origLayer();
stageMeta.layerName = origLayer->params.count("originalLayersNames") ? origLayer->params["originalLayersNames"] :
origLayer->name;
stageMeta.layerType = origLayer->type;
visitedLayers.insert(origLayer);
}
return stageMeta;

View File

@@ -184,9 +184,9 @@ CustomLayer::CustomLayer(std::string configDir, const pugi::xml_node& customLaye
stageOrder.emplace(stageNum, CustomKernel{kernel, _configDir});
}
VPU_THROW_UNLESS(stageOrder.begin()->first == 0,
VPU_THROW_UNLESS(!stageOrder.empty() && stageOrder.begin()->first == 0,
"Error while binding %s custom layer: Stage 0 is not found.", _layerName);
VPU_THROW_UNLESS(stageOrder.rbegin()->first == stageOrder.size() - 1,
VPU_THROW_UNLESS(!stageOrder.empty() && stageOrder.rbegin()->first == stageOrder.size() - 1,
"Error while binding %s custom layer: Kernels should have stage id from 0 to N.", _layerName);
for (auto& stage : stageOrder) {

View File

@@ -430,6 +430,19 @@ bool checkHWRestrictions(
int kernelSizeX, int kernelSizeY,
int kernelStride,
HwOpMode mode, HwOpType type) {
// Workaround for HW ops failure if too wide input:
// Looks like HW operations (primarily Pooling) can
// use only part of available CMX, up to 1014 * 128
// bits (i.e. 1014 * 16 bytes)
// Provided HwOpMode is 16x16, this means HW needs
// to read up to 16 lines of input tensor, so each
// line mustn't exceed 1014 bytes or 507 pixels if
// precision is FP16
// More details available with the ticket #-33366
if (inTileWidth > 507) {
return false;
}
const int chansPerBlock = 1 << static_cast<int>(mode);
int noOfBlocks = divUp(inTileChannels, chansPerBlock);

View File

@@ -193,10 +193,10 @@ void PassImpl::wrapInLoop(const Model& model, const StageList& subgraph) {
loopEndOutputs.push_back(originalOutput);
const auto rule = IterationRule{Dim::N, 0, 1, -1};
endIterationComponents.emplace(std::make_pair(loopEndOutputs.size() - 1, rule), loopEndInputs.size() - 1);
} else {
for (const auto& consumerEdge : originalOutput->consumerEdges()) {
}
for (const auto& consumerEdge : originalOutput->consumerEdges()) {
if (subgraph.has(consumerEdge->consumer()))
model->replaceStageInput(consumerEdge, output);
}
}
}
}

View File

@@ -458,7 +458,7 @@ void PassImpl::packDataInCmx(const Model& model) {
return DataLoopStatus::NextChild;
});
auto allocRes = runAllocator(model, true);
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
env.log->trace("Allocation result : %v", allocRes.status);
if (allocRes.status != AllocationStatus::OK) {

View File

@@ -25,7 +25,7 @@ namespace vpu {
// runAllocator
//
AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
AllocationResult runAllocator(const Model& model, EnableShapeAllocation enableShapeAllocation, CheckOnlyCMX checkOnlyCmx) {
VPU_PROFILE(runAllocator);
auto& allocator = model->getAllocator();
@@ -40,7 +40,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
// Allocate Const/Input/Output datas.
//
if (!onlyCheckCMX) {
if (checkOnlyCmx == CheckOnlyCMX::NO) {
auto result = allocator.preprocess(model);
if (result.status != vpu::AllocationStatus::OK) {
return result;
@@ -86,14 +86,14 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
// Allocate stage outputs.
//
const auto allocateStageOutputs = [onlyCheckCMX, &allocator](const Stage& stage) -> AllocationResult {
const auto allocateStageOutputs = [checkOnlyCmx, &allocator](const Stage& stage) -> AllocationResult {
for (const auto& output : stage->outputs()) {
if (onlyCheckCMX && output->memReqs() != MemoryType::CMX) {
if (checkOnlyCmx == CheckOnlyCMX::YES && output->memReqs() != MemoryType::CMX) {
continue;
}
if (!allocator.allocateData(output)) {
if (output->memReqs() == MemoryType::CMX && !onlyCheckCMX) {
if (output->memReqs() == MemoryType::CMX && checkOnlyCmx == CheckOnlyCMX::NO) {
if (allocator.removeCMXCandidates(output)) {
if (allocator.allocateData(output)) {
continue;
@@ -123,7 +123,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
// Allocate stage temporary buffers.
//
if (!onlyCheckCMX) {
if (checkOnlyCmx == CheckOnlyCMX::NO) {
for (const auto& tempBufferEdge : stage->tempBufferEdges()) {
if (!allocator.allocateData(tempBufferEdge->tempBuffer())) {
allocator.setNeedToAllocNonIntermData();
@@ -157,7 +157,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
//
for (const auto& input : stage->inputs()) {
if (onlyCheckCMX && input->memReqs() != MemoryType::CMX) {
if (checkOnlyCmx == CheckOnlyCMX::YES && input->memReqs() != MemoryType::CMX) {
continue;
}
@@ -168,7 +168,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
// Release stage temporary buffers.
//
if (!onlyCheckCMX) {
if (checkOnlyCmx == CheckOnlyCMX::NO) {
for (const auto& tempBufferEdge : stage->tempBufferEdges()) {
allocator.freeData(tempBufferEdge->tempBuffer());
}
@@ -195,7 +195,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
if (const auto& parentEdge = data->parentDataToShapeEdge()) {
const auto& parent = parentEdge->parent();
if (parent->usage() == DataUsage::Intermediate && (!onlyCheckCMX || parent->memReqs() == MemoryType::CMX)) {
if (parent->usage() == DataUsage::Intermediate && (checkOnlyCmx == CheckOnlyCMX::NO || parent->memReqs() == MemoryType::CMX)) {
allocator.freeData(parent);
}
}
@@ -205,9 +205,11 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
// Allocate shape for all datas
//
for (auto data : model->datas()) {
const auto shapeLocation = allocator.allocateShape(data);
data->setShapeAllocationInfo(shapeLocation);
if (enableShapeAllocation == EnableShapeAllocation::YES) {
for (auto data : model->datas()) {
const auto shapeLocation = allocator.allocateShape(data);
data->setShapeAllocationInfo(shapeLocation);
}
}
return AllocationResult();
@@ -233,7 +235,7 @@ void PassImpl::run(const Model& model) {
// Allocate all resources
//
auto allocRes = runAllocator(model);
auto allocRes = runAllocator(model, EnableShapeAllocation::YES);
IE_ASSERT(allocRes.status == AllocationStatus::OK);
//

View File

@@ -160,7 +160,7 @@ void PassImpl::run(const Model& model) {
model->replaceStageInput(consumerEdge, copyOutput);
}
auto allocRes = runAllocator(model, true);
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
if (allocRes.status != AllocationStatus::OK) {
model->replaceStageOutput(copyProducer->outputEdge(0), copyInput);

View File

@@ -171,7 +171,7 @@ void PassImpl::run(const Model& model) {
.childSW(swStage)
.done();
auto allocRes = runAllocator(model, true);
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
if (allocRes.status == AllocationStatus::OK) {
// TODO: try to merge more than one SW stage?
break;

View File

@@ -160,7 +160,9 @@ void ParsedConfig::parse(const std::map<std::string, std::string>& config) {
setOption(_compileConfig.hwExtraSplit, switches, config, VPU_CONFIG_KEY(HW_EXTRA_SPLIT));
setOption(_compileConfig.injectSwOps, switches, config, VPU_CONFIG_KEY(HW_INJECT_STAGES));
setOption(_compileConfig.mergeHwPoolToConv, switches, config, VPU_CONFIG_KEY(HW_POOL_CONV_MERGE));
IE_SUPPRESS_DEPRECATED_START
setOption(_compileConfig.ignoreIRStatistic, switches, config, VPU_CONFIG_KEY(IGNORE_IR_STATISTIC));
IE_SUPPRESS_DEPRECATED_END
setOption(_compileConfig.hwDilation, switches, config, VPU_CONFIG_KEY(HW_DILATION));
setOption(_compileConfig.forceDeprecatedCnnConversion, switches, config, VPU_CONFIG_KEY(FORCE_DEPRECATED_CNN_CONVERSION));
setOption(_compileConfig.disableReorder, switches, config, VPU_CONFIG_KEY(DISABLE_REORDER));

View File

@@ -266,6 +266,8 @@ void FrontEnd::parseConcat(
const ie::CNNLayerPtr& layer,
const DataVector& inputs,
const DataVector& outputs) const {
VPU_THROW_UNLESS(layer != nullptr, "parseConcat expects valid CNNLayerPtr, actually got nullptr");
VPU_THROW_UNLESS(!inputs.empty(),
"{} layer with name {} must have no less than 1 input, "
"actually provided 0 inputs", layer->type, layer->name);
@@ -275,10 +277,8 @@ void FrontEnd::parseConcat(
auto output = outputs[0];
auto concat = std::dynamic_pointer_cast<ie::ConcatLayer>(layer);
VPU_THROW_UNLESS(layer != nullptr,
"{} layer with name {} must be able to convert to ie::ConcatLayer",
layer->type, layer->name);
const auto& concat = std::dynamic_pointer_cast<ie::ConcatLayer>(layer);
VPU_THROW_UNLESS(concat != nullptr, "{} layer with name {} must be convertable to ie::ConcatLayer", layer->type, layer->name);
VPU_THROW_UNLESS(concat->_axis < output->desc().numDims(),
"{} layer with name {} must have axis attribute no grater than number of "

View File

@@ -128,9 +128,8 @@ private:
void FrontEnd::parseReduce(const Model& model, const ie::CNNLayerPtr& _layer, const DataVector& inputs, const DataVector& outputs) const {
auto layer = std::dynamic_pointer_cast<ie::ReduceLayer>(_layer);
VPU_THROW_UNLESS(layer != nullptr,
"Layer {} of type {} is nullptr",
layer->name, layer->type);
VPU_THROW_UNLESS(layer != nullptr, "parseReduce expects valid ReduceLayer, actually got nullptr");
VPU_THROW_UNLESS(inputs.size() == 2,
"Layer {} of type {} expects {} inputs, but provided {}",
layer->name, layer->type, 2, inputs.size());

View File

@@ -107,6 +107,7 @@ Engine::Engine(std::shared_ptr<IMvnc> mvnc) :
_pluginName = "MYRIAD";
IE_SUPPRESS_DEPRECATED_START
_config = {
{ KEY_VPU_HW_STAGES_OPTIMIZATION, "ON" },
{ KEY_LOG_LEVEL, "LOG_NONE" },
@@ -120,6 +121,7 @@ Engine::Engine(std::shared_ptr<IMvnc> mvnc) :
{ KEY_CONFIG_FILE, "" },
{ KEY_DEVICE_ID, "" },
};
IE_SUPPRESS_DEPRECATED_END
}
InferenceEngine::ExecutableNetwork Engine::ImportNetwork(

View File

@@ -17,6 +17,7 @@
#include <ie_core.hpp>
#include <net_pass.h>
#include <ngraph/opsets/opset3.hpp>
#include <ngraph/function.hpp>
#include <ngraph/variant.hpp>
#include <ngraph/op/maximum.hpp>
@@ -680,4 +681,25 @@ TEST(CNNNGraphImplTests, TestCheckStats) {
ASSERT_EQ(nullptr, _stats);
}
TEST(CNNNGraphImplTests, CanSetBatchReadValue) {
std::shared_ptr<ngraph::Function> ngraph;
{
auto input = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::f32, ngraph::Shape{1, 2});
auto constant = std::make_shared<ngraph::opset3::Constant>(ngraph::element::f32, ngraph::Shape{1, 2},
std::vector<float>{1, 2});
auto read_value = std::make_shared<ngraph::opset3::ReadValue>(constant, "variable_id");
auto add = std::make_shared<ngraph::opset3::Add>(input, read_value);
auto result = std::make_shared<ngraph::op::Result>(add);
ngraph::ParameterVector params = {input};
ngraph::ResultVector results = {result};
ngraph = std::make_shared<ngraph::Function>(results, params);
}
InferenceEngine::details::CNNNetworkNGraphImpl cnnNet(ngraph);
auto status = cnnNet.getCNNNetwork()->setBatchSize(4, nullptr);
EXPECT_EQ(status, StatusCode::OK);
}
IE_SUPPRESS_DEPRECATED_END

View File

@@ -60,9 +60,11 @@ protected:
/* validates a read network with the reference map of CNN layers */
void compareWithRef(const InferenceEngine::CNNNetwork &network,
const std::vector<InferenceEngine::CNNLayerPtr> &refLayersVec) {
IE_SUPPRESS_DEPRECATED_START
ASSERT_NO_THROW(FuncTestUtils::compareLayerByLayer<std::vector<InferenceEngine::CNNLayerPtr>>(
InferenceEngine::details::CNNNetSortTopologically(network),
refLayersVec, false));
IE_SUPPRESS_DEPRECATED_END
}
const std::string _modelPath = "NetReader_test.xml";

View File

@@ -30,16 +30,6 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
</port>
</output>
</layer>
<layer id="15" name="in3" type="Parameter" version="opset1">
<data element_type="f32" shape="1,2,32400"/>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
</output>
</layer>
<layer id="2" name="shape_of1" type="ShapeOf" version="opset1">
<input>
<port id="0" precision="FP32">
@@ -182,63 +172,19 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
</port>
</output>
</layer>
<layer name="concat" id="16" type="Concat" version="opset1">
<data axis="1"/>
<input>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
<port id="1" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
</input>
<output>
<port id="2" precision="FP32">
<dim>1</dim>
<dim>4</dim>
<dim>32400</dim>
</port>
</output>
</layer>
<layer id="10" name="output" type="Result" version="opset1">
<input>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>4</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
</input>
</layer>
<layer id="13" name="output_2" type="Result" version="opset1">
<input>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>768</dim>
<dim>30</dim>
<dim>30</dim>
</port>
</input>
</layer>
<layer id="14" name="output_3" type="Result" version="opset1">
<input>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>3</dim>
<dim>512</dim>
<dim>512</dim>
</port>
</input>
</layer>
</layers>
<edges>
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
<edge from-layer="0" from-port="0" to-layer="13" to-port="0"/>
<edge from-layer="1" from-port="0" to-layer="6" to-port="0"/>
<edge from-layer="1" from-port="0" to-layer="14" to-port="0"/>
<edge from-layer="2" from-port="1" to-layer="5" to-port="0"/>
<edge from-layer="6" from-port="1" to-layer="7" to-port="0"/>
<edge from-layer="3" from-port="1" to-layer="5" to-port="1"/>
@@ -251,90 +197,66 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
<edge from-layer="7" from-port="4" to-layer="8" to-port="1"/>
<edge from-layer="8" from-port="2" to-layer="11" to-port="0"/>
<edge from-layer="12" from-port="0" to-layer="11" to-port="1"/>
<edge from-layer="11" from-port="2" to-layer="16" to-port="1"/>
<edge from-layer="16" from-port="2" to-layer="10" to-port="0"/>
<edge from-layer="15" from-port="0" to-layer="16" to-port="0"/>
<edge from-layer="11" from-port="2" to-layer="10" to-port="0"/>
</edges>
</net>
)V0G0N";
std::string modelV5 = R"V0G0N(
<net name="Network" version="5" precision="FP32" batch="1">
<layers>
<layer name="in2" type="Input" precision="FP32" id="0">
<data originalLayersNames="in2" />
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>3</dim>
<dim>512</dim>
<dim>512</dim>
</port>
</output>
</layer>
<layer name="in1" type="Input" precision="FP32" id="1">
<data originalLayersNames="in1" />
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>768</dim>
<dim>30</dim>
<dim>30</dim>
</port>
</output>
</layer>
<layer name="in3" type="Input" precision="FP32" id="2">
<data originalLayersNames="in3" />
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
</output>
</layer>
<layer name="Constant_49" type="Const" precision="FP32" id="3">
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
</output>
<blobs>
<custom offset="0" size="259200" precision="FP32" />
</blobs>
</layer>
<layer name="concat" type="Concat" precision="FP32" id="4">
<data axis="1" originalLayersNames="concat" />
<input>
<port id="0">
<dim>1</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
</input>
<output>
<port id="2" precision="FP32">
<dim>1</dim>
<dim>4</dim>
<dim>32400</dim>
</port>
</output>
</layer>
</layers>
<edges>
<edge from-layer="2" from-port="0" to-layer="4" to-port="0" />
<edge from-layer="3" from-port="0" to-layer="4" to-port="1" />
</edges>
<layers>
<layer id="0" name="in1" type="Input" precision="FP32">
<output>
<port id="0">
<dim>1</dim>
<dim>768</dim>
<dim>30</dim>
<dim>30</dim>
</port>
</output>
</layer>
<layer id="1" name="in2" type="Input" precision="FP32">
<output>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>512</dim>
<dim>512</dim>
</port>
</output>
</layer>
<layer name="ExpandDims" id="2" type="PriorBoxClustered" precision="FP32">
<data clip="0" step_h="16.000000" step_w="16.000000" flip="1" height="44,10,30,19,94,32,61,53,17" offset="0.500000" step="16.000000" variance="0.1,0.1,0.2,0.2" width="86,13,57,39,68,34,142,50,23" originalLayersNames="ExpandDims,prior,shape_of1,shape_of2,ss1,ss2"/>
<input>
<port id="1">
<dim>1</dim>
<dim>768</dim>
<dim>30</dim>
<dim>30</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>3</dim>
<dim>512</dim>
<dim>512</dim>
</port>
</input>
<output>
<port id="3">
<dim>1</dim>
<dim>2</dim>
<dim>32400</dim>
</port>
</output>
</layer>
</layers>
<edges>
<edge from-layer="0" from-port="0" to-layer="2" to-port="1"/>
<edge from-layer="1" from-port="0" to-layer="2" to-port="2"/>
</edges>
</net>
)V0G0N";
compareIRs(model, modelV5, 259200, [](Blob::Ptr& weights) {
compareIRs(model, modelV5, 50, [](Blob::Ptr& weights) {
auto* buffer = weights->buffer().as<int64_t*>();
buffer[0] = 2;
buffer[1] = 4;
@@ -369,16 +291,6 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
</port>
</output>
</layer>
<layer id="15" name="in3" type="Parameter" version="opset1">
<data element_type="f32" shape="1,2,14400"/>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
</output>
</layer>
<layer id="2" name="shape_of1" type="ShapeOf" version="opset1">
<input>
<port id="0" precision="FP32">
@@ -520,63 +432,19 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
</port>
</output>
</layer>
<layer name="concat" id="16" type="Concat" version="opset1">
<data axis="1"/>
<input>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
<port id="1" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
</input>
<output>
<port id="2" precision="FP32">
<dim>1</dim>
<dim>4</dim>
<dim>14400</dim>
</port>
</output>
</layer>
<layer id="10" name="output" type="Result" version="opset1">
<input>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>4</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
</input>
</layer>
<layer id="13" name="output_2" type="Result" version="opset1">
<input>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>768</dim>
<dim>30</dim>
<dim>30</dim>
</port>
</input>
</layer>
<layer id="14" name="output_3" type="Result" version="opset1">
<input>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>3</dim>
<dim>512</dim>
<dim>512</dim>
</port>
</input>
</layer>
</layers>
<edges>
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
<edge from-layer="0" from-port="0" to-layer="13" to-port="0"/>
<edge from-layer="1" from-port="0" to-layer="6" to-port="0"/>
<edge from-layer="1" from-port="0" to-layer="14" to-port="0"/>
<edge from-layer="2" from-port="1" to-layer="5" to-port="0"/>
<edge from-layer="6" from-port="1" to-layer="7" to-port="0"/>
<edge from-layer="3" from-port="1" to-layer="5" to-port="1"/>
@@ -589,90 +457,66 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
<edge from-layer="7" from-port="4" to-layer="8" to-port="1"/>
<edge from-layer="8" from-port="2" to-layer="11" to-port="0"/>
<edge from-layer="12" from-port="0" to-layer="11" to-port="1"/>
<edge from-layer="11" from-port="2" to-layer="16" to-port="0"/>
<edge from-layer="15" from-port="0" to-layer="16" to-port="1"/>
<edge from-layer="16" from-port="2" to-layer="10" to-port="0"/>
<edge from-layer="11" from-port="2" to-layer="10" to-port="0"/>
</edges>
</net>
)V0G0N";
std::string modelV5 = R"V0G0N(
<net name="Network" version="5" precision="FP32" batch="1">
<layers>
<layer name="in2" type="Input" precision="FP32" id="0">
<data originalLayersNames="in2" />
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>3</dim>
<dim>512</dim>
<dim>512</dim>
</port>
</output>
</layer>
<layer name="in1" type="Input" precision="FP32" id="1">
<data originalLayersNames="in1" />
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>768</dim>
<dim>30</dim>
<dim>30</dim>
</port>
</output>
</layer>
<layer name="Constant_49" type="Const" precision="FP32" id="2">
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
</output>
<blobs>
<custom offset="0" size="115200" precision="FP32" />
</blobs>
</layer>
<layer name="in3" type="Input" precision="FP32" id="3">
<data originalLayersNames="in3" />
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
</output>
</layer>
<layer name="concat" type="Concat" precision="FP32" id="4">
<data axis="1" originalLayersNames="concat" />
<input>
<port id="0">
<dim>1</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
</input>
<output>
<port id="2" precision="FP32">
<dim>1</dim>
<dim>4</dim>
<dim>14400</dim>
</port>
</output>
</layer>
</layers>
<edges>
<edge from-layer="2" from-port="0" to-layer="4" to-port="0" />
<edge from-layer="3" from-port="0" to-layer="4" to-port="1" />
</edges>
<layers>
<layer id="0" name="in1" type="Input" precision="FP32">
<output>
<port id="0">
<dim>1</dim>
<dim>768</dim>
<dim>30</dim>
<dim>30</dim>
</port>
</output>
</layer>
<layer id="1" name="in2" type="Input" precision="FP32">
<output>
<port id="0">
<dim>1</dim>
<dim>3</dim>
<dim>512</dim>
<dim>512</dim>
</port>
</output>
</layer>
<layer name="ExpandDims" id="2" type="PriorBox" precision="FP32">
<data density="" fixed_ratio="" fixed_size="" aspect_ratio="2,0.5" clip="0" flip="0" img_h="0" img_size="0" img_w="0" max_size="" min_size="51.200001,72.407555" offset="0.500000" scale_all_sizes="0" step="17.066666666666666" step_h="0" step_w="0" variance="0.1,0.1,0.2,0.2" originalLayersNames="ExpandDims,prior,shape_of1,shape_of2,ss1,ss2"/>
<input>
<port id="1">
<dim>1</dim>
<dim>768</dim>
<dim>30</dim>
<dim>30</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>3</dim>
<dim>512</dim>
<dim>512</dim>
</port>
</input>
<output>
<port id="3">
<dim>1</dim>
<dim>2</dim>
<dim>14400</dim>
</port>
</output>
</layer>
</layers>
<edges>
<edge from-layer="0" from-port="0" to-layer="2" to-port="1"/>
<edge from-layer="1" from-port="0" to-layer="2" to-port="2"/>
</edges>
</net>
)V0G0N";
compareIRs(model, modelV5, 115200, [](Blob::Ptr& weights) {
compareIRs(model, modelV5, 40, [](Blob::Ptr& weights) {
auto* buffer = weights->buffer().as<int64_t*>();
buffer[0] = 2;
buffer[1] = 4;

View File

@@ -3,6 +3,7 @@
//
#include <string>
#include <generic_ie.hpp>
#include "ngraph_reader_tests.hpp"
TEST_F(NGraphReaderTests, ReadProposalNetwork) {
std::string model_v10 = R"V0G0N(
@@ -305,3 +306,100 @@ TEST_F(NGraphReaderTests, ReadProposalNetwork_2) {
compareIRs(model_v10, model_v6, 32);
}
TEST_F(NGraphReaderTests, ReadExtensionProposalNetwork) {
std::string model_v10 = R"V0G0N(
<net name="Network" version="10">
<layers>
<layer id="0" name="in1" type="Parameter" version="opset1">
<data element_type="f32" shape="1,12,34,62"/>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>12</dim>
<dim>34</dim>
<dim>62</dim>
</port>
</output>
</layer>
<layer id="1" name="in2" type="Parameter" version="opset1">
<data element_type="f32" shape="1,24,34,62"/>
<output>
<port id="0" precision="FP32">
<dim>1</dim>
<dim>24</dim>
<dim>34</dim>
<dim>62</dim>
</port>
</output>
</layer>
<layer id="2" name="in3" type="Const" version="opset1">
<data offset="0" size="24"/>
<output>
<port id="0" precision="I64">
<dim>3</dim>
</port>
</output>
</layer>
<layer name="proposal" type="Proposal" precision="FP32" id="3" version="extension">
<data feat_stride="16" base_size="16" min_size="16" ratio="2.669000" scale="4.000000,6.000000,9.000000,16.000000,24.000000,32.000000" pre_nms_topn="6000" post_nms_topn="200" nms_thresh="0.600000"/>
<input>
<port id="1">
<dim>1</dim>
<dim>12</dim>
<dim>34</dim>
<dim>62</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>24</dim>
<dim>34</dim>
<dim>62</dim>
</port>
<port id="3">
<dim>3</dim>
</port>
</input>
<output>
<port id="3" precision="FP32">
<dim>1000</dim>
<dim>5</dim>
</port>
<port id="4" precision="FP32">
<dim>1000</dim>
</port>
</output>
</layer>
<layer id="4" name="output" type="Result" version="opset1">
<input>
<port id="0" precision="FP32">
<dim>200</dim>
<dim>5</dim>
</port>
</input>
</layer>
</layers>
<edges>
<edge from-layer="0" from-port="0" to-layer="3" to-port="1"/>
<edge from-layer="1" from-port="0" to-layer="3" to-port="2"/>
<edge from-layer="2" from-port="0" to-layer="3" to-port="3"/>
<edge from-layer="3" from-port="4" to-layer="4" to-port="0"/>
</edges>
</net>
)V0G0N";
Core ie;
Blob::Ptr weights;
weights = make_shared_blob<uint8_t>(TensorDesc(Precision::U8, {24}, Layout::C));
weights->allocate();
CommonTestUtils::fill_data(weights->buffer().as<float *>(), weights->size() / sizeof(float));
auto func = ie.ReadNetwork(model_v10, weights).getFunction();
for (auto op : func->get_ordered_ops()) {
if (op->get_friendly_name() == "proposal" && op->get_type_info() == ngraph::op::GenericIE::type_info) {
return;
}
}
FAIL() << "Custom proposal layer is not a Generic operation!";
}

View File

@@ -1,218 +0,0 @@
// Copyright (C) 2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include <gtest/gtest.h>
#include "common_test_utils/test_common.hpp"
#include <string>
#include <memory>
#include <ngraph/opsets/opset3.hpp>
#include <ngraph/function.hpp>
#include <transformations/init_node_info.hpp>
#include <ngraph/pass/constant_folding.hpp>
#include <ngraph/ops.hpp>
#include "ngraph_test_utils.hpp"
using namespace testing;
TEST(TransformationTests, ConstFoldingPriorBox) {
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
{
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
ngraph::op::PriorBoxAttrs attrs;
attrs.min_size = {256.0f};
attrs.max_size = {315.0f};
attrs.aspect_ratio = {2.0f};
attrs.flip = true;
attrs.scale_all_sizes = true;
auto layer_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {1, 1});
auto image_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {300, 300});
auto pb = std::make_shared<ngraph::opset3::PriorBox>(layer_shape, image_shape, attrs);
auto res = std::make_shared<ngraph::opset3::Result>(pb);
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in});
ngraph::pass::InitNodeInfo().run_on_function(f);
ngraph::pass::ConstantFolding().run_on_function(f);
ASSERT_NO_THROW(check_rt_info(f));
}
{
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 16},
{ -0.426667, -0.426667, 0.426667, 0.426667, -0.473286, -0.473286, 0.473286, 0.473286,
-0.603398, -0.301699, 0.603398, 0.301699, -0.301699, -0.603398, 0.301699, 0.603398,
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
});
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
}
auto res = compare_functions(f, f_ref);
ASSERT_TRUE(res.first) << res.second;
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
EXPECT_TRUE(fused != nullptr);
EXPECT_TRUE(ref != nullptr);
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
}
TEST(TransformationTests, ConstFoldingPriorBoxClustered) {
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
{
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
ngraph::op::PriorBoxClusteredAttrs attrs;
attrs.widths = {4.0f, 2.0f, 3.2f};
attrs.heights = {1.0f, 2.0f, 1.1f};
auto layer_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {2, 2});
auto image_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {300, 300});
auto pb = std::make_shared<ngraph::opset3::PriorBoxClustered>(layer_shape, image_shape, attrs);
auto res = std::make_shared<ngraph::opset3::Result>(pb);
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in});
ngraph::pass::InitNodeInfo().run_on_function(f);
ngraph::pass::ConstantFolding().run_on_function(f);
ASSERT_NO_THROW(check_rt_info(f));
}
{
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 48},
{ -0.00666667, -0.00166667, 0.00666667, 0.00166667, -0.00333333, -0.00333333, 0.00333333,
0.00333333, -0.00533333, -0.00183333, 0.00533333, 0.00183333, -0.00333333, -0.00166667,
0.01, 0.00166667, 0, -0.00333333, 0.00666667, 0.00333333, -0.002, -0.00183333, 0.00866667,
0.00183333, -0.00666667, 0.00166667, 0.00666667, 0.005, -0.00333333, 0, 0.00333333,
0.00666667, -0.00533333, 0.0015, 0.00533333, 0.00516667, -0.00333333, 0.00166667, 0.01,
0.005, 0, 0, 0.00666667, 0.00666667, -0.002, 0.0015, 0.00866667, 0.00516667, 0.1, 0.1,
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
});
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
}
auto res = compare_functions(f, f_ref);
ASSERT_TRUE(res.first) << res.second;
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
EXPECT_TRUE(fused != nullptr);
EXPECT_TRUE(ref != nullptr);
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
}
TEST(TransformationTests, ConstFoldingPriorBoxSubgraph) {
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
{
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 1, 1});
auto in_2 = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 300, 300});
ngraph::op::PriorBoxAttrs attrs;
attrs.min_size = {256.0f};
attrs.max_size = {315.0f};
attrs.aspect_ratio = {2.0f};
attrs.flip = true;
attrs.scale_all_sizes = true;
auto layer_shape = std::make_shared<ngraph::opset3::ShapeOf>(in);
auto image_shape = std::make_shared<ngraph::opset3::ShapeOf>(in_2);
auto begin = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {2});
auto end = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {4});
auto stride = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {1});
auto ss_data = std::make_shared<ngraph::opset3::StridedSlice>(layer_shape, begin, end, stride,
std::vector<int64_t>{0}, std::vector<int64_t>{0});
auto ss_image = std::make_shared<ngraph::opset3::StridedSlice>(image_shape, begin, end, stride,
std::vector<int64_t>{0}, std::vector<int64_t>{0});
auto pb = std::make_shared<ngraph::opset3::PriorBox>(ss_data, ss_image, attrs);
auto res = std::make_shared<ngraph::opset3::Result>(pb);
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in, in_2});
ngraph::pass::InitNodeInfo().run_on_function(f);
ngraph::pass::ConstantFolding().run_on_function(f);
ASSERT_NO_THROW(check_rt_info(f));
}
{
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 16},
{ -0.426667, -0.426667, 0.426667, 0.426667, -0.473286, -0.473286, 0.473286, 0.473286,
-0.603398, -0.301699, 0.603398, 0.301699, -0.301699, -0.603398, 0.301699, 0.603398,
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1
});
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
}
auto res = compare_functions(f, f_ref);
ASSERT_TRUE(res.first) << res.second;
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
EXPECT_TRUE(fused != nullptr);
EXPECT_TRUE(ref != nullptr);
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
}
TEST(TransformationTests, ConstFoldingPriorBoxClusteredSubgraph) {
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
{
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 2, 2});
auto in_2 = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 300, 300});
ngraph::op::PriorBoxClusteredAttrs attrs;
attrs.widths = {4.0f, 2.0f, 3.2f};
attrs.heights = {1.0f, 2.0f, 1.1f};
auto layer_shape = std::make_shared<ngraph::opset3::ShapeOf>(in);
auto image_shape = std::make_shared<ngraph::opset3::ShapeOf>(in_2);
auto begin = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {2});
auto end = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {4});
auto stride = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {1});
auto ss_data = std::make_shared<ngraph::opset3::StridedSlice>(layer_shape, begin, end, stride,
std::vector<int64_t>{0}, std::vector<int64_t>{0});
auto ss_image = std::make_shared<ngraph::opset3::StridedSlice>(image_shape, begin, end, stride,
std::vector<int64_t>{0}, std::vector<int64_t>{0});
auto pb = std::make_shared<ngraph::opset3::PriorBoxClustered>(ss_data, ss_image, attrs);
auto res = std::make_shared<ngraph::opset3::Result>(pb);
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in, in_2});
ngraph::pass::InitNodeInfo().run_on_function(f);
ngraph::pass::ConstantFolding().run_on_function(f);
ASSERT_NO_THROW(check_rt_info(f));
}
{
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 48},
{ -0.00666667, -0.00166667, 0.00666667, 0.00166667, -0.00333333, -0.00333333, 0.00333333,
0.00333333, -0.00533333, -0.00183333, 0.00533333, 0.00183333, -0.00333333, -0.00166667,
0.01, 0.00166667, 0, -0.00333333, 0.00666667, 0.00333333, -0.002, -0.00183333, 0.00866667,
0.00183333, -0.00666667, 0.00166667, 0.00666667, 0.005, -0.00333333, 0, 0.00333333,
0.00666667, -0.00533333, 0.0015, 0.00533333, 0.00516667, -0.00333333, 0.00166667, 0.01,
0.005, 0, 0, 0.00666667, 0.00666667, -0.002, 0.0015, 0.00866667, 0.00516667, 0.1, 0.1,
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
});
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
}
auto res = compare_functions(f, f_ref);
ASSERT_TRUE(res.first) << res.second;
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
EXPECT_TRUE(fused != nullptr);
EXPECT_TRUE(ref != nullptr);
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
}

View File

@@ -0,0 +1,73 @@
// Copyright (C) 2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include <gtest/gtest.h>
#include <string>
#include <memory>
#include <queue>
#include <ngraph/function.hpp>
#include <ngraph/opsets/opset1.hpp>
#include <transformations/convert_divide.hpp>
#include <transformations/init_node_info.hpp>
#include <transformations/utils/utils.hpp>
#include "ngraph_test_utils.hpp"
using namespace testing;
TEST(TransformationTests, ConvertDivide) {
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
{
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{3, 1, 2});
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {1.5});
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
ngraph::pass::InitNodeInfo().run_on_function(f);
ngraph::pass::ConvertDivide().run_on_function(f);
ASSERT_NO_THROW(check_rt_info(f));
}
{
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{3, 1, 2});
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {1.5});
auto pow = std::make_shared<ngraph::opset1::Power>(divide_constant,
ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {-1}));
auto mul = std::make_shared<ngraph::opset1::Multiply>(data, pow);
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{mul}, ngraph::ParameterVector{data});
}
auto res = compare_functions(f, f_ref);
ASSERT_TRUE(res.first) << res.second;
}
TEST(TransformationTests, ConvertDivideNegative) {
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
{
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::i32, ngraph::Shape{3, 1, 2});
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::i32, ngraph::Shape{1}, {2});
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
ngraph::pass::InitNodeInfo().run_on_function(f);
ngraph::pass::ConvertDivide().run_on_function(f);
ASSERT_NO_THROW(check_rt_info(f));
}
{
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::i32, ngraph::Shape{3, 1, 2});
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::i32, ngraph::Shape{1}, {2});
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
}
auto res = compare_functions(f, f_ref);
ASSERT_TRUE(res.first) << res.second;
}

View File

@@ -177,6 +177,56 @@ TEST(TransformationTests, ConvertStridedSliceToCropNegative) {
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
}
auto res = compare_functions(f, f_ref);
ASSERT_TRUE(res.first) << res.second;
}
// in this test the Crop will get 3D input which is not supported so the transformation will not be applied
TEST(TransformationTests, ConvertStridedSliceToCropNegative2) {
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
{
auto input = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{128, 1});
auto slice_begin = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
auto slice_end = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
auto slice_stride = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {1, 1, 1});
std::vector<int64_t> begin_mask = {0, 1, 1};
std::vector<int64_t> end_mask = {0, 1, 1};
std::vector<int64_t> new_axis_mask = {1, 0, 0};
std::vector<int64_t> shrink_axis_mask = {0, 0, 0};
std::vector<int64_t> ellipsis_mask = {0, 0, 0};
auto sslice = std::make_shared<ngraph::opset1::StridedSlice>(input, slice_begin, slice_end, slice_stride,
begin_mask, end_mask,
new_axis_mask, shrink_axis_mask, ellipsis_mask);
sslice->set_friendly_name("strided_slice");
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
ngraph::pass::InitNodeInfo().run_on_function(f);
ngraph::pass::ConvertStridedSliceToCrop().run_on_function(f);
ASSERT_NO_THROW(check_rt_info(f));
}
{
auto input = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{128, 1});
auto slice_begin = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
auto slice_end = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
auto slice_stride = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {1, 1, 1});
std::vector<int64_t> begin_mask = {0, 1, 1};
std::vector<int64_t> end_mask = {0, 1, 1};
std::vector<int64_t> new_axis_mask = {1, 0, 0};
std::vector<int64_t> shrink_axis_mask = {0, 0, 0};
std::vector<int64_t> ellipsis_mask = {0, 0, 0};
auto sslice = std::make_shared<ngraph::opset1::StridedSlice>(input, slice_begin, slice_end, slice_stride,
begin_mask, end_mask,
new_axis_mask, shrink_axis_mask, ellipsis_mask);
sslice->set_friendly_name("strided_slice");
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
}
auto res = compare_functions(f, f_ref);
ASSERT_TRUE(res.first) << res.second;
}

View File

@@ -157,5 +157,6 @@ TEST(TransformationTests, ConvertTopK3I64Output1) {
ASSERT_TRUE(res.first) << res.second;
auto result_node_of_converted_f = f->get_output_op(0);
auto topk_node = result_node_of_converted_f->input(0).get_source_output().get_node_shared_ptr();
auto convert_node = result_node_of_converted_f->input(0).get_source_output().get_node_shared_ptr();
ASSERT_TRUE(convert_node->get_friendly_name() == "topk.1") << "Transformation ConvertTopK3 should keep output names.\n";
}

View File

@@ -11,14 +11,15 @@ std::vector<std::string> disabledTestPatterns() {
return {
// TODO: Issue 26264
R"(.*(MaxPool|AvgPool).*S\(1\.2\).*Rounding=CEIL.*)",
// TODO: Issue 31839
R"(.*(QuantConvBackpropData3D).*)",
// TODO: Issue 31841
R"(.*(QuantGroupConvBackpropData3D).*)",
// TODO: Issue 31843
R"(.*(QuantGroupConvBackpropData2D)*QG=Perchannel.*)",
// TODO: Issue 32023
R"(.*(QuantGroupConvBackpropData2D)*QG=Pertensor.*)",
R"(.*(QuantConvBackpropData3D).*)",
R"(.*(QuantConvBackpropData2D).*(QG=Perchannel).*)",
R"(.*(QuantGroupConvBackpropData2D).*(QG=Perchannel).*)",
// TODO: Issue 33886
R"(.*(QuantGroupConv2D).*)",
R"(.*(QuantGroupConv3D).*)",
// TODO: Issue 31845
R"(.*(FakeQuantize).*)",
R"(.*(EltwiseLayerTest).*IS=\(.*\..*\..*\..*\..*\).*secondaryInputType=PARAMETER.*opType=SCALAR.*)",

View File

@@ -19,7 +19,6 @@ const std::vector<InferenceEngine::Precision> netPrecisions = {
const std::vector<size_t> numOutChannels = {16, 32};
const std::vector<size_t > levels = {256};
// FIXME: Perchannel tests fail because of bug in LPT
const std::vector<QuantizationGranularity > granularity = {Pertensor, Perchannel};
/* ============= 2D GroupConvolutionBackpropData ============= */

View File

@@ -0,0 +1,86 @@
// Copyright (C) 2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include <vector>
#include "subgraph_tests/quantized_group_convolution.hpp"
#include "common_test_utils/test_constants.hpp"
using namespace LayerTestsDefinitions;
using namespace ngraph::helpers;
namespace {
const std::vector<InferenceEngine::Precision> netPrecisions = {
InferenceEngine::Precision::FP32
};
const std::vector<size_t> numOutChannels = {3, 24, 48};
const std::vector<size_t> numGroups = {3};
const std::vector<size_t > levels = {256};
const std::vector<QuantizationGranularity> granularity = {Pertensor, Perchannel};
const std::vector<bool> quantizeWeights = {false, true};
/* ============= 2D GroupConvolution ============= */
const std::vector<std::vector<size_t >> inputShapes2D = {{1, 3, 10, 10}, {1, 24, 10, 10}};
const std::vector<std::vector<size_t >> kernels2D = {{1, 1}, {3, 3}};
const std::vector<std::vector<size_t >> strides2D = {{1, 1}};
const std::vector<std::vector<ptrdiff_t>> padBegins2D = {{0, 0}};
const std::vector<std::vector<ptrdiff_t>> padEnds2D = {{0, 0}};
const std::vector<std::vector<size_t >> dilations2D = {{1, 1}};
const auto quantGroupConv2DParams = ::testing::Combine(
::testing::ValuesIn(kernels2D),
::testing::ValuesIn(strides2D),
::testing::ValuesIn(padBegins2D),
::testing::ValuesIn(padEnds2D),
::testing::ValuesIn(dilations2D),
::testing::ValuesIn(numOutChannels),
::testing::ValuesIn(numGroups),
::testing::ValuesIn(levels),
::testing::ValuesIn(granularity),
::testing::ValuesIn(quantizeWeights)
);
INSTANTIATE_TEST_CASE_P(QuantGroupConv2D, QuantGroupConvLayerTest,
::testing::Combine(
quantGroupConv2DParams,
::testing::ValuesIn(netPrecisions),
::testing::ValuesIn(inputShapes2D),
::testing::Values(CommonTestUtils::DEVICE_CPU)),
QuantGroupConvLayerTest::getTestCaseName);
/* ============= 3D GroupConvolution ============= */
const std::vector<std::vector<size_t >> inputShapes3D = {{1, 3, 5, 5, 5}, {1, 24, 5, 5, 5}};
const std::vector<std::vector<size_t >> kernels3D = {{3, 3, 3}};
const std::vector<std::vector<size_t >> strides3D = {{1, 1, 1}};
const std::vector<std::vector<ptrdiff_t>> padBegins3D = {{0, 0, 0}};
const std::vector<std::vector<ptrdiff_t>> padEnds3D = {{0, 0, 0}};
const std::vector<std::vector<size_t >> dilations3D = {{1, 1, 1}};
const auto quantGroupConv3DParams = ::testing::Combine(
::testing::ValuesIn(kernels3D),
::testing::ValuesIn(strides3D),
::testing::ValuesIn(padBegins3D),
::testing::ValuesIn(padEnds3D),
::testing::ValuesIn(dilations3D),
::testing::ValuesIn(numOutChannels),
::testing::ValuesIn(numGroups),
::testing::ValuesIn(levels),
::testing::ValuesIn(granularity),
::testing::ValuesIn(quantizeWeights)
);
INSTANTIATE_TEST_CASE_P(QuantGroupConv3D, QuantGroupConvLayerTest,
::testing::Combine(
quantGroupConv3DParams,
::testing::ValuesIn(netPrecisions),
::testing::ValuesIn(inputShapes3D),
::testing::Values(CommonTestUtils::DEVICE_CPU)),
QuantGroupConvLayerTest::getTestCaseName);
} // namespace

View File

@@ -21,7 +21,7 @@ const std::vector<std::map<std::string, std::string>> configs = {
}
};
INSTANTIATE_TEST_CASE_P(ConcatQuantization, ConcatQuantization,
INSTANTIATE_TEST_CASE_P(smoke_ConcatQuantization, ConcatQuantization,
::testing::Combine(
::testing::ValuesIn(netPrecisions),
::testing::Values(CommonTestUtils::DEVICE_GNA),

View File

@@ -0,0 +1,39 @@
// Copyright (C) 2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
#include <vector>
#include "subgraph_tests/multioutput_eltwise_squeeze_eltwise.hpp"
#include "common_test_utils/test_constants.hpp"
using namespace LayerTestsDefinitions;
namespace {
std::vector<std::vector<std::vector<size_t>>> inputs{
{{1, 16}},
{{2, 16}},
{{1, 160}},
{{8, 40}},
{{3, 8}},
{{4, 32}},
{{5, 64}},
{{6, 128}},
{{7, 256}},
{{8, 512}},
{{8, 1024}}
};
std::map<std::string, std::string> additional_config = {
{"GNA_COMPACT_MODE", "NO"},
};
std::vector<InferenceEngine::Precision> netPrecisions = {InferenceEngine::Precision::FP32,
InferenceEngine::Precision::FP16,
};
INSTANTIATE_TEST_CASE_P(multioutput_eltwise_identity, MultioutputEltwiseReshapeEltwise,
::testing::Combine(
::testing::ValuesIn(inputs),
::testing::ValuesIn(netPrecisions),
::testing::Values(CommonTestUtils::DEVICE_GNA),
::testing::Values(additional_config)),
MultioutputEltwiseReshapeEltwise::getTestCaseName);
} // namespace

View File

@@ -9,6 +9,7 @@ using namespace LayerTestsDefinitions;
namespace {
std::vector<std::vector<std::vector<size_t>>> inputs{
{{1, 4 , 160}, {0, 2, 1}},
{{1, 160, 4}, {0, 2, 1}},
{{8, 16}, {1, 0}},
{{1, 1, 4, 16}, {3, 1, 2, 0}},
{{1, 8, 200}, {0, 2, 1}},

View File

@@ -0,0 +1,53 @@
// Copyright (C) 2020 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include <vector>
#include "subgraph_tests/scaleshift.hpp"
#include "common_test_utils/test_constants.hpp"
using namespace LayerTestsDefinitions;
namespace {
std::vector<std::vector<std::vector<size_t>>> inShapes = {
{{1, 8}},
{{2, 16}},
{{3, 32}},
{{4, 64}},
{{5, 128}},
{{6, 256}},
{{7, 512}},
{{8, 1024}}
};
std::vector<std::vector<float >> Scales = {
{2.0f},
{3.0f},
{-1.0f},
{-2.0f},
{-3.0f}
};
std::vector<std::vector<float >> Shifts = {
{1.0f},
{2.0f},
{3.0f},
{-1.0f},
{-2.0f},
{-3.0f}
};
std::vector<InferenceEngine::Precision> netPrecisions = {InferenceEngine::Precision::FP32,
InferenceEngine::Precision::FP16,
};
INSTANTIATE_TEST_CASE_P(scale_shift, ScaleShiftLayerTest,
::testing::Combine(
::testing::ValuesIn(inShapes),
::testing::ValuesIn(netPrecisions),
::testing::Values(CommonTestUtils::DEVICE_GNA),
::testing::ValuesIn(Scales),
::testing::ValuesIn(Shifts)),
ScaleShiftLayerTest::getTestCaseName);
} // namespace

View File

@@ -60,8 +60,8 @@ INSTANTIATE_TEST_CASE_P(PriorBoxClustered_Basic, PriorBoxClusteredLayerTest,
::testing::Combine(
layerSpeficParams,
::testing::ValuesIn(netPrecisions),
::testing::Values(std::vector<size_t>({ 4, 4 })),
::testing::Values(std::vector<size_t>({ 50, 50 })),
::testing::Values(std::vector<size_t>({ 1, 16, 4, 4 })),
::testing::Values(std::vector<size_t>({ 1, 3, 50, 50 })),
::testing::Values(CommonTestUtils::DEVICE_GPU)),
PriorBoxClusteredLayerTest::getTestCaseName
);

Some files were not shown because too many files have changed in this diff Show More