Compare commits
55 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
023e7c2c3f | ||
|
|
34ddb70f7d | ||
|
|
21e092122f | ||
|
|
92c1333653 | ||
|
|
c26ec8b312 | ||
|
|
32054ff180 | ||
|
|
7cff005ada | ||
|
|
06707cc53f | ||
|
|
fff93d8f05 | ||
|
|
637ddd5dfb | ||
|
|
fa4c5e8e38 | ||
|
|
c9fc6f0531 | ||
|
|
c9eb6ae62b | ||
|
|
eef56ca80c | ||
|
|
36f1c00e02 | ||
|
|
5c43765011 | ||
|
|
bbfc9bbc14 | ||
|
|
9c607528ef | ||
|
|
ae9e0510f0 | ||
|
|
76af547c17 | ||
|
|
5e97a3123f | ||
|
|
532dec140b | ||
|
|
c41c6294f9 | ||
|
|
3bbe88e659 | ||
|
|
2f3d5f68cd | ||
|
|
843f81a1cc | ||
|
|
c596707a09 | ||
|
|
cf60baf2f0 | ||
|
|
aeb70036d7 | ||
|
|
dea04dae8c | ||
|
|
14b44803ba | ||
|
|
06286f2aae | ||
|
|
97e5fc4bae | ||
|
|
47218284b2 | ||
|
|
6079a35b81 | ||
|
|
4f4352f301 | ||
|
|
a67d74c41f | ||
|
|
26c563132d | ||
|
|
dc1ca195dd | ||
|
|
f5ad3e6f89 | ||
|
|
6c736ce001 | ||
|
|
30ab6534e1 | ||
|
|
259a4c25ce | ||
|
|
347930008c | ||
|
|
4fa251483a | ||
|
|
30f8af70fc | ||
|
|
3fc6d8a188 | ||
|
|
66c8df6a87 | ||
|
|
e53eb86334 | ||
|
|
2df99d4263 | ||
|
|
deab4d38b0 | ||
|
|
412428f1dd | ||
|
|
167c96a8af | ||
|
|
b7363ba711 | ||
|
|
5cef9f3734 |
@@ -1,5 +1,5 @@
|
|||||||
# [OpenVINO™ Toolkit](https://01.org/openvinotoolkit) - Deep Learning Deployment Toolkit repository
|
# [OpenVINO™ Toolkit](https://01.org/openvinotoolkit) - Deep Learning Deployment Toolkit repository
|
||||||
[](https://github.com/openvinotoolkit/openvino/releases/tag/2020.3.0)
|
[](https://github.com/openvinotoolkit/openvino/releases/tag/2020.4.0)
|
||||||
[](LICENSE)
|
[](LICENSE)
|
||||||
|
|
||||||
This toolkit allows developers to deploy pre-trained deep learning models
|
This toolkit allows developers to deploy pre-trained deep learning models
|
||||||
|
|||||||
@@ -52,14 +52,15 @@ as a part of [Intel® Distribution of OpenVINO™].
|
|||||||
## Build on Linux\* Systems
|
## Build on Linux\* Systems
|
||||||
|
|
||||||
The software was validated on:
|
The software was validated on:
|
||||||
|
- Ubuntu\* 18.04 (64-bit) with default GCC\* 7.5.0
|
||||||
- Ubuntu\* 16.04 (64-bit) with default GCC\* 5.4.0
|
- Ubuntu\* 16.04 (64-bit) with default GCC\* 5.4.0
|
||||||
- CentOS\* 7.4 (64-bit) with default GCC\* 4.8.5
|
- CentOS\* 7.4 (64-bit) with default GCC\* 4.8.5
|
||||||
|
|
||||||
### Software Requirements
|
### Software Requirements
|
||||||
- [CMake]\* 3.11 or higher
|
- [CMake]\* 3.11 or higher
|
||||||
- GCC\* 4.8 or higher to build the Inference Engine
|
- GCC\* 4.8 or higher to build the Inference Engine
|
||||||
- Python 2.7 or higher for Inference Engine Python API wrapper
|
- Python 3.5 or higher for Inference Engine Python API wrapper
|
||||||
- (Optional) [Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352].
|
- (Optional) [Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441].
|
||||||
|
|
||||||
### Build Steps
|
### Build Steps
|
||||||
1. Clone submodules:
|
1. Clone submodules:
|
||||||
@@ -77,7 +78,7 @@ The software was validated on:
|
|||||||
```
|
```
|
||||||
3. By default, the build enables the Inference Engine GPU plugin to infer models
|
3. By default, the build enables the Inference Engine GPU plugin to infer models
|
||||||
on your Intel® Processor Graphics. This requires you to
|
on your Intel® Processor Graphics. This requires you to
|
||||||
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352]
|
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441]
|
||||||
before running the build. If you don't want to use the GPU plugin, use the
|
before running the build. If you don't want to use the GPU plugin, use the
|
||||||
`-DENABLE_CLDNN=OFF` CMake build option and skip the installation of the
|
`-DENABLE_CLDNN=OFF` CMake build option and skip the installation of the
|
||||||
Intel® Graphics Compute Runtime for OpenCL™ Driver.
|
Intel® Graphics Compute Runtime for OpenCL™ Driver.
|
||||||
@@ -202,7 +203,7 @@ Native compilation of the Inference Engine is the most straightforward solution.
|
|||||||
|
|
||||||
This compilation was tested on the following configuration:
|
This compilation was tested on the following configuration:
|
||||||
|
|
||||||
* Host: Ubuntu\* 16.04 (64-bit, Intel® Core™ i7-6700K CPU @ 4.00GHz × 8)
|
* Host: Ubuntu\* 18.04 (64-bit, Intel® Core™ i7-6700K CPU @ 4.00GHz × 8)
|
||||||
* Target: Raspbian\* Stretch (32-bit, ARMv7, Raspberry Pi\* 3)
|
* Target: Raspbian\* Stretch (32-bit, ARMv7, Raspberry Pi\* 3)
|
||||||
|
|
||||||
1. Install Docker\*:
|
1. Install Docker\*:
|
||||||
@@ -337,7 +338,7 @@ The software was validated on:
|
|||||||
- [CMake]\*3.11 or higher
|
- [CMake]\*3.11 or higher
|
||||||
- Microsoft\* Visual Studio 2017, 2019 or [Intel® C++ Compiler] 18.0
|
- Microsoft\* Visual Studio 2017, 2019 or [Intel® C++ Compiler] 18.0
|
||||||
- (Optional) Intel® Graphics Driver for Windows* (26.20) [driver package].
|
- (Optional) Intel® Graphics Driver for Windows* (26.20) [driver package].
|
||||||
- Python 3.4 or higher for Inference Engine Python API wrapper
|
- Python 3.5 or higher for Inference Engine Python API wrapper
|
||||||
|
|
||||||
### Build Steps
|
### Build Steps
|
||||||
|
|
||||||
@@ -454,7 +455,7 @@ The software was validated on:
|
|||||||
|
|
||||||
- [CMake]\* 3.11 or higher
|
- [CMake]\* 3.11 or higher
|
||||||
- Clang\* compiler from Xcode\* 10.1 or higher
|
- Clang\* compiler from Xcode\* 10.1 or higher
|
||||||
- Python\* 3.4 or higher for the Inference Engine Python API wrapper
|
- Python\* 3.5 or higher for the Inference Engine Python API wrapper
|
||||||
|
|
||||||
### Build Steps
|
### Build Steps
|
||||||
|
|
||||||
@@ -574,8 +575,7 @@ This section describes how to build Inference Engine for Android x86 (64-bit) op
|
|||||||
|
|
||||||
## Use Custom OpenCV Builds for Inference Engine
|
## Use Custom OpenCV Builds for Inference Engine
|
||||||
|
|
||||||
> **NOTE**: The recommended and tested version of OpenCV is 4.3. The minimum
|
> **NOTE**: The recommended and tested version of OpenCV is 4.4.0.
|
||||||
supported version is 3.4.0.
|
|
||||||
|
|
||||||
Required versions of OpenCV packages are downloaded automatically during the
|
Required versions of OpenCV packages are downloaded automatically during the
|
||||||
building Inference Engine library. If the build script can not find and download
|
building Inference Engine library. If the build script can not find and download
|
||||||
@@ -691,7 +691,7 @@ This target collects all dependencies, prepares the nGraph package and copies it
|
|||||||
|
|
||||||
[Intel® Distribution of OpenVINO™]:https://software.intel.com/en-us/openvino-toolkit
|
[Intel® Distribution of OpenVINO™]:https://software.intel.com/en-us/openvino-toolkit
|
||||||
[CMake]:https://cmake.org/download/
|
[CMake]:https://cmake.org/download/
|
||||||
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 20.13.16352]:https://github.com/intel/compute-runtime/releases/tag/20.13.16352
|
[Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.41.14441]:https://github.com/intel/compute-runtime/releases/tag/19.41.14441
|
||||||
[MKL-DNN repository]:https://github.com/intel/mkl-dnn/releases/download/v0.19/mklml_lnx_2019.0.5.20190502.tgz
|
[MKL-DNN repository]:https://github.com/intel/mkl-dnn/releases/download/v0.19/mklml_lnx_2019.0.5.20190502.tgz
|
||||||
[MKL-DNN repository for Windows]:(https://github.com/intel/mkl-dnn/releases/download/v0.19/mklml_win_2019.0.5.20190502.zip)
|
[MKL-DNN repository for Windows]:(https://github.com/intel/mkl-dnn/releases/download/v0.19/mklml_win_2019.0.5.20190502.zip)
|
||||||
[OpenBLAS]:https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download
|
[OpenBLAS]:https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download
|
||||||
|
|||||||
@@ -27,8 +27,14 @@ endif()
|
|||||||
|
|
||||||
if (ENABLE_THREAD_SANITIZER)
|
if (ENABLE_THREAD_SANITIZER)
|
||||||
set(SANITIZER_COMPILER_FLAGS "-g -fsanitize=thread -fno-omit-frame-pointer")
|
set(SANITIZER_COMPILER_FLAGS "-g -fsanitize=thread -fno-omit-frame-pointer")
|
||||||
set(SANITIZER_LINKER_FLAGS "-fsanitize=thread -static-libsan")
|
set(SANITIZER_LINKER_FLAGS "-fsanitize=thread")
|
||||||
|
if(CMAKE_CXX_COMPILER_ID MATCHES "^(Apple)?Clang$" AND NOT WIN32)
|
||||||
|
if(CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 8.0)
|
||||||
|
set(SANITIZER_LINKER_FLAGS "${SANITIZER_LINKER_FLAGS} -fuse-ld=lld")
|
||||||
|
else()
|
||||||
|
set(SANITIZER_LINKER_FLAGS "${SANITIZER_LINKER_FLAGS} -static-libsan")
|
||||||
|
endif()
|
||||||
|
endif()
|
||||||
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SANITIZER_COMPILER_FLAGS}")
|
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${SANITIZER_COMPILER_FLAGS}")
|
||||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SANITIZER_COMPILER_FLAGS}")
|
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${SANITIZER_COMPILER_FLAGS}")
|
||||||
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${SANITIZER_LINKER_FLAGS}")
|
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${SANITIZER_LINKER_FLAGS}")
|
||||||
|
|||||||
@@ -79,7 +79,7 @@ function(ie_build_samples)
|
|||||||
MINGW64 CMAKE_BUILD_TYPE CMAKE_MACOSX_RPATH)
|
MINGW64 CMAKE_BUILD_TYPE CMAKE_MACOSX_RPATH)
|
||||||
unset(${var})
|
unset(${var})
|
||||||
endforeach()
|
endforeach()
|
||||||
|
include(sanitizer)
|
||||||
add_subdirectory(samples)
|
add_subdirectory(samples)
|
||||||
endfunction()
|
endfunction()
|
||||||
|
|
||||||
|
|||||||
@@ -19,7 +19,7 @@ set(VPU_SUPPORTED_FIRMWARES usb-ma2450 usb-ma2x8x pcie-ma248x)
|
|||||||
# Default packages
|
# Default packages
|
||||||
#
|
#
|
||||||
|
|
||||||
set(FIRMWARE_PACKAGE_VERSION 1216)
|
set(FIRMWARE_PACKAGE_VERSION 1223)
|
||||||
set(VPU_CLC_MA2X8X_VERSION "movi-cltools-20.02.0")
|
set(VPU_CLC_MA2X8X_VERSION "movi-cltools-20.02.0")
|
||||||
|
|
||||||
#
|
#
|
||||||
|
|||||||
@@ -1,2 +1,2 @@
|
|||||||
numpy
|
numpy==1.13.3
|
||||||
cython>=0.29
|
cython==0.29.17
|
||||||
|
|||||||
@@ -1,2 +1,2 @@
|
|||||||
opencv-python==3.4.4
|
opencv-python==3.4.4.19
|
||||||
numpy==1.18.1
|
numpy==1.13.3
|
||||||
|
|||||||
@@ -814,8 +814,8 @@ cdef class ExecutableNetwork:
|
|||||||
current_request = self.requests[0]
|
current_request = self.requests[0]
|
||||||
current_request.infer(inputs)
|
current_request.infer(inputs)
|
||||||
res = {}
|
res = {}
|
||||||
for out in current_request._outputs_list:
|
for name, value in current_request.output_blobs.items():
|
||||||
res[out] = deepcopy(current_request.output_blobs[out].buffer)
|
res[name] = deepcopy(value.buffer)
|
||||||
return res
|
return res
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -229,12 +229,14 @@ void InferenceEnginePython::IENetwork::serialize(const std::string &path_to_xml,
|
|||||||
|
|
||||||
const std::vector <InferenceEngine::CNNLayerPtr>
|
const std::vector <InferenceEngine::CNNLayerPtr>
|
||||||
InferenceEnginePython::IENetwork::getLayers() {
|
InferenceEnginePython::IENetwork::getLayers() {
|
||||||
|
IE_SUPPRESS_DEPRECATED_START
|
||||||
std::vector<InferenceEngine::CNNLayerPtr> result;
|
std::vector<InferenceEngine::CNNLayerPtr> result;
|
||||||
std::vector<InferenceEngine::CNNLayerPtr> sorted_layers = InferenceEngine::details::CNNNetSortTopologically(*actual);
|
std::vector<InferenceEngine::CNNLayerPtr> sorted_layers = InferenceEngine::details::CNNNetSortTopologically(*actual);
|
||||||
for (const auto &layer : sorted_layers) {
|
for (const auto &layer : sorted_layers) {
|
||||||
result.emplace_back(layer);
|
result.emplace_back(layer);
|
||||||
}
|
}
|
||||||
return result;
|
return result;
|
||||||
|
IE_SUPPRESS_DEPRECATED_END
|
||||||
}
|
}
|
||||||
|
|
||||||
PyObject* InferenceEnginePython::IENetwork::getFunction() {
|
PyObject* InferenceEnginePython::IENetwork::getFunction() {
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
cython==0.29.17
|
opencv-python==3.4.4.19
|
||||||
pytest==4.0.1
|
pytest==4.0.1
|
||||||
attrs==19.1.0
|
attrs==19.1.0
|
||||||
pytest-html==1.19.0
|
pytest-html==1.19.0
|
||||||
|
|||||||
@@ -22,12 +22,12 @@
|
|||||||
namespace InferenceEngine {
|
namespace InferenceEngine {
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @deprecated Use InferenceEngine::Core instead. Will be removed in 2020.3
|
* @deprecated Use InferenceEngine::Core instead. Will be removed in 2021.1
|
||||||
* @brief This class is a C++ API wrapper for IInferencePlugin.
|
* @brief This class is a C++ API wrapper for IInferencePlugin.
|
||||||
*
|
*
|
||||||
* It can throw exceptions safely for the application, where it is properly handled.
|
* It can throw exceptions safely for the application, where it is properly handled.
|
||||||
*/
|
*/
|
||||||
class INFERENCE_ENGINE_DEPRECATED("Use InferenceEngine::Core instead. Will be removed in 2020.3") InferencePlugin {
|
class INFERENCE_ENGINE_DEPRECATED("Use InferenceEngine::Core instead. Will be removed in 2021.1") InferencePlugin {
|
||||||
IE_SUPPRESS_DEPRECATED_START
|
IE_SUPPRESS_DEPRECATED_START
|
||||||
InferenceEnginePluginPtr actual;
|
InferenceEnginePluginPtr actual;
|
||||||
|
|
||||||
|
|||||||
@@ -21,10 +21,10 @@ namespace InferenceEngine {
|
|||||||
namespace details {
|
namespace details {
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||||
* @brief This class enables range loops for CNNNetwork objects
|
* @brief This class enables range loops for CNNNetwork objects
|
||||||
*/
|
*/
|
||||||
class INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3")
|
class INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1")
|
||||||
CNNNetworkIterator {
|
CNNNetworkIterator {
|
||||||
IE_SUPPRESS_DEPRECATED_START
|
IE_SUPPRESS_DEPRECATED_START
|
||||||
|
|
||||||
|
|||||||
@@ -16,6 +16,7 @@
|
|||||||
namespace InferenceEngine {
|
namespace InferenceEngine {
|
||||||
namespace details {
|
namespace details {
|
||||||
|
|
||||||
|
INFERENCE_ENGINE_INTERNAL("Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1")
|
||||||
INFERENCE_ENGINE_API_CPP(std::vector<CNNLayerPtr>) CNNNetSortTopologically(const ICNNNetwork& network);
|
INFERENCE_ENGINE_API_CPP(std::vector<CNNLayerPtr>) CNNNetSortTopologically(const ICNNNetwork& network);
|
||||||
|
|
||||||
} // namespace details
|
} // namespace details
|
||||||
|
|||||||
@@ -126,7 +126,7 @@ public:
|
|||||||
const SizeVector& getDims() const;
|
const SizeVector& getDims() const;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||||
* @brief Returns an owner of this data layer, parent layer in di-graph
|
* @brief Returns an owner of this data layer, parent layer in di-graph
|
||||||
* @return A weak pointer to CNNLayer that creates this data
|
* @return A weak pointer to CNNLayer that creates this data
|
||||||
*/
|
*/
|
||||||
@@ -147,7 +147,7 @@ public:
|
|||||||
void setName(const std::string& newName);
|
void setName(const std::string& newName);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||||
* @brief Privates child layers in di-graph
|
* @brief Privates child layers in di-graph
|
||||||
* @return A map of child layers
|
* @return A map of child layers
|
||||||
*/
|
*/
|
||||||
|
|||||||
@@ -2049,7 +2049,7 @@ public:
|
|||||||
};
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||||
* @brief This class represents a standard ScatterUpdate layer
|
* @brief This class represents a standard ScatterUpdate layer
|
||||||
*/
|
*/
|
||||||
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ScatterUpdateLayer): public CNNLayer {
|
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ScatterUpdateLayer): public CNNLayer {
|
||||||
@@ -2063,7 +2063,7 @@ public:
|
|||||||
};
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||||
* @brief This class represents a standard ScatterElementsUpdate layer
|
* @brief This class represents a standard ScatterElementsUpdate layer
|
||||||
*/
|
*/
|
||||||
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ScatterElementsUpdateLayer): public CNNLayer {
|
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ScatterElementsUpdateLayer): public CNNLayer {
|
||||||
@@ -2077,7 +2077,7 @@ public:
|
|||||||
};
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2020.3
|
* @deprecated Migrate to IR v10 and work with ngraph::Function directly. The method will be removed in 2021.1
|
||||||
* @brief This class represents an onnx ExperimentalDetectronPriorGridGenerator Layer
|
* @brief This class represents an onnx ExperimentalDetectronPriorGridGenerator Layer
|
||||||
*/
|
*/
|
||||||
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ExperimentalDetectronPriorGridGeneratorLayer): public CNNLayer {
|
class INFERENCE_ENGINE_INTERNAL_CNNLAYER_CLASS(ExperimentalDetectronPriorGridGeneratorLayer): public CNNLayer {
|
||||||
|
|||||||
@@ -123,11 +123,13 @@ DECLARE_VPU_CONFIG_VALUE(NDHWC);
|
|||||||
DECLARE_VPU_CONFIG_KEY(CUSTOM_LAYERS);
|
DECLARE_VPU_CONFIG_KEY(CUSTOM_LAYERS);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
* @deprecated IR statistic is not available in IR v10. The option will be removed in 2021.1
|
||||||
* @brief Ignore statistic in IR by plugin.
|
* @brief Ignore statistic in IR by plugin.
|
||||||
* Plugin could use statistic present in IR in order to try to improve calculations precision.
|
* Plugin could use statistic present in IR in order to try to improve calculations precision.
|
||||||
* If you don't want statistic to be used enable this option.
|
* If you don't want statistic to be used enable this option.
|
||||||
* This option should be used with values: CONFIG_VALUE(YES) or CONFIG_VALUE(NO) (default)
|
* This option should be used with values: CONFIG_VALUE(YES) or CONFIG_VALUE(NO) (default)
|
||||||
*/
|
*/
|
||||||
|
INFERENCE_ENGINE_DEPRECATED("IR statistic is not available in IR v10. The option will be removed in 2021.1")
|
||||||
DECLARE_VPU_CONFIG_KEY(IGNORE_IR_STATISTIC);
|
DECLARE_VPU_CONFIG_KEY(IGNORE_IR_STATISTIC);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|||||||
@@ -382,6 +382,9 @@ int main(int argc, char* argv[]) {
|
|||||||
trim(strLine);
|
trim(strLine);
|
||||||
labels.push_back(strLine);
|
labels.push_back(strLine);
|
||||||
}
|
}
|
||||||
|
inputFile.close();
|
||||||
|
} else {
|
||||||
|
throw std::logic_error("Cannot read label file");
|
||||||
}
|
}
|
||||||
|
|
||||||
ClassificationResult classificationResult(outputBlob, images, batchSize, FLAGS_nt, labels);
|
ClassificationResult classificationResult(outputBlob, images, batchSize, FLAGS_nt, labels);
|
||||||
|
|||||||
@@ -71,8 +71,8 @@ cldnn::device_info clDNNEngine::GetDeviceInfo(const std::map<std::string, std::s
|
|||||||
}
|
}
|
||||||
|
|
||||||
InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngine::ICNNNetwork& network) const {
|
InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngine::ICNNNetwork& network) const {
|
||||||
std::shared_ptr<ICNNNetwork> clonedNetwork(nullptr);
|
std::shared_ptr<ICNNNetwork> clonedNetwork = cloneNetwork(network);
|
||||||
if (network.getFunction()) {
|
if (clonedNetwork->getFunction()) {
|
||||||
const auto transformations_callback = [](const std::shared_ptr<const ::ngraph::Node> &node) -> bool {
|
const auto transformations_callback = [](const std::shared_ptr<const ::ngraph::Node> &node) -> bool {
|
||||||
// DepthToSpace node implementation supports only equal input/output tensors with rank <= 5
|
// DepthToSpace node implementation supports only equal input/output tensors with rank <= 5
|
||||||
// Reshape->Permute->Reshape pattern in theory can change output rank, so this check is added to be sure
|
// Reshape->Permute->Reshape pattern in theory can change output rank, so this check is added to be sure
|
||||||
@@ -84,8 +84,7 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngin
|
|||||||
return std::dynamic_pointer_cast<const ::ngraph::opset2::Gelu>(node) ||
|
return std::dynamic_pointer_cast<const ::ngraph::opset2::Gelu>(node) ||
|
||||||
std::dynamic_pointer_cast<const ::ngraph::opset3::ShuffleChannels>(node);
|
std::dynamic_pointer_cast<const ::ngraph::opset3::ShuffleChannels>(node);
|
||||||
};
|
};
|
||||||
CNNNetwork net(network.getFunction());
|
auto nGraphFunc = clonedNetwork->getFunction();
|
||||||
auto nGraphFunc = net.getFunction();
|
|
||||||
// Disable shape inference (WA for generic operations)
|
// Disable shape inference (WA for generic operations)
|
||||||
::ngraph::op::GenericIE::DisableReshape noReshape(nGraphFunc);
|
::ngraph::op::GenericIE::DisableReshape noReshape(nGraphFunc);
|
||||||
|
|
||||||
@@ -94,9 +93,7 @@ InferenceEngine::ICNNNetwork::Ptr clDNNEngine::CloneNetwork(const InferenceEngin
|
|||||||
ngraph::pass::ConvertOpSet3ToOpSet2(transformations_callback).run_on_function(nGraphFunc);
|
ngraph::pass::ConvertOpSet3ToOpSet2(transformations_callback).run_on_function(nGraphFunc);
|
||||||
ngraph::pass::ConvertOpSet2ToOpSet1(transformations_callback).run_on_function(nGraphFunc);
|
ngraph::pass::ConvertOpSet2ToOpSet1(transformations_callback).run_on_function(nGraphFunc);
|
||||||
ngraph::pass::ConvertOpSet1ToLegacy(transformations_callback).run_on_function(nGraphFunc);
|
ngraph::pass::ConvertOpSet1ToLegacy(transformations_callback).run_on_function(nGraphFunc);
|
||||||
clonedNetwork = InferenceEngine::details::convertFunctionToICNNNetwork(nGraphFunc, network);
|
clonedNetwork = InferenceEngine::details::convertFunctionToICNNNetwork(nGraphFunc, *clonedNetwork);
|
||||||
} else {
|
|
||||||
clonedNetwork = cloneNet(network);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
auto implNetwork = std::dynamic_pointer_cast<InferenceEngine::details::CNNNetworkImpl>(clonedNetwork);
|
auto implNetwork = std::dynamic_pointer_cast<InferenceEngine::details::CNNNetworkImpl>(clonedNetwork);
|
||||||
|
|||||||
@@ -3518,10 +3518,29 @@ void Program::AddConstantBlobInput(cldnn::topology& topology, InferenceEngine::C
|
|||||||
return false;
|
return false;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// WA to inconsistency between input and const 1d tensors
|
||||||
|
// For Concat along batch we go with batch interpretation
|
||||||
|
// For Gather input we go with batch interpretation
|
||||||
|
bool needsBatchInterpretation = false;
|
||||||
|
if (constDims.size() == 1) {
|
||||||
|
for (auto next : GetNextLayers(layer->outData[0])) {
|
||||||
|
if (LayerTypeFromStr(next->type) == Concatenate) {
|
||||||
|
auto nextConcat = as<InferenceEngine::ConcatLayer*>(next);
|
||||||
|
if (nextConcat->_axis == cldnn::concatenation::concatenation_axis::along_b) {
|
||||||
|
needsBatchInterpretation = true;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
} else if (LayerTypeFromStr(next->type) == Gather) {
|
||||||
|
needsBatchInterpretation = true;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// If quantize on weights has per-channel ranges, we have to swap channel and batch dimensions, because
|
// If quantize on weights has per-channel ranges, we have to swap channel and batch dimensions, because
|
||||||
// quantization should be applied per output channel of weights
|
// quantization should be applied per output channel of weights
|
||||||
// TODO: Check if it's still needed once LowPrecisionTransformations ready
|
// TODO: Check if it's still needed once LowPrecisionTransformations ready
|
||||||
if (inputToConstQuantize(layer)) {
|
if (inputToConstQuantize(layer) || needsBatchInterpretation) {
|
||||||
constTensor.batch[0] = constTensor.count();
|
constTensor.batch[0] = constTensor.count();
|
||||||
constTensor.feature[0] = 1;
|
constTensor.feature[0] = 1;
|
||||||
}
|
}
|
||||||
@@ -3862,11 +3881,13 @@ void Program::CreateStridedSlicePrimitive(cldnn::topology& topology, InferenceEn
|
|||||||
tmp = stridedSliceLayer->GetParamAsUInts("shrink_axis_mask");
|
tmp = stridedSliceLayer->GetParamAsUInts("shrink_axis_mask");
|
||||||
std::vector<uint8_t> shrink_axis_mask(tmp.begin(), tmp.end());
|
std::vector<uint8_t> shrink_axis_mask(tmp.begin(), tmp.end());
|
||||||
|
|
||||||
|
auto out_size = CldnnTensorFromIEDims(stridedSliceLayer->outData[0]->getTensorDesc().getDims());
|
||||||
|
|
||||||
std::string stridedSliceLayerName = layer_type_name_ID(layer);
|
std::string stridedSliceLayerName = layer_type_name_ID(layer);
|
||||||
auto stridedSlicePrim = cldnn::strided_slice(
|
auto stridedSlicePrim = cldnn::strided_slice(
|
||||||
stridedSliceLayerName,
|
stridedSliceLayerName,
|
||||||
inputPrimitives[0], inputPrimitives[1], inputPrimitives[2], inputPrimitives[3],
|
inputPrimitives[0], inputPrimitives[1], inputPrimitives[2], inputPrimitives[3],
|
||||||
begin_mask, end_mask, new_axis_mask, shrink_axis_mask);
|
begin_mask, end_mask, new_axis_mask, shrink_axis_mask, out_size);
|
||||||
|
|
||||||
topology.add(stridedSlicePrim);
|
topology.add(stridedSlicePrim);
|
||||||
AddPrimitiveToProfiler(stridedSliceLayerName, layer);
|
AddPrimitiveToProfiler(stridedSliceLayerName, layer);
|
||||||
|
|||||||
@@ -359,7 +359,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitDeinterleaveComponentPrivate(intel_dn
|
|||||||
comp.operation = kDnnDeinterleaveOp;
|
comp.operation = kDnnDeinterleaveOp;
|
||||||
comp.macro_operation = kDnnMacroOpNone;
|
comp.macro_operation = kDnnMacroOpNone;
|
||||||
comp.orientation_in = kDnnInterleavedOrientation;
|
comp.orientation_in = kDnnInterleavedOrientation;
|
||||||
comp.orientation_out = kDnnNonInterleavedOrientation;
|
comp.orientation_out = kDnnInterleavedOrientation;
|
||||||
comp.output_scale_factor = output_scale_factor;
|
comp.output_scale_factor = output_scale_factor;
|
||||||
comp.input_scale_factor = output_scale_factor;
|
comp.input_scale_factor = output_scale_factor;
|
||||||
if (!postInitMem) {
|
if (!postInitMem) {
|
||||||
@@ -1524,6 +1524,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitGNAStruct(intel_nnet_type_t *ptr_nnet
|
|||||||
THROW_GNA_EXCEPTION << "Encountered activation component before pooling component at." << i;
|
THROW_GNA_EXCEPTION << "Encountered activation component before pooling component at." << i;
|
||||||
} else {
|
} else {
|
||||||
const auto poolMode = reinterpret_cast<Gna2PoolingMode*>(gnaUserAllocator(sizeof(Gna2PoolingMode)));
|
const auto poolMode = reinterpret_cast<Gna2PoolingMode*>(gnaUserAllocator(sizeof(Gna2PoolingMode)));
|
||||||
|
IE_ASSERT(poolMode != nullptr);
|
||||||
*poolMode = (comp.op.maxpool.do_sum_not_max) ? Gna2PoolingModeSum : Gna2PoolingModeMax;
|
*poolMode = (comp.op.maxpool.do_sum_not_max) ? Gna2PoolingModeSum : Gna2PoolingModeMax;
|
||||||
const auto poolWindow = create_shape1D_parameter(comp.op.maxpool.num_inputs);
|
const auto poolWindow = create_shape1D_parameter(comp.op.maxpool.num_inputs);
|
||||||
const auto poolStride = create_shape1D_parameter(comp.op.maxpool.num_inputs_step);
|
const auto poolStride = create_shape1D_parameter(comp.op.maxpool.num_inputs_step);
|
||||||
@@ -1583,6 +1584,7 @@ void GNAPluginNS::backend::AMIntelDNN::InitGNAStruct(intel_nnet_type_t *ptr_nnet
|
|||||||
case kDnnPiecewiselinearOp:
|
case kDnnPiecewiselinearOp:
|
||||||
#if GNA_LIB_VER == 2
|
#if GNA_LIB_VER == 2
|
||||||
{
|
{
|
||||||
|
IE_ASSERT(gnaOperation->Operands != nullptr);
|
||||||
auto& outputTensor = const_cast<Gna2Tensor&>(*gnaOperation->Operands[OutOpIdx]);
|
auto& outputTensor = const_cast<Gna2Tensor&>(*gnaOperation->Operands[OutOpIdx]);
|
||||||
outputTensor.Data = comp.ptr_outputs;
|
outputTensor.Data = comp.ptr_outputs;
|
||||||
outputTensor.Type = Gna2DataTypeFromBytes(comp.num_bytes_per_output);
|
outputTensor.Type = Gna2DataTypeFromBytes(comp.num_bytes_per_output);
|
||||||
|
|||||||
@@ -80,7 +80,7 @@ static const char *intel_dnn_softmax_name[kSoftmaxNumType] = {
|
|||||||
};
|
};
|
||||||
|
|
||||||
typedef enum {
|
typedef enum {
|
||||||
kDnnUnknownOrientation,
|
kDnnUnknownOrientation = 100,
|
||||||
kDnnInterleavedOrientation,
|
kDnnInterleavedOrientation,
|
||||||
kDnnNonInterleavedOrientation,
|
kDnnNonInterleavedOrientation,
|
||||||
kDnnNumOrientation
|
kDnnNumOrientation
|
||||||
|
|||||||
@@ -199,9 +199,17 @@ class ScaleFactorPerLayer<InferenceEngine::CNNLayer *> {
|
|||||||
|
|
||||||
if (cnnLayer->type == "Const") {
|
if (cnnLayer->type == "Const") {
|
||||||
auto blob = cnnLayer->blobs["custom"];
|
auto blob = cnnLayer->blobs["custom"];
|
||||||
if (blob->getTensorDesc().getPrecision() == InferenceEngine::Precision::FP16) {
|
auto blob_precision = blob->getTensorDesc().getPrecision();
|
||||||
|
|
||||||
|
if (blob_precision != InferenceEngine::Precision::FP32 && blob_precision != InferenceEngine::Precision::FP16) {
|
||||||
|
quant->_dst_quant.scale = 1.0f;
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (blob_precision == InferenceEngine::Precision::FP16) {
|
||||||
blob = make_fp32_blob(blob);
|
blob = make_fp32_blob(blob);
|
||||||
}
|
}
|
||||||
|
|
||||||
auto max_val = std::numeric_limits<float>::min();
|
auto max_val = std::numeric_limits<float>::min();
|
||||||
auto min_val = std::numeric_limits<float>::max();
|
auto min_val = std::numeric_limits<float>::max();
|
||||||
|
|
||||||
|
|||||||
@@ -9,6 +9,7 @@
|
|||||||
#if GNA_LIB_VER == 2
|
#if GNA_LIB_VER == 2
|
||||||
#include "gna2_model_debug_log.hpp"
|
#include "gna2_model_debug_log.hpp"
|
||||||
#include "gna2-model-api.h"
|
#include "gna2-model-api.h"
|
||||||
|
#include <details/ie_exception.hpp>
|
||||||
|
|
||||||
#include <cstdint>
|
#include <cstdint>
|
||||||
#include <fstream>
|
#include <fstream>
|
||||||
@@ -52,6 +53,7 @@ template <class T>
|
|||||||
bool NextElement(T & elementIndex, const Gna2Shape& total) {
|
bool NextElement(T & elementIndex, const Gna2Shape& total) {
|
||||||
if (total.NumberOfDimensions == 0) return false;
|
if (total.NumberOfDimensions == 0) return false;
|
||||||
auto idx = total.NumberOfDimensions - 1;
|
auto idx = total.NumberOfDimensions - 1;
|
||||||
|
IE_ASSERT(idx < GNA2_SHAPE_MAXIMUM_NUMBER_OF_DIMENSIONS);
|
||||||
while (elementIndex[idx] + 1 >= total.Dimensions[idx] && idx > 0) {
|
while (elementIndex[idx] + 1 >= total.Dimensions[idx] && idx > 0) {
|
||||||
idx--;
|
idx--;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -60,6 +60,7 @@ Gna2Tensor HelperGna2TensorInit3D(uint32_t x, uint32_t y, uint32_t z, Gna2DataTy
|
|||||||
|
|
||||||
Gna2Tensor * createGna2Tensor1D(uint32_t x, uint32_t byteSize, void* data) {
|
Gna2Tensor * createGna2Tensor1D(uint32_t x, uint32_t byteSize, void* data) {
|
||||||
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
||||||
|
IE_ASSERT(input != nullptr);
|
||||||
*input = HelperGna2TensorInit1D(x, Gna2DataTypeFromBytes(byteSize), data);
|
*input = HelperGna2TensorInit1D(x, Gna2DataTypeFromBytes(byteSize), data);
|
||||||
return input;
|
return input;
|
||||||
}
|
}
|
||||||
@@ -74,6 +75,7 @@ Gna2Tensor * createGna2TensorPwl(uint32_t x, void* data) {
|
|||||||
|
|
||||||
Gna2Tensor * createGna2BiasTensor1D(uint32_t x, uint32_t byteSize, void* data) {
|
Gna2Tensor * createGna2BiasTensor1D(uint32_t x, uint32_t byteSize, void* data) {
|
||||||
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
||||||
|
IE_ASSERT(input != nullptr);
|
||||||
if (byteSize == 8) {
|
if (byteSize == 8) {
|
||||||
*input = HelperGna2TensorInit1D(x, Gna2DataTypeCompoundBias, data);
|
*input = HelperGna2TensorInit1D(x, Gna2DataTypeCompoundBias, data);
|
||||||
} else {
|
} else {
|
||||||
@@ -84,24 +86,28 @@ Gna2Tensor * createGna2BiasTensor1D(uint32_t x, uint32_t byteSize, void* data) {
|
|||||||
|
|
||||||
Gna2Tensor * createGna2Tensor2D(uint32_t x, uint32_t y, uint32_t byteSize, void* data) {
|
Gna2Tensor * createGna2Tensor2D(uint32_t x, uint32_t y, uint32_t byteSize, void* data) {
|
||||||
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
||||||
|
IE_ASSERT(input != nullptr);
|
||||||
*input = HelperGna2TensorInit2D(x, y, Gna2DataTypeFromBytes(byteSize), data);
|
*input = HelperGna2TensorInit2D(x, y, Gna2DataTypeFromBytes(byteSize), data);
|
||||||
return input;
|
return input;
|
||||||
}
|
}
|
||||||
|
|
||||||
Gna2Tensor * createGna2Tensor3D(uint32_t x, uint32_t y, uint32_t z, uint32_t byteSize, void* data) {
|
Gna2Tensor * createGna2Tensor3D(uint32_t x, uint32_t y, uint32_t z, uint32_t byteSize, void* data) {
|
||||||
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
const auto input = reinterpret_cast<Gna2Tensor*>(gnaUserAllocator(sizeof(Gna2Tensor)));
|
||||||
|
IE_ASSERT(input != nullptr);
|
||||||
*input = HelperGna2TensorInit3D(x, y, z, Gna2DataTypeFromBytes(byteSize), data);
|
*input = HelperGna2TensorInit3D(x, y, z, Gna2DataTypeFromBytes(byteSize), data);
|
||||||
return input;
|
return input;
|
||||||
}
|
}
|
||||||
|
|
||||||
uint32_t* create_uint32_parameter(uint32_t value) {
|
uint32_t* create_uint32_parameter(uint32_t value) {
|
||||||
const auto param = reinterpret_cast<uint32_t*>(gnaUserAllocator(sizeof(uint32_t)));
|
const auto param = reinterpret_cast<uint32_t*>(gnaUserAllocator(sizeof(uint32_t)));
|
||||||
|
IE_ASSERT(param != nullptr);
|
||||||
*param = value;
|
*param = value;
|
||||||
return param;
|
return param;
|
||||||
}
|
}
|
||||||
|
|
||||||
Gna2Shape* create_shape1D_parameter(uint32_t x) {
|
Gna2Shape* create_shape1D_parameter(uint32_t x) {
|
||||||
const auto shp = reinterpret_cast<Gna2Shape*>(gnaUserAllocator(sizeof(Gna2Shape)));
|
const auto shp = reinterpret_cast<Gna2Shape*>(gnaUserAllocator(sizeof(Gna2Shape)));
|
||||||
|
IE_ASSERT(shp != nullptr);
|
||||||
shp->NumberOfDimensions = 1;
|
shp->NumberOfDimensions = 1;
|
||||||
shp->Dimensions[0] = x;
|
shp->Dimensions[0] = x;
|
||||||
return shp;
|
return shp;
|
||||||
|
|||||||
@@ -25,7 +25,7 @@
|
|||||||
#include "gna_plugin_log.hpp"
|
#include "gna_plugin_log.hpp"
|
||||||
|
|
||||||
uint8_t* GNADeviceHelper::alloc(uint32_t size_requested, uint32_t *size_granted) {
|
uint8_t* GNADeviceHelper::alloc(uint32_t size_requested, uint32_t *size_granted) {
|
||||||
void * memPtr;
|
void * memPtr = nullptr;
|
||||||
#if GNA_LIB_VER == 1
|
#if GNA_LIB_VER == 1
|
||||||
memPtr = GNAAlloc(nGNAHandle, size_requested, size_granted);
|
memPtr = GNAAlloc(nGNAHandle, size_requested, size_granted);
|
||||||
#else
|
#else
|
||||||
|
|||||||
@@ -337,6 +337,7 @@ void GNAGraphCompiler::ConvolutionPrimitive(InferenceEngine::CNNLayerPtr layer)
|
|||||||
void GNAGraphCompiler::PowerPrimitive(InferenceEngine::CNNLayerPtr layer) {
|
void GNAGraphCompiler::PowerPrimitive(InferenceEngine::CNNLayerPtr layer) {
|
||||||
auto& power = dynamic_cast<PowerLayer&>(*layer.get());
|
auto& power = dynamic_cast<PowerLayer&>(*layer.get());
|
||||||
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(layer);
|
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(layer);
|
||||||
|
IE_ASSERT(gnaFlags->sw_fp32 ? (quantized == nullptr) : (quantized != nullptr));
|
||||||
|
|
||||||
if (power.power != 1.0) {
|
if (power.power != 1.0) {
|
||||||
THROW_IE_EXCEPTION << "[GNA plugin] unsupported power factor, expected 1 but was " << power.power;
|
THROW_IE_EXCEPTION << "[GNA plugin] unsupported power factor, expected 1 but was " << power.power;
|
||||||
@@ -386,29 +387,14 @@ void GNAGraphCompiler::PowerPrimitive(InferenceEngine::CNNLayerPtr layer) {
|
|||||||
|
|
||||||
if (gnaFlags->sw_fp32) {
|
if (gnaFlags->sw_fp32) {
|
||||||
gnamem->readonly().push_value(ptr_weights, power.scale, num_rows_out, 64);
|
gnamem->readonly().push_value(ptr_weights, power.scale, num_rows_out, 64);
|
||||||
gnamem->readonly().push_value(ptr_biases, power.scale, num_rows_out, 64);
|
gnamem->readonly().push_value(ptr_biases, power.offset, num_rows_out, 64);
|
||||||
} else {
|
} else {
|
||||||
auto weightsScaledIdentity = power.scale;
|
auto quantizedScale = FLOAT_TO_INT16(std::min(quantized->_weights_quant.scale * power.scale,
|
||||||
auto biasesScaledIdentity = power.scale;
|
static_cast<float>(INT16_MAX)));
|
||||||
if (quantized != nullptr) {
|
auto quantizedOffset = FLOAT_TO_INT32(std::min(quantized->_dst_quant.scale * power.offset,
|
||||||
weightsScaledIdentity = quantized->_weights_quant.scale * weightsScaledIdentity;
|
static_cast<float>(INT32_MAX)));
|
||||||
biasesScaledIdentity = quantized->_bias_quant.scale * biasesScaledIdentity;
|
gnamem->readonly().push_value<int16_t>(ptr_weights, quantizedScale, num_rows_out, 64);
|
||||||
}
|
gnamem->readonly().push_value<int32_t>(ptr_biases, quantizedOffset, num_rows_out, 64);
|
||||||
|
|
||||||
auto weightQuantizedIdentity = FLOAT_TO_INT16(std::min(weightsScaledIdentity, static_cast<float>(INT16_MAX)));
|
|
||||||
auto biasesQuantizedIdentity = FLOAT_TO_INT16(std::min(biasesScaledIdentity, static_cast<float>(INT16_MAX)));
|
|
||||||
gnamem->readonly().push_value<int16_t>(ptr_weights, weightQuantizedIdentity, num_rows_out, 64);
|
|
||||||
gnamem->readonly().push_value<int32_t>(ptr_biases, biasesQuantizedIdentity, num_rows_out, 64);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (power.offset != 0.0f) {
|
|
||||||
if (quantized == nullptr) {
|
|
||||||
gnamem->readonly().push_value(ptr_biases, 0.0f, num_rows_out, 64);
|
|
||||||
} else {
|
|
||||||
gnamem->readonly().push_value<int32_t>(ptr_biases, 0, num_rows_out, 64);
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
gnamem->readonly().push_value(ptr_biases, 0.0f, num_rows_out, 64);
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -1417,6 +1403,7 @@ void GNAGraphCompiler::PermutePrimitive(InferenceEngine::CNNLayerPtr layer) {
|
|||||||
}
|
}
|
||||||
auto layerOrder = layer->GetParamAsInts("order");
|
auto layerOrder = layer->GetParamAsInts("order");
|
||||||
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(layer);
|
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(layer);
|
||||||
|
IE_ASSERT(!layer->insData.empty());
|
||||||
auto inputs = layer->insData.begin()->lock();
|
auto inputs = layer->insData.begin()->lock();
|
||||||
auto inputsOrder = inputs->getTensorDesc().getDims();
|
auto inputsOrder = inputs->getTensorDesc().getDims();
|
||||||
auto outputs = layer->outData.front();
|
auto outputs = layer->outData.front();
|
||||||
|
|||||||
@@ -176,6 +176,63 @@ inline std::pair<InferenceEngine::CNNLayerPtr, int> CNNNetCheckNextLayerSkipCer
|
|||||||
return CNNNetCheckNextLayerSkipCertain(outLayer->second, 0, 0, bOnlyCheck, shouldSkip);
|
return CNNNetCheckNextLayerSkipCertain(outLayer->second, 0, 0, bOnlyCheck, shouldSkip);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @brief return all layers reachable from given one
|
||||||
|
* @param layer
|
||||||
|
* @param oDataIdx - -1 means iterate over all odata indexes
|
||||||
|
* @param shouldSkip
|
||||||
|
* @return
|
||||||
|
*/
|
||||||
|
template <class Layer>
|
||||||
|
inline std::vector<CNNLayerPtr> CNNNetGetAllNextLayersSkipCertain(Layer layer, int oDataIdx, const std::function<bool(CNNLayerPtr)> &shouldSkip) {
|
||||||
|
// TODO: need to have generic function that creates slice of the graph : starting from given layer
|
||||||
|
// and skipped all non functional - ending up into functional one
|
||||||
|
|
||||||
|
std::list<CNNLayerPtr> currentSet;
|
||||||
|
std::vector<CNNLayerPtr> resultSet;
|
||||||
|
|
||||||
|
std::vector<std::map<std::string, CNNLayerPtr>> start;
|
||||||
|
if (oDataIdx == -1) {
|
||||||
|
for (int i = 0; i != layer->outData.size(); i++) {
|
||||||
|
start.push_back(layer->outData[i]->getInputTo());
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
start.push_back(layer->outData[oDataIdx]->getInputTo());
|
||||||
|
}
|
||||||
|
|
||||||
|
auto separate_layers = [¤tSet, &resultSet, &shouldSkip](std::map<std::string, CNNLayerPtr>& inputTo) {
|
||||||
|
for (auto &&bfsLayer : inputTo) {
|
||||||
|
if (shouldSkip(bfsLayer.second)) {
|
||||||
|
currentSet.push_back(bfsLayer.second);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
resultSet.push_back(bfsLayer.second);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
int startIdx, endIdx;
|
||||||
|
if (oDataIdx == -1) {
|
||||||
|
startIdx = 0;
|
||||||
|
endIdx = layer->outData.size();
|
||||||
|
} else {
|
||||||
|
startIdx = oDataIdx;
|
||||||
|
endIdx = oDataIdx + 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (int i = startIdx; i != endIdx; i++) {
|
||||||
|
separate_layers(layer->outData[i]->getInputTo());
|
||||||
|
}
|
||||||
|
|
||||||
|
while (!currentSet.empty()) {
|
||||||
|
auto currentLayer = currentSet.front();
|
||||||
|
currentSet.pop_front();
|
||||||
|
for (auto && oData : currentLayer->outData) {
|
||||||
|
separate_layers(oData->getInputTo());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return resultSet;
|
||||||
|
}
|
||||||
|
|
||||||
/// @brief alias for strict checkNextLayer (false)
|
/// @brief alias for strict checkNextLayer (false)
|
||||||
template <class Layer>
|
template <class Layer>
|
||||||
inline std::pair<InferenceEngine::CNNLayerPtr, int> CNNNetGetNextLayerSkipCertain(Layer layer, int oidx, int iidx,
|
inline std::pair<InferenceEngine::CNNLayerPtr, int> CNNNetGetNextLayerSkipCertain(Layer layer, int oidx, int iidx,
|
||||||
@@ -474,7 +531,31 @@ inline void CNNNetworkInsertLayer(CNNLayerPtr after,
|
|||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @brief remove givven layer from topology, currently only layers with one input data and one output data supported
|
* @brief returns previous layers and outData index for it
|
||||||
|
* @tparam T
|
||||||
|
* @param origin
|
||||||
|
* @param acceptanceCriteria
|
||||||
|
* @param idx
|
||||||
|
*/
|
||||||
|
template <class T>
|
||||||
|
std::vector<std::pair<CNNLayerPtr, int> > CNNNetGetPrevLayersSkip(CNNLayerPtr origin, const T &acceptanceCriteria, int idx = -1) {
|
||||||
|
std::vector<std::pair<CNNLayerPtr, int> > prevLayers;
|
||||||
|
for (int i = idx == -1 ? 0 : idx; CNNNetHasPrevLayer(origin.get(), i) && (idx == -1 || i == idx); i++) {
|
||||||
|
auto prevLayer = CNNNetPrevLayer(origin, i);
|
||||||
|
if (acceptanceCriteria(prevLayer)) {
|
||||||
|
prevLayers.push_back({prevLayer, CNNLayerFindOutDataIdx(origin, i)});
|
||||||
|
} else {
|
||||||
|
// if for some input we need to look in upper layers - original index not used here intentionally
|
||||||
|
auto prevPrevLayers = CNNNetGetPrevLayersSkip(prevLayer, acceptanceCriteria);
|
||||||
|
prevLayers.insert(prevLayers.end(), prevPrevLayers.begin(), prevPrevLayers.end());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return prevLayers;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @brief remove given layer from topology, currently only layers with one input data and one output data supported
|
||||||
*/
|
*/
|
||||||
inline void CNNNetworkRemoveLayer(CNNLayerPtr layer) {
|
inline void CNNNetworkRemoveLayer(CNNLayerPtr layer) {
|
||||||
if (!layer) {
|
if (!layer) {
|
||||||
|
|||||||
@@ -8,6 +8,9 @@
|
|||||||
#include <ios>
|
#include <ios>
|
||||||
#include <iomanip>
|
#include <iomanip>
|
||||||
#include <map>
|
#include <map>
|
||||||
|
#include <ie_algorithm.hpp>
|
||||||
|
#include <ie_common.h>
|
||||||
|
#include <ie_precision.hpp>
|
||||||
|
|
||||||
#if defined __INTEL_COMPILER || defined _MSC_VER
|
#if defined __INTEL_COMPILER || defined _MSC_VER
|
||||||
#include <malloc.h>
|
#include <malloc.h>
|
||||||
@@ -119,15 +122,26 @@ const std::map<Gna2OperationType, std::vector<uint32_t>> GnaParamSize{
|
|||||||
sizeof(Gna2Shape),
|
sizeof(Gna2Shape),
|
||||||
sizeof(Gna2Shape)}},
|
sizeof(Gna2Shape)}},
|
||||||
{Gna2OperationTypeCopy, {sizeof(Gna2Shape)}},
|
{Gna2OperationTypeCopy, {sizeof(Gna2Shape)}},
|
||||||
|
{Gna2OperationTypeTransposition, {sizeof(Gna2Shape)}},
|
||||||
};
|
};
|
||||||
|
|
||||||
void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream & is) {
|
void GNAModelSerial::Import(void *basePointer,
|
||||||
|
size_t gnaGraphSize,
|
||||||
|
std::istream & is,
|
||||||
|
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||||
|
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||||
|
InferenceEngine::InputsDataMap& inputsDataMap,
|
||||||
|
InferenceEngine::OutputsDataMap& outputsDataMap) {
|
||||||
is.exceptions(std::istream::failbit);
|
is.exceptions(std::istream::failbit);
|
||||||
|
|
||||||
|
ImportInputs(is, basePointer, inputsDesc, inputsDataMap);
|
||||||
|
ImportOutputs(is, basePointer, desc, outputsDataMap);
|
||||||
|
|
||||||
for (auto operation = gna2Model->Operations; operation != gna2Model->Operations + gna2Model->NumberOfOperations; ++operation) {
|
for (auto operation = gna2Model->Operations; operation != gna2Model->Operations + gna2Model->NumberOfOperations; ++operation) {
|
||||||
readNBits<32>(operation->Type, is);
|
readNBits<32>(operation->Type, is);
|
||||||
readBits(operation->NumberOfOperands, is);
|
readBits(operation->NumberOfOperands, is);
|
||||||
operation->Operands = static_cast<Gna2Tensor const **>(gnaUserAllocator(sizeof(Gna2Tensor*) * operation->NumberOfOperands));
|
operation->Operands = static_cast<Gna2Tensor const **>(gnaUserAllocator(sizeof(Gna2Tensor*) * operation->NumberOfOperands));
|
||||||
|
IE_ASSERT(operation->Operands != nullptr);
|
||||||
for (uint32_t i = 0; i < operation->NumberOfOperands; i++) {
|
for (uint32_t i = 0; i < operation->NumberOfOperands; i++) {
|
||||||
Gna2Tensor t{};
|
Gna2Tensor t{};
|
||||||
readBits(t, is);
|
readBits(t, is);
|
||||||
@@ -145,11 +159,10 @@ void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream
|
|||||||
case Gna2OperationTypeFullyConnectedAffine:
|
case Gna2OperationTypeFullyConnectedAffine:
|
||||||
case Gna2OperationTypeConvolution:
|
case Gna2OperationTypeConvolution:
|
||||||
case Gna2OperationTypeCopy:
|
case Gna2OperationTypeCopy:
|
||||||
|
case Gna2OperationTypeTransposition:
|
||||||
break;
|
break;
|
||||||
case Gna2OperationTypeRecurrent:
|
case Gna2OperationTypeRecurrent:
|
||||||
THROW_GNA_EXCEPTION << "Importing of recurrent operation not supported";
|
THROW_GNA_EXCEPTION << "Importing of recurrent operation not supported";
|
||||||
case Gna2OperationTypeTransposition:
|
|
||||||
THROW_GNA_EXCEPTION << "Importing of transposition operation not supported";
|
|
||||||
default:
|
default:
|
||||||
THROW_GNA_EXCEPTION << "Importing of unknown GNA operation type(" << operation->Type << ") not supported";
|
THROW_GNA_EXCEPTION << "Importing of unknown GNA operation type(" << operation->Type << ") not supported";
|
||||||
}
|
}
|
||||||
@@ -158,8 +171,9 @@ void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream
|
|||||||
else
|
else
|
||||||
operation->Parameters = nullptr;
|
operation->Parameters = nullptr;
|
||||||
for (uint32_t i = 0; i < operation->NumberOfParameters; i++) {
|
for (uint32_t i = 0; i < operation->NumberOfParameters; i++) {
|
||||||
uint32_t paramSize;
|
uint32_t paramSize = 0;
|
||||||
readBits(paramSize, is);
|
readBits(paramSize, is);
|
||||||
|
IE_ASSERT(operation->Parameters != nullptr);
|
||||||
if (paramSize == 0) {
|
if (paramSize == 0) {
|
||||||
operation->Parameters[i] = nullptr;
|
operation->Parameters[i] = nullptr;
|
||||||
continue;
|
continue;
|
||||||
@@ -235,11 +249,12 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
|||||||
};
|
};
|
||||||
|
|
||||||
auto convert_to_serial = [getOffsetFromBase](const GNAModelSerial::RuntimeEndPoint& ep) {
|
auto convert_to_serial = [getOffsetFromBase](const GNAModelSerial::RuntimeEndPoint& ep) {
|
||||||
ModelHeader::EndPoint out;
|
RuntimeEndPoint out;
|
||||||
out.elements_count = ep.elements_count;
|
out.elements_count = ep.elements_count;
|
||||||
out.descriptor_offset = offsetFromBase(ep.descriptor_ptr);
|
out.descriptor_offset = offsetFromBase(ep.descriptor_ptr);
|
||||||
out.scaleFactor = ep.scaleFactor;
|
out.scaleFactor = ep.scaleFactor;
|
||||||
out.element_size = ep.element_size;
|
out.element_size = ep.element_size;
|
||||||
|
out.orientation = ep.orientation;
|
||||||
return out;
|
return out;
|
||||||
};
|
};
|
||||||
/**
|
/**
|
||||||
@@ -256,15 +271,21 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
|||||||
header.gnaMemSize = gnaGraphSize;
|
header.gnaMemSize = gnaGraphSize;
|
||||||
header.layersCount = layers.size();
|
header.layersCount = layers.size();
|
||||||
header.nGroup = guessGrouping(*gna2Model);
|
header.nGroup = guessGrouping(*gna2Model);
|
||||||
header.input = convert_to_serial(input);
|
header.nInputs = inputs.size();
|
||||||
header.output = convert_to_serial(output);
|
header.nOutputs = outputs.size();
|
||||||
|
|
||||||
header.nRotateRows = nRotateRows;
|
header.nRotateRows = nRotateRows;
|
||||||
header.nRotateColumns = nRotateColumns;
|
header.nRotateColumns = nRotateColumns;
|
||||||
|
|
||||||
|
|
||||||
writeBits(header, os);
|
writeBits(header, os);
|
||||||
|
|
||||||
|
for (const auto &input : inputs) {
|
||||||
|
writeBits(convert_to_serial(input), os);
|
||||||
|
}
|
||||||
|
for (const auto &output : outputs) {
|
||||||
|
writeBits(convert_to_serial(output), os);
|
||||||
|
}
|
||||||
|
|
||||||
for (const auto & layer : layers) {
|
for (const auto & layer : layers) {
|
||||||
writeBits(static_cast<uint32_t>(layer.Type), os);
|
writeBits(static_cast<uint32_t>(layer.Type), os);
|
||||||
writeBits(layer.NumberOfOperands, os);
|
writeBits(layer.NumberOfOperands, os);
|
||||||
@@ -284,11 +305,10 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
|||||||
case Gna2OperationTypeFullyConnectedAffine:
|
case Gna2OperationTypeFullyConnectedAffine:
|
||||||
case Gna2OperationTypeConvolution:
|
case Gna2OperationTypeConvolution:
|
||||||
case Gna2OperationTypeCopy:
|
case Gna2OperationTypeCopy:
|
||||||
|
case Gna2OperationTypeTransposition:
|
||||||
break;
|
break;
|
||||||
case Gna2OperationTypeRecurrent:
|
case Gna2OperationTypeRecurrent:
|
||||||
THROW_GNA_EXCEPTION << "Exporting of recurrent operation not supported";
|
THROW_GNA_EXCEPTION << "Exporting of recurrent operation not supported";
|
||||||
case Gna2OperationTypeTransposition:
|
|
||||||
THROW_GNA_EXCEPTION << "Exporting of interleave operation not supported";
|
|
||||||
default:
|
default:
|
||||||
THROW_GNA_EXCEPTION << "Exporting of unknown GNA operation type(" << layer.Type << ") not supported";
|
THROW_GNA_EXCEPTION << "Exporting of unknown GNA operation type(" << layer.Type << ") not supported";
|
||||||
}
|
}
|
||||||
@@ -314,9 +334,18 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
|||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
|
|
||||||
void GNAModelSerial::Import(void *basePointer, size_t gnaGraphSize, std::istream & is) {
|
void GNAModelSerial::Import(void *basePointer,
|
||||||
|
size_t gnaGraphSize,
|
||||||
|
std::istream & is,
|
||||||
|
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||||
|
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||||
|
InferenceEngine::InputsDataMap& inputsDataMap,
|
||||||
|
InferenceEngine::OutputsDataMap& outputsDataMap) {
|
||||||
is.exceptions(std::istream::failbit);
|
is.exceptions(std::istream::failbit);
|
||||||
|
|
||||||
|
ImportInputs(is, basePointer, inputsDesc, inputsDataMap);
|
||||||
|
ImportOutputs(is, basePointer, desc, outputsDataMap);
|
||||||
|
|
||||||
auto readPwl = [&is, basePointer](intel_pwl_func_t & value) {
|
auto readPwl = [&is, basePointer](intel_pwl_func_t & value) {
|
||||||
readBits(value.nSegments, is);
|
readBits(value.nSegments, is);
|
||||||
if (value.nSegments != 0) {
|
if (value.nSegments != 0) {
|
||||||
@@ -466,11 +495,12 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
|||||||
};
|
};
|
||||||
|
|
||||||
auto convert_to_serial = [getOffsetFromBase](const GNAModelSerial::RuntimeEndPoint& ep){
|
auto convert_to_serial = [getOffsetFromBase](const GNAModelSerial::RuntimeEndPoint& ep){
|
||||||
ModelHeader::EndPoint out;
|
RuntimeEndPoint out;
|
||||||
out.elements_count = ep.elements_count;
|
out.elements_count = ep.elements_count;
|
||||||
out.element_size = ep.element_size;
|
out.element_size = ep.element_size;
|
||||||
out.descriptor_offset = offsetFromBase(ep.descriptor_ptr);
|
out.descriptor_offset = offsetFromBase(ep.descriptor_ptr);
|
||||||
out.scaleFactor = ep.scaleFactor;
|
out.scaleFactor = ep.scaleFactor;
|
||||||
|
out.orientation = ep.orientation;
|
||||||
return out;
|
return out;
|
||||||
};
|
};
|
||||||
/**
|
/**
|
||||||
@@ -486,14 +516,16 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
|||||||
header.gnaMemSize = gnaGraphSize;
|
header.gnaMemSize = gnaGraphSize;
|
||||||
header.layersCount = layers.size();
|
header.layersCount = layers.size();
|
||||||
header.nGroup = ptr_nnet->nGroup;
|
header.nGroup = ptr_nnet->nGroup;
|
||||||
header.input = convert_to_serial(input);
|
header.nInputs = 1;
|
||||||
header.output = convert_to_serial(output);
|
header.nOutputs = 1;
|
||||||
header.headerSize = sizeof(ModelHeader);
|
header.headerSize = sizeof(ModelHeader);
|
||||||
header.nRotateRows = nRotateRows;
|
header.nRotateRows = nRotateRows;
|
||||||
header.nRotateColumns = nRotateColumns;
|
header.nRotateColumns = nRotateColumns;
|
||||||
|
|
||||||
|
|
||||||
writeBits(header, os);
|
writeBits(header, os);
|
||||||
|
writeBits(convert_to_serial(inputs[0]), os);
|
||||||
|
writeBits(convert_to_serial(outputs[0]), os);
|
||||||
|
|
||||||
for (auto & layer : layers) {
|
for (auto & layer : layers) {
|
||||||
writeBits(layer.nInputColumns, os);
|
writeBits(layer.nInputColumns, os);
|
||||||
@@ -572,3 +604,108 @@ void GNAModelSerial::Export(void * basePointer, size_t gnaGraphSize, std::ostrea
|
|||||||
}
|
}
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
std::vector<GNAModelSerial::RuntimeEndPoint> GNAModelSerial::serializeOutputs(const InferenceEngine::OutputsDataMap& outputsDataMap,
|
||||||
|
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc) {
|
||||||
|
std::vector<GNAModelSerial::RuntimeEndPoint> endPoints;
|
||||||
|
std::size_t outputIndex = 0;
|
||||||
|
for (auto const &output : outputsDataMap) {
|
||||||
|
auto outputName = output.first;
|
||||||
|
auto inputDims = output.second->getTensorDesc().getDims();
|
||||||
|
uint32_t elementsCount = static_cast<uint32_t>(InferenceEngine::details::product(inputDims.begin(), inputDims.end()));
|
||||||
|
|
||||||
|
GNAModelSerial::RuntimeEndPoint endPoint(outputsDesc[outputIndex].scale_factor,
|
||||||
|
outputsDesc[outputIndex].ptrs[0],
|
||||||
|
outputsDesc[outputIndex].num_bytes_per_element,
|
||||||
|
elementsCount,
|
||||||
|
outputsDesc[outputIndex].orientation);
|
||||||
|
endPoints.push_back(endPoint);
|
||||||
|
outputIndex++;
|
||||||
|
}
|
||||||
|
return endPoints;
|
||||||
|
}
|
||||||
|
|
||||||
|
std::vector<GNAModelSerial::RuntimeEndPoint> GNAModelSerial::serializeInputs(const InferenceEngine::InputsDataMap& inputsDataMap,
|
||||||
|
std::shared_ptr<GNAPluginNS::InputDesc> inputDesc) {
|
||||||
|
std::vector<GNAModelSerial::RuntimeEndPoint> endPoints;
|
||||||
|
|
||||||
|
std::size_t inputIndex = 0;
|
||||||
|
for (auto const& input : inputsDataMap) {
|
||||||
|
auto inputName = input.first;
|
||||||
|
auto inputDims = input.second->getTensorDesc().getDims();
|
||||||
|
|
||||||
|
double scaleFactor = inputDesc->getScaleFactor(inputIndex);
|
||||||
|
std::vector<void *> descriptor_ptr = inputDesc->getPtrInputsGlobal(inputName);
|
||||||
|
IE_ASSERT(descriptor_ptr.size() > 0);
|
||||||
|
uint32_t element_size = 2u;
|
||||||
|
uint32_t elementsCount = static_cast<uint32_t>(InferenceEngine::details::product(inputDims.begin(), inputDims.end()));
|
||||||
|
intel_dnn_orientation_t orientation = inputDesc->getOrientation(inputName);
|
||||||
|
|
||||||
|
GNAModelSerial::RuntimeEndPoint endPoint(scaleFactor,
|
||||||
|
descriptor_ptr[0],
|
||||||
|
element_size,
|
||||||
|
elementsCount,
|
||||||
|
orientation);
|
||||||
|
endPoints.push_back(endPoint);
|
||||||
|
inputIndex++;
|
||||||
|
}
|
||||||
|
return endPoints;
|
||||||
|
}
|
||||||
|
|
||||||
|
void GNAModelSerial::ImportInputs(std::istream &is,
|
||||||
|
void* basePtr,
|
||||||
|
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||||
|
InferenceEngine::InputsDataMap& dataMap) {
|
||||||
|
dataMap.clear();
|
||||||
|
|
||||||
|
for (auto inputIndex = 0; inputIndex < modelHeader.nInputs; inputIndex++) {
|
||||||
|
std::string name = "input" + std::to_string(inputIndex);
|
||||||
|
RuntimeEndPoint input;
|
||||||
|
is.read(reinterpret_cast<char *>(&input), sizeof(input));
|
||||||
|
inputsDesc->getPtrInputsGlobal(name).push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + input.descriptor_offset));
|
||||||
|
inputsDesc->orientation_in[name] = input.orientation;
|
||||||
|
|
||||||
|
auto inputDims = InferenceEngine::SizeVector({modelHeader.nGroup, input.elements_count / modelHeader.nGroup});
|
||||||
|
|
||||||
|
dataMap[name] = std::make_shared<InferenceEngine::InputInfo>();
|
||||||
|
dataMap[name]->setInputData(std::make_shared<InferenceEngine::Data>(name,
|
||||||
|
InferenceEngine::TensorDesc(
|
||||||
|
InferenceEngine::Precision::FP32,
|
||||||
|
inputDims,
|
||||||
|
InferenceEngine::Layout::NC)));
|
||||||
|
inputsDesc->inputScaleFactors.push_back(input.scaleFactor);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void GNAModelSerial::ImportOutputs(std::istream &is,
|
||||||
|
void* basePtr,
|
||||||
|
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||||
|
InferenceEngine::OutputsDataMap& dataMap) {
|
||||||
|
desc.clear();
|
||||||
|
dataMap.clear();
|
||||||
|
desc.resize(modelHeader.nOutputs);
|
||||||
|
|
||||||
|
for (auto outputIndex = 0; outputIndex < modelHeader.nOutputs; outputIndex++) {
|
||||||
|
std::string name = "output" + std::to_string(outputIndex);
|
||||||
|
RuntimeEndPoint output;
|
||||||
|
is.read(reinterpret_cast<char *>(&output), sizeof(output));
|
||||||
|
GNAPluginNS::OutputDesc description;
|
||||||
|
description.ptrs.push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + output.descriptor_offset));
|
||||||
|
description.orientation = kDnnInterleavedOrientation;
|
||||||
|
description.orientation = output.orientation;
|
||||||
|
description.num_bytes_per_element = output.element_size;
|
||||||
|
description.scale_factor = output.scaleFactor;
|
||||||
|
|
||||||
|
auto outputDims = InferenceEngine::SizeVector({modelHeader.nGroup, output.elements_count / modelHeader.nGroup});
|
||||||
|
dataMap[name] = std::make_shared<InferenceEngine::Data>(name,
|
||||||
|
InferenceEngine::TensorDesc(
|
||||||
|
InferenceEngine::Precision::FP32,
|
||||||
|
outputDims,
|
||||||
|
InferenceEngine::Layout::NC));
|
||||||
|
desc.at(outputIndex) = description;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void GNAModelSerial::setHeader(ModelHeader header) {
|
||||||
|
modelHeader = header;
|
||||||
|
}
|
||||||
|
|||||||
@@ -7,7 +7,10 @@
|
|||||||
#include <istream>
|
#include <istream>
|
||||||
#include <vector>
|
#include <vector>
|
||||||
#include <utility>
|
#include <utility>
|
||||||
#include "gna-api.h"
|
|
||||||
|
#include <gna-api.h>
|
||||||
|
#include "descriptions/gna_input_desc.hpp"
|
||||||
|
#include "descriptions/gna_output_desc.hpp"
|
||||||
#include "gna_plugin_log.hpp"
|
#include "gna_plugin_log.hpp"
|
||||||
#if GNA_LIB_VER == 2
|
#if GNA_LIB_VER == 2
|
||||||
#include "gna2-model-api.h"
|
#include "gna2-model-api.h"
|
||||||
@@ -20,18 +23,19 @@
|
|||||||
* 1.0 - basic support
|
* 1.0 - basic support
|
||||||
* 1.1 - added memory information
|
* 1.1 - added memory information
|
||||||
* 2.0 - for use with GNA2 library
|
* 2.0 - for use with GNA2 library
|
||||||
|
* 2.1 - multiple i/o support
|
||||||
*/
|
*/
|
||||||
#if GNA_LIB_VER == 2
|
#if GNA_LIB_VER == 2
|
||||||
#define HEADER_MAJOR 2
|
#define HEADER_MAJOR 2
|
||||||
#define HEADER_MINOR 0
|
#define HEADER_MINOR 1
|
||||||
#else
|
#else
|
||||||
#define HEADER_MAJOR 1
|
#define HEADER_MAJOR 1
|
||||||
#define HEADER_MINOR 1
|
#define HEADER_MINOR 2
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @brief Header version 1.0
|
* @brief Header version 2.1
|
||||||
*/
|
*/
|
||||||
struct ModelHeader {
|
struct ModelHeader {
|
||||||
/**
|
/**
|
||||||
@@ -74,27 +78,8 @@ struct ModelHeader {
|
|||||||
uint32_t nRotateRows = 0u;
|
uint32_t nRotateRows = 0u;
|
||||||
uint32_t nRotateColumns = 0u;
|
uint32_t nRotateColumns = 0u;
|
||||||
|
|
||||||
|
uint32_t nInputs = 0u;
|
||||||
struct EndPoint {
|
uint32_t nOutputs = 0u;
|
||||||
/**
|
|
||||||
* if scale factor is different then pased into infer , network might need to be requantized
|
|
||||||
*/
|
|
||||||
float scaleFactor = 0.f;
|
|
||||||
/**
|
|
||||||
* Offset in bytes of pointer descriptor
|
|
||||||
*/
|
|
||||||
uint64_t descriptor_offset = 0ull;
|
|
||||||
/**
|
|
||||||
* Endpoint resolution in bytes.
|
|
||||||
*/
|
|
||||||
uint32_t element_size = 0u;
|
|
||||||
/**
|
|
||||||
* Number of elements
|
|
||||||
*/
|
|
||||||
uint32_t elements_count = 0u;
|
|
||||||
};
|
|
||||||
EndPoint input;
|
|
||||||
EndPoint output;
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Reserved Data might be here
|
* Reserved Data might be here
|
||||||
@@ -127,15 +112,23 @@ class GNAModelSerial {
|
|||||||
* Number of elements
|
* Number of elements
|
||||||
*/
|
*/
|
||||||
uint32_t elements_count = 0;
|
uint32_t elements_count = 0;
|
||||||
|
/**
|
||||||
|
* Offset in bytes of pointer descriptor
|
||||||
|
*/
|
||||||
|
uint64_t descriptor_offset = 0ull;
|
||||||
|
|
||||||
|
intel_dnn_orientation_t orientation = kDnnUnknownOrientation;
|
||||||
|
|
||||||
RuntimeEndPoint() = default;
|
RuntimeEndPoint() = default;
|
||||||
RuntimeEndPoint(double scaleFactor,
|
RuntimeEndPoint(double scaleFactor,
|
||||||
void* descriptor_ptr,
|
void* descriptor_ptr,
|
||||||
uint32_t element_size,
|
uint32_t element_size,
|
||||||
uint32_t elements_count) : scaleFactor(scaleFactor),
|
uint32_t elements_count,
|
||||||
|
intel_dnn_orientation_t orientation) : scaleFactor(scaleFactor),
|
||||||
descriptor_ptr(descriptor_ptr),
|
descriptor_ptr(descriptor_ptr),
|
||||||
element_size(element_size),
|
element_size(element_size),
|
||||||
elements_count(elements_count) {
|
elements_count(elements_count),
|
||||||
|
orientation(orientation) {
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
using MemoryType = std::vector<std::pair<void*, uint32_t>>;
|
using MemoryType = std::vector<std::pair<void*, uint32_t>>;
|
||||||
@@ -146,11 +139,23 @@ private:
|
|||||||
#else
|
#else
|
||||||
intel_nnet_type_t *ptr_nnet;
|
intel_nnet_type_t *ptr_nnet;
|
||||||
#endif
|
#endif
|
||||||
RuntimeEndPoint input, output;
|
std::vector<RuntimeEndPoint> inputs;
|
||||||
|
std::vector<RuntimeEndPoint> outputs;
|
||||||
uint32_t nRotateRows = 0;
|
uint32_t nRotateRows = 0;
|
||||||
uint32_t nRotateColumns = 0;
|
uint32_t nRotateColumns = 0;
|
||||||
|
|
||||||
MemoryType states, *pstates = nullptr;
|
MemoryType states, *pstates = nullptr;
|
||||||
|
ModelHeader modelHeader;
|
||||||
|
|
||||||
|
void ImportInputs(std::istream &is,
|
||||||
|
void* basePtr,
|
||||||
|
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||||
|
InferenceEngine::InputsDataMap& dataMap);
|
||||||
|
|
||||||
|
void ImportOutputs(std::istream &is,
|
||||||
|
void* basePtr,
|
||||||
|
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||||
|
InferenceEngine::OutputsDataMap& dataMap);
|
||||||
|
|
||||||
public:
|
public:
|
||||||
#if GNA_LIB_VER == 2
|
#if GNA_LIB_VER == 2
|
||||||
@@ -160,8 +165,12 @@ private:
|
|||||||
|
|
||||||
GNAModelSerial(
|
GNAModelSerial(
|
||||||
Gna2Model * model,
|
Gna2Model * model,
|
||||||
RuntimeEndPoint input,
|
const std::shared_ptr<GNAPluginNS::InputDesc> inputDesc,
|
||||||
RuntimeEndPoint output) : gna2Model(model), input(input), output(output) {
|
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc,
|
||||||
|
const InferenceEngine::InputsDataMap& inputsDataMap,
|
||||||
|
const InferenceEngine::OutputsDataMap& outputsDataMap) : gna2Model(model),
|
||||||
|
inputs(serializeInputs(inputsDataMap, inputDesc)),
|
||||||
|
outputs(serializeOutputs(outputsDataMap, outputsDesc)) {
|
||||||
}
|
}
|
||||||
|
|
||||||
#else
|
#else
|
||||||
@@ -183,8 +192,12 @@ private:
|
|||||||
*/
|
*/
|
||||||
GNAModelSerial(
|
GNAModelSerial(
|
||||||
intel_nnet_type_t *ptr_nnet,
|
intel_nnet_type_t *ptr_nnet,
|
||||||
RuntimeEndPoint input,
|
const std::shared_ptr<GNAPluginNS::InputDesc> inputDesc,
|
||||||
RuntimeEndPoint output) : ptr_nnet(ptr_nnet), input(input), output(output) {
|
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc,
|
||||||
|
const InferenceEngine::InputsDataMap& inputsDataMap,
|
||||||
|
const InferenceEngine::OutputsDataMap& outputsDataMap) : ptr_nnet(ptr_nnet),
|
||||||
|
inputs(serializeInputs(inputsDataMap, inputDesc)),
|
||||||
|
outputs(serializeOutputs(outputsDataMap, outputsDesc)) {
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
@@ -219,7 +232,13 @@ private:
|
|||||||
* @param basePointer
|
* @param basePointer
|
||||||
* @param is - stream without header structure - TBD heder might be needed
|
* @param is - stream without header structure - TBD heder might be needed
|
||||||
*/
|
*/
|
||||||
void Import(void *basePointer, size_t gnaGraphSize, std::istream &is);
|
void Import(void *basePointer,
|
||||||
|
size_t gnaGraphSize,
|
||||||
|
std::istream & is,
|
||||||
|
std::shared_ptr<GNAPluginNS::InputDesc> inputsDesc,
|
||||||
|
std::vector<GNAPluginNS::OutputDesc> &desc,
|
||||||
|
InferenceEngine::InputsDataMap& inputsDataMap,
|
||||||
|
InferenceEngine::OutputsDataMap& outputsDataMap);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* save gna graph to an outpus stream
|
* save gna graph to an outpus stream
|
||||||
@@ -231,4 +250,13 @@ private:
|
|||||||
void Export(void *basePtr,
|
void Export(void *basePtr,
|
||||||
size_t gnaGraphSize,
|
size_t gnaGraphSize,
|
||||||
std::ostream &os) const;
|
std::ostream &os) const;
|
||||||
|
|
||||||
|
static std::vector<GNAModelSerial::RuntimeEndPoint> serializeOutputs(const InferenceEngine::OutputsDataMap& outputsDataMap,
|
||||||
|
const std::vector<GNAPluginNS::OutputDesc>& outputsDesc);
|
||||||
|
|
||||||
|
|
||||||
|
static std::vector<GNAModelSerial::RuntimeEndPoint> serializeInputs(const InferenceEngine::InputsDataMap& inputsDataMap,
|
||||||
|
const std::shared_ptr<GNAPluginNS::InputDesc>);
|
||||||
|
|
||||||
|
void setHeader(ModelHeader header);
|
||||||
};
|
};
|
||||||
|
|||||||
@@ -373,6 +373,7 @@ void GNAPlugin::LoadNetwork(ICNNNetwork &network) {
|
|||||||
passes->registerPass<InsertDiagonalLayerPass>();
|
passes->registerPass<InsertDiagonalLayerPass>();
|
||||||
passes->registerPass<HandleMultipleActivationsForTheLayerPass>();
|
passes->registerPass<HandleMultipleActivationsForTheLayerPass>();
|
||||||
passes->registerPass<SubstituteScaleShiftBroadCastPass>();
|
passes->registerPass<SubstituteScaleShiftBroadCastPass>();
|
||||||
|
passes->registerPass<FuseMultipleIdentitiesPass>();
|
||||||
passIdx = passes->run(passIdx);
|
passIdx = passes->run(passIdx);
|
||||||
};
|
};
|
||||||
|
|
||||||
@@ -1140,13 +1141,15 @@ InferenceEngine::IExecutableNetwork::Ptr GNAPlugin::ImportNetwork(const std::str
|
|||||||
#else
|
#else
|
||||||
auto serial = GNAModelSerial(&std::get<0>(nnets.back())->obj, mt);
|
auto serial = GNAModelSerial(&std::get<0>(nnets.back())->obj, mt);
|
||||||
#endif
|
#endif
|
||||||
serial.Import(basePtr, header.gnaMemSize, inputStream);
|
|
||||||
|
|
||||||
inputsDesc->getPtrInputsGlobal("input").push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + header.input.descriptor_offset));
|
serial.setHeader(header);
|
||||||
// TODO: import of multioutput network not supported
|
serial.Import(basePtr,
|
||||||
outputsDesc.resize(1);
|
header.gnaMemSize,
|
||||||
auto &outputDesc = outputsDesc.front();
|
inputStream,
|
||||||
outputDesc.ptrs.push_back(reinterpret_cast<float*>(reinterpret_cast<uint8_t *> (basePtr) + header.output.descriptor_offset));
|
inputsDesc,
|
||||||
|
outputsDesc,
|
||||||
|
inputsDataMap,
|
||||||
|
outputsDataMap);
|
||||||
|
|
||||||
#if GNA_LIB_VER == 2
|
#if GNA_LIB_VER == 2
|
||||||
auto getOrientation = [](Gna2Operation & gnaOperation) {
|
auto getOrientation = [](Gna2Operation & gnaOperation) {
|
||||||
@@ -1160,32 +1163,10 @@ InferenceEngine::IExecutableNetwork::Ptr GNAPlugin::ImportNetwork(const std::str
|
|||||||
};
|
};
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#if GNA_LIB_VER == 2
|
#if GNA_LIB_VER == 1
|
||||||
inputsDesc->orientation_in["input"] = getOrientation(std::get<0>(gnaModels.back())->obj.Operations[0]);
|
|
||||||
outputDesc.orientation = getOrientation(std::get<0>(gnaModels.back())->obj.Operations[std::get<0>(gnaModels.back())->obj.NumberOfOperations - 1]);
|
|
||||||
#else
|
|
||||||
inputsDesc->orientation_in["input"] = getOrientation(std::get<0>(nnets.back())->obj.pLayers[0]);
|
inputsDesc->orientation_in["input"] = getOrientation(std::get<0>(nnets.back())->obj.pLayers[0]);
|
||||||
outputDesc.orientation = getOrientation(std::get<0>(nnets.back())->obj.pLayers[std::get<0>(nnets.back())->obj.nLayers - 1]);
|
outputsDesc[0].orientation = getOrientation(std::get<0>(nnets.back())->obj.pLayers[std::get<0>(nnets.back())->obj.nLayers - 1]);
|
||||||
#endif
|
#endif
|
||||||
outputDesc.num_bytes_per_element = header.output.element_size;
|
|
||||||
|
|
||||||
auto outputDims = SizeVector({header.nGroup, header.output.elements_count / header.nGroup});
|
|
||||||
auto inputDims = SizeVector({header.nGroup, header.input.elements_count / header.nGroup});
|
|
||||||
|
|
||||||
inputsDataMap["input"] = std::make_shared<InputInfo>();
|
|
||||||
inputsDataMap["input"]->setInputData(make_shared<Data>("input",
|
|
||||||
TensorDesc(
|
|
||||||
Precision::FP32,
|
|
||||||
inputDims,
|
|
||||||
Layout::NC)));
|
|
||||||
outputsDataMap["output"] = make_shared<Data>("output",
|
|
||||||
TensorDesc(
|
|
||||||
Precision::FP32,
|
|
||||||
outputDims,
|
|
||||||
Layout::NC));
|
|
||||||
|
|
||||||
outputDesc.scale_factor = header.output.scaleFactor;
|
|
||||||
inputsDesc->inputScaleFactors.push_back(header.input.scaleFactor);
|
|
||||||
|
|
||||||
num_rotate_rows = header.nRotateRows;
|
num_rotate_rows = header.nRotateRows;
|
||||||
num_rotate_columns = header.nRotateColumns;
|
num_rotate_columns = header.nRotateColumns;
|
||||||
@@ -1214,9 +1195,11 @@ void GNAPlugin::Export(const std::string &fileName) {
|
|||||||
THROW_GNA_EXCEPTION << " network not loaded";
|
THROW_GNA_EXCEPTION << " network not loaded";
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#if GNA_LIB_VER == 1
|
||||||
if (inputsDesc->ptr_inputs_global_id.size() != 1) {
|
if (inputsDesc->ptr_inputs_global_id.size() != 1) {
|
||||||
THROW_GNA_EXCEPTION << " exporting network with multiple inputs not supported";
|
THROW_GNA_EXCEPTION << " exporting network with multiple inputs not supported";
|
||||||
}
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
std::fstream outStream(fileName, ios_base::out | ios_base::binary);
|
std::fstream outStream(fileName, ios_base::out | ios_base::binary);
|
||||||
|
|
||||||
@@ -1229,19 +1212,16 @@ void GNAPlugin::Export(const std::string &fileName) {
|
|||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
#if GNA_LIB_VER == 2
|
#if GNA_LIB_VER == 2
|
||||||
auto serial = GNAModelSerial(&std::get<0>(gnaModels.front())->obj,
|
Gna2Model* modelToSerial = &std::get<0>(gnaModels.front())->obj;
|
||||||
#else
|
#else
|
||||||
auto serial = GNAModelSerial(&std::get<0>(nnets.front())->obj,
|
intel_nnet_type_t* modelToSerial = &std::get<0>(nnets.front())->obj;
|
||||||
#endif
|
#endif
|
||||||
{inputsDesc->inputScaleFactors.front(),
|
auto serial = GNAModelSerial(modelToSerial,
|
||||||
inputsDesc->ptr_inputs_global_storage.front()[0],
|
inputsDesc,
|
||||||
2,
|
outputsDesc,
|
||||||
static_cast<uint32_t>(InferenceEngine::details::product(inputsDataMap.begin()->second->getTensorDesc().getDims()))},
|
inputsDataMap,
|
||||||
{outputsDesc.front().scale_factor,
|
outputsDataMap)
|
||||||
outputsDesc.front().ptrs.front(),
|
.SetInputRotation(dnn->num_rotate_rows, dnn->num_rotate_columns);
|
||||||
outputsDesc.front().num_bytes_per_element,
|
|
||||||
static_cast<uint32_t>(InferenceEngine::details::product(outputsDataMap.begin()->second->getTensorDesc().getDims()))})
|
|
||||||
.SetInputRotation(dnn->num_rotate_rows, dnn->num_rotate_columns);
|
|
||||||
|
|
||||||
for (auto && memoryConnection : graphCompiler.memory_connection) {
|
for (auto && memoryConnection : graphCompiler.memory_connection) {
|
||||||
serial.AddState(memoryConnection.second.gna_ptr, memoryConnection.second.reserved_size);
|
serial.AddState(memoryConnection.second.gna_ptr, memoryConnection.second.reserved_size);
|
||||||
|
|||||||
@@ -71,7 +71,7 @@ void Config::UpdateFromMap(const std::map<std::string, std::string>& config) {
|
|||||||
key.erase(0, 1);
|
key.erase(0, 1);
|
||||||
try {
|
try {
|
||||||
input_index = std::stoi(key);
|
input_index = std::stoi(key);
|
||||||
if (input_index < 0 | input_index > 99) {
|
if (input_index > 99) {
|
||||||
throw std::out_of_range("");
|
throw std::out_of_range("");
|
||||||
}
|
}
|
||||||
} catch (std::invalid_argument&) {
|
} catch (std::invalid_argument&) {
|
||||||
|
|||||||
@@ -107,6 +107,9 @@ class LayerInfo {
|
|||||||
bool isConcatAlignFilter() const noexcept {
|
bool isConcatAlignFilter() const noexcept {
|
||||||
return isOfType("ConcatAlignFilter");
|
return isOfType("ConcatAlignFilter");
|
||||||
}
|
}
|
||||||
|
bool isLink() const noexcept {
|
||||||
|
return isOfType("Link");
|
||||||
|
}
|
||||||
bool isAffineFilter() const noexcept {
|
bool isAffineFilter() const noexcept {
|
||||||
return isOfType("AffineFilter");
|
return isOfType("AffineFilter");
|
||||||
}
|
}
|
||||||
@@ -204,6 +207,7 @@ class LayerInfo {
|
|||||||
if (layerOrder == std::vector<int>({ 0, 3, 2, 1 })) {
|
if (layerOrder == std::vector<int>({ 0, 3, 2, 1 })) {
|
||||||
return true; // supported case
|
return true; // supported case
|
||||||
}
|
}
|
||||||
|
IE_ASSERT(!layer->insData.empty());
|
||||||
auto inputs = layer->insData.begin()->lock();
|
auto inputs = layer->insData.begin()->lock();
|
||||||
auto inputsOrder = inputs->getTensorDesc().getDims();
|
auto inputsOrder = inputs->getTensorDesc().getDims();
|
||||||
|
|
||||||
|
|||||||
@@ -40,7 +40,6 @@ public:
|
|||||||
|
|
||||||
// length of current cycle
|
// length of current cycle
|
||||||
std::list<cnt_type> permuteCycles;
|
std::list<cnt_type> permuteCycles;
|
||||||
int seqId = 0;
|
|
||||||
bool newSeq = false;
|
bool newSeq = false;
|
||||||
|
|
||||||
for (int i = 0; i != orderVec.size();) {
|
for (int i = 0; i != orderVec.size();) {
|
||||||
|
|||||||
@@ -609,31 +609,6 @@ void InsertIdentityLayerPass::run() {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* @brief returns previous layers and insData index for it
|
|
||||||
* @tparam T
|
|
||||||
* @param origin
|
|
||||||
* @param acceptanceCriteria
|
|
||||||
* @param idx
|
|
||||||
*/
|
|
||||||
// give previous layers while skipping certain layer according to expression
|
|
||||||
template <class T>
|
|
||||||
std::vector<std::pair<CNNLayerPtr, int> > CNNNetGetPrevLayersSkip(CNNLayerPtr origin, const T &acceptanceCriteria, int idx = -1) {
|
|
||||||
std::vector<std::pair<CNNLayerPtr, int> > prevLayers;
|
|
||||||
for (int i = idx == -1 ? 0 : idx; CNNNetHasPrevLayer(origin.get(), i) && (idx == -1 || i == idx); i++) {
|
|
||||||
auto prevLayer = CNNNetPrevLayer(origin, i);
|
|
||||||
if (acceptanceCriteria(prevLayer)) {
|
|
||||||
prevLayers.push_back({prevLayer, CNNLayerFindOutDataIdx(origin, i)});
|
|
||||||
} else {
|
|
||||||
// if for some input we need to look in upper layers - original index not used here intentionally
|
|
||||||
auto prevPrevLayers = CNNNetGetPrevLayersSkip(prevLayer, acceptanceCriteria);
|
|
||||||
prevLayers.insert(prevLayers.end(), prevPrevLayers.begin(), prevPrevLayers.end());
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return prevLayers;
|
|
||||||
}
|
|
||||||
|
|
||||||
void InsertCopyLayerPass::run() {
|
void InsertCopyLayerPass::run() {
|
||||||
for (auto & l : *pLayers) {
|
for (auto & l : *pLayers) {
|
||||||
if (l->insData.empty()) continue;
|
if (l->insData.empty()) continue;
|
||||||
@@ -1084,6 +1059,78 @@ void RemoveConstPass::run() {
|
|||||||
transformer.fullTrim();
|
transformer.fullTrim();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void FuseMultipleIdentitiesPass::run() {
|
||||||
|
for (auto &l : *pLayers) {
|
||||||
|
if (l->insData.empty()) continue;
|
||||||
|
|
||||||
|
auto isNonFunctional = [](CNNLayerPtr ptr) {
|
||||||
|
return LayerInfo(ptr).isNonFunctional();
|
||||||
|
};
|
||||||
|
auto eltwise = dynamic_cast<InferenceEngine::EltwiseLayer *>(l.get());
|
||||||
|
auto concat = dynamic_cast<InferenceEngine::ConcatLayer *>(l.get());
|
||||||
|
|
||||||
|
if (LayerInfo(l).isNonFunctional() || LayerInfo(l).has32BInput())
|
||||||
|
continue;
|
||||||
|
gnalog() << "CNNNetPrevLayer skip non functional from :: " << l->name;
|
||||||
|
auto prevLayersReached = CNNNetGetPrevLayersSkip(l, [](CNNLayerPtr ptr) {
|
||||||
|
return !LayerInfo(ptr).isNonFunctional();
|
||||||
|
});
|
||||||
|
prevLayersReached.erase(std::remove_if(prevLayersReached.begin(),
|
||||||
|
prevLayersReached.end(),
|
||||||
|
[] (const std::pair<CNNLayerPtr, int> & candidate) {
|
||||||
|
return LayerInfo(candidate.first).isLink();
|
||||||
|
}), prevLayersReached.end());
|
||||||
|
|
||||||
|
if (prevLayersReached.size() != 1 && eltwise == nullptr && concat == nullptr) {
|
||||||
|
std::stringstream layers;
|
||||||
|
for (auto && prevLayer : prevLayersReached) {
|
||||||
|
layers << prevLayer.first->name;
|
||||||
|
layers << ", ";
|
||||||
|
}
|
||||||
|
THROW_GNA_LAYER_EXCEPTION(l) << "unsupported case: connected to "
|
||||||
|
<< (prevLayersReached.empty() ? "zero" : "multiple") << " outputs : " << layers.str();
|
||||||
|
}
|
||||||
|
auto prevLayer = prevLayersReached.front().first;
|
||||||
|
auto outDataIdx = prevLayersReached.front().second;
|
||||||
|
gnalog() << ", reached " << prevLayer->name << " at " << outDataIdx << std::endl;
|
||||||
|
|
||||||
|
if (!LayerInfo(prevLayer).has32BOutput())
|
||||||
|
continue;
|
||||||
|
|
||||||
|
std::vector<CNNLayerPtr> resultSet = CNNNetGetAllNextLayersSkipCertain(prevLayer, outDataIdx, isNonFunctional);
|
||||||
|
|
||||||
|
// now result set should have all needed layers
|
||||||
|
// checking that result set consist of already identity
|
||||||
|
CNNLayerPtr alreadyIdentity;
|
||||||
|
for (auto &&res : resultSet) {
|
||||||
|
if (LayerInfo(res).isIdentity()) {
|
||||||
|
alreadyIdentity = res;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (!alreadyIdentity) {
|
||||||
|
continue;
|
||||||
|
} else {
|
||||||
|
// just figure out how to connect to that "already identity"
|
||||||
|
// 1st stage - disconnect given layer from previous
|
||||||
|
auto directPrev = l->insData.front().lock()->getCreatorLayer().lock();
|
||||||
|
auto oDataIdx = CNNLayerFindOutDataIdx(directPrev, 0);
|
||||||
|
auto &inputTo = directPrev->outData[oDataIdx]->getInputTo();
|
||||||
|
for (auto inIterator = inputTo.begin(); inIterator != inputTo.end(); inIterator++) {
|
||||||
|
if (inIterator->second == l) {
|
||||||
|
inputTo.erase(inIterator);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
l->insData.clear();
|
||||||
|
|
||||||
|
//2nd stage - now setting up new connection
|
||||||
|
l->insData.push_back(alreadyIdentity->outData.front());
|
||||||
|
alreadyIdentity->outData.front()->getInputTo()[l->name] = l;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
int PassManager::run(int index) {
|
int PassManager::run(int index) {
|
||||||
// #define PLOT
|
// #define PLOT
|
||||||
#ifdef PLOT
|
#ifdef PLOT
|
||||||
|
|||||||
@@ -149,6 +149,11 @@ DECL_PASS_BEFORE_COPY(UnrollTI);
|
|||||||
*/
|
*/
|
||||||
DECL_PASS_BEFORE_COPY(RemoveConst);
|
DECL_PASS_BEFORE_COPY(RemoveConst);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @brief removed extra identity layer for multi-output
|
||||||
|
*/
|
||||||
|
DECL_PASS(FuseMultipleIdentities);
|
||||||
|
|
||||||
struct PassManagerSettings {
|
struct PassManagerSettings {
|
||||||
Policy policy;
|
Policy policy;
|
||||||
/// @brief whether to run passes before copy
|
/// @brief whether to run passes before copy
|
||||||
|
|||||||
@@ -139,7 +139,7 @@ private:
|
|||||||
|
|
||||||
friend INFERENCE_ENGINE_API_CPP(std::shared_ptr<CNNNetworkImpl>)
|
friend INFERENCE_ENGINE_API_CPP(std::shared_ptr<CNNNetworkImpl>)
|
||||||
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph,
|
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph,
|
||||||
const ICNNNetwork& nGraphImpl);
|
const ICNNNetwork& nGraphImpl, bool keep_constant_inputs);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @brief Reshape on the same shape
|
* @brief Reshape on the same shape
|
||||||
|
|||||||
@@ -63,9 +63,9 @@ ngraph::op::GenericIE::GenericIE(const ngraph::NodeVector& inputs,
|
|||||||
: GenericIE(as_output_vector(inputs), params, type, outputs) {}
|
: GenericIE(as_output_vector(inputs), params, type, outputs) {}
|
||||||
|
|
||||||
ngraph::op::GenericIE::GenericIE(const ngraph::OutputVector& inputs,
|
ngraph::op::GenericIE::GenericIE(const ngraph::OutputVector& inputs,
|
||||||
const std::map<std::string, InferenceEngine::Parameter>& params,
|
const std::map<std::string, InferenceEngine::Parameter>& params_,
|
||||||
const std::string type, const std::vector<PortIE>& outputs)
|
const std::string type_, const std::vector<PortIE>& outputs_)
|
||||||
: Op(inputs), params(params), outputs(outputs), type(type), initialized(0) {
|
: Op(inputs), params(params_), outputs(outputs_), type(type_), initialized(0) {
|
||||||
constructor_validate_and_infer_types();
|
constructor_validate_and_infer_types();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -179,7 +179,9 @@ CNNNetwork details::ReadNetwork(const std::string& modelPath, const std::string&
|
|||||||
THROW_IE_EXCEPTION << "Weights file " << bPath << " cannot be opened!";
|
THROW_IE_EXCEPTION << "Weights file " << bPath << " cannot be opened!";
|
||||||
|
|
||||||
// read model with weights
|
// read model with weights
|
||||||
return reader->read(modelStream, binStream, exts);
|
auto network = reader->read(modelStream, binStream, exts);
|
||||||
|
modelStream.close();
|
||||||
|
return network;
|
||||||
}
|
}
|
||||||
// read model without weights
|
// read model without weights
|
||||||
return reader->read(modelStream, exts);
|
return reader->read(modelStream, exts);
|
||||||
|
|||||||
@@ -15,7 +15,8 @@ namespace InferenceEngine {
|
|||||||
namespace details {
|
namespace details {
|
||||||
|
|
||||||
INFERENCE_ENGINE_API_CPP(std::shared_ptr<CNNNetworkImpl>)
|
INFERENCE_ENGINE_API_CPP(std::shared_ptr<CNNNetworkImpl>)
|
||||||
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph, const ICNNNetwork &network);
|
convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph,
|
||||||
|
const ICNNNetwork &network, bool keep_constant_inputs = false);
|
||||||
|
|
||||||
} // namespace details
|
} // namespace details
|
||||||
} // namespace InferenceEngine
|
} // namespace InferenceEngine
|
||||||
|
|||||||
@@ -24,6 +24,8 @@
|
|||||||
#include "ngraph_ops/pad_ie.hpp"
|
#include "ngraph_ops/pad_ie.hpp"
|
||||||
#include "ngraph_ops/onehot_ie.hpp"
|
#include "ngraph_ops/onehot_ie.hpp"
|
||||||
#include "ngraph_ops/power.hpp"
|
#include "ngraph_ops/power.hpp"
|
||||||
|
#include "ngraph_ops/prior_box_clustered_ie.hpp"
|
||||||
|
#include "ngraph_ops/prior_box_ie.hpp"
|
||||||
#include "ngraph_ops/proposal_ie.hpp"
|
#include "ngraph_ops/proposal_ie.hpp"
|
||||||
#include "ngraph_ops/relu_ie.hpp"
|
#include "ngraph_ops/relu_ie.hpp"
|
||||||
#include "ngraph_ops/scaleshift.hpp"
|
#include "ngraph_ops/scaleshift.hpp"
|
||||||
@@ -472,20 +474,6 @@ InferenceEngine::details::CNNLayerCreator::CNNLayerCreator(const std::shared_ptr
|
|||||||
return res;
|
return res;
|
||||||
|
|
||||||
});
|
});
|
||||||
|
|
||||||
addSpecificCreator({"PriorBox"}, [](const std::shared_ptr<::ngraph::Node>& node,
|
|
||||||
const std::map<std::string, std::string> params) -> CNNLayerPtr {
|
|
||||||
THROW_IE_EXCEPTION << "PriorBox operation has a form that is not supported." << node->get_friendly_name()
|
|
||||||
<< " should be replaced by constant during constant folding.";
|
|
||||||
return nullptr;
|
|
||||||
});
|
|
||||||
|
|
||||||
addSpecificCreator({"PriorBoxClustered"}, [](const std::shared_ptr<::ngraph::Node>& node,
|
|
||||||
const std::map<std::string, std::string> params) -> CNNLayerPtr {
|
|
||||||
THROW_IE_EXCEPTION << "PriorBoxClustered operation has a form that is not supported." << node->get_friendly_name()
|
|
||||||
<< " should be replaced by constant during constant folding.";
|
|
||||||
return nullptr;
|
|
||||||
});
|
|
||||||
}
|
}
|
||||||
|
|
||||||
CNNLayerPtr InferenceEngine::details::CNNLayerCreator::create() {
|
CNNLayerPtr InferenceEngine::details::CNNLayerCreator::create() {
|
||||||
@@ -499,7 +487,9 @@ CNNLayerPtr InferenceEngine::details::CNNLayerCreator::create() {
|
|||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
|
||||||
std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function>& graph, const ICNNNetwork &network) {
|
std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_ptr<const ::ngraph::Function> &graph,
|
||||||
|
const ICNNNetwork &network,
|
||||||
|
bool keep_constant_inputs) {
|
||||||
IE_PROFILING_AUTO_SCOPE(convertFunctionToICNNNetwork)
|
IE_PROFILING_AUTO_SCOPE(convertFunctionToICNNNetwork)
|
||||||
const auto createCNNLayer = [](const std::shared_ptr<::ngraph::Node> &node) -> CNNLayerPtr {
|
const auto createCNNLayer = [](const std::shared_ptr<::ngraph::Node> &node) -> CNNLayerPtr {
|
||||||
class NGraphCNNLayer: public CNNLayer {
|
class NGraphCNNLayer: public CNNLayer {
|
||||||
@@ -565,6 +555,10 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
|||||||
std::make_shared<Builder::NodeConverter<::ngraph::op::PadIE>>(),
|
std::make_shared<Builder::NodeConverter<::ngraph::op::PadIE>>(),
|
||||||
std::make_shared<Builder::NodeConverter<::ngraph::op::v1::Power>>(),
|
std::make_shared<Builder::NodeConverter<::ngraph::op::v1::Power>>(),
|
||||||
std::make_shared<Builder::NodeConverter<::ngraph::op::PowerIE>>(),
|
std::make_shared<Builder::NodeConverter<::ngraph::op::PowerIE>>(),
|
||||||
|
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBox>>(),
|
||||||
|
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxClustered>>(),
|
||||||
|
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxClusteredIE>>(),
|
||||||
|
std::make_shared<Builder::NodeConverter<::ngraph::op::PriorBoxIE>>(),
|
||||||
std::make_shared<Builder::NodeConverter<::ngraph::op::Proposal>>(),
|
std::make_shared<Builder::NodeConverter<::ngraph::op::Proposal>>(),
|
||||||
std::make_shared<Builder::NodeConverter<::ngraph::op::ProposalIE>>(),
|
std::make_shared<Builder::NodeConverter<::ngraph::op::ProposalIE>>(),
|
||||||
std::make_shared<Builder::NodeConverter<::ngraph::op::Relu>>(),
|
std::make_shared<Builder::NodeConverter<::ngraph::op::Relu>>(),
|
||||||
@@ -715,7 +709,7 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
|||||||
for (const auto &layer : nodes)
|
for (const auto &layer : nodes)
|
||||||
op_names.insert(layer->get_name());
|
op_names.insert(layer->get_name());
|
||||||
|
|
||||||
bool keep_constants = ::ngraph::op::util::has_op_with_type<::ngraph::op::FakeQuantize>(graph);
|
bool keep_constants = keep_constant_inputs || ::ngraph::op::util::has_op_with_type<::ngraph::op::FakeQuantize>(graph);
|
||||||
|
|
||||||
// Create layers and output data
|
// Create layers and output data
|
||||||
for (const auto &layer : nodes) {
|
for (const auto &layer : nodes) {
|
||||||
@@ -766,6 +760,20 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
|||||||
cnnLayer->insData.resize(inputCount);
|
cnnLayer->insData.resize(inputCount);
|
||||||
|
|
||||||
for (size_t i = 0; i < layer->get_output_size(); i++) {
|
for (size_t i = 0; i < layer->get_output_size(); i++) {
|
||||||
|
// Memory node with index = 1 has no inputs according to the specification.
|
||||||
|
// For proper conversion, we must cut off all the layers and data nodes above ReadValue,
|
||||||
|
// if they are connected only with this layer.
|
||||||
|
// Now MO generates only constants or constant sub-graphs as input to ReadValue op.
|
||||||
|
if (std::dynamic_pointer_cast<::ngraph::op::Constant>(layer)) {
|
||||||
|
bool all_to_read_value = !layer->output(i).get_target_inputs().empty();
|
||||||
|
for (const auto &output_input : layer->output(i).get_target_inputs()) {
|
||||||
|
all_to_read_value
|
||||||
|
&= dynamic_cast<ngraph::op::ReadValue *>(output_input.get_node()) != nullptr;
|
||||||
|
}
|
||||||
|
if (all_to_read_value)
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
if (cnnLayer->type == "Memory" && cnnLayer->params["index"] == "0") {
|
if (cnnLayer->type == "Memory" && cnnLayer->params["index"] == "0") {
|
||||||
cnnLayer->outData.clear();
|
cnnLayer->outData.clear();
|
||||||
continue;
|
continue;
|
||||||
@@ -773,7 +781,6 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
|||||||
std::string outName = layer->get_friendly_name();
|
std::string outName = layer->get_friendly_name();
|
||||||
if (layer->get_output_size() != 1) outName += "." + std::to_string(i);
|
if (layer->get_output_size() != 1) outName += "." + std::to_string(i);
|
||||||
DataPtr &ptr = cnnNetworkImpl->getData(outName.c_str());
|
DataPtr &ptr = cnnNetworkImpl->getData(outName.c_str());
|
||||||
|
|
||||||
SizeVector dims;
|
SizeVector dims;
|
||||||
dims = layer->get_output_shape(i);
|
dims = layer->get_output_shape(i);
|
||||||
for (const auto &dim : dims) {
|
for (const auto &dim : dims) {
|
||||||
@@ -889,6 +896,7 @@ std::shared_ptr<CNNNetworkImpl> convertFunctionToICNNNetwork(const std::shared_p
|
|||||||
for (const auto &ext : ::ngraph::op::GenericIE::getExtensions(graph)) {
|
for (const auto &ext : ::ngraph::op::GenericIE::getExtensions(graph)) {
|
||||||
cnnNetworkImpl->AddExtension(ext, nullptr);
|
cnnNetworkImpl->AddExtension(ext, nullptr);
|
||||||
}
|
}
|
||||||
|
|
||||||
return cnnNetworkImpl;
|
return cnnNetworkImpl;
|
||||||
}
|
}
|
||||||
} // namespace details
|
} // namespace details
|
||||||
|
|||||||
@@ -232,7 +232,8 @@ std::vector<CNNLayerPtr> ConstTransformer::foldConstSubgraphsInternal(const std:
|
|||||||
static std::vector<std::string> skipConstInfer = {
|
static std::vector<std::string> skipConstInfer = {
|
||||||
"FakeQuantize",
|
"FakeQuantize",
|
||||||
"Quantize",
|
"Quantize",
|
||||||
"CumSum" // Const inference function for CumSum is not implemented!
|
"CumSum", // Const inference function for CumSum is not implemented
|
||||||
|
"Convolution" // Const inference function for Convolution is not implemented
|
||||||
};
|
};
|
||||||
|
|
||||||
const std::map<std::string, bool> ConstTransformer::getConstLayers(const std::vector<CNNLayerPtr>& sortedLayers) {
|
const std::map<std::string, bool> ConstTransformer::getConstLayers(const std::vector<CNNLayerPtr>& sortedLayers) {
|
||||||
|
|||||||
@@ -34,6 +34,8 @@
|
|||||||
#include "ngraph_ops/onehot_ie.hpp"
|
#include "ngraph_ops/onehot_ie.hpp"
|
||||||
#include "ngraph_ops/pad_ie.hpp"
|
#include "ngraph_ops/pad_ie.hpp"
|
||||||
#include "ngraph_ops/power.hpp"
|
#include "ngraph_ops/power.hpp"
|
||||||
|
#include "ngraph_ops/prior_box_clustered_ie.hpp"
|
||||||
|
#include "ngraph_ops/prior_box_ie.hpp"
|
||||||
#include "ngraph_ops/proposal_ie.hpp"
|
#include "ngraph_ops/proposal_ie.hpp"
|
||||||
#include "ngraph_ops/relu_ie.hpp"
|
#include "ngraph_ops/relu_ie.hpp"
|
||||||
#include "ngraph_ops/selu_ie.hpp"
|
#include "ngraph_ops/selu_ie.hpp"
|
||||||
@@ -1473,6 +1475,136 @@ CNNLayer::Ptr NodeConverter<ngraph::op::ProposalIE>::createLayer(const std::shar
|
|||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
template <>
|
||||||
|
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxClusteredIE>::createLayer(
|
||||||
|
const std::shared_ptr<ngraph::Node>& layer) const {
|
||||||
|
LayerParams params = {layer->get_friendly_name(), "PriorBoxClustered",
|
||||||
|
details::convertPrecision(layer->get_output_element_type(0))};
|
||||||
|
auto res = std::make_shared<InferenceEngine::CNNLayer>(params);
|
||||||
|
auto castedLayer = ngraph::as_type_ptr<ngraph::op::PriorBoxClusteredIE>(layer);
|
||||||
|
if (castedLayer == nullptr) THROW_IE_EXCEPTION << "Cannot get " << params.type << " layer " << params.name;
|
||||||
|
|
||||||
|
auto attr = castedLayer->get_attrs();
|
||||||
|
std::string param;
|
||||||
|
for (const auto& val : attr.widths) {
|
||||||
|
if (!param.empty()) param += ",";
|
||||||
|
param += asString(val);
|
||||||
|
}
|
||||||
|
res->params["width"] = param;
|
||||||
|
|
||||||
|
param.clear();
|
||||||
|
for (const auto& val : attr.heights) {
|
||||||
|
if (!param.empty()) param += ",";
|
||||||
|
param += asString(val);
|
||||||
|
}
|
||||||
|
res->params["height"] = param;
|
||||||
|
|
||||||
|
param.clear();
|
||||||
|
for (const auto& val : attr.variances) {
|
||||||
|
if (!param.empty()) param += ",";
|
||||||
|
param += asString(val);
|
||||||
|
}
|
||||||
|
res->params["variance"] = param;
|
||||||
|
|
||||||
|
if (std::abs(attr.step_heights - attr.step_widths) < 1e-5) {
|
||||||
|
res->params["step"] = asString(attr.step_widths);
|
||||||
|
} else {
|
||||||
|
res->params["step_w"] = asString(attr.step_widths);
|
||||||
|
res->params["step_h"] = asString(attr.step_heights);
|
||||||
|
}
|
||||||
|
res->params["offset"] = asString(attr.offset);
|
||||||
|
res->params["clip"] = asString(attr.clip ? 1 : 0);
|
||||||
|
res->params["flip"] = "1";
|
||||||
|
|
||||||
|
return res;
|
||||||
|
}
|
||||||
|
|
||||||
|
template <>
|
||||||
|
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxClustered>::createLayer(
|
||||||
|
const std::shared_ptr<ngraph::Node>& layer) const {
|
||||||
|
THROW_IE_EXCEPTION << "PriorBoxClustered operation must be converted to PriorBoxClusteredIE operation.";
|
||||||
|
}
|
||||||
|
|
||||||
|
template <>
|
||||||
|
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBoxIE>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
|
||||||
|
LayerParams params = {layer->get_friendly_name(), "PriorBox",
|
||||||
|
details::convertPrecision(layer->get_output_element_type(0))};
|
||||||
|
auto res = std::make_shared<InferenceEngine::CNNLayer>(params);
|
||||||
|
auto castedLayer = ngraph::as_type_ptr<ngraph::op::PriorBoxIE>(layer);
|
||||||
|
auto layer_info = params.type + " layer " + params.name;
|
||||||
|
|
||||||
|
if (castedLayer == nullptr) THROW_IE_EXCEPTION << "Cannot get " << layer_info;
|
||||||
|
|
||||||
|
auto attr = castedLayer->get_attrs();
|
||||||
|
std::string param;
|
||||||
|
|
||||||
|
auto data_pshape = castedLayer->get_input_partial_shape(0);
|
||||||
|
if (data_pshape.is_dynamic()) THROW_IE_EXCEPTION << "Dynamic 0-port input of " << layer_info << " is not supported";
|
||||||
|
auto data_shape = data_pshape.to_shape();
|
||||||
|
if (data_shape.size() != 4) THROW_IE_EXCEPTION << layer_info << " has " << data_shape.size() << " items in 0-port input, 4 expected";
|
||||||
|
|
||||||
|
auto img_pshape = castedLayer->get_input_partial_shape(1);
|
||||||
|
if (img_pshape.is_dynamic()) THROW_IE_EXCEPTION << "Dynamic 1-port input of " << layer_info << " is not supported";
|
||||||
|
auto img_shape = img_pshape.to_shape();
|
||||||
|
if (img_shape.size() != 4) THROW_IE_EXCEPTION << layer_info << " has " << data_shape.size() << " items in 1-port input, 4 expected";
|
||||||
|
|
||||||
|
if (!attr.scale_all_sizes) {
|
||||||
|
// mxnet-like PriorBox
|
||||||
|
auto img_H = img_shape[2];
|
||||||
|
auto data_H = data_shape[2];
|
||||||
|
if (attr.step == -1)
|
||||||
|
attr.step = 1. * img_H / data_H;
|
||||||
|
else
|
||||||
|
attr.step *= img_H;
|
||||||
|
for (auto& size : attr.min_size)
|
||||||
|
size *= img_H;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (const auto& val : attr.max_size) {
|
||||||
|
if (!param.empty()) param += ",";
|
||||||
|
param += asString(val);
|
||||||
|
}
|
||||||
|
res->params["max_size"] = param;
|
||||||
|
|
||||||
|
param.clear();
|
||||||
|
for (const auto& val : attr.min_size) {
|
||||||
|
if (!param.empty()) param += ",";
|
||||||
|
param += asString(val);
|
||||||
|
}
|
||||||
|
res->params["min_size"] = param;
|
||||||
|
|
||||||
|
param.clear();
|
||||||
|
for (const auto& val : attr.aspect_ratio) {
|
||||||
|
if (!param.empty()) param += ",";
|
||||||
|
param += asString(val);
|
||||||
|
}
|
||||||
|
res->params["aspect_ratio"] = param;
|
||||||
|
|
||||||
|
param.clear();
|
||||||
|
for (const auto& val : attr.variance) {
|
||||||
|
if (!param.empty()) param += ",";
|
||||||
|
param += asString(val);
|
||||||
|
}
|
||||||
|
res->params["variance"] = param;
|
||||||
|
|
||||||
|
res->params["step"] = asString(attr.step);
|
||||||
|
res->params["offset"] = asString(attr.offset);
|
||||||
|
res->params["clip"] = asString(attr.clip ? 1 : 0);
|
||||||
|
res->params["flip"] = asString(attr.flip ? 1 : 0);
|
||||||
|
res->params["scale_all_sizes"] = asString(attr.scale_all_sizes ? 1 : 0);
|
||||||
|
|
||||||
|
res->params["density"] = asString(attr.density);
|
||||||
|
res->params["fixed_size"] = asString(attr.fixed_size);
|
||||||
|
res->params["fixed_ratio"] = asString(attr.fixed_ratio);
|
||||||
|
|
||||||
|
return res;
|
||||||
|
}
|
||||||
|
|
||||||
|
template <>
|
||||||
|
CNNLayer::Ptr NodeConverter<ngraph::op::PriorBox>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
|
||||||
|
THROW_IE_EXCEPTION << "PriorBox operation must be converted to PriorBoxIE operation.";
|
||||||
|
}
|
||||||
|
|
||||||
template <>
|
template <>
|
||||||
CNNLayer::Ptr NodeConverter<ngraph::op::PowerIE>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
|
CNNLayer::Ptr NodeConverter<ngraph::op::PowerIE>::createLayer(const std::shared_ptr<ngraph::Node>& layer) const {
|
||||||
LayerParams params = {layer->get_friendly_name(), "Power",
|
LayerParams params = {layer->get_friendly_name(), "Power",
|
||||||
|
|||||||
@@ -272,6 +272,48 @@ void CombineData(DataPtr& master, DataPtr& slave) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Preserve output data name and update output data map of the network
|
||||||
|
*
|
||||||
|
* @param in_data name to update
|
||||||
|
* @param out_data name to preserve
|
||||||
|
* @param net output data map to update with in_data
|
||||||
|
*/
|
||||||
|
template <typename NET>
|
||||||
|
void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, NET &net) {
|
||||||
|
// TODO: update outputs of the network if out_data was output
|
||||||
|
if (out_data->getInputTo().empty()) {
|
||||||
|
auto data_name = out_data->getName();
|
||||||
|
in_data->setName(data_name);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, NET &net), where
|
||||||
|
* NET = ICNNNetwork
|
||||||
|
*/
|
||||||
|
void SaveOutputDataName(InferenceEngine::DataPtr in_data, InferenceEngine::DataPtr out_data, ICNNNetwork& net) {
|
||||||
|
if (out_data->getInputTo().empty()) {
|
||||||
|
InferenceEngine::OutputsDataMap outputs_data_map;
|
||||||
|
net.getOutputsInfo(outputs_data_map);
|
||||||
|
auto out_data_name = out_data->getName();
|
||||||
|
in_data->setName(out_data_name);
|
||||||
|
if (outputs_data_map.count(out_data_name)) {
|
||||||
|
auto parent_layer_ptr = in_data->getCreatorLayer().lock();
|
||||||
|
IE_ASSERT(parent_layer_ptr != nullptr);
|
||||||
|
auto parent_layer_name = parent_layer_ptr->name;
|
||||||
|
size_t in_data_out_index = 0;
|
||||||
|
for (size_t ind = 0; ind < parent_layer_ptr->outData.size(); ++ind) {
|
||||||
|
if (parent_layer_ptr->outData[ind] == in_data) {
|
||||||
|
in_data_out_index = ind;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
net.addOutput(parent_layer_name, in_data_out_index);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Remove layer form graph
|
* Remove layer form graph
|
||||||
* May be applied only for inplace layer. One input, one output,
|
* May be applied only for inplace layer. One input, one output,
|
||||||
@@ -279,7 +321,8 @@ void CombineData(DataPtr& master, DataPtr& slave) {
|
|||||||
*
|
*
|
||||||
* @param layer to remove from graph
|
* @param layer to remove from graph
|
||||||
*/
|
*/
|
||||||
void RemoveLayer(CNNLayerPtr& layer) {
|
template <typename NET>
|
||||||
|
void RemoveLayer(CNNLayerPtr& layer, NET &net) {
|
||||||
IE_ASSERT(layer->insData.size() == 1);
|
IE_ASSERT(layer->insData.size() == 1);
|
||||||
IE_ASSERT(layer->outData.size() == 1);
|
IE_ASSERT(layer->outData.size() == 1);
|
||||||
|
|
||||||
@@ -299,10 +342,8 @@ void RemoveLayer(CNNLayerPtr& layer) {
|
|||||||
// transfer output connections into parent data
|
// transfer output connections into parent data
|
||||||
CombineData(in_data, out_data);
|
CombineData(in_data, out_data);
|
||||||
|
|
||||||
// Save name for output data
|
// save name for output data and update network output
|
||||||
if (out_data->getInputTo().empty()) {
|
SaveOutputDataName(in_data, out_data, net);
|
||||||
in_data->setName(out_data->getName());
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/************************************************************/
|
/************************************************************/
|
||||||
@@ -1371,7 +1412,7 @@ void fixConvertLayers(NET &net) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
for (auto &layer : to_remove) {
|
for (auto &layer : to_remove) {
|
||||||
RemoveLayer(layer);
|
RemoveLayer(layer, net);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -21,6 +21,8 @@ public:
|
|||||||
~GemmTransformation() override {};
|
~GemmTransformation() override {};
|
||||||
bool canBeTransformed(const TransformationContext& context, const CNNLayer& layer) const override;
|
bool canBeTransformed(const TransformationContext& context, const CNNLayer& layer) const override;
|
||||||
void transform(TransformationContext& context, CNNLayer& layer) const override;
|
void transform(TransformationContext& context, CNNLayer& layer) const override;
|
||||||
|
|
||||||
|
bool isQuantized(const CNNLayer& layer) const noexcept override;
|
||||||
};
|
};
|
||||||
|
|
||||||
IE_SUPPRESS_DEPRECATED_END
|
IE_SUPPRESS_DEPRECATED_END
|
||||||
|
|||||||
@@ -83,6 +83,8 @@ protected:
|
|||||||
const std::vector<float>& originalWeightsDequantizationShifts,
|
const std::vector<float>& originalWeightsDequantizationShifts,
|
||||||
std::vector<float>& dequantizationScales,
|
std::vector<float>& dequantizationScales,
|
||||||
std::vector<float>& dequantizationShifts) const;
|
std::vector<float>& dequantizationShifts) const;
|
||||||
|
|
||||||
|
static bool getDequantizationDimIsSupported(const CNNLayer& weightableLayer);
|
||||||
};
|
};
|
||||||
|
|
||||||
typedef std::shared_ptr<WeightableLayerTransformation> WeightableLayerTransformationPtr;
|
typedef std::shared_ptr<WeightableLayerTransformation> WeightableLayerTransformationPtr;
|
||||||
|
|||||||
@@ -135,7 +135,6 @@ void ConcatTransformation::transform(TransformationContext& context, CNNLayer& c
|
|||||||
|
|
||||||
|
|
||||||
dequantizationScale = maxOutputInterval / (dataPrecision.max - dataPrecision.min);
|
dequantizationScale = maxOutputInterval / (dataPrecision.max - dataPrecision.min);
|
||||||
const float max = maxOutputInterval / ((dataPrecision.max - dataPrecision.min) / dataPrecision.max);
|
|
||||||
const float min = maxOutputInterval / ((dataPrecision.max - dataPrecision.min) / dataPrecision.min);
|
const float min = maxOutputInterval / ((dataPrecision.max - dataPrecision.min) / dataPrecision.min);
|
||||||
dequantizationShift = outputLowValue - min;
|
dequantizationShift = outputLowValue - min;
|
||||||
|
|
||||||
|
|||||||
@@ -25,15 +25,6 @@
|
|||||||
using namespace InferenceEngine;
|
using namespace InferenceEngine;
|
||||||
using namespace InferenceEngine::details;
|
using namespace InferenceEngine::details;
|
||||||
|
|
||||||
bool getDequantizationValuesAreBroadcasted(const CNNLayer& fullyConnected) {
|
|
||||||
const DataPtr inputData = fullyConnected.insData[0].lock();
|
|
||||||
if (inputData == nullptr) {
|
|
||||||
THROW_IE_LPT_EXCEPTION(fullyConnected) << "input data is absent";
|
|
||||||
}
|
|
||||||
|
|
||||||
return inputData->getDims().size() == 3ul;
|
|
||||||
}
|
|
||||||
|
|
||||||
bool FullyConnectedTransformation::canBeTransformed(const TransformationContext& context, const CNNLayer& fullyConnected) const {
|
bool FullyConnectedTransformation::canBeTransformed(const TransformationContext& context, const CNNLayer& fullyConnected) const {
|
||||||
if (!WeightableLayerTransformation::canBeTransformed(context, fullyConnected)) {
|
if (!WeightableLayerTransformation::canBeTransformed(context, fullyConnected)) {
|
||||||
return false;
|
return false;
|
||||||
@@ -72,7 +63,12 @@ bool FullyConnectedTransformation::canBeTransformed(const TransformationContext&
|
|||||||
std::vector<float> dequantizationShifts;
|
std::vector<float> dequantizationShifts;
|
||||||
fillFromDequantizationLayer(*scaleShift, dequantizationScales, dequantizationShifts);
|
fillFromDequantizationLayer(*scaleShift, dequantizationScales, dequantizationShifts);
|
||||||
|
|
||||||
if ((inTensorDims.size() == 3ul) && (!DequantizationDetails::isPerTensor(dequantizationScales, dequantizationShifts))) {
|
const bool dequantizationDimIsSupported = !getDequantizationDimIsSupported(fullyConnected);
|
||||||
|
if ((!dequantizationDimIsSupported) &&
|
||||||
|
(!DequantizationDetails::isPerTensor(dequantizationScales, dequantizationShifts) ||
|
||||||
|
// if asymmetric quantization is not supported then no shifts for dequantizationDimIsSupported = false case:
|
||||||
|
// in this case we can not dequantize with shifts
|
||||||
|
(!supportAsymmetricQuantization && (dequantizationShifts[0] != 0.f)))) {
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -318,7 +314,7 @@ void FullyConnectedTransformation::calculateDequantizationForSymmetric(
|
|||||||
const auto prevDequantizationScaleBuffer = CNNNetworkHelper::getFloatData(CNNNetworkHelper::getBlob(scaleShift, "weights"));
|
const auto prevDequantizationScaleBuffer = CNNNetworkHelper::getFloatData(CNNNetworkHelper::getBlob(scaleShift, "weights"));
|
||||||
const auto prevDequantizationShiftBuffer = CNNNetworkHelper::getFloatData(CNNNetworkHelper::getBlob(scaleShift, "biases"));
|
const auto prevDequantizationShiftBuffer = CNNNetworkHelper::getFloatData(CNNNetworkHelper::getBlob(scaleShift, "biases"));
|
||||||
|
|
||||||
const bool dequantizationValuesAreBroadcasted = getDequantizationValuesAreBroadcasted(fullyConnected);
|
const bool dequantizationValuesAreBroadcasted = !getDequantizationDimIsSupported(fullyConnected);
|
||||||
for (size_t i = 0; i < outputChannelsCount; ++i) {
|
for (size_t i = 0; i < outputChannelsCount; ++i) {
|
||||||
dequantizationScales[i] =
|
dequantizationScales[i] =
|
||||||
prevDequantizationScaleBuffer.get()[0] *
|
prevDequantizationScaleBuffer.get()[0] *
|
||||||
@@ -401,7 +397,7 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
|
|||||||
THROW_IE_EXCEPTION << "Unexpected layer type to calculate quantization values " << scaleShift->type;
|
THROW_IE_EXCEPTION << "Unexpected layer type to calculate quantization values " << scaleShift->type;
|
||||||
}
|
}
|
||||||
|
|
||||||
const bool dequantizationValuesAreBroadcasted = getDequantizationValuesAreBroadcasted(fullyConnected);
|
const bool dequantizationValuesAreBroadcasted = !getDequantizationDimIsSupported(fullyConnected);
|
||||||
|
|
||||||
dequantizationScales.resize(outputChannelsCount);
|
dequantizationScales.resize(outputChannelsCount);
|
||||||
dequantizationShifts.resize(outputChannelsCount);
|
dequantizationShifts.resize(outputChannelsCount);
|
||||||
@@ -412,10 +408,10 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
|
|||||||
prevDequantizationScaleBuffer.get()[0] *
|
prevDequantizationScaleBuffer.get()[0] *
|
||||||
(originalWeightsDequantizationScales.size() == 0 ?
|
(originalWeightsDequantizationScales.size() == 0 ?
|
||||||
1.0 :
|
1.0 :
|
||||||
(originalWeightsDequantizationScales.size() == 1 ? originalWeightsDequantizationScales[0] : originalWeightsDequantizationScales[i]));
|
originalWeightsDequantizationScales[((originalWeightsDequantizationScales.size() == 1) || dequantizationValuesAreBroadcasted) ? 0 : i]);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (CNNNetworkHelper::isQuantizedConstWeights(fullyConnected)) {
|
if (CNNNetworkHelper::isQuantizedConstWeights(fullyConnected) && (!dequantizationValuesAreBroadcasted)) {
|
||||||
const Blob::Ptr weightsBlob = CNNNetworkHelper::getWeights(fullyConnected, roundQuantizedValues);
|
const Blob::Ptr weightsBlob = CNNNetworkHelper::getWeights(fullyConnected, roundQuantizedValues);
|
||||||
const auto weightsBuffer = CNNNetworkHelper::getFloatData(weightsBlob);
|
const auto weightsBuffer = CNNNetworkHelper::getFloatData(weightsBlob);
|
||||||
const Blob::Ptr biasesBlob = CNNNetworkHelper::getBiases(fullyConnected);
|
const Blob::Ptr biasesBlob = CNNNetworkHelper::getBiases(fullyConnected);
|
||||||
@@ -432,7 +428,7 @@ void FullyConnectedTransformation::calculateDequantizationForAsymmetric(
|
|||||||
|
|
||||||
for (size_t w = 0; w < inputChannelsCount; ++w) {
|
for (size_t w = 0; w < inputChannelsCount; ++w) {
|
||||||
const float kernel = weightsBuffer.get()[channel * inputChannelsCount + w];
|
const float kernel = weightsBuffer.get()[channel * inputChannelsCount + w];
|
||||||
const float shift = dequantizationValuesAreBroadcasted ? prevDequantizationShiftBuffer.get()[0] : prevDequantizationShiftBuffer.get()[w];
|
const float shift = prevDequantizationShiftBuffer.get()[w];
|
||||||
sum1 += kernel * shift * weightsDequantizationScale;
|
sum1 += kernel * shift * weightsDequantizationScale;
|
||||||
sum2 += kernel * dataZeroPoints[w] * weightsDequantizationScale;
|
sum2 += kernel * dataZeroPoints[w] * weightsDequantizationScale;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -133,3 +133,8 @@ void GemmTransformation::transform(TransformationContext& context, CNNLayer& gem
|
|||||||
|
|
||||||
addDequantizationLayer(context, gemm, dequantizationScales, dequantizationShifts);
|
addDequantizationLayer(context, gemm, dequantizationScales, dequantizationShifts);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
bool GemmTransformation::isQuantized(const CNNLayer& layer) const noexcept {
|
||||||
|
// weightable layer version overriding
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|||||||
@@ -128,6 +128,15 @@ bool WeightableLayerTransformation::isPrecisionPreserved(const CNNLayer& layer)
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
bool WeightableLayerTransformation::getDequantizationDimIsSupported(const CNNLayer& fullyConnected) {
|
||||||
|
const DataPtr inputData = fullyConnected.insData[0].lock();
|
||||||
|
if (inputData == nullptr) {
|
||||||
|
THROW_IE_LPT_EXCEPTION(fullyConnected) << "input data is absent";
|
||||||
|
}
|
||||||
|
|
||||||
|
return inputData->getDims().size() != 3ul;
|
||||||
|
}
|
||||||
|
|
||||||
void WeightableLayerTransformation::updateLayerBiases(
|
void WeightableLayerTransformation::updateLayerBiases(
|
||||||
TransformationContext& context,
|
TransformationContext& context,
|
||||||
const CNNLayer& weightableLayer,
|
const CNNLayer& weightableLayer,
|
||||||
@@ -135,7 +144,17 @@ void WeightableLayerTransformation::updateLayerBiases(
|
|||||||
std::vector<float>& dequantizationScales,
|
std::vector<float>& dequantizationScales,
|
||||||
std::vector<float>& dequantizationShifts,
|
std::vector<float>& dequantizationShifts,
|
||||||
std::vector<float>& biasesShifts) const {
|
std::vector<float>& biasesShifts) const {
|
||||||
if (!std::all_of(dequantizationShifts.begin(), dequantizationShifts.end(), [](float value) { return value == 0.0; })) {
|
const bool dequantizationShiftsAreZero = std::all_of(
|
||||||
|
dequantizationShifts.begin(),
|
||||||
|
dequantizationShifts.end(),
|
||||||
|
[](float value) { return value == 0.0; });
|
||||||
|
|
||||||
|
const bool dequantizationDimIsNotSupported = !getDequantizationDimIsSupported(weightableLayer);
|
||||||
|
CNNLayerPtr biasesLayer = CNNNetworkHelper::getParent(weightableLayer, 2);
|
||||||
|
|
||||||
|
// we need to correct biases if dequantization shifts values are not zero or
|
||||||
|
// dequantization dimention is not supported (as result dequantization shifts can not be calculated)
|
||||||
|
if ((dequantizationDimIsNotSupported && (biasesLayer != nullptr)) || (!dequantizationShiftsAreZero)) {
|
||||||
const DataPtr insData = weightableLayer.insData[0].lock();
|
const DataPtr insData = weightableLayer.insData[0].lock();
|
||||||
if (insData == nullptr) {
|
if (insData == nullptr) {
|
||||||
THROW_IE_LPT_EXCEPTION(weightableLayer) << "input data is absent";
|
THROW_IE_LPT_EXCEPTION(weightableLayer) << "input data is absent";
|
||||||
@@ -144,7 +163,6 @@ void WeightableLayerTransformation::updateLayerBiases(
|
|||||||
|
|
||||||
std::shared_ptr<float> biasesBufferPtr;
|
std::shared_ptr<float> biasesBufferPtr;
|
||||||
Blob::Ptr biasesBlob;
|
Blob::Ptr biasesBlob;
|
||||||
CNNLayerPtr biasesLayer = CNNNetworkHelper::getParent(weightableLayer, 2);
|
|
||||||
if (biasesLayer == nullptr) {
|
if (biasesLayer == nullptr) {
|
||||||
if (weightableLayer.outData.size() != 1ul) {
|
if (weightableLayer.outData.size() != 1ul) {
|
||||||
THROW_IE_LPT_EXCEPTION(weightableLayer) << "unexpected output data count " << weightableLayer.outData.size();
|
THROW_IE_LPT_EXCEPTION(weightableLayer) << "unexpected output data count " << weightableLayer.outData.size();
|
||||||
|
|||||||
@@ -661,6 +661,13 @@ MKLDNNMemoryDesc::operator InferenceEngine::TensorDesc() const {
|
|||||||
blkDims.push_back(8);
|
blkDims.push_back(8);
|
||||||
layout = Layout::BLOCKED;
|
layout = Layout::BLOCKED;
|
||||||
break;
|
break;
|
||||||
|
case memory::gOdhwi8o:
|
||||||
|
order = {0, 1, 2, 3, 4, 5, 1};
|
||||||
|
blkDims = dims;
|
||||||
|
blkDims[1] = blkDims[1] / 8 + (blkDims[1] % 8 ? 1 : 0);
|
||||||
|
blkDims.push_back(8);
|
||||||
|
layout = Layout::BLOCKED;
|
||||||
|
break;
|
||||||
case memory::nChw16c:
|
case memory::nChw16c:
|
||||||
order = {0, 1, 2, 3, 1};
|
order = {0, 1, 2, 3, 1};
|
||||||
blkDims = dims;
|
blkDims = dims;
|
||||||
@@ -676,6 +683,13 @@ MKLDNNMemoryDesc::operator InferenceEngine::TensorDesc() const {
|
|||||||
blkDims.push_back(16);
|
blkDims.push_back(16);
|
||||||
layout = Layout::BLOCKED;
|
layout = Layout::BLOCKED;
|
||||||
break;
|
break;
|
||||||
|
case memory::gOdhwi16o:
|
||||||
|
order = {0, 1, 2, 3, 4, 5, 1};
|
||||||
|
blkDims = dims;
|
||||||
|
blkDims[1] = blkDims[1] / 16 + (blkDims[1] % 16 ? 1 : 0);
|
||||||
|
blkDims.push_back(16);
|
||||||
|
layout = Layout::BLOCKED;
|
||||||
|
break;
|
||||||
case memory::Ohwi8o:
|
case memory::Ohwi8o:
|
||||||
order = {0, 1, 2, 3, 0};
|
order = {0, 1, 2, 3, 0};
|
||||||
blkDims = dims;
|
blkDims = dims;
|
||||||
@@ -1267,6 +1281,13 @@ MKLDNNMemoryDesc::MKLDNNMemoryDesc(const TensorDesc& tDesc):
|
|||||||
} else if (blkdDims[6] == 16) {
|
} else if (blkdDims[6] == 16) {
|
||||||
mkldnnFormat = memory::format::Goidhw16g;
|
mkldnnFormat = memory::format::Goidhw16g;
|
||||||
}
|
}
|
||||||
|
} else if (order.size() == 7 &&
|
||||||
|
order[0] == 0 && order[1] == 1 && order[2] == 2 && order[3] == 3 && order[4] == 4 && order[5] == 5 && order[6] == 1) {
|
||||||
|
if (blkdDims[6] == 8) {
|
||||||
|
mkldnnFormat = memory::format::gOdhwi8o;
|
||||||
|
} else if (blkdDims[6] == 16) {
|
||||||
|
mkldnnFormat = memory::format::gOdhwi16o;
|
||||||
|
}
|
||||||
} else if (order.size() == 8 &&
|
} else if (order.size() == 8 &&
|
||||||
order[0] == 0 && order[1] == 1 && order[2] == 3 && order[3] == 4 && order[4] == 2 && order[5] == 5 &&
|
order[0] == 0 && order[1] == 1 && order[2] == 3 && order[3] == 4 && order[4] == 2 && order[5] == 5 &&
|
||||||
order[6] == 1 && order[7] == 2) {
|
order[6] == 1 && order[7] == 2) {
|
||||||
|
|||||||
@@ -182,8 +182,6 @@ void argmax_many_classes_has_axis(const float* src_data, float* dst_data, Shape
|
|||||||
vmask_type vmask;
|
vmask_type vmask;
|
||||||
int s_index = i0 * dim * after_num + ib1 * block_size;
|
int s_index = i0 * dim * after_num + ib1 * block_size;
|
||||||
|
|
||||||
std::memset(reinterpret_cast<void*>(&vmax_values[0]), 0, sizeof(vmax_values));
|
|
||||||
|
|
||||||
auto vswap_func = [&](int index1, int index2) {
|
auto vswap_func = [&](int index1, int index2) {
|
||||||
vtmp = vmax_values[index1];
|
vtmp = vmax_values[index1];
|
||||||
vmax_values[index1] = _mm_uni_blendv_ps(vmax_values[index1], vmax_values[index2], vmask);
|
vmax_values[index1] = _mm_uni_blendv_ps(vmax_values[index1], vmax_values[index2], vmask);
|
||||||
|
|||||||
@@ -157,7 +157,7 @@ void MKLDNNDepthwiseNode::createDescriptor(const std::vector<InferenceEngine::Te
|
|||||||
const std::vector<InferenceEngine::TensorDesc> &outputDesc) {
|
const std::vector<InferenceEngine::TensorDesc> &outputDesc) {
|
||||||
MKLDNNMemoryDesc in_candidate(inputDesc[0]);
|
MKLDNNMemoryDesc in_candidate(inputDesc[0]);
|
||||||
MKLDNNMemoryDesc out_candidate(inputDesc[0]);
|
MKLDNNMemoryDesc out_candidate(inputDesc[0]);
|
||||||
MKLDNNDims weightDims({in_candidate.getDims()[1]});
|
MKLDNNDims weightDims({in_candidate.getDims().ndims() == 1 ? in_candidate.getDims()[0] : in_candidate.getDims()[1]});
|
||||||
|
|
||||||
MKLDNNMemoryDesc wgh_candidate{weightDims, in_candidate.getDataType(), memory::x};
|
MKLDNNMemoryDesc wgh_candidate{weightDims, in_candidate.getDataType(), memory::x};
|
||||||
|
|
||||||
|
|||||||
@@ -209,32 +209,34 @@ void MKLDNNFullyConnectedNode::setPostOps(mkldnn::primitive_attr &attr, bool ini
|
|||||||
PostOpsIntBlobMemory.push_back(MKLDNNMemoryPtr(new MKLDNNMemory(getEngine())));
|
PostOpsIntBlobMemory.push_back(MKLDNNMemoryPtr(new MKLDNNMemory(getEngine())));
|
||||||
PostOpsIntBlobMemory[blob_idx]->Create(depthwiseDims, memory::data_type::f32, memory::format::x);
|
PostOpsIntBlobMemory[blob_idx]->Create(depthwiseDims, memory::data_type::f32, memory::format::x);
|
||||||
|
|
||||||
PostOpsIntBlobMemory[blob_idx]->SetData(memory::data_type::f32, memory::x,
|
// In case ndims == 3 graph optimizer allows fusing only if all weights values are the same
|
||||||
depthwiseLayer->_weights->buffer(),
|
|
||||||
depthwiseLayer->_weights->size() *
|
|
||||||
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
|
|
||||||
|
|
||||||
if (depthwiseNode->isBroadcast() || ndims == 3) {
|
if (depthwiseNode->isBroadcast() || ndims == 3) {
|
||||||
float broadcastValue = static_cast<float *>(PostOpsIntBlobMemory[blob_idx]->GetData())[0];
|
float broadcastValue = static_cast<float *>(depthwiseLayer->_weights->buffer())[0];
|
||||||
for (int i = 1; i < PostOpsIntBlobMemory[blob_idx]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
|
for (int i = 0; i < PostOpsIntBlobMemory[blob_idx]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
|
||||||
static_cast<float *>(PostOpsIntBlobMemory[blob_idx]->GetData())[i] = broadcastValue;
|
static_cast<float *>(PostOpsIntBlobMemory[blob_idx]->GetData())[i] = broadcastValue;
|
||||||
}
|
}
|
||||||
|
} else {
|
||||||
|
PostOpsIntBlobMemory[blob_idx]->SetData(memory::data_type::f32, memory::x,
|
||||||
|
depthwiseLayer->_weights->buffer(),
|
||||||
|
depthwiseLayer->_weights->size() *
|
||||||
|
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
|
||||||
}
|
}
|
||||||
|
|
||||||
if (depthwiseNode->getAlgorithm() == depthwise_scale_shift) {
|
if (depthwiseNode->getAlgorithm() == depthwise_scale_shift) {
|
||||||
PostOpsIntBlobMemory.push_back(MKLDNNMemoryPtr(new MKLDNNMemory(getEngine())));
|
PostOpsIntBlobMemory.push_back(MKLDNNMemoryPtr(new MKLDNNMemory(getEngine())));
|
||||||
PostOpsIntBlobMemory[blob_idx + 1]->Create(depthwiseDims, memory::data_type::f32,
|
PostOpsIntBlobMemory[blob_idx + 1]->Create(depthwiseDims, memory::data_type::f32, memory::format::x);
|
||||||
memory::format::x);
|
|
||||||
PostOpsIntBlobMemory[blob_idx + 1]->SetData(memory::data_type::f32, memory::x,
|
|
||||||
depthwiseLayer->_biases->buffer(),
|
|
||||||
depthwiseLayer->_biases->size() *
|
|
||||||
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
|
|
||||||
|
|
||||||
|
// In case ndims == 3 graph optimizer allows fusing only if all biases values are the same
|
||||||
if (depthwiseNode->isBroadcast() || ndims == 3) {
|
if (depthwiseNode->isBroadcast() || ndims == 3) {
|
||||||
float broadcastValue = static_cast<float *>(PostOpsIntBlobMemory[blob_idx + 1]->GetData())[0];
|
float broadcastValue = static_cast<float *>(depthwiseLayer->_biases->buffer())[0];
|
||||||
for (int i = 1; i < PostOpsIntBlobMemory[blob_idx + 1]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
|
for (int i = 0; i < PostOpsIntBlobMemory[blob_idx + 1]->GetPrimitiveDescriptor().desc().data.dims[0]; i++) {
|
||||||
static_cast<float *>(PostOpsIntBlobMemory[blob_idx + 1]->GetData())[i] = broadcastValue;
|
static_cast<float *>(PostOpsIntBlobMemory[blob_idx + 1]->GetData())[i] = broadcastValue;
|
||||||
}
|
}
|
||||||
|
} else {
|
||||||
|
PostOpsIntBlobMemory[blob_idx + 1]->SetData(memory::data_type::f32, memory::x,
|
||||||
|
depthwiseLayer->_biases->buffer(),
|
||||||
|
depthwiseLayer->_biases->size() *
|
||||||
|
MKLDNNExtensionUtils::sizeOfDataType(memory::data_type::f32));
|
||||||
}
|
}
|
||||||
|
|
||||||
ops.append_depthwise(depthwiseNode->getAlgorithm(),
|
ops.append_depthwise(depthwiseNode->getAlgorithm(),
|
||||||
|
|||||||
@@ -667,7 +667,8 @@ private:
|
|||||||
};
|
};
|
||||||
|
|
||||||
MKLDNNNormalizeNode::MKLDNNNormalizeNode(const InferenceEngine::CNNLayerPtr& layer, const mkldnn::engine& eng, MKLDNNWeightsSharing::Ptr &cache)
|
MKLDNNNormalizeNode::MKLDNNNormalizeNode(const InferenceEngine::CNNLayerPtr& layer, const mkldnn::engine& eng, MKLDNNWeightsSharing::Ptr &cache)
|
||||||
: MKLDNNNode(layer, eng, cache) {}
|
: MKLDNNNode(layer, eng, cache), src_data_size(0lu), dst_data_size(0lu), weights_data_size(0lu),
|
||||||
|
input_prec(Precision::UNSPECIFIED), output_prec(Precision::UNSPECIFIED), weights_prec(Precision::UNSPECIFIED) {}
|
||||||
|
|
||||||
void MKLDNNNormalizeNode::getSupportedDescriptors() {
|
void MKLDNNNormalizeNode::getSupportedDescriptors() {
|
||||||
if (!descs.empty())
|
if (!descs.empty())
|
||||||
|
|||||||
@@ -120,13 +120,18 @@ void MKLDNNReorderNode::createReorderPrimitive(const mkldnn::memory::desc &srcDe
|
|||||||
// Code block below tries to detect such cases and reinterpret data planar formats (e.g. nchw)
|
// Code block below tries to detect such cases and reinterpret data planar formats (e.g. nchw)
|
||||||
// as grouped weights planar formats (e.g. goihw) since they have same physical memory layout.
|
// as grouped weights planar formats (e.g. goihw) since they have same physical memory layout.
|
||||||
if (MKLDNNMemory::GetPlainFormat(src_blocked->GetDims()) == src_blocked->GetFormat() &&
|
if (MKLDNNMemory::GetPlainFormat(src_blocked->GetDims()) == src_blocked->GetFormat() &&
|
||||||
MKLDNNMemory::IsGroupedFormat(dst_blocked->GetFormat())) {
|
src_blocked->GetDims().size() + 1 == dst_blocked->GetDims().size()) {
|
||||||
try {
|
try {
|
||||||
mkldnn::memory::dims newDims = dst_blocked->GetDims();
|
mkldnn::memory::dims newDims = dst_blocked->GetDims();
|
||||||
mkldnn::memory::format newFormat;
|
mkldnn::memory::format newFormat;
|
||||||
newFormat = src_blocked->GetDims().size() == 4 ? memory::goihw :
|
if (MKLDNNMemory::IsGroupedFormat(dst_blocked->GetFormat())) {
|
||||||
src_blocked->GetDims().size() == 5 ? memory::goidhw :
|
newFormat = src_blocked->GetDims().size() == 4 ? memory::goihw :
|
||||||
src_blocked->GetFormat();
|
src_blocked->GetDims().size() == 5 ? memory::goidhw :
|
||||||
|
src_blocked->GetFormat();
|
||||||
|
} else {
|
||||||
|
newFormat = src_blocked->GetDims().size() == 4 ? memory::ncdhw :
|
||||||
|
src_blocked->GetFormat();
|
||||||
|
}
|
||||||
|
|
||||||
auto newDesc = mkldnn::memory::desc(newDims, src_blocked->GetDataType(), newFormat);
|
auto newDesc = mkldnn::memory::desc(newDims, src_blocked->GetDataType(), newFormat);
|
||||||
src_blocked->Create(newDesc, srcPtr, false);
|
src_blocked->Create(newDesc, srcPtr, false);
|
||||||
|
|||||||
@@ -413,6 +413,16 @@ std::shared_ptr<ngraph::Node> V10Parser::createNode(const std::vector<ngraph::Ou
|
|||||||
std::make_shared<LayerCreator<ngraph::op::v1::ReduceLogicalOr>>("ReduceLogicalOr"),
|
std::make_shared<LayerCreator<ngraph::op::v1::ReduceLogicalOr>>("ReduceLogicalOr"),
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Check that operation in default opsets
|
||||||
|
auto isDefaultOpSet = [](const std::string& version) -> bool {
|
||||||
|
for (size_t i = 1; i <= 3; i++) {
|
||||||
|
std::string opset_name = "opset" + std::to_string(i);
|
||||||
|
if (version == opset_name)
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
return false;
|
||||||
|
};
|
||||||
|
|
||||||
for (size_t i = 0; i < inputs.size(); i++) {
|
for (size_t i = 0; i < inputs.size(); i++) {
|
||||||
if (!inputs[i].get_node())
|
if (!inputs[i].get_node())
|
||||||
THROW_IE_EXCEPTION << params.type << " layer " << params.name << " with id: " << params.layerId
|
THROW_IE_EXCEPTION << params.type << " layer " << params.name << " with id: " << params.layerId
|
||||||
@@ -423,21 +433,23 @@ std::shared_ptr<ngraph::Node> V10Parser::createNode(const std::vector<ngraph::Ou
|
|||||||
}
|
}
|
||||||
|
|
||||||
std::shared_ptr<ngraph::Node> ngraphNode;
|
std::shared_ptr<ngraph::Node> ngraphNode;
|
||||||
// Try to create operation from creators
|
if (isDefaultOpSet(params.version)) {
|
||||||
for (const auto& creator : creators) {
|
// Try to create operation from creators
|
||||||
if (creator->shouldCreate(params.type)) {
|
for (const auto& creator : creators) {
|
||||||
bool useCreator = false;
|
if (creator->shouldCreate(params.type)) {
|
||||||
// Check that opset is registered
|
bool useCreator = false;
|
||||||
useCreator |= opsets.find(params.version) == opsets.end();
|
// Check that opset is registered
|
||||||
if (!useCreator) {
|
useCreator |= opsets.find(params.version) == opsets.end();
|
||||||
// Check that creator can create operation with the version from opset
|
if (!useCreator) {
|
||||||
const auto opset = opsets.at(params.version);
|
// Check that creator can create operation with the version from opset
|
||||||
// Opset should contains the same version of operation or doesn't contain operation with current type
|
const auto opset = opsets.at(params.version);
|
||||||
useCreator |= opset.contains_type(creator->getNodeType()) || !opset.contains_type(params.type);
|
// Opset should contains the same version of operation or doesn't contain operation with current type
|
||||||
|
useCreator |= opset.contains_type(creator->getNodeType()) || !opset.contains_type(params.type);
|
||||||
|
}
|
||||||
|
if (useCreator)
|
||||||
|
ngraphNode = creator->createLayer(inputs, node, binStream, params);
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
if (useCreator)
|
|
||||||
ngraphNode = creator->createLayer(inputs, node, binStream, params);
|
|
||||||
break;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,43 @@
|
|||||||
|
// Copyright (C) 2018-2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include <memory>
|
||||||
|
|
||||||
|
#include <transformations_visibility.hpp>
|
||||||
|
|
||||||
|
#include <ngraph/op/op.hpp>
|
||||||
|
#include <ngraph/op/experimental/layers/prior_box_clustered.hpp>
|
||||||
|
|
||||||
|
namespace ngraph {
|
||||||
|
namespace op {
|
||||||
|
|
||||||
|
class TRANSFORMATIONS_API PriorBoxClusteredIE : public Op {
|
||||||
|
public:
|
||||||
|
static constexpr NodeTypeInfo type_info{"PriorBoxClusteredIE", 1};
|
||||||
|
const NodeTypeInfo& get_type_info() const override { return type_info; }
|
||||||
|
|
||||||
|
/// \brief Constructs a PriorBoxClusteredIE operation
|
||||||
|
///
|
||||||
|
/// \param layer Layer for which prior boxes are computed
|
||||||
|
/// \param image Input Input to which prior boxes are scaled
|
||||||
|
/// \param attrs PriorBoxClustered attributes
|
||||||
|
PriorBoxClusteredIE(const Output<Node>& input,
|
||||||
|
const Output<Node>& image,
|
||||||
|
const ngraph::op::PriorBoxClusteredAttrs& attrs);
|
||||||
|
|
||||||
|
void validate_and_infer_types() override;
|
||||||
|
|
||||||
|
std::shared_ptr<Node> copy_with_new_args(const NodeVector& new_args) const override;
|
||||||
|
|
||||||
|
const PriorBoxClusteredAttrs& get_attrs() const { return m_attrs; }
|
||||||
|
|
||||||
|
private:
|
||||||
|
PriorBoxClusteredAttrs m_attrs;
|
||||||
|
};
|
||||||
|
|
||||||
|
} // namespace op
|
||||||
|
} // namespace ngraph
|
||||||
|
|
||||||
@@ -0,0 +1,42 @@
|
|||||||
|
// Copyright (C) 2018-2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include <memory>
|
||||||
|
|
||||||
|
#include <transformations_visibility.hpp>
|
||||||
|
|
||||||
|
#include "ngraph/op/op.hpp"
|
||||||
|
#include "ngraph/op/experimental/layers/prior_box.hpp"
|
||||||
|
|
||||||
|
namespace ngraph {
|
||||||
|
namespace op {
|
||||||
|
|
||||||
|
class TRANSFORMATIONS_API PriorBoxIE : public Op {
|
||||||
|
public:
|
||||||
|
static constexpr NodeTypeInfo type_info{"PriorBoxIE", 1};
|
||||||
|
const NodeTypeInfo& get_type_info() const override { return type_info; }
|
||||||
|
|
||||||
|
/// \brief Constructs a PriorBoxIE operation
|
||||||
|
///
|
||||||
|
/// \param layer Layer for which prior boxes are computed
|
||||||
|
/// \param image Input Input to which prior boxes are scaled
|
||||||
|
/// \param attrs PriorBox attributes
|
||||||
|
PriorBoxIE(const Output<Node>& input,
|
||||||
|
const Output<Node>& image,
|
||||||
|
const ngraph::op::PriorBoxAttrs& attrs);
|
||||||
|
|
||||||
|
void validate_and_infer_types() override;
|
||||||
|
|
||||||
|
std::shared_ptr<Node> copy_with_new_args(const NodeVector& new_args) const override;
|
||||||
|
|
||||||
|
const PriorBoxAttrs& get_attrs() const { return m_attrs; }
|
||||||
|
|
||||||
|
private:
|
||||||
|
PriorBoxAttrs m_attrs;
|
||||||
|
};
|
||||||
|
|
||||||
|
} // namespace op
|
||||||
|
} // namespace ngraph
|
||||||
@@ -16,6 +16,8 @@
|
|||||||
|
|
||||||
// This pass must be called first in pipeline
|
// This pass must be called first in pipeline
|
||||||
NGRAPH_PASS(InitNodeInfo, ::ngraph::pass)
|
NGRAPH_PASS(InitNodeInfo, ::ngraph::pass)
|
||||||
|
NGRAPH_PASS(ConvertPriorBox, ::ngraph::pass) // WA: ConvertPriorBox must be executed before CF
|
||||||
|
NGRAPH_PASS(ConstantFolding, ::ngraph::pass)
|
||||||
NGRAPH_PASS(RemoveFilteringBoxesBySize, ::ngraph::pass) // Resolves dynamism (replaces NonZero), CF needed
|
NGRAPH_PASS(RemoveFilteringBoxesBySize, ::ngraph::pass) // Resolves dynamism (replaces NonZero), CF needed
|
||||||
NGRAPH_PASS(ConstantFolding, ::ngraph::pass)
|
NGRAPH_PASS(ConstantFolding, ::ngraph::pass)
|
||||||
NGRAPH_PASS(StridedSliceOptimization, ::ngraph::pass) // depends on CF
|
NGRAPH_PASS(StridedSliceOptimization, ::ngraph::pass) // depends on CF
|
||||||
|
|||||||
@@ -0,0 +1,33 @@
|
|||||||
|
// Copyright (C) 2018-2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include <vector>
|
||||||
|
#include <memory>
|
||||||
|
|
||||||
|
#include <transformations_visibility.hpp>
|
||||||
|
|
||||||
|
#include <ngraph/pass/graph_rewrite.hpp>
|
||||||
|
|
||||||
|
namespace ngraph {
|
||||||
|
namespace pass {
|
||||||
|
|
||||||
|
class TRANSFORMATIONS_API ConvertPriorBox;
|
||||||
|
|
||||||
|
} // namespace pass
|
||||||
|
} // namespace ngraph
|
||||||
|
|
||||||
|
class ngraph::pass::ConvertPriorBox: public ngraph::pass::GraphRewrite {
|
||||||
|
public:
|
||||||
|
ConvertPriorBox() : GraphRewrite() {
|
||||||
|
convert_prior_box();
|
||||||
|
convert_prior_box_clustered();
|
||||||
|
}
|
||||||
|
|
||||||
|
private:
|
||||||
|
void convert_prior_box();
|
||||||
|
|
||||||
|
void convert_prior_box_clustered();
|
||||||
|
};
|
||||||
@@ -0,0 +1,39 @@
|
|||||||
|
// Copyright (C) 2018-2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#include "ngraph_ops/prior_box_clustered_ie.hpp"
|
||||||
|
|
||||||
|
#include <memory>
|
||||||
|
|
||||||
|
#include "ngraph/op/constant.hpp"
|
||||||
|
|
||||||
|
using namespace std;
|
||||||
|
using namespace ngraph;
|
||||||
|
|
||||||
|
constexpr NodeTypeInfo op::PriorBoxClusteredIE::type_info;
|
||||||
|
|
||||||
|
op::PriorBoxClusteredIE::PriorBoxClusteredIE(const Output<Node>& input, const Output<Node>& image,
|
||||||
|
const PriorBoxClusteredAttrs& attrs)
|
||||||
|
: Op({input, image}), m_attrs(attrs) {
|
||||||
|
constructor_validate_and_infer_types();
|
||||||
|
}
|
||||||
|
|
||||||
|
void op::PriorBoxClusteredIE::validate_and_infer_types() {
|
||||||
|
if (get_input_partial_shape(0).is_dynamic() || get_input_partial_shape(1).is_dynamic()) {
|
||||||
|
set_output_type(0, element::f32, PartialShape::dynamic(3));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
auto input_shape = get_input_shape(0);
|
||||||
|
auto image_shape = get_input_shape(1);
|
||||||
|
|
||||||
|
size_t num_priors = m_attrs.widths.size();
|
||||||
|
|
||||||
|
set_output_type(0, element::f32, Shape {1, 2, 4 * input_shape[2] * input_shape[3] * num_priors});
|
||||||
|
}
|
||||||
|
|
||||||
|
shared_ptr<Node> op::PriorBoxClusteredIE::copy_with_new_args(const NodeVector& new_args) const {
|
||||||
|
check_new_args_count(this, new_args);
|
||||||
|
return make_shared<PriorBoxClusteredIE>(new_args.at(0), new_args.at(1), m_attrs);
|
||||||
|
}
|
||||||
@@ -0,0 +1,36 @@
|
|||||||
|
// Copyright (C) 2018-2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#include "ngraph_ops/prior_box_ie.hpp"
|
||||||
|
|
||||||
|
#include <memory>
|
||||||
|
|
||||||
|
#include "ngraph/op/constant.hpp"
|
||||||
|
|
||||||
|
using namespace std;
|
||||||
|
using namespace ngraph;
|
||||||
|
|
||||||
|
constexpr NodeTypeInfo op::PriorBoxIE::type_info;
|
||||||
|
|
||||||
|
op::PriorBoxIE::PriorBoxIE(const Output<Node>& input, const Output<Node>& image, const PriorBoxAttrs& attrs)
|
||||||
|
: Op({input, image}), m_attrs(attrs) {
|
||||||
|
constructor_validate_and_infer_types();
|
||||||
|
}
|
||||||
|
|
||||||
|
void op::PriorBoxIE::validate_and_infer_types() {
|
||||||
|
if (get_input_partial_shape(0).is_dynamic() || get_input_partial_shape(1).is_dynamic()) {
|
||||||
|
set_output_type(0, element::f32, PartialShape::dynamic(3));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
auto input_shape = get_input_shape(0);
|
||||||
|
auto image_shape = get_input_shape(1);
|
||||||
|
|
||||||
|
set_output_type(0, element::f32, Shape {
|
||||||
|
1, 2, 4 * input_shape[2] * input_shape[3] * static_cast<size_t>(op::PriorBox::number_of_priors(m_attrs))});
|
||||||
|
}
|
||||||
|
|
||||||
|
shared_ptr<Node> op::PriorBoxIE::copy_with_new_args(const NodeVector& new_args) const {
|
||||||
|
check_new_args_count(this, new_args);
|
||||||
|
return make_shared<PriorBoxIE>(new_args.at(0), new_args.at(1), m_attrs);
|
||||||
|
}
|
||||||
@@ -5,6 +5,7 @@
|
|||||||
#include <memory>
|
#include <memory>
|
||||||
|
|
||||||
#include "transformations/common_optimizations/common_optimizations.hpp"
|
#include "transformations/common_optimizations/common_optimizations.hpp"
|
||||||
|
#include "transformations/convert_opset1_to_legacy/convert_prior_to_ie_prior.hpp"
|
||||||
#include "transformations/depth_to_space_fusion.hpp"
|
#include "transformations/depth_to_space_fusion.hpp"
|
||||||
#include "transformations/optimize_strided_slice.hpp"
|
#include "transformations/optimize_strided_slice.hpp"
|
||||||
#include "transformations/convert_scatter_elements_to_scatter.hpp"
|
#include "transformations/convert_scatter_elements_to_scatter.hpp"
|
||||||
|
|||||||
@@ -17,7 +17,8 @@ void ngraph::pass::ConvertDivide::convert_divide() {
|
|||||||
|
|
||||||
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
|
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
|
||||||
auto div = std::dynamic_pointer_cast<ngraph::opset1::Divide> (m.get_match_root());
|
auto div = std::dynamic_pointer_cast<ngraph::opset1::Divide> (m.get_match_root());
|
||||||
if (!div) {
|
// We can not apply this transformation in case with integer input data type
|
||||||
|
if (!div || div->input(0).get_element_type().is_integral()) {
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,294 @@
|
|||||||
|
// Copyright (C) 2018-2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#include "transformations/convert_opset1_to_legacy/convert_prior_to_ie_prior.hpp"
|
||||||
|
|
||||||
|
#include <memory>
|
||||||
|
#include <vector>
|
||||||
|
|
||||||
|
#include <ngraph/opsets/opset3.hpp>
|
||||||
|
#include <ngraph/opsets/opset1.hpp>
|
||||||
|
|
||||||
|
#include <ngraph_ops/prior_box_ie.hpp>
|
||||||
|
#include <ngraph_ops/prior_box_clustered_ie.hpp>
|
||||||
|
#include <ngraph/rt_info.hpp>
|
||||||
|
|
||||||
|
void ngraph::pass::ConvertPriorBox::convert_prior_box() {
|
||||||
|
auto data = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
|
||||||
|
auto axes = ngraph::opset1::Constant::create(element::i64, Shape{1}, {0});
|
||||||
|
auto image = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
|
||||||
|
|
||||||
|
ngraph::op::PriorBoxAttrs attr;
|
||||||
|
attr.min_size = {162.0f};
|
||||||
|
attr.max_size = {213.0f};
|
||||||
|
attr.aspect_ratio = {2.0f, 3.0f};
|
||||||
|
attr.variance = {0.1f, 0.1f, 0.2f, 0.2f};
|
||||||
|
attr.step = 64.0f;
|
||||||
|
attr.offset = 0.5f;
|
||||||
|
attr.clip = 0;
|
||||||
|
attr.flip = 1;
|
||||||
|
attr.scale_all_sizes = true;
|
||||||
|
|
||||||
|
auto prior_box = std::make_shared<ngraph::opset1::PriorBox>(data, image, attr);
|
||||||
|
auto unsqueeze = std::make_shared<ngraph::opset1::Unsqueeze> (prior_box, axes);
|
||||||
|
|
||||||
|
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
|
||||||
|
auto unsqueeze = std::dynamic_pointer_cast<ngraph::opset1::Unsqueeze> (m.get_match_root());
|
||||||
|
if (!unsqueeze) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
auto prior_box_node = std::dynamic_pointer_cast<ngraph::opset1::PriorBox> (unsqueeze->input_value(0).get_node_shared_ptr());
|
||||||
|
|
||||||
|
if (!prior_box_node) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// vector of nGraph nodes that will be replaced
|
||||||
|
ngraph::NodeVector ops_to_replace{unsqueeze, prior_box_node};
|
||||||
|
|
||||||
|
std::shared_ptr<Node> input_1(prior_box_node->input_value(0).get_node_shared_ptr());
|
||||||
|
std::shared_ptr<Node> input_2(prior_box_node->input_value(1).get_node_shared_ptr());
|
||||||
|
|
||||||
|
auto convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||||
|
auto convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||||
|
|
||||||
|
if (convert1 && convert2) {
|
||||||
|
ops_to_replace.push_back(convert1);
|
||||||
|
ops_to_replace.push_back(convert2);
|
||||||
|
input_1 = convert1->input_value(0).get_node_shared_ptr();
|
||||||
|
input_2 = convert2->input_value(0).get_node_shared_ptr();
|
||||||
|
}
|
||||||
|
|
||||||
|
auto strided_slice1 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_1);
|
||||||
|
auto strided_slice2 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_2);
|
||||||
|
|
||||||
|
if (!strided_slice1 || !strided_slice2) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
ops_to_replace.push_back(strided_slice1);
|
||||||
|
ops_to_replace.push_back(strided_slice2);
|
||||||
|
|
||||||
|
// Check that StridedSlice1 cuts H,W dims for PriorBox
|
||||||
|
auto begin = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(1).get_node_shared_ptr());
|
||||||
|
auto end = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(2).get_node_shared_ptr());
|
||||||
|
auto stride = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->input_value(3).get_node_shared_ptr());
|
||||||
|
|
||||||
|
if (!begin || !end || !stride) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
auto begin_val = begin->get_vector<int64_t>();
|
||||||
|
auto end_val = end->get_vector<int64_t>();
|
||||||
|
auto stride_val = stride->get_vector<int64_t>();
|
||||||
|
|
||||||
|
if (begin_val.size() != 1 && begin_val[0] != 2) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (end_val.size() != 1 && end_val[0] != 4) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (stride_val.size() != 1 && stride_val[0] != 1) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// TODO: should we check second StridedSlice?
|
||||||
|
input_1 = strided_slice1->input_value(0).get_node_shared_ptr();
|
||||||
|
input_2 = strided_slice2->input_value(0).get_node_shared_ptr();
|
||||||
|
|
||||||
|
convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||||
|
convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||||
|
|
||||||
|
if (convert1 && convert2) {
|
||||||
|
ops_to_replace.push_back(convert1);
|
||||||
|
ops_to_replace.push_back(convert2);
|
||||||
|
input_1 = convert1->input_value(0).get_node_shared_ptr();
|
||||||
|
input_2 = convert2->input_value(0).get_node_shared_ptr();
|
||||||
|
}
|
||||||
|
|
||||||
|
// the input can be either ShapeOf-1 or ShapeOf-3
|
||||||
|
std::shared_ptr<ngraph::op::Op> shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_1);
|
||||||
|
std::shared_ptr<ngraph::op::Op> shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_2);
|
||||||
|
|
||||||
|
if (!shape_of1 || !shape_of2) {
|
||||||
|
shape_of1 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_1);
|
||||||
|
shape_of2 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_2);
|
||||||
|
}
|
||||||
|
if (!shape_of1 || !shape_of2) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
// keep this code for a while if will decide to run this transformation again in the opset1->legacy
|
||||||
|
// the input can be either ShapeOf or Convert(ShapeOf)
|
||||||
|
// if (!shape_of1 || !shape_of2) {
|
||||||
|
// auto shapeof1_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||||
|
// auto shapeof2_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||||
|
// if (!shapeof1_convert || !shapeof2_convert)
|
||||||
|
// return false;
|
||||||
|
// shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof1_convert->input_value(0).get_node_shared_ptr());
|
||||||
|
// shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof2_convert->input_value(0).get_node_shared_ptr());
|
||||||
|
// if (!shape_of1 || !shape_of2)
|
||||||
|
// return false;
|
||||||
|
// ops_to_replace.push_back(shapeof1_convert);
|
||||||
|
// ops_to_replace.push_back(shapeof2_convert);
|
||||||
|
// }
|
||||||
|
|
||||||
|
ops_to_replace.push_back(shape_of1);
|
||||||
|
ops_to_replace.push_back(shape_of2);
|
||||||
|
|
||||||
|
auto prior_box_ie = std::make_shared<ngraph::op::PriorBoxIE> (shape_of1->input_value(0),
|
||||||
|
shape_of2->input_value(0),
|
||||||
|
prior_box_node->get_attrs());
|
||||||
|
|
||||||
|
prior_box_ie->set_friendly_name(unsqueeze->get_friendly_name());
|
||||||
|
|
||||||
|
// Nodes in copy runtime info function should be in topological order
|
||||||
|
std::reverse(ops_to_replace.begin(), ops_to_replace.end());
|
||||||
|
ngraph::copy_runtime_info(ops_to_replace, prior_box_ie);
|
||||||
|
ngraph::replace_node(m.get_match_root(), prior_box_ie);
|
||||||
|
return true;
|
||||||
|
};
|
||||||
|
|
||||||
|
auto m = std::make_shared<ngraph::pattern::Matcher>(unsqueeze, "CPUFusion.ConvertPriorBoxToPriorBoxIE");
|
||||||
|
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
|
||||||
|
}
|
||||||
|
|
||||||
|
void ngraph::pass::ConvertPriorBox::convert_prior_box_clustered() {
|
||||||
|
auto data = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
|
||||||
|
auto axes = ngraph::opset1::Constant::create(element::i64, Shape{1}, {0});
|
||||||
|
auto image = std::make_shared<pattern::op::Label>(element::i64, Shape{1, 1, 1, 1});
|
||||||
|
|
||||||
|
ngraph::op::PriorBoxClusteredAttrs attr;
|
||||||
|
attr.widths = {0.1f, 0.1f, 0.2f, 0.2f};
|
||||||
|
attr.heights = {0.1f, 0.1f, 0.2f, 0.2f};
|
||||||
|
attr.variances = {0.1f, 0.1f, 0.2f, 0.2f};
|
||||||
|
attr.step_widths = 64.0f;
|
||||||
|
attr.step_heights = 64.0f;
|
||||||
|
attr.offset = 0.5f;
|
||||||
|
attr.clip = false;
|
||||||
|
|
||||||
|
auto prior_box = std::make_shared<ngraph::opset1::PriorBoxClustered>(data, image, attr);
|
||||||
|
auto unsqueeze = std::make_shared<ngraph::opset1::Unsqueeze> (prior_box, axes);
|
||||||
|
|
||||||
|
ngraph::graph_rewrite_callback callback = [](pattern::Matcher& m) {
|
||||||
|
auto unsqueeze = std::dynamic_pointer_cast<ngraph::opset1::Unsqueeze> (m.get_match_root());
|
||||||
|
if (!unsqueeze) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
auto prior_box_node = std::dynamic_pointer_cast<ngraph::opset1::PriorBoxClustered> (unsqueeze->get_argument(0));
|
||||||
|
|
||||||
|
if (!prior_box_node) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// vector of nGraph nodes that will be replaced
|
||||||
|
ngraph::NodeVector ops_to_replace{unsqueeze, prior_box_node};
|
||||||
|
|
||||||
|
std::shared_ptr<Node> input_1(prior_box_node->input_value(0).get_node_shared_ptr());
|
||||||
|
std::shared_ptr<Node> input_2(prior_box_node->input_value(1).get_node_shared_ptr());
|
||||||
|
|
||||||
|
auto convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||||
|
auto convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||||
|
|
||||||
|
if (convert1 && convert2) {
|
||||||
|
ops_to_replace.push_back(convert1);
|
||||||
|
ops_to_replace.push_back(convert2);
|
||||||
|
input_1 = convert1->input_value(0).get_node_shared_ptr();
|
||||||
|
input_2 = convert2->input_value(0).get_node_shared_ptr();
|
||||||
|
}
|
||||||
|
|
||||||
|
auto strided_slice1 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_1);
|
||||||
|
auto strided_slice2 = std::dynamic_pointer_cast<ngraph::opset1::StridedSlice> (input_2);
|
||||||
|
|
||||||
|
if (!strided_slice1 || !strided_slice2) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
ops_to_replace.push_back(strided_slice1);
|
||||||
|
ops_to_replace.push_back(strided_slice2);
|
||||||
|
|
||||||
|
// Check that StridedSlice1 cuts H,W dims for PriorBox
|
||||||
|
auto begin = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(1));
|
||||||
|
auto end = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(2));
|
||||||
|
auto stride = std::dynamic_pointer_cast<ngraph::opset1::Constant> (strided_slice1->get_argument(3));
|
||||||
|
|
||||||
|
if (!begin || !end || !stride) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
auto begin_val = begin->get_vector<int64_t>();
|
||||||
|
auto end_val = end->get_vector<int64_t>();
|
||||||
|
auto stride_val = stride->get_vector<int64_t>();
|
||||||
|
|
||||||
|
if (begin_val.size() != 1 && begin_val[0] != 2) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (end_val.size() != 1 && end_val[0] != 4) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (stride_val.size() != 1 && stride_val[0] != 1) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// TODO: should we check second StridedSlice?
|
||||||
|
input_1 = strided_slice1->input_value(0).get_node_shared_ptr();
|
||||||
|
input_2 = strided_slice2->input_value(0).get_node_shared_ptr();
|
||||||
|
|
||||||
|
convert1 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||||
|
convert2 = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||||
|
|
||||||
|
if (convert1 && convert2) {
|
||||||
|
ops_to_replace.push_back(convert1);
|
||||||
|
ops_to_replace.push_back(convert2);
|
||||||
|
input_1 = convert1->input_value(0).get_node_shared_ptr();
|
||||||
|
input_2 = convert2->input_value(0).get_node_shared_ptr();
|
||||||
|
}
|
||||||
|
|
||||||
|
// the input can be either ShapeOf-1 or ShapeOf-3
|
||||||
|
std::shared_ptr<ngraph::op::Op> shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_1);
|
||||||
|
std::shared_ptr<ngraph::op::Op> shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf> (input_2);
|
||||||
|
|
||||||
|
if (!shape_of1 || !shape_of2) {
|
||||||
|
shape_of1 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_1);
|
||||||
|
shape_of2 = std::dynamic_pointer_cast<ngraph::opset3::ShapeOf>(input_2);
|
||||||
|
}
|
||||||
|
if (!shape_of1 || !shape_of2) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
// keep this code for a while if will decide to run this transformation again in the opset1->legacy
|
||||||
|
// the input can be either ShapeOf or Convert(ShapeOf)
|
||||||
|
// if (!shape_of1 || !shape_of2) {
|
||||||
|
// auto shapeof1_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_1);
|
||||||
|
// auto shapeof2_convert = std::dynamic_pointer_cast<ngraph::opset1::Convert> (input_2);
|
||||||
|
// if (!shapeof1_convert || !shapeof2_convert)
|
||||||
|
// return false;
|
||||||
|
// shape_of1 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof1_convert->input_value(0).get_node_shared_ptr());
|
||||||
|
// shape_of2 = std::dynamic_pointer_cast<ngraph::opset1::ShapeOf>(shapeof2_convert->input_value(0).get_node_shared_ptr());
|
||||||
|
// if (!shape_of1 || !shape_of2)
|
||||||
|
// return false;
|
||||||
|
// ops_to_replace.push_back(shapeof1_convert);
|
||||||
|
// ops_to_replace.push_back(shapeof2_convert);
|
||||||
|
// }
|
||||||
|
|
||||||
|
ops_to_replace.push_back(shape_of1);
|
||||||
|
ops_to_replace.push_back(shape_of2);
|
||||||
|
|
||||||
|
auto prior_box_ie = std::make_shared<ngraph::op::PriorBoxClusteredIE> (shape_of1->get_argument(0),
|
||||||
|
shape_of2->get_argument(0),
|
||||||
|
prior_box_node->get_attrs());
|
||||||
|
prior_box_ie->set_friendly_name(unsqueeze->get_friendly_name());
|
||||||
|
|
||||||
|
// Nodes in copy runtime info function should be in topological order
|
||||||
|
std::reverse(ops_to_replace.begin(), ops_to_replace.end());
|
||||||
|
ngraph::copy_runtime_info(ops_to_replace, prior_box_ie);
|
||||||
|
ngraph::replace_node(unsqueeze, prior_box_ie);
|
||||||
|
return true;
|
||||||
|
};
|
||||||
|
|
||||||
|
auto m = std::make_shared<ngraph::pattern::Matcher>(unsqueeze, "CPUFusion.ConvertPriorBoxClusteredToPriorBoxClusteredIE");
|
||||||
|
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
|
||||||
|
}
|
||||||
@@ -41,10 +41,6 @@ void ngraph::pass::ConvertStridedSliceToCrop::convert_strided_slice_to_crop() {
|
|||||||
|
|
||||||
auto input_shape = slice->get_input_shape(0);
|
auto input_shape = slice->get_input_shape(0);
|
||||||
auto output_shape = slice->get_output_shape(0);
|
auto output_shape = slice->get_output_shape(0);
|
||||||
// MKLDNN: "Crop supports only 2d, 4d and 5d blobs."
|
|
||||||
if (input_shape.size() != 2 && input_shape.size() != 4 && input_shape.size() != 5) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
auto begin = begin_node->cast_vector<int64_t>();
|
auto begin = begin_node->cast_vector<int64_t>();
|
||||||
auto end = end_node->cast_vector<int64_t>();
|
auto end = end_node->cast_vector<int64_t>();
|
||||||
@@ -201,6 +197,12 @@ void ngraph::pass::ConvertStridedSliceToCrop::convert_strided_slice_to_crop() {
|
|||||||
new_ops.push_back(data_node);
|
new_ops.push_back(data_node);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
auto data_node_shape = data_node->get_output_shape(0);
|
||||||
|
// MKLDNN: "Crop supports only 2d, 4d and 5d blobs."
|
||||||
|
if (data_node_shape.size() != 2 && data_node_shape.size() != 4 && data_node_shape.size() != 5) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
// Crop
|
// Crop
|
||||||
data_node = std::make_shared<ngraph::op::CropIE> (data_node, axes, dim, offset);
|
data_node = std::make_shared<ngraph::op::CropIE> (data_node, axes, dim, offset);
|
||||||
data_node->set_friendly_name(slice->get_friendly_name());
|
data_node->set_friendly_name(slice->get_friendly_name());
|
||||||
|
|||||||
@@ -42,22 +42,37 @@ void ngraph::pass::ConvertTopKToTopKIE::convert_topk_to_topk_ie() {
|
|||||||
topk->get_sort_type());
|
topk->get_sort_type());
|
||||||
new_ops.push_back(topk_ie);
|
new_ops.push_back(topk_ie);
|
||||||
|
|
||||||
|
Output<Node> element_output;
|
||||||
Output<Node> index_output;
|
Output<Node> index_output;
|
||||||
// insert Convert if index element type not equal to i32
|
// insert Convert if index element type not equal to i32 and output #1 of TopK has consumers
|
||||||
if (topk->get_index_element_type() == element::i32) {
|
if (topk->get_index_element_type() == element::i32 || topk->get_output_target_inputs(1).size() == 0) {
|
||||||
|
element_output = topk_ie->output(0);
|
||||||
index_output = topk_ie->output(1);
|
index_output = topk_ie->output(1);
|
||||||
} else {
|
topk_ie->set_friendly_name(topk->get_friendly_name());
|
||||||
|
} else if (topk->get_output_target_inputs(0).size() == 0) {
|
||||||
index_output = std::make_shared<opset1::Convert>(topk_ie->output(1), topk->get_index_element_type());
|
index_output = std::make_shared<opset1::Convert>(topk_ie->output(1), topk->get_index_element_type());
|
||||||
new_ops.push_back(index_output.get_node_shared_ptr());
|
new_ops.push_back(index_output.get_node_shared_ptr());
|
||||||
|
|
||||||
|
// workaround for naming output #1 of TopK
|
||||||
|
index_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
|
||||||
|
} else {
|
||||||
|
// create fake convert for 0 output, it is a workaround in purpose of correct output names preserving
|
||||||
|
element_output = std::make_shared<opset1::Convert>(topk_ie->output(0), topk->get_output_element_type(0));
|
||||||
|
index_output = std::make_shared<opset1::Convert>(topk_ie->output(1), topk->get_index_element_type());
|
||||||
|
new_ops.push_back(element_output.get_node_shared_ptr());
|
||||||
|
new_ops.push_back(index_output.get_node_shared_ptr());
|
||||||
|
|
||||||
|
// workaround for naming two outputs of TopK
|
||||||
|
element_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".0");
|
||||||
|
index_output.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
|
||||||
}
|
}
|
||||||
|
|
||||||
topk_ie->set_friendly_name(topk->get_friendly_name());
|
|
||||||
ngraph::copy_runtime_info(topk, new_ops);
|
ngraph::copy_runtime_info(topk, new_ops);
|
||||||
topk->output(0).replace(topk_ie->output(0));
|
topk->output(0).replace(element_output);
|
||||||
topk->output(1).replace(index_output);
|
topk->output(1).replace(index_output);
|
||||||
return true;
|
return true;
|
||||||
};
|
};
|
||||||
|
|
||||||
auto m = std::make_shared<ngraph::pattern::Matcher>(topk, "ConvertTopKToTopKIE");
|
auto m = std::make_shared<ngraph::pattern::Matcher>(topk, "ConvertTopKToTopKIE");
|
||||||
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
|
this->add_matcher(m, callback, PassProperty::CHANGE_DYNAMIC_STATE);
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -20,24 +20,40 @@ void ngraph::pass::ConvertTopK3::convert_topk3() {
|
|||||||
if (!topk) {
|
if (!topk) {
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
Output<Node> last;
|
Output<Node> last0;
|
||||||
|
Output<Node> last1;
|
||||||
ngraph::NodeVector new_ops;
|
ngraph::NodeVector new_ops;
|
||||||
|
|
||||||
auto new_topk = std::make_shared<ngraph::opset2::TopK>(topk->input_value(0), topk->input_value(1),
|
auto new_topk = std::make_shared<ngraph::opset2::TopK>(topk->input_value(0), topk->input_value(1),
|
||||||
topk->get_axis(), topk->get_mode(), topk->get_sort_type(), element::i32);
|
topk->get_axis(), topk->get_mode(), topk->get_sort_type(), element::i32);
|
||||||
new_ops.push_back(new_topk);
|
new_ops.push_back(new_topk);
|
||||||
// if the output is the i32 then it matches behavior of the v1::TopK otherwise need to insert Convert
|
// if the output is the i32 or output #1 has no consumers
|
||||||
if (topk->get_index_element_type() == element::i32) {
|
// then it matches behavior of the v1::TopK otherwise need to insert Convert
|
||||||
last = new_topk->output(1);
|
if (topk->get_index_element_type() == element::i32 || topk->get_output_target_inputs(1).size() == 0) {
|
||||||
|
last0 = new_topk->output(0);
|
||||||
|
last1 = new_topk->output(1);
|
||||||
|
new_topk->set_friendly_name(topk->get_friendly_name());
|
||||||
|
} else if (topk->get_output_target_inputs(0).size() == 0) {
|
||||||
|
last1 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
|
||||||
|
new_ops.push_back(last1.get_node_shared_ptr());
|
||||||
|
|
||||||
|
// workaround for naming two outputs of TopK
|
||||||
|
last1.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
|
||||||
} else {
|
} else {
|
||||||
last = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
|
// create fake convert for 0 output, it is a workaround in purpose of correct output names preserving
|
||||||
new_ops.push_back(last.get_node_shared_ptr());
|
last0 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(0), topk->get_output_element_type(0));
|
||||||
|
last1 = std::make_shared<ngraph::opset2::Convert>(new_topk->output(1), topk->get_index_element_type());
|
||||||
|
new_ops.push_back(last0.get_node_shared_ptr());
|
||||||
|
new_ops.push_back(last1.get_node_shared_ptr());
|
||||||
|
|
||||||
|
// workaround for naming two outputs of TopK
|
||||||
|
last0.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".0");
|
||||||
|
last1.get_node_shared_ptr()->set_friendly_name(topk->get_friendly_name() + ".1");
|
||||||
}
|
}
|
||||||
|
|
||||||
new_topk->set_friendly_name(topk->get_friendly_name());
|
|
||||||
ngraph::copy_runtime_info(topk, new_ops);
|
ngraph::copy_runtime_info(topk, new_ops);
|
||||||
topk->output(0).replace(new_topk->output(0));
|
topk->output(0).replace(last0);
|
||||||
topk->output(1).replace(last);
|
topk->output(1).replace(last1);
|
||||||
return true;
|
return true;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
|||||||
@@ -30,7 +30,7 @@ bool check_block_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
|
|||||||
is_transformation_valid &= (expected_shape == shape_reshape_before);
|
is_transformation_valid &= (expected_shape == shape_reshape_before);
|
||||||
|
|
||||||
// x'' = transpose(x', [0, K + 1, K + 2, 1, K + 3, 2, K + 4, 3, ..., K + (K + 1), K])
|
// x'' = transpose(x', [0, K + 1, K + 2, 1, K + 3, 2, K + 4, 3, ..., K + (K + 1), K])
|
||||||
ngraph::AxisVector expected_permutation = {0, spatial_dims + 1};
|
ngraph::AxisVector expected_permutation = {0, static_cast<size_t>(spatial_dims + 1)};
|
||||||
for (uint64_t i = 2; i < shape_input.size(); ++i) {
|
for (uint64_t i = 2; i < shape_input.size(); ++i) {
|
||||||
expected_permutation.push_back(spatial_dims + i);
|
expected_permutation.push_back(spatial_dims + i);
|
||||||
expected_permutation.push_back(i - 1);
|
expected_permutation.push_back(i - 1);
|
||||||
@@ -38,7 +38,7 @@ bool check_block_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
|
|||||||
is_transformation_valid &= (expected_permutation == permutation);
|
is_transformation_valid &= (expected_permutation == permutation);
|
||||||
|
|
||||||
// y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])
|
// y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])
|
||||||
expected_shape = {shape_input[0], c_dim};
|
expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
|
||||||
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
||||||
expected_shape.push_back(shape_input[i] * possible_block_size);
|
expected_shape.push_back(shape_input[i] * possible_block_size);
|
||||||
is_transformation_valid &= (expected_shape == shape_reshape_after);
|
is_transformation_valid &= (expected_shape == shape_reshape_after);
|
||||||
@@ -57,7 +57,7 @@ bool check_depth_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
|
|||||||
uint64_t c_dim = shape_input[1] / std::pow(possible_block_size, spatial_dims);
|
uint64_t c_dim = shape_input[1] / std::pow(possible_block_size, spatial_dims);
|
||||||
|
|
||||||
// x' = reshape(data, [N, C / (block_size ^ K), block_size, block_size, ..., block_size, D1, D2, ..., DK])
|
// x' = reshape(data, [N, C / (block_size ^ K), block_size, block_size, ..., block_size, D1, D2, ..., DK])
|
||||||
ngraph::Shape expected_shape = {shape_input[0], c_dim};
|
ngraph::Shape expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
|
||||||
for (uint64_t i = 0; i < spatial_dims; ++i)
|
for (uint64_t i = 0; i < spatial_dims; ++i)
|
||||||
expected_shape.push_back(possible_block_size);
|
expected_shape.push_back(possible_block_size);
|
||||||
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
||||||
@@ -73,7 +73,7 @@ bool check_depth_first(const ngraph::Shape& shape_input, const ngraph::Shape& sh
|
|||||||
is_transformation_valid &= (expected_permutation == permutation);
|
is_transformation_valid &= (expected_permutation == permutation);
|
||||||
|
|
||||||
// y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])
|
// y = reshape(x'', [N, C / (block_size ^ K), D1 * block_size, D2 * block_size, D3 * block_size, ..., DK * block_size])
|
||||||
expected_shape = {shape_input[0], c_dim};
|
expected_shape = {shape_input[0], static_cast<size_t>(c_dim)};
|
||||||
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
for (uint64_t i = 2; i < shape_input.size(); ++i)
|
||||||
expected_shape.push_back(shape_input[i] * possible_block_size);
|
expected_shape.push_back(shape_input[i] * possible_block_size);
|
||||||
is_transformation_valid &= (expected_shape == shape_reshape_after);
|
is_transformation_valid &= (expected_shape == shape_reshape_after);
|
||||||
|
|||||||
@@ -26,7 +26,7 @@ namespace vpu {
|
|||||||
|
|
||||||
template <typename T>
|
template <typename T>
|
||||||
Optional<int> parseNumber(const std::string& s) {
|
Optional<int> parseNumber(const std::string& s) {
|
||||||
T value;
|
auto value = T{};
|
||||||
if ((std::istringstream(s) >> value >> std::ws).eof()) {
|
if ((std::istringstream(s) >> value >> std::ws).eof()) {
|
||||||
return {value};
|
return {value};
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -39,7 +39,7 @@ void dynamicToStaticShapeBinaryEltwise(std::shared_ptr<ngraph::Node> eltwise) {
|
|||||||
const auto diff = std::abs(lhsRank.get_length() - rhsRank.get_length());
|
const auto diff = std::abs(lhsRank.get_length() - rhsRank.get_length());
|
||||||
if (diff) {
|
if (diff) {
|
||||||
auto & broadcastInput = lhsRank.get_length() < rhsRank.get_length() ? lhsInput : rhsInput;
|
auto & broadcastInput = lhsRank.get_length() < rhsRank.get_length() ? lhsInput : rhsInput;
|
||||||
const auto broadcastConst = ngraph::opset3::Constant::create(broadcastInput.get_element_type(), {static_cast<uint64_t>(diff)}, {1});
|
const auto broadcastConst = ngraph::opset3::Constant::create(broadcastInput.get_element_type(), {static_cast<size_t>(diff)}, {1});
|
||||||
broadcastInput = std::make_shared<ngraph::opset3::Concat>(ngraph::OutputVector{broadcastConst, broadcastInput}, 0);
|
broadcastInput = std::make_shared<ngraph::opset3::Concat>(ngraph::OutputVector{broadcastConst, broadcastInput}, 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -392,8 +392,17 @@ inline Stage ModelObj::addNewStage(
|
|||||||
// runAllocator
|
// runAllocator
|
||||||
//
|
//
|
||||||
|
|
||||||
|
VPU_DECLARE_ENUM(EnableShapeAllocation,
|
||||||
|
YES,
|
||||||
|
NO)
|
||||||
|
|
||||||
|
VPU_DECLARE_ENUM(CheckOnlyCMX,
|
||||||
|
YES,
|
||||||
|
NO)
|
||||||
|
|
||||||
AllocationResult runAllocator(
|
AllocationResult runAllocator(
|
||||||
const Model& model,
|
const Model& model,
|
||||||
bool onlyCheckCMX = false);
|
EnableShapeAllocation = EnableShapeAllocation::NO,
|
||||||
|
CheckOnlyCMX = CheckOnlyCMX::NO);
|
||||||
|
|
||||||
} // namespace vpu
|
} // namespace vpu
|
||||||
|
|||||||
@@ -84,9 +84,11 @@ void BackEnd::getMetaData(
|
|||||||
stageMeta.layerName = "<Extra>";
|
stageMeta.layerName = "<Extra>";
|
||||||
stageMeta.layerType = "<Extra>";
|
stageMeta.layerType = "<Extra>";
|
||||||
} else {
|
} else {
|
||||||
stageMeta.layerName = stage->origLayer()->name;
|
const auto& origLayer = stage->origLayer();
|
||||||
stageMeta.layerType = stage->origLayer()->type;
|
stageMeta.layerName = origLayer->params.count("originalLayersNames") ? origLayer->params["originalLayersNames"] :
|
||||||
visitedLayers.insert(stage->origLayer());
|
origLayer->name;
|
||||||
|
stageMeta.layerType = origLayer->type;
|
||||||
|
visitedLayers.insert(origLayer);
|
||||||
}
|
}
|
||||||
|
|
||||||
return stageMeta;
|
return stageMeta;
|
||||||
|
|||||||
@@ -184,9 +184,9 @@ CustomLayer::CustomLayer(std::string configDir, const pugi::xml_node& customLaye
|
|||||||
stageOrder.emplace(stageNum, CustomKernel{kernel, _configDir});
|
stageOrder.emplace(stageNum, CustomKernel{kernel, _configDir});
|
||||||
}
|
}
|
||||||
|
|
||||||
VPU_THROW_UNLESS(stageOrder.begin()->first == 0,
|
VPU_THROW_UNLESS(!stageOrder.empty() && stageOrder.begin()->first == 0,
|
||||||
"Error while binding %s custom layer: Stage 0 is not found.", _layerName);
|
"Error while binding %s custom layer: Stage 0 is not found.", _layerName);
|
||||||
VPU_THROW_UNLESS(stageOrder.rbegin()->first == stageOrder.size() - 1,
|
VPU_THROW_UNLESS(!stageOrder.empty() && stageOrder.rbegin()->first == stageOrder.size() - 1,
|
||||||
"Error while binding %s custom layer: Kernels should have stage id from 0 to N.", _layerName);
|
"Error while binding %s custom layer: Kernels should have stage id from 0 to N.", _layerName);
|
||||||
|
|
||||||
for (auto& stage : stageOrder) {
|
for (auto& stage : stageOrder) {
|
||||||
|
|||||||
@@ -430,6 +430,19 @@ bool checkHWRestrictions(
|
|||||||
int kernelSizeX, int kernelSizeY,
|
int kernelSizeX, int kernelSizeY,
|
||||||
int kernelStride,
|
int kernelStride,
|
||||||
HwOpMode mode, HwOpType type) {
|
HwOpMode mode, HwOpType type) {
|
||||||
|
// Workaround for HW ops failure if too wide input:
|
||||||
|
// Looks like HW operations (primarily Pooling) can
|
||||||
|
// use only part of available CMX, up to 1014 * 128
|
||||||
|
// bits (i.e. 1014 * 16 bytes)
|
||||||
|
// Provided HwOpMode is 16x16, this means HW needs
|
||||||
|
// to read up to 16 lines of input tensor, so each
|
||||||
|
// line mustn't exceed 1014 bytes or 507 pixels if
|
||||||
|
// precision is FP16
|
||||||
|
// More details available with the ticket #-33366
|
||||||
|
if (inTileWidth > 507) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
const int chansPerBlock = 1 << static_cast<int>(mode);
|
const int chansPerBlock = 1 << static_cast<int>(mode);
|
||||||
int noOfBlocks = divUp(inTileChannels, chansPerBlock);
|
int noOfBlocks = divUp(inTileChannels, chansPerBlock);
|
||||||
|
|
||||||
|
|||||||
@@ -193,10 +193,10 @@ void PassImpl::wrapInLoop(const Model& model, const StageList& subgraph) {
|
|||||||
loopEndOutputs.push_back(originalOutput);
|
loopEndOutputs.push_back(originalOutput);
|
||||||
const auto rule = IterationRule{Dim::N, 0, 1, -1};
|
const auto rule = IterationRule{Dim::N, 0, 1, -1};
|
||||||
endIterationComponents.emplace(std::make_pair(loopEndOutputs.size() - 1, rule), loopEndInputs.size() - 1);
|
endIterationComponents.emplace(std::make_pair(loopEndOutputs.size() - 1, rule), loopEndInputs.size() - 1);
|
||||||
} else {
|
}
|
||||||
for (const auto& consumerEdge : originalOutput->consumerEdges()) {
|
for (const auto& consumerEdge : originalOutput->consumerEdges()) {
|
||||||
|
if (subgraph.has(consumerEdge->consumer()))
|
||||||
model->replaceStageInput(consumerEdge, output);
|
model->replaceStageInput(consumerEdge, output);
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -458,7 +458,7 @@ void PassImpl::packDataInCmx(const Model& model) {
|
|||||||
return DataLoopStatus::NextChild;
|
return DataLoopStatus::NextChild;
|
||||||
});
|
});
|
||||||
|
|
||||||
auto allocRes = runAllocator(model, true);
|
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
|
||||||
env.log->trace("Allocation result : %v", allocRes.status);
|
env.log->trace("Allocation result : %v", allocRes.status);
|
||||||
|
|
||||||
if (allocRes.status != AllocationStatus::OK) {
|
if (allocRes.status != AllocationStatus::OK) {
|
||||||
|
|||||||
@@ -25,7 +25,7 @@ namespace vpu {
|
|||||||
// runAllocator
|
// runAllocator
|
||||||
//
|
//
|
||||||
|
|
||||||
AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
AllocationResult runAllocator(const Model& model, EnableShapeAllocation enableShapeAllocation, CheckOnlyCMX checkOnlyCmx) {
|
||||||
VPU_PROFILE(runAllocator);
|
VPU_PROFILE(runAllocator);
|
||||||
|
|
||||||
auto& allocator = model->getAllocator();
|
auto& allocator = model->getAllocator();
|
||||||
@@ -40,7 +40,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
|||||||
// Allocate Const/Input/Output datas.
|
// Allocate Const/Input/Output datas.
|
||||||
//
|
//
|
||||||
|
|
||||||
if (!onlyCheckCMX) {
|
if (checkOnlyCmx == CheckOnlyCMX::NO) {
|
||||||
auto result = allocator.preprocess(model);
|
auto result = allocator.preprocess(model);
|
||||||
if (result.status != vpu::AllocationStatus::OK) {
|
if (result.status != vpu::AllocationStatus::OK) {
|
||||||
return result;
|
return result;
|
||||||
@@ -86,14 +86,14 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
|||||||
// Allocate stage outputs.
|
// Allocate stage outputs.
|
||||||
//
|
//
|
||||||
|
|
||||||
const auto allocateStageOutputs = [onlyCheckCMX, &allocator](const Stage& stage) -> AllocationResult {
|
const auto allocateStageOutputs = [checkOnlyCmx, &allocator](const Stage& stage) -> AllocationResult {
|
||||||
for (const auto& output : stage->outputs()) {
|
for (const auto& output : stage->outputs()) {
|
||||||
if (onlyCheckCMX && output->memReqs() != MemoryType::CMX) {
|
if (checkOnlyCmx == CheckOnlyCMX::YES && output->memReqs() != MemoryType::CMX) {
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!allocator.allocateData(output)) {
|
if (!allocator.allocateData(output)) {
|
||||||
if (output->memReqs() == MemoryType::CMX && !onlyCheckCMX) {
|
if (output->memReqs() == MemoryType::CMX && checkOnlyCmx == CheckOnlyCMX::NO) {
|
||||||
if (allocator.removeCMXCandidates(output)) {
|
if (allocator.removeCMXCandidates(output)) {
|
||||||
if (allocator.allocateData(output)) {
|
if (allocator.allocateData(output)) {
|
||||||
continue;
|
continue;
|
||||||
@@ -123,7 +123,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
|||||||
// Allocate stage temporary buffers.
|
// Allocate stage temporary buffers.
|
||||||
//
|
//
|
||||||
|
|
||||||
if (!onlyCheckCMX) {
|
if (checkOnlyCmx == CheckOnlyCMX::NO) {
|
||||||
for (const auto& tempBufferEdge : stage->tempBufferEdges()) {
|
for (const auto& tempBufferEdge : stage->tempBufferEdges()) {
|
||||||
if (!allocator.allocateData(tempBufferEdge->tempBuffer())) {
|
if (!allocator.allocateData(tempBufferEdge->tempBuffer())) {
|
||||||
allocator.setNeedToAllocNonIntermData();
|
allocator.setNeedToAllocNonIntermData();
|
||||||
@@ -157,7 +157,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
|||||||
//
|
//
|
||||||
|
|
||||||
for (const auto& input : stage->inputs()) {
|
for (const auto& input : stage->inputs()) {
|
||||||
if (onlyCheckCMX && input->memReqs() != MemoryType::CMX) {
|
if (checkOnlyCmx == CheckOnlyCMX::YES && input->memReqs() != MemoryType::CMX) {
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -168,7 +168,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
|||||||
// Release stage temporary buffers.
|
// Release stage temporary buffers.
|
||||||
//
|
//
|
||||||
|
|
||||||
if (!onlyCheckCMX) {
|
if (checkOnlyCmx == CheckOnlyCMX::NO) {
|
||||||
for (const auto& tempBufferEdge : stage->tempBufferEdges()) {
|
for (const auto& tempBufferEdge : stage->tempBufferEdges()) {
|
||||||
allocator.freeData(tempBufferEdge->tempBuffer());
|
allocator.freeData(tempBufferEdge->tempBuffer());
|
||||||
}
|
}
|
||||||
@@ -195,7 +195,7 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
|||||||
|
|
||||||
if (const auto& parentEdge = data->parentDataToShapeEdge()) {
|
if (const auto& parentEdge = data->parentDataToShapeEdge()) {
|
||||||
const auto& parent = parentEdge->parent();
|
const auto& parent = parentEdge->parent();
|
||||||
if (parent->usage() == DataUsage::Intermediate && (!onlyCheckCMX || parent->memReqs() == MemoryType::CMX)) {
|
if (parent->usage() == DataUsage::Intermediate && (checkOnlyCmx == CheckOnlyCMX::NO || parent->memReqs() == MemoryType::CMX)) {
|
||||||
allocator.freeData(parent);
|
allocator.freeData(parent);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -205,9 +205,11 @@ AllocationResult runAllocator(const Model& model, bool onlyCheckCMX) {
|
|||||||
// Allocate shape for all datas
|
// Allocate shape for all datas
|
||||||
//
|
//
|
||||||
|
|
||||||
for (auto data : model->datas()) {
|
if (enableShapeAllocation == EnableShapeAllocation::YES) {
|
||||||
const auto shapeLocation = allocator.allocateShape(data);
|
for (auto data : model->datas()) {
|
||||||
data->setShapeAllocationInfo(shapeLocation);
|
const auto shapeLocation = allocator.allocateShape(data);
|
||||||
|
data->setShapeAllocationInfo(shapeLocation);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return AllocationResult();
|
return AllocationResult();
|
||||||
@@ -233,7 +235,7 @@ void PassImpl::run(const Model& model) {
|
|||||||
// Allocate all resources
|
// Allocate all resources
|
||||||
//
|
//
|
||||||
|
|
||||||
auto allocRes = runAllocator(model);
|
auto allocRes = runAllocator(model, EnableShapeAllocation::YES);
|
||||||
IE_ASSERT(allocRes.status == AllocationStatus::OK);
|
IE_ASSERT(allocRes.status == AllocationStatus::OK);
|
||||||
|
|
||||||
//
|
//
|
||||||
|
|||||||
@@ -160,7 +160,7 @@ void PassImpl::run(const Model& model) {
|
|||||||
model->replaceStageInput(consumerEdge, copyOutput);
|
model->replaceStageInput(consumerEdge, copyOutput);
|
||||||
}
|
}
|
||||||
|
|
||||||
auto allocRes = runAllocator(model, true);
|
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
|
||||||
if (allocRes.status != AllocationStatus::OK) {
|
if (allocRes.status != AllocationStatus::OK) {
|
||||||
model->replaceStageOutput(copyProducer->outputEdge(0), copyInput);
|
model->replaceStageOutput(copyProducer->outputEdge(0), copyInput);
|
||||||
|
|
||||||
|
|||||||
@@ -171,7 +171,7 @@ void PassImpl::run(const Model& model) {
|
|||||||
.childSW(swStage)
|
.childSW(swStage)
|
||||||
.done();
|
.done();
|
||||||
|
|
||||||
auto allocRes = runAllocator(model, true);
|
auto allocRes = runAllocator(model, EnableShapeAllocation::NO, CheckOnlyCMX::YES);
|
||||||
if (allocRes.status == AllocationStatus::OK) {
|
if (allocRes.status == AllocationStatus::OK) {
|
||||||
// TODO: try to merge more than one SW stage?
|
// TODO: try to merge more than one SW stage?
|
||||||
break;
|
break;
|
||||||
|
|||||||
@@ -160,7 +160,9 @@ void ParsedConfig::parse(const std::map<std::string, std::string>& config) {
|
|||||||
setOption(_compileConfig.hwExtraSplit, switches, config, VPU_CONFIG_KEY(HW_EXTRA_SPLIT));
|
setOption(_compileConfig.hwExtraSplit, switches, config, VPU_CONFIG_KEY(HW_EXTRA_SPLIT));
|
||||||
setOption(_compileConfig.injectSwOps, switches, config, VPU_CONFIG_KEY(HW_INJECT_STAGES));
|
setOption(_compileConfig.injectSwOps, switches, config, VPU_CONFIG_KEY(HW_INJECT_STAGES));
|
||||||
setOption(_compileConfig.mergeHwPoolToConv, switches, config, VPU_CONFIG_KEY(HW_POOL_CONV_MERGE));
|
setOption(_compileConfig.mergeHwPoolToConv, switches, config, VPU_CONFIG_KEY(HW_POOL_CONV_MERGE));
|
||||||
|
IE_SUPPRESS_DEPRECATED_START
|
||||||
setOption(_compileConfig.ignoreIRStatistic, switches, config, VPU_CONFIG_KEY(IGNORE_IR_STATISTIC));
|
setOption(_compileConfig.ignoreIRStatistic, switches, config, VPU_CONFIG_KEY(IGNORE_IR_STATISTIC));
|
||||||
|
IE_SUPPRESS_DEPRECATED_END
|
||||||
setOption(_compileConfig.hwDilation, switches, config, VPU_CONFIG_KEY(HW_DILATION));
|
setOption(_compileConfig.hwDilation, switches, config, VPU_CONFIG_KEY(HW_DILATION));
|
||||||
setOption(_compileConfig.forceDeprecatedCnnConversion, switches, config, VPU_CONFIG_KEY(FORCE_DEPRECATED_CNN_CONVERSION));
|
setOption(_compileConfig.forceDeprecatedCnnConversion, switches, config, VPU_CONFIG_KEY(FORCE_DEPRECATED_CNN_CONVERSION));
|
||||||
setOption(_compileConfig.disableReorder, switches, config, VPU_CONFIG_KEY(DISABLE_REORDER));
|
setOption(_compileConfig.disableReorder, switches, config, VPU_CONFIG_KEY(DISABLE_REORDER));
|
||||||
|
|||||||
@@ -266,6 +266,8 @@ void FrontEnd::parseConcat(
|
|||||||
const ie::CNNLayerPtr& layer,
|
const ie::CNNLayerPtr& layer,
|
||||||
const DataVector& inputs,
|
const DataVector& inputs,
|
||||||
const DataVector& outputs) const {
|
const DataVector& outputs) const {
|
||||||
|
VPU_THROW_UNLESS(layer != nullptr, "parseConcat expects valid CNNLayerPtr, actually got nullptr");
|
||||||
|
|
||||||
VPU_THROW_UNLESS(!inputs.empty(),
|
VPU_THROW_UNLESS(!inputs.empty(),
|
||||||
"{} layer with name {} must have no less than 1 input, "
|
"{} layer with name {} must have no less than 1 input, "
|
||||||
"actually provided 0 inputs", layer->type, layer->name);
|
"actually provided 0 inputs", layer->type, layer->name);
|
||||||
@@ -275,10 +277,8 @@ void FrontEnd::parseConcat(
|
|||||||
|
|
||||||
auto output = outputs[0];
|
auto output = outputs[0];
|
||||||
|
|
||||||
auto concat = std::dynamic_pointer_cast<ie::ConcatLayer>(layer);
|
const auto& concat = std::dynamic_pointer_cast<ie::ConcatLayer>(layer);
|
||||||
VPU_THROW_UNLESS(layer != nullptr,
|
VPU_THROW_UNLESS(concat != nullptr, "{} layer with name {} must be convertable to ie::ConcatLayer", layer->type, layer->name);
|
||||||
"{} layer with name {} must be able to convert to ie::ConcatLayer",
|
|
||||||
layer->type, layer->name);
|
|
||||||
|
|
||||||
VPU_THROW_UNLESS(concat->_axis < output->desc().numDims(),
|
VPU_THROW_UNLESS(concat->_axis < output->desc().numDims(),
|
||||||
"{} layer with name {} must have axis attribute no grater than number of "
|
"{} layer with name {} must have axis attribute no grater than number of "
|
||||||
|
|||||||
@@ -128,9 +128,8 @@ private:
|
|||||||
|
|
||||||
void FrontEnd::parseReduce(const Model& model, const ie::CNNLayerPtr& _layer, const DataVector& inputs, const DataVector& outputs) const {
|
void FrontEnd::parseReduce(const Model& model, const ie::CNNLayerPtr& _layer, const DataVector& inputs, const DataVector& outputs) const {
|
||||||
auto layer = std::dynamic_pointer_cast<ie::ReduceLayer>(_layer);
|
auto layer = std::dynamic_pointer_cast<ie::ReduceLayer>(_layer);
|
||||||
VPU_THROW_UNLESS(layer != nullptr,
|
VPU_THROW_UNLESS(layer != nullptr, "parseReduce expects valid ReduceLayer, actually got nullptr");
|
||||||
"Layer {} of type {} is nullptr",
|
|
||||||
layer->name, layer->type);
|
|
||||||
VPU_THROW_UNLESS(inputs.size() == 2,
|
VPU_THROW_UNLESS(inputs.size() == 2,
|
||||||
"Layer {} of type {} expects {} inputs, but provided {}",
|
"Layer {} of type {} expects {} inputs, but provided {}",
|
||||||
layer->name, layer->type, 2, inputs.size());
|
layer->name, layer->type, 2, inputs.size());
|
||||||
|
|||||||
@@ -107,6 +107,7 @@ Engine::Engine(std::shared_ptr<IMvnc> mvnc) :
|
|||||||
|
|
||||||
_pluginName = "MYRIAD";
|
_pluginName = "MYRIAD";
|
||||||
|
|
||||||
|
IE_SUPPRESS_DEPRECATED_START
|
||||||
_config = {
|
_config = {
|
||||||
{ KEY_VPU_HW_STAGES_OPTIMIZATION, "ON" },
|
{ KEY_VPU_HW_STAGES_OPTIMIZATION, "ON" },
|
||||||
{ KEY_LOG_LEVEL, "LOG_NONE" },
|
{ KEY_LOG_LEVEL, "LOG_NONE" },
|
||||||
@@ -120,6 +121,7 @@ Engine::Engine(std::shared_ptr<IMvnc> mvnc) :
|
|||||||
{ KEY_CONFIG_FILE, "" },
|
{ KEY_CONFIG_FILE, "" },
|
||||||
{ KEY_DEVICE_ID, "" },
|
{ KEY_DEVICE_ID, "" },
|
||||||
};
|
};
|
||||||
|
IE_SUPPRESS_DEPRECATED_END
|
||||||
}
|
}
|
||||||
|
|
||||||
InferenceEngine::ExecutableNetwork Engine::ImportNetwork(
|
InferenceEngine::ExecutableNetwork Engine::ImportNetwork(
|
||||||
|
|||||||
@@ -17,6 +17,7 @@
|
|||||||
#include <ie_core.hpp>
|
#include <ie_core.hpp>
|
||||||
#include <net_pass.h>
|
#include <net_pass.h>
|
||||||
|
|
||||||
|
#include <ngraph/opsets/opset3.hpp>
|
||||||
#include <ngraph/function.hpp>
|
#include <ngraph/function.hpp>
|
||||||
#include <ngraph/variant.hpp>
|
#include <ngraph/variant.hpp>
|
||||||
#include <ngraph/op/maximum.hpp>
|
#include <ngraph/op/maximum.hpp>
|
||||||
@@ -680,4 +681,25 @@ TEST(CNNNGraphImplTests, TestCheckStats) {
|
|||||||
ASSERT_EQ(nullptr, _stats);
|
ASSERT_EQ(nullptr, _stats);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
TEST(CNNNGraphImplTests, CanSetBatchReadValue) {
|
||||||
|
std::shared_ptr<ngraph::Function> ngraph;
|
||||||
|
{
|
||||||
|
auto input = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::f32, ngraph::Shape{1, 2});
|
||||||
|
auto constant = std::make_shared<ngraph::opset3::Constant>(ngraph::element::f32, ngraph::Shape{1, 2},
|
||||||
|
std::vector<float>{1, 2});
|
||||||
|
|
||||||
|
auto read_value = std::make_shared<ngraph::opset3::ReadValue>(constant, "variable_id");
|
||||||
|
auto add = std::make_shared<ngraph::opset3::Add>(input, read_value);
|
||||||
|
auto result = std::make_shared<ngraph::op::Result>(add);
|
||||||
|
|
||||||
|
ngraph::ParameterVector params = {input};
|
||||||
|
ngraph::ResultVector results = {result};
|
||||||
|
|
||||||
|
ngraph = std::make_shared<ngraph::Function>(results, params);
|
||||||
|
}
|
||||||
|
|
||||||
|
InferenceEngine::details::CNNNetworkNGraphImpl cnnNet(ngraph);
|
||||||
|
auto status = cnnNet.getCNNNetwork()->setBatchSize(4, nullptr);
|
||||||
|
EXPECT_EQ(status, StatusCode::OK);
|
||||||
|
}
|
||||||
IE_SUPPRESS_DEPRECATED_END
|
IE_SUPPRESS_DEPRECATED_END
|
||||||
|
|||||||
@@ -60,9 +60,11 @@ protected:
|
|||||||
/* validates a read network with the reference map of CNN layers */
|
/* validates a read network with the reference map of CNN layers */
|
||||||
void compareWithRef(const InferenceEngine::CNNNetwork &network,
|
void compareWithRef(const InferenceEngine::CNNNetwork &network,
|
||||||
const std::vector<InferenceEngine::CNNLayerPtr> &refLayersVec) {
|
const std::vector<InferenceEngine::CNNLayerPtr> &refLayersVec) {
|
||||||
|
IE_SUPPRESS_DEPRECATED_START
|
||||||
ASSERT_NO_THROW(FuncTestUtils::compareLayerByLayer<std::vector<InferenceEngine::CNNLayerPtr>>(
|
ASSERT_NO_THROW(FuncTestUtils::compareLayerByLayer<std::vector<InferenceEngine::CNNLayerPtr>>(
|
||||||
InferenceEngine::details::CNNNetSortTopologically(network),
|
InferenceEngine::details::CNNNetSortTopologically(network),
|
||||||
refLayersVec, false));
|
refLayersVec, false));
|
||||||
|
IE_SUPPRESS_DEPRECATED_END
|
||||||
}
|
}
|
||||||
|
|
||||||
const std::string _modelPath = "NetReader_test.xml";
|
const std::string _modelPath = "NetReader_test.xml";
|
||||||
|
|||||||
@@ -30,16 +30,6 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
|
|||||||
</port>
|
</port>
|
||||||
</output>
|
</output>
|
||||||
</layer>
|
</layer>
|
||||||
<layer id="15" name="in3" type="Parameter" version="opset1">
|
|
||||||
<data element_type="f32" shape="1,2,32400"/>
|
|
||||||
<output>
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>2</dim>
|
|
||||||
<dim>32400</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
<layer id="2" name="shape_of1" type="ShapeOf" version="opset1">
|
<layer id="2" name="shape_of1" type="ShapeOf" version="opset1">
|
||||||
<input>
|
<input>
|
||||||
<port id="0" precision="FP32">
|
<port id="0" precision="FP32">
|
||||||
@@ -182,63 +172,19 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
|
|||||||
</port>
|
</port>
|
||||||
</output>
|
</output>
|
||||||
</layer>
|
</layer>
|
||||||
<layer name="concat" id="16" type="Concat" version="opset1">
|
|
||||||
<data axis="1"/>
|
|
||||||
<input>
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>2</dim>
|
|
||||||
<dim>32400</dim>
|
|
||||||
</port>
|
|
||||||
<port id="1" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>2</dim>
|
|
||||||
<dim>32400</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
<output>
|
|
||||||
<port id="2" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>4</dim>
|
|
||||||
<dim>32400</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
<layer id="10" name="output" type="Result" version="opset1">
|
<layer id="10" name="output" type="Result" version="opset1">
|
||||||
<input>
|
<input>
|
||||||
<port id="0" precision="FP32">
|
<port id="0" precision="FP32">
|
||||||
<dim>1</dim>
|
<dim>1</dim>
|
||||||
<dim>4</dim>
|
<dim>2</dim>
|
||||||
<dim>32400</dim>
|
<dim>32400</dim>
|
||||||
</port>
|
</port>
|
||||||
</input>
|
</input>
|
||||||
</layer>
|
</layer>
|
||||||
<layer id="13" name="output_2" type="Result" version="opset1">
|
|
||||||
<input>
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>768</dim>
|
|
||||||
<dim>30</dim>
|
|
||||||
<dim>30</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
</layer>
|
|
||||||
<layer id="14" name="output_3" type="Result" version="opset1">
|
|
||||||
<input>
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
<dim>512</dim>
|
|
||||||
<dim>512</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
</layer>
|
|
||||||
</layers>
|
</layers>
|
||||||
<edges>
|
<edges>
|
||||||
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
|
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
|
||||||
<edge from-layer="0" from-port="0" to-layer="13" to-port="0"/>
|
|
||||||
<edge from-layer="1" from-port="0" to-layer="6" to-port="0"/>
|
<edge from-layer="1" from-port="0" to-layer="6" to-port="0"/>
|
||||||
<edge from-layer="1" from-port="0" to-layer="14" to-port="0"/>
|
|
||||||
<edge from-layer="2" from-port="1" to-layer="5" to-port="0"/>
|
<edge from-layer="2" from-port="1" to-layer="5" to-port="0"/>
|
||||||
<edge from-layer="6" from-port="1" to-layer="7" to-port="0"/>
|
<edge from-layer="6" from-port="1" to-layer="7" to-port="0"/>
|
||||||
<edge from-layer="3" from-port="1" to-layer="5" to-port="1"/>
|
<edge from-layer="3" from-port="1" to-layer="5" to-port="1"/>
|
||||||
@@ -251,90 +197,66 @@ TEST_F(NGraphReaderTests, ReadPriorBoxClusteredNetwork) {
|
|||||||
<edge from-layer="7" from-port="4" to-layer="8" to-port="1"/>
|
<edge from-layer="7" from-port="4" to-layer="8" to-port="1"/>
|
||||||
<edge from-layer="8" from-port="2" to-layer="11" to-port="0"/>
|
<edge from-layer="8" from-port="2" to-layer="11" to-port="0"/>
|
||||||
<edge from-layer="12" from-port="0" to-layer="11" to-port="1"/>
|
<edge from-layer="12" from-port="0" to-layer="11" to-port="1"/>
|
||||||
<edge from-layer="11" from-port="2" to-layer="16" to-port="1"/>
|
<edge from-layer="11" from-port="2" to-layer="10" to-port="0"/>
|
||||||
<edge from-layer="16" from-port="2" to-layer="10" to-port="0"/>
|
|
||||||
<edge from-layer="15" from-port="0" to-layer="16" to-port="0"/>
|
|
||||||
</edges>
|
</edges>
|
||||||
</net>
|
</net>
|
||||||
)V0G0N";
|
)V0G0N";
|
||||||
std::string modelV5 = R"V0G0N(
|
std::string modelV5 = R"V0G0N(
|
||||||
<net name="Network" version="5" precision="FP32" batch="1">
|
<net name="Network" version="5" precision="FP32" batch="1">
|
||||||
<layers>
|
<layers>
|
||||||
<layer name="in2" type="Input" precision="FP32" id="0">
|
<layer id="0" name="in1" type="Input" precision="FP32">
|
||||||
<data originalLayersNames="in2" />
|
<output>
|
||||||
<output>
|
<port id="0">
|
||||||
<port id="0" precision="FP32">
|
<dim>1</dim>
|
||||||
<dim>1</dim>
|
<dim>768</dim>
|
||||||
<dim>3</dim>
|
<dim>30</dim>
|
||||||
<dim>512</dim>
|
<dim>30</dim>
|
||||||
<dim>512</dim>
|
</port>
|
||||||
</port>
|
</output>
|
||||||
</output>
|
</layer>
|
||||||
</layer>
|
<layer id="1" name="in2" type="Input" precision="FP32">
|
||||||
<layer name="in1" type="Input" precision="FP32" id="1">
|
<output>
|
||||||
<data originalLayersNames="in1" />
|
<port id="0">
|
||||||
<output>
|
<dim>1</dim>
|
||||||
<port id="0" precision="FP32">
|
<dim>3</dim>
|
||||||
<dim>1</dim>
|
<dim>512</dim>
|
||||||
<dim>768</dim>
|
<dim>512</dim>
|
||||||
<dim>30</dim>
|
</port>
|
||||||
<dim>30</dim>
|
</output>
|
||||||
</port>
|
</layer>
|
||||||
</output>
|
<layer name="ExpandDims" id="2" type="PriorBoxClustered" precision="FP32">
|
||||||
</layer>
|
<data clip="0" step_h="16.000000" step_w="16.000000" flip="1" height="44,10,30,19,94,32,61,53,17" offset="0.500000" step="16.000000" variance="0.1,0.1,0.2,0.2" width="86,13,57,39,68,34,142,50,23" originalLayersNames="ExpandDims,prior,shape_of1,shape_of2,ss1,ss2"/>
|
||||||
<layer name="in3" type="Input" precision="FP32" id="2">
|
<input>
|
||||||
<data originalLayersNames="in3" />
|
<port id="1">
|
||||||
<output>
|
<dim>1</dim>
|
||||||
<port id="0" precision="FP32">
|
<dim>768</dim>
|
||||||
<dim>1</dim>
|
<dim>30</dim>
|
||||||
<dim>2</dim>
|
<dim>30</dim>
|
||||||
<dim>32400</dim>
|
</port>
|
||||||
</port>
|
<port id="2">
|
||||||
</output>
|
<dim>1</dim>
|
||||||
</layer>
|
<dim>3</dim>
|
||||||
<layer name="Constant_49" type="Const" precision="FP32" id="3">
|
<dim>512</dim>
|
||||||
<output>
|
<dim>512</dim>
|
||||||
<port id="0" precision="FP32">
|
</port>
|
||||||
<dim>1</dim>
|
</input>
|
||||||
<dim>2</dim>
|
<output>
|
||||||
<dim>32400</dim>
|
<port id="3">
|
||||||
</port>
|
<dim>1</dim>
|
||||||
</output>
|
<dim>2</dim>
|
||||||
<blobs>
|
<dim>32400</dim>
|
||||||
<custom offset="0" size="259200" precision="FP32" />
|
</port>
|
||||||
</blobs>
|
</output>
|
||||||
</layer>
|
</layer>
|
||||||
<layer name="concat" type="Concat" precision="FP32" id="4">
|
</layers>
|
||||||
<data axis="1" originalLayersNames="concat" />
|
<edges>
|
||||||
<input>
|
<edge from-layer="0" from-port="0" to-layer="2" to-port="1"/>
|
||||||
<port id="0">
|
<edge from-layer="1" from-port="0" to-layer="2" to-port="2"/>
|
||||||
<dim>1</dim>
|
</edges>
|
||||||
<dim>2</dim>
|
|
||||||
<dim>32400</dim>
|
|
||||||
</port>
|
|
||||||
<port id="1">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>2</dim>
|
|
||||||
<dim>32400</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
<output>
|
|
||||||
<port id="2" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>4</dim>
|
|
||||||
<dim>32400</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
</layers>
|
|
||||||
<edges>
|
|
||||||
<edge from-layer="2" from-port="0" to-layer="4" to-port="0" />
|
|
||||||
<edge from-layer="3" from-port="0" to-layer="4" to-port="1" />
|
|
||||||
</edges>
|
|
||||||
</net>
|
</net>
|
||||||
)V0G0N";
|
)V0G0N";
|
||||||
|
|
||||||
compareIRs(model, modelV5, 259200, [](Blob::Ptr& weights) {
|
compareIRs(model, modelV5, 50, [](Blob::Ptr& weights) {
|
||||||
auto* buffer = weights->buffer().as<int64_t*>();
|
auto* buffer = weights->buffer().as<int64_t*>();
|
||||||
buffer[0] = 2;
|
buffer[0] = 2;
|
||||||
buffer[1] = 4;
|
buffer[1] = 4;
|
||||||
@@ -369,16 +291,6 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
|
|||||||
</port>
|
</port>
|
||||||
</output>
|
</output>
|
||||||
</layer>
|
</layer>
|
||||||
<layer id="15" name="in3" type="Parameter" version="opset1">
|
|
||||||
<data element_type="f32" shape="1,2,14400"/>
|
|
||||||
<output>
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>2</dim>
|
|
||||||
<dim>14400</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
<layer id="2" name="shape_of1" type="ShapeOf" version="opset1">
|
<layer id="2" name="shape_of1" type="ShapeOf" version="opset1">
|
||||||
<input>
|
<input>
|
||||||
<port id="0" precision="FP32">
|
<port id="0" precision="FP32">
|
||||||
@@ -520,63 +432,19 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
|
|||||||
</port>
|
</port>
|
||||||
</output>
|
</output>
|
||||||
</layer>
|
</layer>
|
||||||
<layer name="concat" id="16" type="Concat" version="opset1">
|
|
||||||
<data axis="1"/>
|
|
||||||
<input>
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>2</dim>
|
|
||||||
<dim>14400</dim>
|
|
||||||
</port>
|
|
||||||
<port id="1" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>2</dim>
|
|
||||||
<dim>14400</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
<output>
|
|
||||||
<port id="2" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>4</dim>
|
|
||||||
<dim>14400</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
<layer id="10" name="output" type="Result" version="opset1">
|
<layer id="10" name="output" type="Result" version="opset1">
|
||||||
<input>
|
<input>
|
||||||
<port id="0" precision="FP32">
|
<port id="0" precision="FP32">
|
||||||
<dim>1</dim>
|
<dim>1</dim>
|
||||||
<dim>4</dim>
|
<dim>2</dim>
|
||||||
<dim>14400</dim>
|
<dim>14400</dim>
|
||||||
</port>
|
</port>
|
||||||
</input>
|
</input>
|
||||||
</layer>
|
</layer>
|
||||||
<layer id="13" name="output_2" type="Result" version="opset1">
|
|
||||||
<input>
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>768</dim>
|
|
||||||
<dim>30</dim>
|
|
||||||
<dim>30</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
</layer>
|
|
||||||
<layer id="14" name="output_3" type="Result" version="opset1">
|
|
||||||
<input>
|
|
||||||
<port id="0" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>3</dim>
|
|
||||||
<dim>512</dim>
|
|
||||||
<dim>512</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
</layer>
|
|
||||||
</layers>
|
</layers>
|
||||||
<edges>
|
<edges>
|
||||||
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
|
<edge from-layer="0" from-port="0" to-layer="2" to-port="0"/>
|
||||||
<edge from-layer="0" from-port="0" to-layer="13" to-port="0"/>
|
|
||||||
<edge from-layer="1" from-port="0" to-layer="6" to-port="0"/>
|
<edge from-layer="1" from-port="0" to-layer="6" to-port="0"/>
|
||||||
<edge from-layer="1" from-port="0" to-layer="14" to-port="0"/>
|
|
||||||
<edge from-layer="2" from-port="1" to-layer="5" to-port="0"/>
|
<edge from-layer="2" from-port="1" to-layer="5" to-port="0"/>
|
||||||
<edge from-layer="6" from-port="1" to-layer="7" to-port="0"/>
|
<edge from-layer="6" from-port="1" to-layer="7" to-port="0"/>
|
||||||
<edge from-layer="3" from-port="1" to-layer="5" to-port="1"/>
|
<edge from-layer="3" from-port="1" to-layer="5" to-port="1"/>
|
||||||
@@ -589,90 +457,66 @@ TEST_F(NGraphReaderTests, ReadPriorBoxNetwork) {
|
|||||||
<edge from-layer="7" from-port="4" to-layer="8" to-port="1"/>
|
<edge from-layer="7" from-port="4" to-layer="8" to-port="1"/>
|
||||||
<edge from-layer="8" from-port="2" to-layer="11" to-port="0"/>
|
<edge from-layer="8" from-port="2" to-layer="11" to-port="0"/>
|
||||||
<edge from-layer="12" from-port="0" to-layer="11" to-port="1"/>
|
<edge from-layer="12" from-port="0" to-layer="11" to-port="1"/>
|
||||||
<edge from-layer="11" from-port="2" to-layer="16" to-port="0"/>
|
<edge from-layer="11" from-port="2" to-layer="10" to-port="0"/>
|
||||||
<edge from-layer="15" from-port="0" to-layer="16" to-port="1"/>
|
|
||||||
<edge from-layer="16" from-port="2" to-layer="10" to-port="0"/>
|
|
||||||
</edges>
|
</edges>
|
||||||
</net>
|
</net>
|
||||||
)V0G0N";
|
)V0G0N";
|
||||||
std::string modelV5 = R"V0G0N(
|
std::string modelV5 = R"V0G0N(
|
||||||
<net name="Network" version="5" precision="FP32" batch="1">
|
<net name="Network" version="5" precision="FP32" batch="1">
|
||||||
<layers>
|
<layers>
|
||||||
<layer name="in2" type="Input" precision="FP32" id="0">
|
<layer id="0" name="in1" type="Input" precision="FP32">
|
||||||
<data originalLayersNames="in2" />
|
<output>
|
||||||
<output>
|
<port id="0">
|
||||||
<port id="0" precision="FP32">
|
<dim>1</dim>
|
||||||
<dim>1</dim>
|
<dim>768</dim>
|
||||||
<dim>3</dim>
|
<dim>30</dim>
|
||||||
<dim>512</dim>
|
<dim>30</dim>
|
||||||
<dim>512</dim>
|
</port>
|
||||||
</port>
|
</output>
|
||||||
</output>
|
</layer>
|
||||||
</layer>
|
<layer id="1" name="in2" type="Input" precision="FP32">
|
||||||
<layer name="in1" type="Input" precision="FP32" id="1">
|
<output>
|
||||||
<data originalLayersNames="in1" />
|
<port id="0">
|
||||||
<output>
|
<dim>1</dim>
|
||||||
<port id="0" precision="FP32">
|
<dim>3</dim>
|
||||||
<dim>1</dim>
|
<dim>512</dim>
|
||||||
<dim>768</dim>
|
<dim>512</dim>
|
||||||
<dim>30</dim>
|
</port>
|
||||||
<dim>30</dim>
|
</output>
|
||||||
</port>
|
</layer>
|
||||||
</output>
|
<layer name="ExpandDims" id="2" type="PriorBox" precision="FP32">
|
||||||
</layer>
|
<data density="" fixed_ratio="" fixed_size="" aspect_ratio="2,0.5" clip="0" flip="0" img_h="0" img_size="0" img_w="0" max_size="" min_size="51.200001,72.407555" offset="0.500000" scale_all_sizes="0" step="17.066666666666666" step_h="0" step_w="0" variance="0.1,0.1,0.2,0.2" originalLayersNames="ExpandDims,prior,shape_of1,shape_of2,ss1,ss2"/>
|
||||||
<layer name="Constant_49" type="Const" precision="FP32" id="2">
|
<input>
|
||||||
<output>
|
<port id="1">
|
||||||
<port id="0" precision="FP32">
|
<dim>1</dim>
|
||||||
<dim>1</dim>
|
<dim>768</dim>
|
||||||
<dim>2</dim>
|
<dim>30</dim>
|
||||||
<dim>14400</dim>
|
<dim>30</dim>
|
||||||
</port>
|
</port>
|
||||||
</output>
|
<port id="2">
|
||||||
<blobs>
|
<dim>1</dim>
|
||||||
<custom offset="0" size="115200" precision="FP32" />
|
<dim>3</dim>
|
||||||
</blobs>
|
<dim>512</dim>
|
||||||
</layer>
|
<dim>512</dim>
|
||||||
<layer name="in3" type="Input" precision="FP32" id="3">
|
</port>
|
||||||
<data originalLayersNames="in3" />
|
</input>
|
||||||
<output>
|
<output>
|
||||||
<port id="0" precision="FP32">
|
<port id="3">
|
||||||
<dim>1</dim>
|
<dim>1</dim>
|
||||||
<dim>2</dim>
|
<dim>2</dim>
|
||||||
<dim>14400</dim>
|
<dim>14400</dim>
|
||||||
</port>
|
</port>
|
||||||
</output>
|
</output>
|
||||||
</layer>
|
</layer>
|
||||||
<layer name="concat" type="Concat" precision="FP32" id="4">
|
</layers>
|
||||||
<data axis="1" originalLayersNames="concat" />
|
<edges>
|
||||||
<input>
|
<edge from-layer="0" from-port="0" to-layer="2" to-port="1"/>
|
||||||
<port id="0">
|
<edge from-layer="1" from-port="0" to-layer="2" to-port="2"/>
|
||||||
<dim>1</dim>
|
</edges>
|
||||||
<dim>2</dim>
|
|
||||||
<dim>14400</dim>
|
|
||||||
</port>
|
|
||||||
<port id="1">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>2</dim>
|
|
||||||
<dim>14400</dim>
|
|
||||||
</port>
|
|
||||||
</input>
|
|
||||||
<output>
|
|
||||||
<port id="2" precision="FP32">
|
|
||||||
<dim>1</dim>
|
|
||||||
<dim>4</dim>
|
|
||||||
<dim>14400</dim>
|
|
||||||
</port>
|
|
||||||
</output>
|
|
||||||
</layer>
|
|
||||||
</layers>
|
|
||||||
<edges>
|
|
||||||
<edge from-layer="2" from-port="0" to-layer="4" to-port="0" />
|
|
||||||
<edge from-layer="3" from-port="0" to-layer="4" to-port="1" />
|
|
||||||
</edges>
|
|
||||||
</net>
|
</net>
|
||||||
)V0G0N";
|
)V0G0N";
|
||||||
|
|
||||||
compareIRs(model, modelV5, 115200, [](Blob::Ptr& weights) {
|
compareIRs(model, modelV5, 40, [](Blob::Ptr& weights) {
|
||||||
auto* buffer = weights->buffer().as<int64_t*>();
|
auto* buffer = weights->buffer().as<int64_t*>();
|
||||||
buffer[0] = 2;
|
buffer[0] = 2;
|
||||||
buffer[1] = 4;
|
buffer[1] = 4;
|
||||||
|
|||||||
@@ -3,6 +3,7 @@
|
|||||||
//
|
//
|
||||||
|
|
||||||
#include <string>
|
#include <string>
|
||||||
|
#include <generic_ie.hpp>
|
||||||
#include "ngraph_reader_tests.hpp"
|
#include "ngraph_reader_tests.hpp"
|
||||||
TEST_F(NGraphReaderTests, ReadProposalNetwork) {
|
TEST_F(NGraphReaderTests, ReadProposalNetwork) {
|
||||||
std::string model_v10 = R"V0G0N(
|
std::string model_v10 = R"V0G0N(
|
||||||
@@ -305,3 +306,100 @@ TEST_F(NGraphReaderTests, ReadProposalNetwork_2) {
|
|||||||
|
|
||||||
compareIRs(model_v10, model_v6, 32);
|
compareIRs(model_v10, model_v6, 32);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
TEST_F(NGraphReaderTests, ReadExtensionProposalNetwork) {
|
||||||
|
std::string model_v10 = R"V0G0N(
|
||||||
|
<net name="Network" version="10">
|
||||||
|
<layers>
|
||||||
|
<layer id="0" name="in1" type="Parameter" version="opset1">
|
||||||
|
<data element_type="f32" shape="1,12,34,62"/>
|
||||||
|
<output>
|
||||||
|
<port id="0" precision="FP32">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>12</dim>
|
||||||
|
<dim>34</dim>
|
||||||
|
<dim>62</dim>
|
||||||
|
</port>
|
||||||
|
</output>
|
||||||
|
</layer>
|
||||||
|
<layer id="1" name="in2" type="Parameter" version="opset1">
|
||||||
|
<data element_type="f32" shape="1,24,34,62"/>
|
||||||
|
<output>
|
||||||
|
<port id="0" precision="FP32">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>24</dim>
|
||||||
|
<dim>34</dim>
|
||||||
|
<dim>62</dim>
|
||||||
|
</port>
|
||||||
|
</output>
|
||||||
|
</layer>
|
||||||
|
<layer id="2" name="in3" type="Const" version="opset1">
|
||||||
|
<data offset="0" size="24"/>
|
||||||
|
<output>
|
||||||
|
<port id="0" precision="I64">
|
||||||
|
<dim>3</dim>
|
||||||
|
</port>
|
||||||
|
</output>
|
||||||
|
</layer>
|
||||||
|
<layer name="proposal" type="Proposal" precision="FP32" id="3" version="extension">
|
||||||
|
<data feat_stride="16" base_size="16" min_size="16" ratio="2.669000" scale="4.000000,6.000000,9.000000,16.000000,24.000000,32.000000" pre_nms_topn="6000" post_nms_topn="200" nms_thresh="0.600000"/>
|
||||||
|
<input>
|
||||||
|
<port id="1">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>12</dim>
|
||||||
|
<dim>34</dim>
|
||||||
|
<dim>62</dim>
|
||||||
|
</port>
|
||||||
|
<port id="2">
|
||||||
|
<dim>1</dim>
|
||||||
|
<dim>24</dim>
|
||||||
|
<dim>34</dim>
|
||||||
|
<dim>62</dim>
|
||||||
|
</port>
|
||||||
|
<port id="3">
|
||||||
|
<dim>3</dim>
|
||||||
|
</port>
|
||||||
|
</input>
|
||||||
|
<output>
|
||||||
|
<port id="3" precision="FP32">
|
||||||
|
<dim>1000</dim>
|
||||||
|
<dim>5</dim>
|
||||||
|
</port>
|
||||||
|
<port id="4" precision="FP32">
|
||||||
|
<dim>1000</dim>
|
||||||
|
</port>
|
||||||
|
</output>
|
||||||
|
</layer>
|
||||||
|
<layer id="4" name="output" type="Result" version="opset1">
|
||||||
|
<input>
|
||||||
|
<port id="0" precision="FP32">
|
||||||
|
<dim>200</dim>
|
||||||
|
<dim>5</dim>
|
||||||
|
</port>
|
||||||
|
</input>
|
||||||
|
</layer>
|
||||||
|
</layers>
|
||||||
|
<edges>
|
||||||
|
<edge from-layer="0" from-port="0" to-layer="3" to-port="1"/>
|
||||||
|
<edge from-layer="1" from-port="0" to-layer="3" to-port="2"/>
|
||||||
|
<edge from-layer="2" from-port="0" to-layer="3" to-port="3"/>
|
||||||
|
<edge from-layer="3" from-port="4" to-layer="4" to-port="0"/>
|
||||||
|
</edges>
|
||||||
|
</net>
|
||||||
|
)V0G0N";
|
||||||
|
|
||||||
|
Core ie;
|
||||||
|
Blob::Ptr weights;
|
||||||
|
|
||||||
|
weights = make_shared_blob<uint8_t>(TensorDesc(Precision::U8, {24}, Layout::C));
|
||||||
|
weights->allocate();
|
||||||
|
CommonTestUtils::fill_data(weights->buffer().as<float *>(), weights->size() / sizeof(float));
|
||||||
|
|
||||||
|
auto func = ie.ReadNetwork(model_v10, weights).getFunction();
|
||||||
|
for (auto op : func->get_ordered_ops()) {
|
||||||
|
if (op->get_friendly_name() == "proposal" && op->get_type_info() == ngraph::op::GenericIE::type_info) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
FAIL() << "Custom proposal layer is not a Generic operation!";
|
||||||
|
}
|
||||||
|
|||||||
@@ -1,218 +0,0 @@
|
|||||||
// Copyright (C) 2020 Intel Corporation
|
|
||||||
// SPDX-License-Identifier: Apache-2.0
|
|
||||||
//
|
|
||||||
|
|
||||||
#include <gtest/gtest.h>
|
|
||||||
|
|
||||||
#include "common_test_utils/test_common.hpp"
|
|
||||||
#include <string>
|
|
||||||
#include <memory>
|
|
||||||
|
|
||||||
#include <ngraph/opsets/opset3.hpp>
|
|
||||||
#include <ngraph/function.hpp>
|
|
||||||
#include <transformations/init_node_info.hpp>
|
|
||||||
#include <ngraph/pass/constant_folding.hpp>
|
|
||||||
#include <ngraph/ops.hpp>
|
|
||||||
#include "ngraph_test_utils.hpp"
|
|
||||||
|
|
||||||
using namespace testing;
|
|
||||||
|
|
||||||
TEST(TransformationTests, ConstFoldingPriorBox) {
|
|
||||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
|
||||||
|
|
||||||
{
|
|
||||||
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
|
||||||
ngraph::op::PriorBoxAttrs attrs;
|
|
||||||
attrs.min_size = {256.0f};
|
|
||||||
attrs.max_size = {315.0f};
|
|
||||||
attrs.aspect_ratio = {2.0f};
|
|
||||||
attrs.flip = true;
|
|
||||||
attrs.scale_all_sizes = true;
|
|
||||||
|
|
||||||
auto layer_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {1, 1});
|
|
||||||
auto image_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {300, 300});
|
|
||||||
auto pb = std::make_shared<ngraph::opset3::PriorBox>(layer_shape, image_shape, attrs);
|
|
||||||
auto res = std::make_shared<ngraph::opset3::Result>(pb);
|
|
||||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in});
|
|
||||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
|
||||||
ngraph::pass::ConstantFolding().run_on_function(f);
|
|
||||||
ASSERT_NO_THROW(check_rt_info(f));
|
|
||||||
}
|
|
||||||
|
|
||||||
{
|
|
||||||
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
|
||||||
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 16},
|
|
||||||
{ -0.426667, -0.426667, 0.426667, 0.426667, -0.473286, -0.473286, 0.473286, 0.473286,
|
|
||||||
-0.603398, -0.301699, 0.603398, 0.301699, -0.301699, -0.603398, 0.301699, 0.603398,
|
|
||||||
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
|
|
||||||
});
|
|
||||||
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
|
|
||||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
|
|
||||||
}
|
|
||||||
|
|
||||||
auto res = compare_functions(f, f_ref);
|
|
||||||
ASSERT_TRUE(res.first) << res.second;
|
|
||||||
|
|
||||||
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
|
||||||
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
|
||||||
|
|
||||||
EXPECT_TRUE(fused != nullptr);
|
|
||||||
EXPECT_TRUE(ref != nullptr);
|
|
||||||
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
|
|
||||||
}
|
|
||||||
|
|
||||||
TEST(TransformationTests, ConstFoldingPriorBoxClustered) {
|
|
||||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
|
||||||
|
|
||||||
{
|
|
||||||
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
|
||||||
ngraph::op::PriorBoxClusteredAttrs attrs;
|
|
||||||
attrs.widths = {4.0f, 2.0f, 3.2f};
|
|
||||||
attrs.heights = {1.0f, 2.0f, 1.1f};
|
|
||||||
|
|
||||||
auto layer_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {2, 2});
|
|
||||||
auto image_shape = ngraph::opset3::Constant::create<int64_t>(ngraph::element::i64, ngraph::Shape{2}, {300, 300});
|
|
||||||
auto pb = std::make_shared<ngraph::opset3::PriorBoxClustered>(layer_shape, image_shape, attrs);
|
|
||||||
auto res = std::make_shared<ngraph::opset3::Result>(pb);
|
|
||||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in});
|
|
||||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
|
||||||
ngraph::pass::ConstantFolding().run_on_function(f);
|
|
||||||
ASSERT_NO_THROW(check_rt_info(f));
|
|
||||||
}
|
|
||||||
|
|
||||||
{
|
|
||||||
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
|
||||||
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 48},
|
|
||||||
{ -0.00666667, -0.00166667, 0.00666667, 0.00166667, -0.00333333, -0.00333333, 0.00333333,
|
|
||||||
0.00333333, -0.00533333, -0.00183333, 0.00533333, 0.00183333, -0.00333333, -0.00166667,
|
|
||||||
0.01, 0.00166667, 0, -0.00333333, 0.00666667, 0.00333333, -0.002, -0.00183333, 0.00866667,
|
|
||||||
0.00183333, -0.00666667, 0.00166667, 0.00666667, 0.005, -0.00333333, 0, 0.00333333,
|
|
||||||
0.00666667, -0.00533333, 0.0015, 0.00533333, 0.00516667, -0.00333333, 0.00166667, 0.01,
|
|
||||||
0.005, 0, 0, 0.00666667, 0.00666667, -0.002, 0.0015, 0.00866667, 0.00516667, 0.1, 0.1,
|
|
||||||
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
|
||||||
});
|
|
||||||
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
|
|
||||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
|
|
||||||
}
|
|
||||||
|
|
||||||
auto res = compare_functions(f, f_ref);
|
|
||||||
ASSERT_TRUE(res.first) << res.second;
|
|
||||||
|
|
||||||
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
|
||||||
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
|
||||||
|
|
||||||
EXPECT_TRUE(fused != nullptr);
|
|
||||||
EXPECT_TRUE(ref != nullptr);
|
|
||||||
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
|
|
||||||
}
|
|
||||||
|
|
||||||
TEST(TransformationTests, ConstFoldingPriorBoxSubgraph) {
|
|
||||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
|
||||||
|
|
||||||
{
|
|
||||||
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 1, 1});
|
|
||||||
auto in_2 = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 300, 300});
|
|
||||||
ngraph::op::PriorBoxAttrs attrs;
|
|
||||||
attrs.min_size = {256.0f};
|
|
||||||
attrs.max_size = {315.0f};
|
|
||||||
attrs.aspect_ratio = {2.0f};
|
|
||||||
attrs.flip = true;
|
|
||||||
attrs.scale_all_sizes = true;
|
|
||||||
|
|
||||||
auto layer_shape = std::make_shared<ngraph::opset3::ShapeOf>(in);
|
|
||||||
auto image_shape = std::make_shared<ngraph::opset3::ShapeOf>(in_2);
|
|
||||||
|
|
||||||
auto begin = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {2});
|
|
||||||
auto end = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {4});
|
|
||||||
auto stride = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {1});
|
|
||||||
auto ss_data = std::make_shared<ngraph::opset3::StridedSlice>(layer_shape, begin, end, stride,
|
|
||||||
std::vector<int64_t>{0}, std::vector<int64_t>{0});
|
|
||||||
|
|
||||||
auto ss_image = std::make_shared<ngraph::opset3::StridedSlice>(image_shape, begin, end, stride,
|
|
||||||
std::vector<int64_t>{0}, std::vector<int64_t>{0});
|
|
||||||
auto pb = std::make_shared<ngraph::opset3::PriorBox>(ss_data, ss_image, attrs);
|
|
||||||
auto res = std::make_shared<ngraph::opset3::Result>(pb);
|
|
||||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in, in_2});
|
|
||||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
|
||||||
ngraph::pass::ConstantFolding().run_on_function(f);
|
|
||||||
ASSERT_NO_THROW(check_rt_info(f));
|
|
||||||
}
|
|
||||||
|
|
||||||
{
|
|
||||||
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
|
||||||
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 16},
|
|
||||||
{ -0.426667, -0.426667, 0.426667, 0.426667, -0.473286, -0.473286, 0.473286, 0.473286,
|
|
||||||
-0.603398, -0.301699, 0.603398, 0.301699, -0.301699, -0.603398, 0.301699, 0.603398,
|
|
||||||
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1
|
|
||||||
});
|
|
||||||
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
|
|
||||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
|
|
||||||
}
|
|
||||||
|
|
||||||
auto res = compare_functions(f, f_ref);
|
|
||||||
ASSERT_TRUE(res.first) << res.second;
|
|
||||||
|
|
||||||
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
|
||||||
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
|
||||||
|
|
||||||
EXPECT_TRUE(fused != nullptr);
|
|
||||||
EXPECT_TRUE(ref != nullptr);
|
|
||||||
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
|
|
||||||
}
|
|
||||||
|
|
||||||
TEST(TransformationTests, ConstFoldingPriorBoxClusteredSubgraph) {
|
|
||||||
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
|
||||||
|
|
||||||
{
|
|
||||||
auto in = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 2, 2});
|
|
||||||
auto in_2 = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2, 3, 300, 300});
|
|
||||||
ngraph::op::PriorBoxClusteredAttrs attrs;
|
|
||||||
attrs.widths = {4.0f, 2.0f, 3.2f};
|
|
||||||
attrs.heights = {1.0f, 2.0f, 1.1f};
|
|
||||||
|
|
||||||
auto layer_shape = std::make_shared<ngraph::opset3::ShapeOf>(in);
|
|
||||||
auto image_shape = std::make_shared<ngraph::opset3::ShapeOf>(in_2);
|
|
||||||
|
|
||||||
auto begin = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {2});
|
|
||||||
auto end = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {4});
|
|
||||||
auto stride = ngraph::opset3::Constant::create(ngraph::element::i64, ngraph::Shape{1}, {1});
|
|
||||||
auto ss_data = std::make_shared<ngraph::opset3::StridedSlice>(layer_shape, begin, end, stride,
|
|
||||||
std::vector<int64_t>{0}, std::vector<int64_t>{0});
|
|
||||||
|
|
||||||
auto ss_image = std::make_shared<ngraph::opset3::StridedSlice>(image_shape, begin, end, stride,
|
|
||||||
std::vector<int64_t>{0}, std::vector<int64_t>{0});
|
|
||||||
auto pb = std::make_shared<ngraph::opset3::PriorBoxClustered>(ss_data, ss_image, attrs);
|
|
||||||
auto res = std::make_shared<ngraph::opset3::Result>(pb);
|
|
||||||
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{in, in_2});
|
|
||||||
ngraph::pass::InitNodeInfo().run_on_function(f);
|
|
||||||
ngraph::pass::ConstantFolding().run_on_function(f);
|
|
||||||
ASSERT_NO_THROW(check_rt_info(f));
|
|
||||||
}
|
|
||||||
|
|
||||||
{
|
|
||||||
auto layer_shape = std::make_shared<ngraph::opset3::Parameter>(ngraph::element::i64, ngraph::Shape{2});
|
|
||||||
auto const_prior_box = ngraph::opset3::Constant::create<float>(ngraph::element::f32, ngraph::Shape{2, 48},
|
|
||||||
{ -0.00666667, -0.00166667, 0.00666667, 0.00166667, -0.00333333, -0.00333333, 0.00333333,
|
|
||||||
0.00333333, -0.00533333, -0.00183333, 0.00533333, 0.00183333, -0.00333333, -0.00166667,
|
|
||||||
0.01, 0.00166667, 0, -0.00333333, 0.00666667, 0.00333333, -0.002, -0.00183333, 0.00866667,
|
|
||||||
0.00183333, -0.00666667, 0.00166667, 0.00666667, 0.005, -0.00333333, 0, 0.00333333,
|
|
||||||
0.00666667, -0.00533333, 0.0015, 0.00533333, 0.00516667, -0.00333333, 0.00166667, 0.01,
|
|
||||||
0.005, 0, 0, 0.00666667, 0.00666667, -0.002, 0.0015, 0.00866667, 0.00516667, 0.1, 0.1,
|
|
||||||
0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
||||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
|
||||||
});
|
|
||||||
auto res = std::make_shared<ngraph::opset3::Result>(const_prior_box);
|
|
||||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{res}, ngraph::ParameterVector{layer_shape});
|
|
||||||
}
|
|
||||||
|
|
||||||
auto res = compare_functions(f, f_ref);
|
|
||||||
ASSERT_TRUE(res.first) << res.second;
|
|
||||||
|
|
||||||
auto fused = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
|
||||||
auto ref = std::dynamic_pointer_cast<ngraph::opset3::Constant>(f->get_result()->input_value(0).get_node_shared_ptr());
|
|
||||||
|
|
||||||
EXPECT_TRUE(fused != nullptr);
|
|
||||||
EXPECT_TRUE(ref != nullptr);
|
|
||||||
EXPECT_TRUE(fused->get_vector<float>() == ref->get_vector<float>());
|
|
||||||
}
|
|
||||||
@@ -0,0 +1,73 @@
|
|||||||
|
// Copyright (C) 2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#include <gtest/gtest.h>
|
||||||
|
|
||||||
|
#include <string>
|
||||||
|
#include <memory>
|
||||||
|
#include <queue>
|
||||||
|
|
||||||
|
#include <ngraph/function.hpp>
|
||||||
|
#include <ngraph/opsets/opset1.hpp>
|
||||||
|
#include <transformations/convert_divide.hpp>
|
||||||
|
#include <transformations/init_node_info.hpp>
|
||||||
|
#include <transformations/utils/utils.hpp>
|
||||||
|
|
||||||
|
#include "ngraph_test_utils.hpp"
|
||||||
|
|
||||||
|
using namespace testing;
|
||||||
|
|
||||||
|
TEST(TransformationTests, ConvertDivide) {
|
||||||
|
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||||
|
{
|
||||||
|
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{3, 1, 2});
|
||||||
|
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {1.5});
|
||||||
|
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
|
||||||
|
|
||||||
|
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
|
||||||
|
|
||||||
|
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||||
|
ngraph::pass::ConvertDivide().run_on_function(f);
|
||||||
|
ASSERT_NO_THROW(check_rt_info(f));
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{3, 1, 2});
|
||||||
|
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {1.5});
|
||||||
|
auto pow = std::make_shared<ngraph::opset1::Power>(divide_constant,
|
||||||
|
ngraph::opset1::Constant::create(ngraph::element::f32, ngraph::Shape{1}, {-1}));
|
||||||
|
auto mul = std::make_shared<ngraph::opset1::Multiply>(data, pow);
|
||||||
|
|
||||||
|
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{mul}, ngraph::ParameterVector{data});
|
||||||
|
}
|
||||||
|
|
||||||
|
auto res = compare_functions(f, f_ref);
|
||||||
|
ASSERT_TRUE(res.first) << res.second;
|
||||||
|
}
|
||||||
|
|
||||||
|
TEST(TransformationTests, ConvertDivideNegative) {
|
||||||
|
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||||
|
{
|
||||||
|
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::i32, ngraph::Shape{3, 1, 2});
|
||||||
|
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::i32, ngraph::Shape{1}, {2});
|
||||||
|
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
|
||||||
|
|
||||||
|
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
|
||||||
|
|
||||||
|
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||||
|
ngraph::pass::ConvertDivide().run_on_function(f);
|
||||||
|
ASSERT_NO_THROW(check_rt_info(f));
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
auto data = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::i32, ngraph::Shape{3, 1, 2});
|
||||||
|
auto divide_constant = ngraph::opset1::Constant::create(ngraph::element::i32, ngraph::Shape{1}, {2});
|
||||||
|
auto divide = std::make_shared<ngraph::opset1::Divide>(data, divide_constant);
|
||||||
|
|
||||||
|
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{divide}, ngraph::ParameterVector{data});
|
||||||
|
}
|
||||||
|
|
||||||
|
auto res = compare_functions(f, f_ref);
|
||||||
|
ASSERT_TRUE(res.first) << res.second;
|
||||||
|
}
|
||||||
@@ -177,6 +177,56 @@ TEST(TransformationTests, ConvertStridedSliceToCropNegative) {
|
|||||||
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
|
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
auto res = compare_functions(f, f_ref);
|
||||||
|
ASSERT_TRUE(res.first) << res.second;
|
||||||
|
}
|
||||||
|
|
||||||
|
// in this test the Crop will get 3D input which is not supported so the transformation will not be applied
|
||||||
|
TEST(TransformationTests, ConvertStridedSliceToCropNegative2) {
|
||||||
|
std::shared_ptr<ngraph::Function> f(nullptr), f_ref(nullptr);
|
||||||
|
{
|
||||||
|
auto input = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{128, 1});
|
||||||
|
auto slice_begin = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
|
||||||
|
auto slice_end = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
|
||||||
|
auto slice_stride = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {1, 1, 1});
|
||||||
|
|
||||||
|
std::vector<int64_t> begin_mask = {0, 1, 1};
|
||||||
|
std::vector<int64_t> end_mask = {0, 1, 1};
|
||||||
|
std::vector<int64_t> new_axis_mask = {1, 0, 0};
|
||||||
|
std::vector<int64_t> shrink_axis_mask = {0, 0, 0};
|
||||||
|
std::vector<int64_t> ellipsis_mask = {0, 0, 0};
|
||||||
|
|
||||||
|
auto sslice = std::make_shared<ngraph::opset1::StridedSlice>(input, slice_begin, slice_end, slice_stride,
|
||||||
|
begin_mask, end_mask,
|
||||||
|
new_axis_mask, shrink_axis_mask, ellipsis_mask);
|
||||||
|
sslice->set_friendly_name("strided_slice");
|
||||||
|
|
||||||
|
f = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
|
||||||
|
ngraph::pass::InitNodeInfo().run_on_function(f);
|
||||||
|
ngraph::pass::ConvertStridedSliceToCrop().run_on_function(f);
|
||||||
|
ASSERT_NO_THROW(check_rt_info(f));
|
||||||
|
}
|
||||||
|
|
||||||
|
{
|
||||||
|
auto input = std::make_shared<ngraph::opset1::Parameter>(ngraph::element::f32, ngraph::Shape{128, 1});
|
||||||
|
auto slice_begin = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
|
||||||
|
auto slice_end = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {0, 0, 0});
|
||||||
|
auto slice_stride = ngraph::opset1::Constant::create(ngraph::element::i64, ngraph::Shape{3}, {1, 1, 1});
|
||||||
|
|
||||||
|
std::vector<int64_t> begin_mask = {0, 1, 1};
|
||||||
|
std::vector<int64_t> end_mask = {0, 1, 1};
|
||||||
|
std::vector<int64_t> new_axis_mask = {1, 0, 0};
|
||||||
|
std::vector<int64_t> shrink_axis_mask = {0, 0, 0};
|
||||||
|
std::vector<int64_t> ellipsis_mask = {0, 0, 0};
|
||||||
|
|
||||||
|
auto sslice = std::make_shared<ngraph::opset1::StridedSlice>(input, slice_begin, slice_end, slice_stride,
|
||||||
|
begin_mask, end_mask,
|
||||||
|
new_axis_mask, shrink_axis_mask, ellipsis_mask);
|
||||||
|
sslice->set_friendly_name("strided_slice");
|
||||||
|
|
||||||
|
f_ref = std::make_shared<ngraph::Function>(ngraph::NodeVector{sslice}, ngraph::ParameterVector{input});
|
||||||
|
}
|
||||||
|
|
||||||
auto res = compare_functions(f, f_ref);
|
auto res = compare_functions(f, f_ref);
|
||||||
ASSERT_TRUE(res.first) << res.second;
|
ASSERT_TRUE(res.first) << res.second;
|
||||||
}
|
}
|
||||||
@@ -157,5 +157,6 @@ TEST(TransformationTests, ConvertTopK3I64Output1) {
|
|||||||
ASSERT_TRUE(res.first) << res.second;
|
ASSERT_TRUE(res.first) << res.second;
|
||||||
|
|
||||||
auto result_node_of_converted_f = f->get_output_op(0);
|
auto result_node_of_converted_f = f->get_output_op(0);
|
||||||
auto topk_node = result_node_of_converted_f->input(0).get_source_output().get_node_shared_ptr();
|
auto convert_node = result_node_of_converted_f->input(0).get_source_output().get_node_shared_ptr();
|
||||||
|
ASSERT_TRUE(convert_node->get_friendly_name() == "topk.1") << "Transformation ConvertTopK3 should keep output names.\n";
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -11,14 +11,15 @@ std::vector<std::string> disabledTestPatterns() {
|
|||||||
return {
|
return {
|
||||||
// TODO: Issue 26264
|
// TODO: Issue 26264
|
||||||
R"(.*(MaxPool|AvgPool).*S\(1\.2\).*Rounding=CEIL.*)",
|
R"(.*(MaxPool|AvgPool).*S\(1\.2\).*Rounding=CEIL.*)",
|
||||||
// TODO: Issue 31839
|
|
||||||
R"(.*(QuantConvBackpropData3D).*)",
|
|
||||||
// TODO: Issue 31841
|
// TODO: Issue 31841
|
||||||
R"(.*(QuantGroupConvBackpropData3D).*)",
|
R"(.*(QuantGroupConvBackpropData3D).*)",
|
||||||
// TODO: Issue 31843
|
// TODO: Issue 31843
|
||||||
R"(.*(QuantGroupConvBackpropData2D)*QG=Perchannel.*)",
|
R"(.*(QuantConvBackpropData3D).*)",
|
||||||
// TODO: Issue 32023
|
R"(.*(QuantConvBackpropData2D).*(QG=Perchannel).*)",
|
||||||
R"(.*(QuantGroupConvBackpropData2D)*QG=Pertensor.*)",
|
R"(.*(QuantGroupConvBackpropData2D).*(QG=Perchannel).*)",
|
||||||
|
// TODO: Issue 33886
|
||||||
|
R"(.*(QuantGroupConv2D).*)",
|
||||||
|
R"(.*(QuantGroupConv3D).*)",
|
||||||
// TODO: Issue 31845
|
// TODO: Issue 31845
|
||||||
R"(.*(FakeQuantize).*)",
|
R"(.*(FakeQuantize).*)",
|
||||||
R"(.*(EltwiseLayerTest).*IS=\(.*\..*\..*\..*\..*\).*secondaryInputType=PARAMETER.*opType=SCALAR.*)",
|
R"(.*(EltwiseLayerTest).*IS=\(.*\..*\..*\..*\..*\).*secondaryInputType=PARAMETER.*opType=SCALAR.*)",
|
||||||
|
|||||||
@@ -19,7 +19,6 @@ const std::vector<InferenceEngine::Precision> netPrecisions = {
|
|||||||
const std::vector<size_t> numOutChannels = {16, 32};
|
const std::vector<size_t> numOutChannels = {16, 32};
|
||||||
|
|
||||||
const std::vector<size_t > levels = {256};
|
const std::vector<size_t > levels = {256};
|
||||||
// FIXME: Perchannel tests fail because of bug in LPT
|
|
||||||
const std::vector<QuantizationGranularity > granularity = {Pertensor, Perchannel};
|
const std::vector<QuantizationGranularity > granularity = {Pertensor, Perchannel};
|
||||||
|
|
||||||
/* ============= 2D GroupConvolutionBackpropData ============= */
|
/* ============= 2D GroupConvolutionBackpropData ============= */
|
||||||
|
|||||||
@@ -0,0 +1,86 @@
|
|||||||
|
// Copyright (C) 2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#include <vector>
|
||||||
|
|
||||||
|
#include "subgraph_tests/quantized_group_convolution.hpp"
|
||||||
|
#include "common_test_utils/test_constants.hpp"
|
||||||
|
|
||||||
|
using namespace LayerTestsDefinitions;
|
||||||
|
using namespace ngraph::helpers;
|
||||||
|
|
||||||
|
namespace {
|
||||||
|
|
||||||
|
const std::vector<InferenceEngine::Precision> netPrecisions = {
|
||||||
|
InferenceEngine::Precision::FP32
|
||||||
|
};
|
||||||
|
|
||||||
|
|
||||||
|
const std::vector<size_t> numOutChannels = {3, 24, 48};
|
||||||
|
const std::vector<size_t> numGroups = {3};
|
||||||
|
|
||||||
|
const std::vector<size_t > levels = {256};
|
||||||
|
const std::vector<QuantizationGranularity> granularity = {Pertensor, Perchannel};
|
||||||
|
const std::vector<bool> quantizeWeights = {false, true};
|
||||||
|
|
||||||
|
/* ============= 2D GroupConvolution ============= */
|
||||||
|
const std::vector<std::vector<size_t >> inputShapes2D = {{1, 3, 10, 10}, {1, 24, 10, 10}};
|
||||||
|
const std::vector<std::vector<size_t >> kernels2D = {{1, 1}, {3, 3}};
|
||||||
|
const std::vector<std::vector<size_t >> strides2D = {{1, 1}};
|
||||||
|
const std::vector<std::vector<ptrdiff_t>> padBegins2D = {{0, 0}};
|
||||||
|
const std::vector<std::vector<ptrdiff_t>> padEnds2D = {{0, 0}};
|
||||||
|
const std::vector<std::vector<size_t >> dilations2D = {{1, 1}};
|
||||||
|
|
||||||
|
|
||||||
|
const auto quantGroupConv2DParams = ::testing::Combine(
|
||||||
|
::testing::ValuesIn(kernels2D),
|
||||||
|
::testing::ValuesIn(strides2D),
|
||||||
|
::testing::ValuesIn(padBegins2D),
|
||||||
|
::testing::ValuesIn(padEnds2D),
|
||||||
|
::testing::ValuesIn(dilations2D),
|
||||||
|
::testing::ValuesIn(numOutChannels),
|
||||||
|
::testing::ValuesIn(numGroups),
|
||||||
|
::testing::ValuesIn(levels),
|
||||||
|
::testing::ValuesIn(granularity),
|
||||||
|
::testing::ValuesIn(quantizeWeights)
|
||||||
|
);
|
||||||
|
|
||||||
|
INSTANTIATE_TEST_CASE_P(QuantGroupConv2D, QuantGroupConvLayerTest,
|
||||||
|
::testing::Combine(
|
||||||
|
quantGroupConv2DParams,
|
||||||
|
::testing::ValuesIn(netPrecisions),
|
||||||
|
::testing::ValuesIn(inputShapes2D),
|
||||||
|
::testing::Values(CommonTestUtils::DEVICE_CPU)),
|
||||||
|
QuantGroupConvLayerTest::getTestCaseName);
|
||||||
|
|
||||||
|
/* ============= 3D GroupConvolution ============= */
|
||||||
|
const std::vector<std::vector<size_t >> inputShapes3D = {{1, 3, 5, 5, 5}, {1, 24, 5, 5, 5}};
|
||||||
|
const std::vector<std::vector<size_t >> kernels3D = {{3, 3, 3}};
|
||||||
|
const std::vector<std::vector<size_t >> strides3D = {{1, 1, 1}};
|
||||||
|
const std::vector<std::vector<ptrdiff_t>> padBegins3D = {{0, 0, 0}};
|
||||||
|
const std::vector<std::vector<ptrdiff_t>> padEnds3D = {{0, 0, 0}};
|
||||||
|
const std::vector<std::vector<size_t >> dilations3D = {{1, 1, 1}};
|
||||||
|
|
||||||
|
const auto quantGroupConv3DParams = ::testing::Combine(
|
||||||
|
::testing::ValuesIn(kernels3D),
|
||||||
|
::testing::ValuesIn(strides3D),
|
||||||
|
::testing::ValuesIn(padBegins3D),
|
||||||
|
::testing::ValuesIn(padEnds3D),
|
||||||
|
::testing::ValuesIn(dilations3D),
|
||||||
|
::testing::ValuesIn(numOutChannels),
|
||||||
|
::testing::ValuesIn(numGroups),
|
||||||
|
::testing::ValuesIn(levels),
|
||||||
|
::testing::ValuesIn(granularity),
|
||||||
|
::testing::ValuesIn(quantizeWeights)
|
||||||
|
);
|
||||||
|
|
||||||
|
INSTANTIATE_TEST_CASE_P(QuantGroupConv3D, QuantGroupConvLayerTest,
|
||||||
|
::testing::Combine(
|
||||||
|
quantGroupConv3DParams,
|
||||||
|
::testing::ValuesIn(netPrecisions),
|
||||||
|
::testing::ValuesIn(inputShapes3D),
|
||||||
|
::testing::Values(CommonTestUtils::DEVICE_CPU)),
|
||||||
|
QuantGroupConvLayerTest::getTestCaseName);
|
||||||
|
|
||||||
|
} // namespace
|
||||||
@@ -21,7 +21,7 @@ const std::vector<std::map<std::string, std::string>> configs = {
|
|||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
INSTANTIATE_TEST_CASE_P(ConcatQuantization, ConcatQuantization,
|
INSTANTIATE_TEST_CASE_P(smoke_ConcatQuantization, ConcatQuantization,
|
||||||
::testing::Combine(
|
::testing::Combine(
|
||||||
::testing::ValuesIn(netPrecisions),
|
::testing::ValuesIn(netPrecisions),
|
||||||
::testing::Values(CommonTestUtils::DEVICE_GNA),
|
::testing::Values(CommonTestUtils::DEVICE_GNA),
|
||||||
|
|||||||
@@ -0,0 +1,39 @@
|
|||||||
|
// Copyright (C) 2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
#include <vector>
|
||||||
|
#include "subgraph_tests/multioutput_eltwise_squeeze_eltwise.hpp"
|
||||||
|
#include "common_test_utils/test_constants.hpp"
|
||||||
|
|
||||||
|
using namespace LayerTestsDefinitions;
|
||||||
|
|
||||||
|
namespace {
|
||||||
|
std::vector<std::vector<std::vector<size_t>>> inputs{
|
||||||
|
{{1, 16}},
|
||||||
|
{{2, 16}},
|
||||||
|
{{1, 160}},
|
||||||
|
{{8, 40}},
|
||||||
|
{{3, 8}},
|
||||||
|
{{4, 32}},
|
||||||
|
{{5, 64}},
|
||||||
|
{{6, 128}},
|
||||||
|
{{7, 256}},
|
||||||
|
{{8, 512}},
|
||||||
|
{{8, 1024}}
|
||||||
|
};
|
||||||
|
|
||||||
|
std::map<std::string, std::string> additional_config = {
|
||||||
|
{"GNA_COMPACT_MODE", "NO"},
|
||||||
|
};
|
||||||
|
|
||||||
|
std::vector<InferenceEngine::Precision> netPrecisions = {InferenceEngine::Precision::FP32,
|
||||||
|
InferenceEngine::Precision::FP16,
|
||||||
|
};
|
||||||
|
|
||||||
|
INSTANTIATE_TEST_CASE_P(multioutput_eltwise_identity, MultioutputEltwiseReshapeEltwise,
|
||||||
|
::testing::Combine(
|
||||||
|
::testing::ValuesIn(inputs),
|
||||||
|
::testing::ValuesIn(netPrecisions),
|
||||||
|
::testing::Values(CommonTestUtils::DEVICE_GNA),
|
||||||
|
::testing::Values(additional_config)),
|
||||||
|
MultioutputEltwiseReshapeEltwise::getTestCaseName);
|
||||||
|
} // namespace
|
||||||
@@ -9,6 +9,7 @@ using namespace LayerTestsDefinitions;
|
|||||||
namespace {
|
namespace {
|
||||||
std::vector<std::vector<std::vector<size_t>>> inputs{
|
std::vector<std::vector<std::vector<size_t>>> inputs{
|
||||||
{{1, 4 , 160}, {0, 2, 1}},
|
{{1, 4 , 160}, {0, 2, 1}},
|
||||||
|
{{1, 160, 4}, {0, 2, 1}},
|
||||||
{{8, 16}, {1, 0}},
|
{{8, 16}, {1, 0}},
|
||||||
{{1, 1, 4, 16}, {3, 1, 2, 0}},
|
{{1, 1, 4, 16}, {3, 1, 2, 0}},
|
||||||
{{1, 8, 200}, {0, 2, 1}},
|
{{1, 8, 200}, {0, 2, 1}},
|
||||||
|
|||||||
@@ -0,0 +1,53 @@
|
|||||||
|
// Copyright (C) 2020 Intel Corporation
|
||||||
|
// SPDX-License-Identifier: Apache-2.0
|
||||||
|
//
|
||||||
|
|
||||||
|
#include <vector>
|
||||||
|
#include "subgraph_tests/scaleshift.hpp"
|
||||||
|
#include "common_test_utils/test_constants.hpp"
|
||||||
|
|
||||||
|
using namespace LayerTestsDefinitions;
|
||||||
|
|
||||||
|
namespace {
|
||||||
|
|
||||||
|
std::vector<std::vector<std::vector<size_t>>> inShapes = {
|
||||||
|
{{1, 8}},
|
||||||
|
{{2, 16}},
|
||||||
|
{{3, 32}},
|
||||||
|
{{4, 64}},
|
||||||
|
{{5, 128}},
|
||||||
|
{{6, 256}},
|
||||||
|
{{7, 512}},
|
||||||
|
{{8, 1024}}
|
||||||
|
};
|
||||||
|
|
||||||
|
std::vector<std::vector<float >> Scales = {
|
||||||
|
{2.0f},
|
||||||
|
{3.0f},
|
||||||
|
{-1.0f},
|
||||||
|
{-2.0f},
|
||||||
|
{-3.0f}
|
||||||
|
};
|
||||||
|
|
||||||
|
std::vector<std::vector<float >> Shifts = {
|
||||||
|
{1.0f},
|
||||||
|
{2.0f},
|
||||||
|
{3.0f},
|
||||||
|
{-1.0f},
|
||||||
|
{-2.0f},
|
||||||
|
{-3.0f}
|
||||||
|
};
|
||||||
|
|
||||||
|
std::vector<InferenceEngine::Precision> netPrecisions = {InferenceEngine::Precision::FP32,
|
||||||
|
InferenceEngine::Precision::FP16,
|
||||||
|
};
|
||||||
|
|
||||||
|
INSTANTIATE_TEST_CASE_P(scale_shift, ScaleShiftLayerTest,
|
||||||
|
::testing::Combine(
|
||||||
|
::testing::ValuesIn(inShapes),
|
||||||
|
::testing::ValuesIn(netPrecisions),
|
||||||
|
::testing::Values(CommonTestUtils::DEVICE_GNA),
|
||||||
|
::testing::ValuesIn(Scales),
|
||||||
|
::testing::ValuesIn(Shifts)),
|
||||||
|
ScaleShiftLayerTest::getTestCaseName);
|
||||||
|
} // namespace
|
||||||
@@ -60,8 +60,8 @@ INSTANTIATE_TEST_CASE_P(PriorBoxClustered_Basic, PriorBoxClusteredLayerTest,
|
|||||||
::testing::Combine(
|
::testing::Combine(
|
||||||
layerSpeficParams,
|
layerSpeficParams,
|
||||||
::testing::ValuesIn(netPrecisions),
|
::testing::ValuesIn(netPrecisions),
|
||||||
::testing::Values(std::vector<size_t>({ 4, 4 })),
|
::testing::Values(std::vector<size_t>({ 1, 16, 4, 4 })),
|
||||||
::testing::Values(std::vector<size_t>({ 50, 50 })),
|
::testing::Values(std::vector<size_t>({ 1, 3, 50, 50 })),
|
||||||
::testing::Values(CommonTestUtils::DEVICE_GPU)),
|
::testing::Values(CommonTestUtils::DEVICE_GPU)),
|
||||||
PriorBoxClusteredLayerTest::getTestCaseName
|
PriorBoxClusteredLayerTest::getTestCaseName
|
||||||
);
|
);
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user