Publishing R4 (#41)

* Publishing R4
This commit is contained in:
Alexey Suhov
2018-11-23 16:19:43 +03:00
committed by openvino-pushbot
parent 54eab18036
commit 55a41d7570
5728 changed files with 94337 additions and 755031 deletions

1
.gitignore vendored
View File

@@ -282,7 +282,6 @@ report/
/CMakeCache.txt
.vimprj/
build_IA32/
doc/
.dir-locals.el
GTAGS
GPATH

3
.gitmodules vendored Normal file
View File

@@ -0,0 +1,3 @@
[submodule "inference-engine/thirdparty/ade"]
path = inference-engine/thirdparty/ade
url = https://github.com/opencv/ade.git

View File

@@ -2,29 +2,33 @@
The software was validated on:
- Ubuntu\* 16.04 with default GCC\* 5.4.0
- CentOS\* 7.4 with default GCC\* 4.8.5 (using clDNN library built separately with GCC\* 5.2)
- CentOS\* 7.4 with default GCC\* 4.8.5
- [Intel® Graphics Compute Runtime for OpenCL™ Driver package 18.28.11080](https://github.com/intel/compute-runtime/releases/tag/18.28.11080).
### Software Requirements
- [CMake\*](https://cmake.org/download/) 3.9 or higher
- GCC\* 4.8 or higher to build the Inference Engine
- GCC\* 5.2 or higher to build the Compute Library for Deep Neural Networks (clDNN library)
- OpenBLAS\*
### Build Steps
1. Install OpenBLAS and other dependencies using the `install_dependencies.sh` script in the project root folder.
2. Create a build folder:
1. Clone submodules:
```sh
git submodule init
git submodule update --recursive
```
2. Install build dependencies using the `install_dependencies.sh` script in the project root folder.
3. Create a build folder:
```sh
mkdir build
```
3. Inference Engine uses a CMake-based build system. In the created `build` directory, run `cmake` to fetch project dependencies and create Unix makefiles, then run `make` to build the project:
4. Inference Engine uses a CMake-based build system. In the created `build` directory, run `cmake` to fetch project dependencies and create Unix makefiles, then run `make` to build the project:
```sh
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j16
```
You can use the following additional build options:
- Use `BLAS_INCLUDE_DIRS` and `BLAS_LIBRARIES` cmake options to specify path to OpenBLAS headers and library, for example use the following options on CentOS\*: `-DBLAS_INCLUDE_DIRS=/usr/include/openblas -DBLAS_LIBRARIES=/usr/lib64/libopenblas.so.0`
- To build clDNN from sources, please specify the `-DENABLE_CLDNN_BUILD=ON` option for `cmake`. By default pre-built version of the clDNN library is used.
- Internal JIT GEMM implementation is used by default.
- To switch to OpenBLAS\* implementation, use `GEMM=OPENBLAS` option and `BLAS_INCLUDE_DIRS` and `BLAS_LIBRARIES` cmake options to specify path to OpenBLAS headers and library, for example use the following options on CentOS\*: `-DGEMM=OPENBLAS -DBLAS_INCLUDE_DIRS=/usr/include/openblas -DBLAS_LIBRARIES=/usr/lib64/libopenblas.so.0`
- To switch to optimized MKL-ML\* GEMM implementation, use `GEMM=MKL` and `MKLROOT` cmake options to specify path to unpacked MKL-ML with `include` and `lib` folders, for example use the following options: `-DGEMM=MKL -DMKLROOT=<path_to_MKL>`. MKL-ML\* package can be downloaded [here](https://github.com/intel/mkl-dnn/releases/download/v0.17/mklml_lnx_2019.0.1.20180928.tgz)
- To switch on/off the CPU and GPU plugins, use `cmake` options `-DENABLE_MKL_DNN=ON/OFF` and `-DENABLE_CLDNN=ON/OFF`.
## Build on Windows\* Systems:
@@ -39,25 +43,31 @@ The software was validated on:
- [Intel® C++ Compiler](https://software.intel.com/en-us/intel-parallel-studio-xe) 18.0 to build the Inference Engine on Windows.
### Build Steps
1. Download and install [Intel® C++ Compiler](https://software.intel.com/en-us/intel-parallel-studio-xe) 18.0
2. Install OpenBLAS:
1. Clone submodules:
```sh
git submodule init
git submodule update --recursive
```
2. Download and install [Intel® C++ Compiler](https://software.intel.com/en-us/intel-parallel-studio-xe) 18.0
3. Install OpenBLAS:
1. Download [OpenBLAS\*](https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download)
2. Unzip the downloaded package to a directory on your machine. In this document, this directory is referred to as `<OPENBLAS_DIR>`.
3. Create build directory:
4. Create build directory:
```sh
mkdir build
```
4. In the `build` directory, run `cmake` to fetch project dependencies and generate a Visual Studio solution:
5. In the `build` directory, run `cmake` to fetch project dependencies and generate a Visual Studio solution:
```sh
cd build
cmake -G "Visual Studio 15 2017 Win64" -T "Intel C++ Compiler 18.0" -DOS_FOLDER=ON ^
-DBLAS_INCLUDE_DIRS=<OPENBLAS_DIR>\include ^
-DBLAS_LIBRARIES=<OPENBLAS_DIR>\lib\libopenblas.dll.a ^
-DCMAKE_BUILD_TYPE=Release ^
-DICCLIB="C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\compiler\lib" ..
```
5. Build generated solution in Visual Studio 2017 or run `cmake --build .` to build from the command line.
- To switch to OpenBLAS GEMM implementation, use -DGEMM=OPENBLAS cmake option and specify path to OpenBLAS using `-DBLAS_INCLUDE_DIRS=<OPENBLAS_DIR>\include` and `-DBLAS_LIBRARIES=<OPENBLAS_DIR>\lib\libopenblas.dll.a` options. Prebuilt OpenBLAS\* package can be downloaded [here](https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download), mingw64* runtime dependencies [here](https://sourceforge.net/projects/openblas/files/v0.2.14/mingw64_dll.zip/download)
- To switch to optimized MKL-ML GEMM implementation, use `GEMM=MKL` and `MKLROOT` cmake options to specify path to unpacked MKL-ML with `include` and `lib` folders, for example use the following options: `-DGEMM=MKL -DMKLROOT=<path_to_MKL>`. MKL-ML\* package can be downloaded [here](https://github.com/intel/mkl-dnn/releases/download/v0.17/mklml_win_2019.0.1.20180928.zip)
6. Build generated solution in Visual Studio 2017 or run `cmake --build . --config Release` to build from the command line.
---
\* Other names and brands may be claimed as the property of others.

View File

@@ -2,10 +2,11 @@
#
# SPDX-License-Identifier: Apache-2.0
#
include ("features")
include("features")
include("mode")
include("omp")
if (THREADING STREQUAL "OMP")
include("omp")
endif()
include("itt")
#64 bits platform
@@ -40,7 +41,7 @@ if (WIN32)
if (MINGW)
SET(ENABLE_CLDNN OFF) # dont have mingw dll for linking
set(ENABLE_SAMPLES_CORE OFF)
set(ENABLE_SAMPLES OFF)
endif()
endif()
@@ -63,14 +64,6 @@ if (NOT ENABLE_MKL_DNN)
set(GEMM OPENBLAS)
endif()
if (NOT ENABLE_VPU)
set(ENABLE_MYRIAD OFF)
endif()
if (NOT ENABLE_MYRIAD)
set(ENABLE_VPU OFF)
endif()
#next section set defines to be accesible in c++/c code for certain feature
if (ENABLE_PROFILING_RAW)
add_definitions(-DENABLE_PROFILING_RAW=1)
@@ -100,8 +93,6 @@ if (ENABLE_OBJECT_DETECTION_TESTS)
add_definitions(-DENABLE_OBJECT_DETECTION_TESTS=1)
endif()
#models dependend tests
if (DEVELOPMENT_PLUGIN_MODE)
message (STATUS "Enabled development plugin mode")
@@ -121,14 +112,9 @@ if (VERBOSE_BUILD)
set(CMAKE_VERBOSE_MAKEFILE ON)
endif()
if (NOT ENABLE_OMP)
if (THREADING STREQUAL "TBB" OR THREADING STREQUAL "SEQ")
set(ENABLE_INTEL_OMP OFF)
message(STATUS "ENABLE_INTEL_OMP should be disabled if THREADING is TBB or Sequential. ENABLE_INTEL_OMP option is " ${ENABLE_INTEL_OMP})
endif()
if (NOT GEMM STREQUAL "MKL" AND NOT GEMM STREQUAL "OPENBLAS")
message("FATAL_ERROR" "GEMM should be set to MKL|OPENBLAS")
endif()
print_enabled_features()
message(STATUS "GEMM = ${GEMM}")
print_enabled_features()

View File

@@ -4,6 +4,7 @@
#
cmake_minimum_required(VERSION 2.8)
cmake_policy(SET CMP0054 NEW)
#features trigger supported by build system
include(check_features)
@@ -40,19 +41,6 @@ endif()
set(MODELS_PATH "${TEMP}/models")
debug_message(STATUS "MODELS_PATH=" ${MODELS_PATH})
#clDNN
if (ENABLE_CLDNN AND NOT ENABLE_CLDNN_BUILD)
if(NOT IE_SUBMODULE_IN_CLDNN)
RESOLVE_DEPENDENCY(CLDNN
ARCHIVE_UNIFIED "cldnn-main-03988.zip"
TARGET_PATH "${TEMP}/clDNN"
ENVIRONMENT "CLDNN"
VERSION_REGEX ".*_(([a-z]+-)?[a-z]+-[0-9]+)---.*"
FOLDER) #new cldnn package dont have toplevel cldnn folder
debug_message(STATUS "clDNN=" ${CLDNN})
endif ()
endif ()
## enable cblas_gemm from OpenBLAS package
if (GEMM STREQUAL "OPENBLAS")
if(NOT BLAS_LIBRARIES OR NOT BLAS_INCLUDE_DIRS)
@@ -67,51 +55,87 @@ debug_message(STATUS "openblas=" ${BLAS_LIBRARIES})
endif ()
#MKL-ml package
if (GEMM STREQUAL "MKL" OR ENABLE_INTEL_OMP)
if (GEMM STREQUAL "MKL")
if(NOT MKLROOT)
message(FATAL_ERROR "MKLROOT not found: install MKL and set -DMKLROOT=<path_to_MKL>")
endif()
debug_message(STATUS "mkl_ml=" ${MKLROOT})
endif ()
if (ENABLE_INTEL_OMP)
if (WIN32)
RESOLVE_DEPENDENCY(MKL
ARCHIVE_WIN "mkltiny_win_20180512.zip"
TARGET_PATH "${TEMP}/mkltiny_win_20180512"
ENVIRONMENT "MKLROOT"
RESOLVE_DEPENDENCY(OMP
ARCHIVE_WIN "iomp.zip"
TARGET_PATH "${TEMP}/omp"
ENVIRONMENT "OMP"
VERSION_REGEX ".*_([a-z]*_([a-z0-9]+\\.)*[0-9]+).*")
elseif(LINUX)
RESOLVE_DEPENDENCY(MKL
ARCHIVE_LIN "mkltiny_lnx_20180511.tgz"
TARGET_PATH "${TEMP}/mkltiny_lnx_20180511"
ENVIRONMENT "MKLROOT"
RESOLVE_DEPENDENCY(OMP
ARCHIVE_LIN "iomp.tgz"
TARGET_PATH "${TEMP}/omp"
ENVIRONMENT "OMP"
VERSION_REGEX ".*_([a-z]*_([a-z0-9]+\\.)*[0-9]+).*")
endif()
debug_message(STATUS "mkl_ml=" ${MKL})
log_rpath_from_dir(OMP "${OMP}/lib")
debug_message(STATUS "intel_omp=" ${OMP})
endif ()
#TBB package
if (THREADING STREQUAL "TBB")
if (WIN32)
#TODO: add target_path to be platform specific as well, to avoid following if
RESOLVE_DEPENDENCY(TBB
ARCHIVE_WIN "tbb2018_20180618_win.zip" #TODO: windows zip archive created incorrectly using old name for folder
TARGET_PATH "${TEMP}/tbb"
ENVIRONMENT "TBBROOT"
VERSION_REGEX ".*_([a-z]*_([a-z0-9]+\\.)*[0-9]+).*")
elseif(LINUX)
RESOLVE_DEPENDENCY(TBB
ARCHIVE_LIN "tbb2018_20180618_lin.tgz"
TARGET_PATH "${TEMP}/tbb"
ENVIRONMENT "TBBROOT")
endif()
set(TBB_INCLUDE_DIRS "${TBB}/include")
find_path(TBB_INCLUDE_DIRS tbb/tbb.h)
find_library(TBB_LIBRARIES_RELEASE tbb HINTS "${TBB}/lib")
if (TBB_INCLUDE_DIRS AND TBB_LIBRARIES_RELEASE)
log_rpath_from_dir(TBB "${TBB}/lib")
else()
message("FATAL_ERROR" "TBB is unset")
endif()
debug_message(STATUS "tbb=" ${TBB})
endif ()
if (ENABLE_OPENCV)
if (WIN32)
RESOLVE_DEPENDENCY(OPENCV
ARCHIVE_WIN "opencv_3.4.3.zip"
TARGET_PATH "${TEMP}/opencv"
ARCHIVE_WIN "opencv_4.0.0-0256.zip"
TARGET_PATH "${TEMP}/opencv_4.0.0"
ENVIRONMENT "OpenCV_DIR"
VERSION_REGEX ".*_([0-9]+.[0-9]+.[0-9]+).*")
log_rpath_from_dir(OPENCV "\\opencv\\x64\\vc14\\bin")
set( ENV{OpenCV_DIR} ${OPENCV} )
log_rpath_from_dir(OPENCV "\\opencv_4.0.0\\bin")
set( ENV{OpenCV_DIR} ${OPENCV}/cmake )
elseif(LINUX)
if (${LINUX_OS_NAME} STREQUAL "Ubuntu 16.04")
RESOLVE_DEPENDENCY(OPENCV
ARCHIVE_LIN "opencv_3.4.3_ubuntu16.tar.bz2"
TARGET_PATH "${TEMP}/opencv_ubuntu16"
ARCHIVE_LIN "opencv_4.0.0-0256_ubuntu16.tgz"
TARGET_PATH "${TEMP}/opencv_4.0.0_ubuntu"
ENVIRONMENT "OpenCV_DIR"
VERSION_REGEX ".*_([0-9]+.[0-9]+.[0-9]+).*")
log_rpath_from_dir(OPENCV "opencv_ubuntu16/lib")
log_rpath_from_dir(OPENCV "opencv_4.0.0_ubuntu/lib")
elseif (${LINUX_OS_NAME} STREQUAL "CentOS 7")
RESOLVE_DEPENDENCY(OPENCV
ARCHIVE_LIN "opencv_3.4.3_centos7.tar.bz2"
TARGET_PATH "${TEMP}/opencv_centos7"
ARCHIVE_LIN "opencv_4.0.0-0256_centos.tgz"
TARGET_PATH "${TEMP}/opencv_4.0.0_centos"
ENVIRONMENT "OpenCV_DIR"
VERSION_REGEX ".*_([0-9]+.[0-9]+.[0-9]+).*")
log_rpath_from_dir(OPENCV "opencv_centos7/lib")
log_rpath_from_dir(OPENCV "opencv_4.0.0_centos/lib")
endif()
set( ENV{OpenCV_DIR} ${OPENCV}/share )
set( ENV{OpenCV_DIR} ${OPENCV}/cmake )
endif()
debug_message(STATUS "opencv=" ${OPENCV})
endif()
include(omp)
if (THREADING STREQUAL "OMP")
include(omp)
endif ()

View File

@@ -24,4 +24,4 @@ function (Download from to fatal result output)
endfunction(Download)
include ("download_and_apply")
include ("download_and_extract")
include ("download_and_extract")

View File

@@ -53,4 +53,4 @@ function (DownloadAndCheck from to fatal result)
file(REMOVE ${to}.md5)
set(${result} "${status_res}" PARENT_SCOPE)
endfunction(DownloadAndCheck)
endfunction(DownloadAndCheck)

View File

@@ -144,7 +144,7 @@ function (CheckOrDownloadAndExtract component RELATIVE_URL archive_name unpacked
set (status "ON")
set (on_master FALSE)
set (URL "https://download.01.org/openvinotoolkit/2018_R3/dldt/inference_engine/${RELATIVE_URL}")
set (URL "https://download.01.org/openvinotoolkit/2018_R4/dldt/inference_engine/${RELATIVE_URL}")
#no message on recursive calls
if (${use_alternatives})

View File

@@ -45,4 +45,4 @@ function (extract archive_path unpacked_path folder result)
endif()
endif()
endfunction (extract)
endfunction (extract)

View File

@@ -15,18 +15,25 @@ ie_option (ENABLE_MKL_DNN "MKL-DNN plugin for inference engine" ON)
ie_option (ENABLE_CLDNN "clDnn based plugin for inference engine" ON)
ie_option (ENABLE_CLDNN_BUILD "build clDnn from sources" OFF)
ie_option (ENABLE_PROFILING_ITT "ITT tracing of IE and plugins internals" ON)
ie_option (ENABLE_PROFILING_RAW "Raw counters profiling (just values, no start/stop time or timeline)" OFF)
# "MKL-DNN library might use MKL-ML or OpenBLAS for gemm tasks: OPENBLAS|MKL"
if (NOT GEMM)
set (GEMM "OPENBLAS")
endif()
#
ie_option (ENABLE_OMP "MKL-DNN library based on OMP implementation" ON)
# "MKL-DNN library might use MKL-ML or OpenBLAS for gemm tasks: MKL|OPENBLAS|JIT"
if (NOT GEMM STREQUAL "MKL" AND NOT GEMM STREQUAL "OPENBLAS" AND NOT GEMM STREQUAL "JIT")
set (GEMM "JIT")
message(STATUS "GEMM should be set to MKL|OPENBLAS|JIT. Default option is " ${GEMM})
endif()
list (APPEND IE_OPTIONS GEMM)
# "MKL-DNN library based on OMP or TBB or Sequential implementation: TBB|OMP|SEQ"
if (NOT THREADING STREQUAL "TBB" AND NOT THREADING STREQUAL "OMP" AND NOT THREADING STREQUAL "SEQ")
set (THREADING "OMP")
message(STATUS "THREADING should be set to TBB|OMP|SEQ. Default option is " ${THREADING})
endif()
list (APPEND IE_OPTIONS THREADING)
ie_option (ENABLE_INTEL_OMP "MKL-DNN library based on Intel OMP implementation" ON)
@@ -60,4 +67,3 @@ ie_option (ENABLE_PLUGIN_RPATH "enables rpath information to be present in plugi
#name of environment variable stored path to temp directory"
set (DL_SDK_TEMP "DL_SDK_TEMP")

View File

@@ -8,7 +8,7 @@ cmake_minimum_required(VERSION 2.8)
if (UNIX)
function(get_linux_name res_var)
if (NOT EXISTS "/etc/lsb-release")
execute_process(COMMAND find /etc/ -maxdepth 1 -type f -name *-release -exec cat {} \;
execute_process(COMMAND find -L /etc/ -maxdepth 1 -type f -name *-release -exec cat {} \;
OUTPUT_VARIABLE release_data RESULT_VARIABLE result)
set(name_regex "NAME=\"([^ \"\n]*).*\"\n")
set(version_regex "VERSION=\"([0-9]+(\\.[0-9]+)?)[^\n]*\"")

View File

@@ -3,17 +3,19 @@
# SPDX-License-Identifier: Apache-2.0
#
cmake_policy(SET CMP0054 NEW)
if (APPLE OR WIN32)
find_path(OMP_INC omp.h)
find_library(OMP_LIB iomp5
PATHS ${MKL}/lib)
PATHS ${OMP}/lib)
if (OMP_INC AND OMP_LIB)
set(HAVE_OMP TRUE)
get_filename_component(OMP_LIB_DIR "${OMP_LIB}" PATH)
else()
if (ENABLE_OMP)
if (THREADING STREQUAL "OMP")
find_package(OpenMP)
if (NOT OPENMP_FOUND)
message(WARNING "OpenMP not found. OpenMP support will be disabled.")
@@ -34,7 +36,7 @@ macro(enable_omp)
elseif(UNIX) # Linux
add_definitions(-fopenmp)
elseif(WIN32) # Windows
if (ENABLE_OMP)
if (THREADING STREQUAL "OMP")
set(OPENMP_FLAGS "/Qopenmp /openmp")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${CMAKE_CCXX_FLAGS} ${OPENMP_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${CMAKE_CCXX_FLAGS} ${OPENMP_FLAGS}")
@@ -45,13 +47,13 @@ macro(enable_omp)
if (WIN32)
find_library(intel_omp_lib
libiomp5md
PATHS ${MKL}/lib ${ICCLIB})
PATHS ${OMP}/lib ${ICCLIB})
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /nodefaultlib:vcomp")
set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /nodefaultlib:vcomp")
else()
find_library(intel_omp_lib
iomp5
PATHS ${MKL}/lib)
PATHS ${OMP}/lib)
endif()
endif()
endmacro(enable_omp)

View File

@@ -3,6 +3,7 @@
# SPDX-License-Identifier: Apache-2.0
#
# Usage: ie_option(<option_variable> "description" <initial value or boolean expression> [IF <condition>])
function (ie_option variable description value)
option(${variable} "${description}" ${value})
list (APPEND IE_OPTIONS "${variable}")

View File

@@ -8,4 +8,4 @@ if (ENABLE_SANITIZER)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address -fuse-ld=gold")
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -fsanitize=address")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=address")
endif()
endif()

View File

@@ -1,7 +1,7 @@
# Copyright (C) 2018 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
#
set(InferenceEngine_VERSION 1.2.0)
set(InferenceEngine_VERSION 1.4.0)
set(PACKAGE_VERSION ${InferenceEngine_VERSION})
set(PACKAGE_VERSION_EXACT False)
@@ -14,4 +14,4 @@ endif()
if(PACKAGE_FIND_VERSION VERSION_LESS PACKAGE_VERSION)
set(PACKAGE_VERSION_COMPATIBLE True)
endif()
endif()

View File

@@ -75,10 +75,17 @@ else()
set(os_name "${os_name} ${CMAKE_MATCH_1}")
if (NOT os_name)
message(FATAL_ERROR "Cannot detect OS via reading /etc/*-release:\n ${release_data}")
if(InferenceEngine_FIND_REQUIRED)
message(FATAL_ERROR "Cannot detect OS via reading /etc/*-release:\n ${release_data}")
elseif(NOT InferenceEngine_FIND_QUIETLY)
message(WARNING "Cannot detect OS via reading /etc/*-release:\n ${release_data}")
endif()
return()
endif()
message (STATUS "/etc/*-release distrib: ${os_name}")
if (NOT InferenceEngine_FIND_QUIETLY)
message (STATUS "/etc/*-release distrib: ${os_name}")
endif()
if (${os_name} STREQUAL "Ubuntu 14.04")
set(_OS_PATH "ubuntu_14.04/")
@@ -89,7 +96,12 @@ else()
elseif (${os_name} STREQUAL "poky 2.0")
set(_OS_PATH "ubuntu_16.04/")
else()
message(FATAL_ERROR "${os_name} is not supported. List of supported OS: Ubuntu 14.04, Ubuntu 16.04, CentOS 7")
if(InferenceEngine_FIND_REQUIRED)
message(FATAL_ERROR "${os_name} is not supported. List of supported OS: Ubuntu 14.04, Ubuntu 16.04, CentOS 7")
elseif(NOT InferenceEngine_FIND_QUIETLY)
message(WARNING "${os_name} is not supported. List of supported OS: Ubuntu 14.04, Ubuntu 16.04, CentOS 7")
endif()
return()
endif()
endif()
endif()
@@ -98,18 +110,23 @@ else()
unset(IE_INCLUDE_DIR CACHE)
endif()
if(IE_SRC_DIR AND NOT "${IE_ROOT_DIR}/src" EQUAL "${IE_SRC_DIR}")
unset(IE_SRC_DIR CACHE)
endif()
if(IE_LIBRARY AND NOT "${IE_ROOT_DIR}/lib/${_OS_PATH}/${_ARCH}" EQUAL "${IE_LIBRARY}")
unset(IE_LIBRARY CACHE)
endif()
set(_IE_ROOT_INCLUDE_DIR "${IE_ROOT_DIR}/include")
set(_IE_ROOT_SRC_DIR "${IE_ROOT_DIR}/src")
set(_IE_ROOT_LIBRARY "${IE_ROOT_DIR}/lib/${_OS_PATH}/${_ARCH}")
find_path(IE_INCLUDE_DIR inference_engine.hpp "${_IE_ROOT_INCLUDE_DIR}")
#message("InferenceEngine_INCLUDE_DIR=${IE_INCLUDE_DIR}:${_IE_ROOT_INCLUDE_DIR}")
find_path(IE_SRC_DIR extension "${_IE_ROOT_SRC_DIR}")
include(FindPackageHandleStandardArgs)
if (WIN32)
find_library(IE_RELEASE_LIBRARY inference_engine "${_IE_ROOT_LIBRARY}/Release")
find_library(IE_DEBUG_LIBRARY inference_engine "${_IE_ROOT_LIBRARY}/Debug")
@@ -146,6 +163,9 @@ else()
set(InferenceEngine_INCLUDE_DIRS ${IE_INCLUDE_DIR})
set(InferenceEngine_LIBRARIES IE::inference_engine)
set(InferenceEngine_FOUND TRUE)
add_subdirectory(${IE_SRC_DIR}/extension EXCLUDE_FROM_ALL ie_cpu_extension)
add_library(IE::ie_cpu_extension ALIAS ie_cpu_extension)
endif()
endif()

View File

@@ -1,4 +1,4 @@
# Overview of Inference Engine Python* API {#InferEnginePythonAPI}
# Overview of Inference Engine Python* API
**NOTE:** It is a preview version of the Inference Engine Python\* API for evaluation purpose only.
Module structure and API itself may be changed in future releases.
@@ -32,20 +32,21 @@ after running the environment configuration script.
This class stores main information about the layer and allow to modify some layer parameters
### Class attributes:
* `name` - name of the layer
* `type` - layer type
* `precision` - layer base operating precision
* `affinity` - layer affinity set by user or default affinity set by IEPlugin.set_initial_affinity() method.
The affinity attribute provides getter and setter interface, so the layer affinity can be modified directly in following way
* `name` - Name of the layer
* `type`- Layer type
* `precision` - Layer base operating precision. Provides getter and setter interfaces.
* `affinity` - Layer affinity set by user or a default affinity set by the `IEPlugin.set_initial_affinity()` method.
The affinity attribute provides getter and setter interfaces, so the layer affinity can be modified directly.
For example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> plugin = IEPlugin(device="HETERO:FPGA,CPU")
>>> plugin.set_config({"TARGET_FALLBACK": "HETERO:FPGA,CPU"})
>>> plugin.set_initial_affinity(net)
>>> for l in net.layers.values():
... if l.type == "Convolution":
... l.affinity = "CPU"
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> plugin = IEPlugin(device="HETERO:FPGA,CPU")
>>> plugin.set_config({"TARGET_FALLBACK": "HETERO:FPGA,CPU"})
>>> plugin.set_initial_affinity(net)
>>> for l in net.layers.values():
... if l.type == "Convolution":
... l.affinity = "CPU"
```
@@ -61,18 +62,18 @@ To understand how default and non-default affinities are set:
1. Call `net.layers` function right after model loading and check that layer affinity parameter is empty.
2. Call `plugin.set_default_affinity(net)`.
3. Call `net.layers` and check layer affinity parameters to see how plugin set default affinity
3. Call `net.layers` and check layer affinity parameters to see how plugin set a default affinity
4. Set layer affinity how it's described above
5. Call `net.layers` again and check layer affinity parameters to see how it was changed after manual affinity
setting
Please refer to `affinity_setting_sample.py` to see the full usage pipeline.
Please refer to `affinity_setting_demo.py` to see the full usage pipeline.
* `weights` - dictionary with layer weights, biases or custom blobs if any
* `params` - layer specific parameters. Provides getter and setter interface which allows to get and\or modify layer parameters.
Please note that some modifications can be ignored and\or overwriten by target plugin (e.g. modification of
convolution kernel size will be reflected in layer parameters but finally the plugin will ignore it and will
use initial kernel size)
* `weights`- Dictionary with layer weights, biases or custom blobs if any
* `params` - Layer specific parameters. Provides getter and setter interfaces to get and modify layer parameters.
Please note that some modifications can be ignored and\or overwriten by target plugin (e.g. modification of
convolution kernel size will be reflected in layer parameters but finally the plugin will ignore it and will
use initial kernel size)
## <a name="ienetwork-class"></a>IENetwork
@@ -86,41 +87,53 @@ There is no explicit class constructor. Use `from_ir` class method to read the I
### Class attributes:
* `name` - Name of the loaded network
* `inputs` - a dictionary of input layer name as a key and input data shape as a value
* `inputs` - A dictionary that maps input layer names to <a name="inputinfo-class"></a>InputInfo objects.
For example, to get a shape of the input layer:
* Usage example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.inputs
{'data': [1, 3, 224, 224]}
```
* `outputs` - a list of output layer names
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.inputs
{'data': <inference_engine.ie_api.InputInfo object at 0x7efe042dedd8>}
>>> net.inputs['data'].shape
[1, 3, 224, 224]
```
* Usage example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.outputs
['prob']
```
* `outputs` - A dictionary that maps output layer names to <a name="inputinfo-class"></a>OutputInfo objects
For example, to get a shape of the output layer:
* `batch_size` - Batch size of the network. Provides getter and setter interface which allows to get and modify the
network batch size in the following way:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.batch_size
1
>>> net.batch_size = 4
>>> net.batch_size
4
```
* `layers` - return dictionary with the network layer names as key and <a name="ienetlayer-class"></a>IENetLayer objects containing layer properties
as value
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.inputs
{'prob': <inference_engine.ie_api.OutputInfo object at 0x7efe03ab95d0>}
>>> net.outputs['prob'].shape
[1, 1000]
```
* `batch_size` - Batch size of the network. Provides getter and setter interfaces to get and modify the
network batch size. For example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.batch_size
1
>>> net.batch_size = 4
>>> net.batch_size
4
>>> net.inputs['data'].shape
[4, 3, 224, 224]
```
* `layers` - Return dictionary that maps network layer names to <a name="ienetlayer-class"></a>`IENetLayer`
objects containing layer properties. For example, to list all network layers:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.layers
{'conv0': <inference_engine.ie_api.IENetLayer object at 0x7f3a4c102370>}
```
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.layers
{'conv0': <inference_engine.ie_api.IENetLayer object at 0x7f3a4c102370>
...
}
```
### Class Methods
* `from_ir(model: str, weights: str)`
@@ -131,19 +144,20 @@ There is no explicit class constructor. Use `from_ir` class method to read the I
* Parameters:
* model - path to `.xml` file of the IR
* weights - path to `.bin` file of the IR
* model - Path to `.xml` file of the IR
* weights - Path to `.bin` file of the IR
* Return value:
An instance of the `IENetwork` class
* Usage example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net
<inference_engine.ie_api.IENetwork object at 0x7fd7dbce54b0>
```
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net
<inference_engine.ie_api.IENetwork object at 0x7fd7dbce54b0>
```
### Instance Methods
@@ -156,24 +170,89 @@ There is no explicit class constructor. Use `from_ir` class method to read the I
* Parameters:
* `outputs` - a list of layer names to be set as model outputs. In case of setting one layer as output, string with one layer can be provided.
* `outputs` - List of layer names to be set as model outputs. In case of setting one layer as output, string with one layer can be provided.
* Return value:
None
* Usage example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.add_outputs(["conv5_1/dwise', conv2_1/expand'])]
>>> net.outputs
['prob', 'conv5_1/dwise', 'conv2_1/expand']
```
Note that the last layers (nodes without successors in graph representation of the model) are set as output
by default. In the case above, `prob` layer is a default output and `conv5_1/dwise`, `conv2_1/expand` are user-defined
outputs.
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> net.add_outputs(["conv5_1/dwise', conv2_1/expand'])]
>>> net.outputs
['prob', 'conv5_1/dwise', 'conv2_1/expand']
```
**Note**
The last layers (nodes without successors in graph representation of the model) are set as output
by default. In the case above, `prob` layer is a default output and `conv5_1/dwise`, `conv2_1/expand` are user-defined
outputs.
* `reshape(input_shapes: dict)`:
* Description:
The method reshapes the network to change spatial dimensions, batch size, or any dimension.
**Note:**
Before using this method, make sure that the target shape is applicable for the network
Changing the network shape to an arbitrary value may lead to unpredictable behaviour.
* Parameters:
* `input_shapes` - The dictionary that maps input layer names to tuples with the target shape
* Return value:
None
* Usage example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> input_layer = next(iter(net.inputs))
>>> n, c, h, w = net.inputs[input_layer]
>>> net.reshape({input_layer: (n, c, h*2, w*2)}]
```
## <a name="inputinfo-class"></a>InputInfo
This class contains the information about the network input layers
### Class attributes:
* `precision` - Precision of the input data provided by user. Provides setter and getter interfaces
to get and modify input layer precision.
List of applicable precisions: FP32 FP16, I32, I16, I8, U32, U16
**Note**: Support of any calculation precision depends on the target plugin
* `layout` - Layout of the input data provided by user. Provides setter and getter interfaces
to get and modify input layer layout.
List of applicable layouts: NCHW, NHWC, OIHW, C, CHW, HW, NC, CN, BLOCKED
* `shape` - input layer data shape
## <a name="outputinfo-class"></a>OutputInfo
This class contains the information about the network input layers
### Class attributes:
* `precision` - Precision of the output data. Provides setter and getter interfaces
to get and modify output layer precision.
* `layout` - Layout of the output data provided by user
* `shape` - Input layer data shape
## <a name="ieplugin-class"></a>IEPlugin Class
This class is the main plugin interface and serves to initialize and configure the plugin.
@@ -184,8 +263,8 @@ This class is the main plugin interface and serves to initialize and configure t
* Parameters:
* `device` - target device name. Supported devices: CPU, GPU, FPGA, MYRIAD, HETERO
* `plugin_dirs` - list of paths to plugin directories
* `device` - Target device name. Supported devices: CPU, GPU, FPGA, MYRIAD, HETERO
* `plugin_dirs` - List of paths to plugin directories
### Properties
@@ -194,7 +273,7 @@ This class is the main plugin interface and serves to initialize and configure t
### Instance Methods
* `load(network: IENetwork, num_requests: int=1, config=None)`
* ```load(network: IENetwork, num_requests: int=1, config=None)```
* Description:
@@ -204,23 +283,25 @@ This class is the main plugin interface and serves to initialize and configure t
* Parameters:
* `network` - a valid IENetwork instance created by `IENetwork.from_ir()` method
* `num_requests` - a positive integer value of infer requests to be created. Number of infer requests may be limited
* `network` - A valid IENetwork instance created by `IENetwork.from_ir()` method
* `num_requests` - A positive integer value of infer requests to be created. Number of infer requests may be limited
by device capabilities.
* `config` - a dictionary of plugin configuration keys and their values
* `config` - A dictionary of plugin configuration keys and their values
* Return value:
None
* Usage example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> plugin = IEPlugin(device="CPU")
>>> exec_net = plugin.load(network=net, num_requsts=2)
>>> exec_net
<inference_engine.ie_api.ExecutableNetwork object at 0x7f5140bbcd38>
```
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> plugin = IEPlugin(device="CPU")
>>> exec_net = plugin.load(network=net, num_requsts=2)
>>> exec_net
<inference_engine.ie_api.ExecutableNetwork object at 0x7f5140bbcd38>
```
* `set_initial_affinity(net: IENetwork)`
* Description:
@@ -230,7 +311,7 @@ This class is the main plugin interface and serves to initialize and configure t
* Parameters:
* `net` - a valid instance of IENetwork
* `net` - A valid instance of IENetwork
* Return value:
@@ -248,17 +329,20 @@ This class is the main plugin interface and serves to initialize and configure t
* Parameters:
* `extension_path` - a full path to CPU extensions library
* `extension_path` - A full path to CPU extensions library
* Return value:
None
* Usage example:
```py
>>> plugin = IEPlugin(device="CPU")
>>> plugin.add_cpu_extenstions(ext_lib_path)
```
```py
>>> plugin = IEPlugin(device="CPU")
>>> plugin.add_cpu_extenstions(ext_lib_path)
```
* `set_config(config: dict)`
* Description:
@@ -268,7 +352,7 @@ This class is the main plugin interface and serves to initialize and configure t
* Parameters:
* `config` - a dictionary of keys and values of acceptable configuration parameters
* `config` - A dictionary of keys and values of acceptable configuration parameters
* Return value:
@@ -279,6 +363,7 @@ This class is the main plugin interface and serves to initialize and configure t
See `set_affinity` method of the `IENetwork` class.
* `get_supported_layers(net: IENetwork)`
* Description:
Returns the set of layers supported by the plugin. Please note that in case of CPU plugin support of
@@ -286,7 +371,7 @@ This class is the main plugin interface and serves to initialize and configure t
* Parameters:
* `net` - a valid instance of IENetwork
* `net` - A valid instance of IENetwork
* Return value:
@@ -306,16 +391,19 @@ There is no explicit class constructor. To make a valid instance of `ExecutableN
### Class attributes
* `requests` - a tuple of InferRequest instances
* `requests` - A tuple of InferRequest instances
* Usage example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> plugin = IEPlugin(device="CPU")
>>> exec_net = plugin.load(network=net, num_requsts=2)
>>> exec_net.requests
(<inference_engine.ie_api.InferRequest object at 0x7f66f56c57e0>, <inference_engine.ie_api.InferRequest object at 0x7f66f56c58b8>, <inference_engine.ie_api.InferRequest object at 0x7f66f56c5900>)
```
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> plugin = IEPlugin(device="CPU")
>>> exec_net = plugin.load(network=net, num_requsts=3)
>>> exec_net.requests
(<inference_engine.ie_api.InferRequest object at 0x7f66f56c57e0>,
<inference_engine.ie_api.InferRequest object at 0x7f66f56c58b8>,
<inference_engine.ie_api.InferRequest object at 0x7f66f56c5900>)
```
### Instance Methods
@@ -327,27 +415,28 @@ There is no explicit class constructor. To make a valid instance of `ExecutableN
Wraps `infer()` method of the `InferRequest` class
* Parameters:
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
* Return value:
A dictionary of output layer name as a key and `numpy.ndarray` with output data of the layer as a value
A dictionary that maps output layer names to `numpy.ndarray` objects with output data of the layer
* Usage example:
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> plugin = IEPlugin(device="CPU")
>>> exec_net = plugin.load(network=net, num_requsts=2)
>>> res = exec_net.infer({'data': img})
>>> res
{'prob': array([[[[2.83426580e-08]],
[[2.40166020e-08]],
[[1.29469613e-09]],
[[2.95946148e-08]]
......
]])}
```
For illustration of input data preparation, please see samples (for example, `classification_sample.py`).
```py
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
>>> plugin = IEPlugin(device="CPU")
>>> exec_net = plugin.load(network=net, num_requsts=2)
>>> res = exec_net.infer({'data': img})
>>> res
{'prob': array([[[[2.83426580e-08]],
[[2.40166020e-08]],
[[1.29469613e-09]],
[[2.95946148e-08]]
......
]])}
```
For illustration of input data preparation, please see samples (for example, `classification_sample.py`).
* `start_async(request_id, inputs=None)`
@@ -358,21 +447,23 @@ There is no explicit class constructor. To make a valid instance of `ExecutableN
* Parameters:
* `request_id` - index of infer request to start inference
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
* `request_id` - Index of infer request to start inference
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
* Return value:
A handler of specified infer request, which is an instance of the `InferRequest` class.
* Usage example:
```py
>>> infer_request_handle = exec_net.start_async(request_id=0, inputs={input_blob: image})
>>> infer_status = infer_request_handle.wait()
>>> res = infer_request_handle.outputs[out_blob]
```
For more details about infer requests processing, see `classification_sample_async.py` (simplified case) and
`object_detection_demo_ssd_async.py` (real synchronous use case) samples.
```py
>>> infer_request_handle = exec_net.start_async(request_id=0, inputs={input_blob: image})
>>> infer_status = infer_request_handle.wait()
>>> res = infer_request_handle.outputs[out_blob]
```
For more details about infer requests processing, see `classification_sample_async.py` (simplified case) and
`object_detection_demo_ssd_async.py` (real asynchronous use case) samples.
## <a name="inferrequest"></a>InferRequest Class
@@ -386,19 +477,20 @@ class with specified number of requests to get `ExecutableNetwork` instance whic
### Class attributes
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
* `outputs` - a dictionary of output layer name as a key and `numpy.ndarray` with output data of the layer as a value
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
* `outputs` - A dictionary that maps output layer names to `numpy.ndarray` objects with output data of the layer
* Usage example:
```py
>>> exec_net.requests[0].inputs['data'][:] = image
>>> exec_net.requests[0].infer()
>>> res = exec_net.requests[0].outputs['prob']
>>> np.flip(np.sort(np.squeeze(res)),0)
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
2.26027006e-03, 2.12283316e-03 ...])
```
* Usage example:
```py
>>> exec_net.requests[0].inputs['data'][:] = image
>>> exec_net.requests[0].infer()
>>> res = exec_net.requests[0].outputs['prob']
>>> np.flip(np.sort(np.squeeze(res)),0)
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
2.26027006e-03, 2.12283316e-03 ...])
```
### Instance Methods
@@ -413,22 +505,23 @@ To run inference, please use simplified methods `infer()` and `start_async()` of
* Parameters:
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
* Return value:
None
* Usage example:
```py
>>> exec_net = plugin.load(network=net, num_requests=2)
>>> exec_net.requests[0].infer({input_blob: image})
>>> res = exec_net.requests[0].outputs['prob']
>>> np.flip(np.sort(np.squeeze(res)),0)
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
2.26027006e-03, 2.12283316e-03 ...])
```
```py
>>> exec_net = plugin.load(network=net, num_requests=2)
>>> exec_net.requests[0].infer({input_blob: image})
>>> res = exec_net.requests[0].outputs['prob']
>>> np.flip(np.sort(np.squeeze(res)),0)
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
2.26027006e-03, 2.12283316e-03 ...])
```
* `async_infer(inputs=None)`
@@ -438,23 +531,24 @@ To run inference, please use simplified methods `infer()` and `start_async()` of
* Parameters:
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
* Return value:
None
* Usage example:
```py
>>> exec_net = plugin.load(network=net, num_requests=2)
>>> exec_net.requests[0].async_infer({input_blob: image})
>>> exec_net.requests[0].wait()
>>> res = exec_net.requests[0].outputs['prob']
>>> np.flip(np.sort(np.squeeze(res)),0)
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
2.26027006e-03, 2.12283316e-03 ...])
```
```py
>>> exec_net = plugin.load(network=net, num_requests=2)
>>> exec_net.requests[0].async_infer({input_blob: image})
>>> exec_net.requests[0].wait()
>>> res = exec_net.requests[0].outputs['prob']
>>> np.flip(np.sort(np.squeeze(res)),0)
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
2.26027006e-03, 2.12283316e-03 ...])
```
* `wait(timeout=-1)`
@@ -467,14 +561,14 @@ To run inference, please use simplified methods `infer()` and `start_async()` of
There are special values of the timeout parameter:
* 0 - immediately returns the inference status. It does not block or interrupt execution.
* 0 - Immediately returns the inference status. It does not block or interrupt execution.
To find statuses meaning, please refer to InferenceEngine::StatusCode in Inference Engine C++ documentation
* -1 - waits until inference result becomes available (default value)
* -1 - Waits until inference result becomes available (default value)
* Parameters:
* `timeout` - time to wait in milliseconds or special (0, -1) cases described above.
* `timeout` - Time to wait in milliseconds or special (0, -1) cases described above.
If not specified, `timeout` value is set to -1 by default.
* Usage example:
@@ -498,19 +592,20 @@ To run inference, please use simplified methods `infer()` and `start_async()` of
* Usage example:
```py
>>> exec_net = plugin.load(network=net, num_requests=2)
>>> exec_net.requests[0].infer({input_blob: image})
>>> exec_net.requests[0].get_perf_counts()
{'Conv2D': {'exec_type': 'jit_avx2_1x1',
'real_time': 154,
'cpu_time': 154,
'status': 'EXECUTED',
'layer_type': 'Convolution'},
'Relu6': {'exec_type': 'undef',
'real_time': 0,
'cpu_time': 0,
'status': 'NOT_RUN',
'layer_type': 'Clamp'}
...
}
```py
>>> exec_net = plugin.load(network=net, num_requests=2)
>>> exec_net.requests[0].infer({input_blob: image})
>>> exec_net.requests[0].get_perf_counts()
{'Conv2D': {'exec_type': 'jit_avx2_1x1',
'real_time': 154,
'cpu_time': 154,
'status': 'EXECUTED',
'layer_type': 'Convolution'},
'Relu6': {'exec_type': 'undef',
'real_time': 0,
'cpu_time': 0,
'status': 'NOT_RUN',
'layer_type': 'Clamp'}
...
}
```

View File

@@ -9,7 +9,6 @@ from .ie_api_impl_defs cimport Blob, TensorDesc
from libcpp.string cimport string
from libcpp.vector cimport vector
from libcpp.memory cimport unique_ptr
from libcpp cimport bool
cdef class BlobBuffer:
cdef Blob.Ptr ptr
@@ -57,3 +56,9 @@ cdef class IENetReader:
cdef class IENetLayer:
cdef C.IENetLayer impl
cdef class InputInfo:
cdef C.InputInfo impl
cdef class OutputInfo:
cdef C.OutputInfo impl

View File

@@ -14,6 +14,7 @@ from libcpp.memory cimport unique_ptr
from libc.stdint cimport int64_t
import os
import numpy as np
from copy import deepcopy
cdef extern from "<utility>" namespace "std" nogil:
cdef unique_ptr[C.IEExecNetwork] move(unique_ptr[C.IEExecNetwork])
@@ -32,7 +33,8 @@ cdef dict_to_c_map(py_dict):
c_map[k.encode()] = v.encode()
return c_map
supported_precisions = ["fp32", "fp16", "q78", "i32", "i16", "i8", "u32", "u16"]
supported_precisions = ["FP32", "FP16", "Q78", "I32", "I16", "I8", "U32", "U16"]
supported_layouts = ["NCHW", "NHWC", "OIHW", "C", "CHW", "HW", "NC", "CN", "BLOCKED"]
known_plugins = ['CPU', 'GPU', 'FPGA', 'MYRIAD', 'HETERO']
def get_version():
@@ -62,6 +64,7 @@ cdef class IENetLayer:
weights_buffer.reset(weights.second)
weights_map[weights.first.decode()] = weights_buffer.to_numpy()
return weights_map
@property
def params(self):
return {k.decode(): v.decode() for k, v in self.impl.params}
@@ -73,6 +76,56 @@ cdef class IENetLayer:
def params(self, params_map):
self.impl.setParams(dict_to_c_map(params_map))
@precision.setter
def precision(self, precision: str):
self.impl.setPrecision(precision.upper().encode())
cdef class InputInfo:
@property
def precision(self):
return self.impl.precision.decode()
@property
def layout(self):
return self.impl.layout.decode()
@property
def shape(self):
return self.impl.dims
@precision.setter
def precision(self, precision):
if precision.upper() not in supported_precisions:
raise AttributeError(
"Unsupported precision {}! List of supported precisions: {}".format(precision, supported_precisions))
self.impl.setPrecision(precision.encode())
@layout.setter
def layout(self, layout):
if layout.upper() not in supported_layouts:
raise AttributeError(
"Unsupported layout {}! List of supported layouts: {}".format(layout, supported_layouts))
self.impl.setLayout(layout.encode())
cdef class OutputInfo:
@property
def precision(self):
return self.impl.precision.decode()
@property
def layout(self):
return self.impl.layout.decode()
@property
def shape(self):
return self.impl.dims
@precision.setter
def precision(self, precision):
if precision.upper() not in supported_precisions:
raise AttributeError(
"Unsupported precision {}! List of supported precisions: {}".format(precision, supported_precisions))
self.impl.setPrecision(precision.encode())
# @layout.setter
# def layout(self, layout):
# self.impl.setLayout(layout.encode())
cdef class ExecutableNetwork:
def __init__(self):
self._requests = []
@@ -80,8 +133,8 @@ cdef class ExecutableNetwork:
def infer(self, inputs=None):
current_request = self.requests[0]
current_request.infer(inputs)
if inputs is not None:
return {k: v for k, v in current_request.outputs.items()}
return deepcopy(current_request.outputs)
def start_async(self, request_id, inputs=None):
if request_id not in list(range(len(self.requests))):
@@ -147,7 +200,7 @@ cdef class InferRequest:
def _fill_inputs(self, inputs):
for k, v in inputs.items():
self.inputs[k][:] = v
self._inputs[k][:] = v
cdef class IENetwork:
@property
@@ -157,11 +210,25 @@ cdef class IENetwork:
@property
def inputs(self):
return {k.decode(): v for k, v in self.impl.inputs}
cdef map[string, C.InputInfo] c_inputs = self.impl.getInputs()
inputs = {}
cdef InputInfo in_info
for input in c_inputs:
in_info = InputInfo()
in_info.impl = input.second
inputs[input.first.decode()] = in_info
return inputs
@property
def outputs(self):
return [k.decode() for k in self.impl.outputs]
cdef map[string, C.OutputInfo] c_outputs = self.impl.getOutputs()
outputs = {}
cdef OutputInfo out_info
for out in c_outputs:
out_info = OutputInfo()
out_info.impl = out.second
outputs[out.first.decode()] = out_info
return outputs
@property
def batch_size(self):
@@ -176,7 +243,7 @@ cdef class IENetwork:
@property
def layers(self):
cdef map[string, C.IENetLayer] c_layers = <map[string, C.IENetLayer]>self.impl.getLayers()
cdef map[string, C.IENetLayer] c_layers = <map[string, C.IENetLayer]> self.impl.getLayers()
layers = {}
cdef IENetLayer net_l = IENetLayer()
for l in c_layers:
@@ -188,22 +255,23 @@ cdef class IENetwork:
@classmethod
def from_ir(cls, model: str, weights: str):
if not os.path.isfile(model):
raise FileNotFoundError("Path to the model {} doesn't exists or it's a directory".format(model))
raise Exception("Path to the model {} doesn't exists or it's a directory".format(model))
if not os.path.isfile(weights):
raise FileNotFoundError("Path to the weights {} doesn't exists or it's a directory".format(weights))
raise Exception("Path to the weights {} doesn't exists or it's a directory".format(weights))
net_reader = IENetReader()
return net_reader.read(model, weights)
# TODO: Use enum with precision type instead of srting parameter when python2 support will not be required.
def add_outputs(self, outputs, precision="FP32"):
if precision.lower() not in supported_precisions:
raise AttributeError("Unsupported precision {}! List of supported precisions: {}".format(precision, supported_precisions))
if precision.upper() not in supported_precisions:
raise AttributeError(
"Unsupported precision {}! List of supported precisions: {}".format(precision, supported_precisions))
if not isinstance(outputs, list):
outputs = [outputs]
cdef vector[string] _outputs
for l in outputs:
_outputs.push_back(l.encode())
self.impl.addOutputs(_outputs, precision.lower().encode())
self.impl.addOutputs(_outputs, precision.upper().encode())
def reshape(self, input_shapes: dict):
cdef map[string, vector[size_t]] c_input_shapes;
@@ -241,7 +309,8 @@ cdef class IEPlugin:
cpdef ExecutableNetwork load(self, IENetwork network, int num_requests=1, config=None):
if num_requests <= 0:
raise ValueError("Incorrect number of requests specified: {}. Expected positive integer number.".format(num_requests))
raise ValueError(
"Incorrect number of requests specified: {}. Expected positive integer number.".format(num_requests))
cdef ExecutableNetwork exec_net = ExecutableNetwork()
cdef vector[string] inputs_list
cdef vector[string] outputs_list
@@ -275,13 +344,13 @@ cdef class IEPlugin:
return exec_net
cpdef void set_initial_affinity(self,IENetwork net) except *:
cpdef void set_initial_affinity(self, IENetwork net) except *:
if self.device.find("HETERO") == -1:
raise RuntimeError("set_initial_affinity method applicable only for HETERO device")
self.impl.setInitialAffinity(net.impl)
cpdef set get_supported_layers(self,IENetwork net):
return set([l.decode() for l in self.impl.queryNetwork(net.impl)])
cpdef set get_supported_layers(self, IENetwork net):
return set([l.decode() for l in self.impl.queryNetwork(net.impl)])
@property
def device(self):
@@ -305,8 +374,6 @@ cdef class IEPlugin:
c_config[to_std_string(k)] = to_std_string(v)
self.impl.setConfig(c_config)
cdef class IENetReader:
def read(self, model: str, weights: str) -> IENetwork:
cdef IENetwork net = IENetwork()
@@ -349,7 +416,6 @@ cdef class BlobBuffer:
buffer.strides = self.strides.data()
buffer.suboffsets = NULL
cdef char*_get_blob_format(self, const TensorDesc & desc):
cdef Precision precision = desc.getPrecision()
name = bytes(precision.name()).decode()

View File

@@ -6,6 +6,25 @@
#include "ie_api_impl.hpp"
#include "hetero/hetero_plugin_config.hpp"
#include "ie_iinfer_request.hpp"
std::map <std::string,InferenceEngine::Precision> precision_map = {{"FP32", InferenceEngine::Precision::FP32},
{"FP16", InferenceEngine::Precision::FP16},
{"Q78", InferenceEngine::Precision::Q78},
{"I32", InferenceEngine::Precision::I32},
{"I16", InferenceEngine::Precision::I16},
{"I8", InferenceEngine::Precision::I8},
{"U16", InferenceEngine::Precision::U16},
{"U8", InferenceEngine::Precision::U8}};
std::map <std::string,InferenceEngine::Layout> layout_map = {{"ANY", InferenceEngine::Layout::ANY},
{"NCHW", InferenceEngine::Layout::NCHW},
{"NHWC", InferenceEngine::Layout::NHWC},
{"OIHW", InferenceEngine::Layout::OIHW},
{"C", InferenceEngine::Layout::C},
{"CHW", InferenceEngine::Layout::CHW},
{"HW", InferenceEngine::Layout::HW},
{"NC", InferenceEngine::Layout::NC},
{"CN", InferenceEngine::Layout::CN},
{"BLOCKED", InferenceEngine::Layout::BLOCKED}};
#define stringify( name ) # name
#define IE_CHECK_CALL(expr) { \
auto ret = (expr); \
@@ -14,33 +33,18 @@
} \
} \
InferenceEnginePython::IENetwork InferenceEnginePython::IENetReader::read(std::string const &model,
std::string const &weights)
{
InferenceEngine::CNNNetReader net_reader;
net_reader.ReadNetwork(model);
net_reader.ReadWeights(weights);
const std::string &net_name = net_reader.getName();
std::map<std::string, std::vector<size_t>> inputs;
const InferenceEngine::InputsDataMap &inputsInfo = net_reader.getNetwork().getInputsInfo();
for (auto &item : inputsInfo)
{
const InferenceEngine::TensorDesc &inputTensorDesc = item.second->getTensorDesc();
InferenceEngine::SizeVector dims = inputTensorDesc.getDims();
inputs[item.first] = dims;
}
// TODO: store output shapes for each output
std::vector<std::string> outputs;
const InferenceEngine::OutputsDataMap &outputsInfo = net_reader.getNetwork().getOutputsInfo();
for (auto &item : outputsInfo)
{
outputs.push_back(item.first);
}
InferenceEngine::CNNNetwork network = net_reader.getNetwork();
std::size_t batch_size = network.getBatchSize();
return {network, net_name, batch_size, inputs, outputs};
return {network, net_name, batch_size};
}
std::map<std::string, InferenceEnginePython::IENetLayer> InferenceEnginePython::IENetwork::getLayers()
@@ -91,17 +95,47 @@ std::map<std::string, InferenceEnginePython::IENetLayer> InferenceEnginePython::
return result;
}
std::map<std::string, InferenceEnginePython::InputInfo> InferenceEnginePython::IENetwork::getInputs(){
std::map<std::string, InferenceEnginePython::InputInfo> inputs;
const InferenceEngine::InputsDataMap &inputsInfo = actual.getInputsInfo();
for (auto & in : inputsInfo){
InferenceEnginePython::InputInfo info;
info.actual = *in.second;
const InferenceEngine::TensorDesc &inputTensorDesc = in.second->getTensorDesc();
info.dims = inputTensorDesc.getDims();
for (auto it : precision_map )
if (it.second == in.second->getPrecision())
info.precision = it.first;
for (auto it : layout_map )
if (it.second == in.second->getLayout())
info.layout = it.first;
inputs[in.first] = info;
}
return inputs;
}
std::map<std::string, InferenceEnginePython::OutputInfo> InferenceEnginePython::IENetwork::getOutputs(){
std::map<std::string, InferenceEnginePython::OutputInfo> outputs;
const InferenceEngine::OutputsDataMap &outputsInfo = actual.getOutputsInfo();
for (auto & out : outputsInfo){
InferenceEnginePython::OutputInfo info;
info.actual = out.second;
const InferenceEngine::TensorDesc &inputTensorDesc = out.second->getTensorDesc();
info.dims = inputTensorDesc.getDims();
for (auto it : precision_map )
if (it.second == out.second->getPrecision())
info.precision = it.first;
for (auto it : layout_map )
if (it.second == out.second->getLayout())
info.layout = it.first;
outputs[out.first] = info;
}
return outputs;
}
void InferenceEnginePython::IENetwork::addOutputs(const std::vector<std::string> & out_layers, const std::string &precision)
{
std::map <std::string,InferenceEngine::Precision> precision_map = {{"fp32", InferenceEngine::Precision::FP32},
{"fp16", InferenceEngine::Precision::FP16},
{"q78", InferenceEngine::Precision::Q78},
{"i32", InferenceEngine::Precision::I32},
{"i16", InferenceEngine::Precision::I16},
{"i8", InferenceEngine::Precision::I8},
{"u16", InferenceEngine::Precision::U16},
{"u8", InferenceEngine::Precision::U8}};
for (auto && l : out_layers)
{
InferenceEngine::OutputsDataMap outputsDataMap = actual.getOutputsInfo();
@@ -118,32 +152,29 @@ void InferenceEnginePython::IENetwork::addOutputs(const std::vector<std::string>
actual.addOutput(l);
InferenceEngine::OutputsDataMap outputsDataMapUpd = actual.getOutputsInfo();
outputsDataMapUpd[l]->setPrecision(precision_map[precision]);
outputs.push_back(l);
}
}
void InferenceEnginePython::IENetwork::setBatch(const size_t size)
{
actual.setBatchSize(size);
const InferenceEngine::InputsDataMap &inputsInfo = actual.getInputsInfo();
for (auto &item : inputsInfo)
{
const InferenceEngine::TensorDesc &inputTensorDesc = item.second->getTensorDesc();
InferenceEngine::SizeVector dims = inputTensorDesc.getDims();
inputs[item.first] = dims;
}
}
void InferenceEnginePython::IENetwork::reshape(const std::map<std::string, std::vector<size_t>> & input_shapes){
actual.reshape(input_shapes);
const InferenceEngine::InputsDataMap &inputsInfo = actual.getInputsInfo();
for (auto &item : inputsInfo)
{
const InferenceEngine::TensorDesc &inputTensorDesc = item.second->getTensorDesc();
InferenceEngine::SizeVector dims = inputTensorDesc.getDims();
inputs[item.first] = dims;
}
}
void InferenceEnginePython::InputInfo::setPrecision(std::string precision){
actual.setPrecision(precision_map[precision]);
}
void InferenceEnginePython::InputInfo::setLayout(std::string layout){
actual.setLayout(layout_map[layout]);
}
void InferenceEnginePython::OutputInfo::setPrecision(std::string precision){
actual->setPrecision(precision_map[precision]);
}
InferenceEnginePython::IEPlugin::IEPlugin(const std::string &device, const std::vector<std::string> &plugin_dirs)
{
@@ -211,6 +242,9 @@ std::map<std::string, InferenceEngine::Blob::Ptr> InferenceEnginePython::IENetLa
return weights;
}
void InferenceEnginePython::IENetLayer::setPrecision(std::string precision){
layer_ptr->precision = precision_map[precision];
}
void InferenceEnginePython::IEPlugin::addCpuExtension(const std::string &extension_path)
{
InferenceEngine::ResponseDesc response;
@@ -295,13 +329,13 @@ std::vector<std::string> InferenceEnginePython::InferRequestWrap::getOutputsList
}
void InferenceEnginePython::InferRequestWrap::infer() {
InferenceEngine::ResponseDesc responseDesc;
request_ptr->Infer(&responseDesc);
InferenceEngine::ResponseDesc response;
IE_CHECK_CALL(request_ptr->Infer(&response));
}
void InferenceEnginePython::InferRequestWrap::infer_async() {
InferenceEngine::ResponseDesc responseDesc;
request_ptr->StartAsync(&responseDesc);
InferenceEngine::ResponseDesc response;
IE_CHECK_CALL(request_ptr->StartAsync(&response));
}
int InferenceEnginePython::InferRequestWrap::wait(int64_t timeout) {

View File

@@ -14,13 +14,8 @@
#include <sstream>
#include "ie_extension.h"
namespace InferenceEnginePython {
//struct BlobInfo {
// int layout;
// std::vector<std::size_t> dims;
// std::string name;
// std::vector<std::string> inputTo;
//};
struct IENetLayer {
InferenceEngine::CNNLayerPtr layer_ptr;
std::string name;
@@ -28,11 +23,25 @@ struct IENetLayer {
std::string precision;
std::string affinity;
std::map<std::string, std::string> params;
// std::map<std::string, InferenceEnginePython::BlobInfo> blob_info;
// std::map<std::string, InferenceEngine::Blob::Ptr> weights;
void setAffinity(const std::string & target_affinity);
void setParams(const std::map<std::string, std::string> & params_map);
std::map<std::string, InferenceEngine::Blob::Ptr> getWeights();
void setPrecision(std::string precision);
};
struct InputInfo{
InferenceEngine::InputInfo actual;
std::vector<size_t> dims;
std::string precision;
std::string layout;
void setPrecision(std::string precision);
void setLayout(std::string layout);
};
struct OutputInfo{
InferenceEngine::DataPtr actual;
std::vector<size_t> dims;
std::string precision;
std::string layout;
void setPrecision(std::string precision);
};
struct ProfileInfo {
std::string status;
@@ -46,15 +55,11 @@ struct IENetwork {
InferenceEngine::CNNNetwork actual;
std::string name;
std::size_t batch_size;
std::map<std::string, std::vector<size_t>> inputs;
std::vector<std::string> outputs;
void setPrecision() {
InferenceEngine::CNNNetwork one;
InferenceEngine::CNNNetwork second(std::move(one));
}
void setBatch(const size_t size);
void addOutputs(const std::vector<std::string> &out_layers, const std::string &precision);
std::map<std::string, InferenceEnginePython::IENetLayer> getLayers();
std::map<std::string, InferenceEnginePython::InputInfo> getInputs();
std::map<std::string, InferenceEnginePython::OutputInfo> getOutputs();
void reshape(const std::map<std::string, std::vector<size_t>> & input_shapes);
};

View File

@@ -43,12 +43,21 @@ cdef extern from "ie_api_impl.hpp" namespace "InferenceEnginePython":
void setAffinity(const string & target_affinity) except +
void setParams(const map[string, string] & params_map) except +
map[string, Blob.Ptr] getWeights() except +
void setPrecision(string precision) except +
cdef cppclass InputInfo:
vector[size_t] dims
string precision
string layout
void setPrecision(string precision)
void setLayout(string layout)
cdef cppclass OutputInfo:
vector[size_t] dims
string precision
string layout
void setPrecision(string precision)
# cdef cppclass BlobInfo:
# int layout
# vector[size_t] dims
# string name
# vector[string] inputTo
cdef cppclass ProfileInfo:
string status
@@ -71,8 +80,9 @@ cdef extern from "ie_api_impl.hpp" namespace "InferenceEnginePython":
string name
size_t batch_size
map[string, vector[size_t]] inputs
vector[string] outputs
map[string, IENetLayer] getLayers() except +
map[string, InputInfo] getInputs() except +
map[string, OutputInfo] getOutputs() except +
void addOutputs(vector[string] &, string &) except +
void setAffinity(map[string, string] &types_affinity_map, map[string, string] &layers_affinity_map) except +
void setBatch(size_t size) except +

View File

@@ -1,112 +0,0 @@
#!/usr/bin/env python
"""
Copyright (c) 2018 Intel Corporation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""
from __future__ import print_function
import sys
import os
from argparse import ArgumentParser
import cv2
import numpy as np
import logging as log
from openvino.inference_engine import IENetwork, IEPlugin
def build_argparser():
parser = ArgumentParser()
parser.add_argument("-m", "--model", help="Path to an .xml file with a trained model.", required=True, type=str)
parser.add_argument("-i", "--input", help="Path to a folder with images or path to an image files", required=True,
type=str)
parser.add_argument("-l", "--cpu_extension",
help="MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernels "
"impl.", type=str, default=None)
parser.add_argument("-pp", "--plugin_dir", help="Path to a plugin folder", type=str, default=None)
parser.add_argument("-d", "--device",
help="Specify hetero plugin configuration; e.g. HETERO:FPGA,CPU", default="HETERO:CPU,GPU",
type=str)
parser.add_argument("-nt", "--number_top", help="Number of top results", default=10, type=int)
return parser
def main():
log.basicConfig(format="[ %(levelname)s ] %(message)s", level=log.INFO, stream=sys.stdout)
args = build_argparser().parse_args()
assert args.device.split(':')[0] == "HETERO", "This sample supports only Hetero Plugin. " \
"Please specify correct device, e.g. HETERO:FPGA,CPU"
model_xml = args.model
model_bin = os.path.splitext(model_xml)[0] + ".bin"
# Plugin initialization for specified device and load extensions library if specified
plugin = IEPlugin(device=args.device, plugin_dirs=args.plugin_dir)
if args.cpu_extension and 'CPU' in args.device:
plugin.add_cpu_extension(args.cpu_extension)
# Read IR
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
if "CPU" in plugin.device:
supported_layers = plugin.get_supported_layers(net)
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
if len(not_supported_layers) != 0:
log.error("Following layers are not supported by the plugin for specified device {}:\n {}".
format(plugin.device, ', '.join(not_supported_layers)))
log.error("Please try to specify cpu extensions library path in sample's command line parameters using -l "
"or --cpu_extension command line argument")
sys.exit(1)
net_ops = set([l.type for l in net.layers.values()])
if not any([op == "Convolution" for op in net_ops]):
log.warning("Specified IR doesn't contain any Convolution operations for which affinity going to be set.\n"
"Try to use another topology to make the affinity setting result more visible.")
# Configure the plugin to initialize default affinity for network in set_initial_affinity() function.
plugin.set_config({"TARGET_FALLBACK": args.device.split(':')[1]})
# Enable graph visualization
plugin.set_config({"HETERO_DUMP_GRAPH_DOT": "YES"})
plugin.set_initial_affinity(net)
for l in net.layers.values():
if l.type == "Convolution":
l.affinity = "GPU"
assert len(net.inputs.keys()) == 1, "Sample supports only single input topologies"
assert len(net.outputs) == 1, "Sample supports only single output topologies"
input_blob = next(iter(net.inputs))
out_blob = next(iter(net.outputs))
# Read and pre-process input image
n, c, h, w = net.inputs[input_blob]
image = cv2.imread(args.input)
image = cv2.resize(image, (w, h))
image = image.transpose((2, 0, 1)) # Change data layout from HWC to CHW
image = image.reshape((n, c, h, w))
# Load network to the plugin
exec_net = plugin.load(network=net)
del net
# Start sync inference
res = exec_net.infer(inputs={input_blob: image})
top_ind = np.argsort(res[out_blob], axis=1)[0, -args.number_top:][::-1]
for i in top_ind:
log.info("%f #%d" % (res[out_blob][0, i], i))
del exec_net
del plugin
cwd = os.getcwd()
log.info(
"Graphs representing default and resulting affinities dumped to {} and {} files respectively"
.format(os.path.join(cwd, 'hetero_affinity.dot'), os.path.join(cwd, 'hetero_subgraphs.dot'))
)
if __name__ == '__main__':
sys.exit(main() or 0)

View File

@@ -60,7 +60,7 @@ def main():
log.info("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
if "CPU" in plugin.device:
if plugin.device == "CPU":
supported_layers = plugin.get_supported_layers(net)
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
if len(not_supported_layers) != 0:
@@ -79,7 +79,7 @@ def main():
net.batch_size = len(args.input)
# Read and pre-process input images
n, c, h, w = net.inputs[input_blob]
n, c, h, w = net.inputs[input_blob].shape
images = np.ndarray(shape=(n, c, h, w))
for i in range(n):
image = cv2.imread(args.input[i])

View File

@@ -60,7 +60,7 @@ def main():
log.info("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
if "CPU" in plugin.device:
if plugin.device == "CPU":
supported_layers = plugin.get_supported_layers(net)
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
if len(not_supported_layers) != 0:
@@ -78,7 +78,7 @@ def main():
net.batch_size = len(args.input)
# Read and pre-process input images
n, c, h, w = net.inputs[input_blob]
n, c, h, w = net.inputs[input_blob].shape
images = np.ndarray(shape=(n, c, h, w))
for i in range(n):
image = cv2.imread(args.input[i])

View File

@@ -1,176 +0,0 @@
#!/usr/bin/env python
"""
Copyright (c) 2018 Intel Corporation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
"""
from __future__ import print_function
import sys
import os
from argparse import ArgumentParser
import cv2
import time
import logging as log
from openvino.inference_engine import IENetwork, IEPlugin
def build_argparser():
parser = ArgumentParser()
parser.add_argument("-m", "--model", help="Path to an .xml file with a trained model.", required=True, type=str)
parser.add_argument("-i", "--input",
help="Path to video file or image. 'cam' for capturing video stream from camera", required=True,
type=str)
parser.add_argument("-l", "--cpu_extension",
help="MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernels "
"impl.", type=str, default=None)
parser.add_argument("-pp", "--plugin_dir", help="Path to a plugin folder", type=str, default=None)
parser.add_argument("-d", "--device",
help="Specify the target device to infer on; CPU, GPU, FPGA or MYRIAD is acceptable. Sample "
"will look for a suitable plugin for device specified (CPU by default)", default="CPU",
type=str)
parser.add_argument("--labels", help="Labels mapping file", default=None, type=str)
parser.add_argument("-pt", "--prob_threshold", help="Probability threshold for detections filtering",
default=0.5, type=float)
return parser
def main():
log.basicConfig(format="[ %(levelname)s ] %(message)s", level=log.INFO, stream=sys.stdout)
args = build_argparser().parse_args()
model_xml = args.model
model_bin = os.path.splitext(model_xml)[0] + ".bin"
# Plugin initialization for specified device and load extensions library if specified
log.info("Initializing plugin for {} device...".format(args.device))
plugin = IEPlugin(device=args.device, plugin_dirs=args.plugin_dir)
if args.cpu_extension and 'CPU' in args.device:
plugin.add_cpu_extension(args.cpu_extension)
# Read IR
log.info("Reading IR...")
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
if "CPU" in plugin.device:
supported_layers = plugin.get_supported_layers(net)
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
if len(not_supported_layers) != 0:
log.error("Following layers are not supported by the plugin for specified device {}:\n {}".
format(plugin.device, ', '.join(not_supported_layers)))
log.error("Please try to specify cpu extensions library path in sample's command line parameters using -l "
"or --cpu_extension command line argument")
sys.exit(1)
assert len(net.inputs.keys()) == 1, "Sample supports only single input topologies"
assert len(net.outputs) == 1, "Sample supports only single output topologies"
input_blob = next(iter(net.inputs))
out_blob = next(iter(net.outputs))
log.info("Loading IR to the plugin...")
exec_net = plugin.load(network=net, num_requests=2)
# Read and pre-process input image
n, c, h, w = net.inputs[input_blob]
del net
if args.input == 'cam':
input_stream = 0
else:
input_stream = args.input
assert os.path.isfile(args.input), "Specified input file doesn't exist"
if args.labels:
with open(args.labels, 'r') as f:
labels_map = [x.strip() for x in f]
else:
labels_map = None
cap = cv2.VideoCapture(input_stream)
cur_request_id = 0
next_request_id = 1
log.info("Starting inference in async mode...")
log.info("To switch between sync and async modes press Tab button")
log.info("To stop the sample execution press Esc button")
is_async_mode = True
render_time = 0
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
initial_w = cap.get(3)
initial_h = cap.get(4)
in_frame = cv2.resize(frame, (w, h))
in_frame = in_frame.transpose((2, 0, 1)) # Change data layout from HWC to CHW
in_frame = in_frame.reshape((n, c, h, w))
# Main sync point:
# in the truly Async mode we start the NEXT infer request, while waiting for the CURRENT to complete
# in the regular mode we start the CURRENT request and immediately wait for it's completion
inf_start = time.time()
if is_async_mode:
exec_net.start_async(request_id=next_request_id, inputs={input_blob: in_frame})
else:
exec_net.start_async(request_id=cur_request_id, inputs={input_blob: in_frame})
if exec_net.requests[cur_request_id].wait(-1) == 0:
inf_end = time.time()
det_time = inf_end - inf_start
# Parse detection results of the current request
res = exec_net.requests[cur_request_id].outputs[out_blob]
for obj in res[0][0]:
# Draw only objects when probability more than specified threshold
if obj[2] > args.prob_threshold:
xmin = int(obj[3] * initial_w)
ymin = int(obj[4] * initial_h)
xmax = int(obj[5] * initial_w)
ymax = int(obj[6] * initial_h)
class_id = int(obj[1])
# Draw box and label\class_id
color = (min(class_id * 12.5, 255), min(class_id * 7, 255), min(class_id * 5, 255))
cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), color, 2)
det_label = labels_map[class_id] if labels_map else str(class_id)
cv2.putText(frame, det_label + ' ' + str(round(obj[2] * 100, 1)) + ' %', (xmin, ymin - 7),
cv2.FONT_HERSHEY_COMPLEX, 0.6, color, 1)
# Draw performance stats
inf_time_message = "Inference time: N\A for async mode" if is_async_mode else \
"Inference time: {:.3f} ms".format(det_time * 1000)
render_time_message = "OpenCV rendering time: {:.3f} ms".format(render_time * 1000)
async_mode_message = "Async mode is on. Processing request {}".format(cur_request_id) if is_async_mode else \
"Async mode is off. Processing request {}".format(cur_request_id)
cv2.putText(frame, inf_time_message, (15, 15), cv2.FONT_HERSHEY_COMPLEX, 0.5, (200, 10, 10), 1)
cv2.putText(frame, render_time_message, (15, 30), cv2.FONT_HERSHEY_COMPLEX, 0.5, (10, 10, 200), 1)
cv2.putText(frame, async_mode_message, (10, int(initial_h - 20)), cv2.FONT_HERSHEY_COMPLEX, 0.5,
(10, 10, 200), 1)
#
render_start = time.time()
cv2.imshow("Detection Results", frame)
render_end = time.time()
render_time = render_end - render_start
key = cv2.waitKey(1)
if key == 27:
break
if (9 == key):
is_async_mode = not is_async_mode
log.info("Switched to {} mode".format("async" if is_async_mode else "sync"))
if is_async_mode:
cur_request_id, next_request_id = next_request_id, cur_request_id
cv2.destroyAllWindows()
del exec_net
del plugin
if __name__ == '__main__':
sys.exit(main() or 0)

View File

@@ -82,7 +82,7 @@ def main():
log.info("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
if "CPU" in plugin.device:
if plugin.device == "CPU":
supported_layers = plugin.get_supported_layers(net)
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
if len(not_supported_layers) != 0:
@@ -100,7 +100,7 @@ def main():
net.batch_size = len(args.input)
# Read and pre-process input images
n, c, h, w = net.inputs[input_blob]
n, c, h, w = net.inputs[input_blob].shape
images = np.ndarray(shape=(n, c, h, w))
for i in range(n):
image = cv2.imread(args.input[i])

View File

@@ -69,7 +69,7 @@ def main():
log.info("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
if "CPU" in plugin.device:
if plugin.device == "CPU":
supported_layers = plugin.get_supported_layers(net)
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
if len(not_supported_layers) != 0:
@@ -88,7 +88,7 @@ def main():
net.batch_size = len(args.input)
# Read and pre-process input images
n, c, h, w = net.inputs[input_blob]
n, c, h, w = net.inputs[input_blob].shape
images = np.ndarray(shape=(n, c, h, w))
for i in range(n):
image = cv2.imread(args.input[i])

View File

@@ -75,7 +75,11 @@ public:
CNNNetwork getNetwork() {
// network obj are to be updated upon this call
if (network.get() == nullptr) {
network.reset(new CNNNetwork(actual));
try {
network.reset(new CNNNetwork(actual));
} catch (...) {
THROW_IE_EXCEPTION << "Could not allocate memory";
}
}
return *network.get();
}

View File

@@ -191,7 +191,7 @@ public:
* @brief - Helper method to get collect all input shapes with names of corresponding Data objects
* @return Map of pairs: input's name and its dimension.
*/
virtual ICNNNetwork::InputShapes getInputShapes() {
virtual ICNNNetwork::InputShapes getInputShapes() const {
ICNNNetwork::InputShapes shapes;
InputsDataMap inputs;
actual->getInputsInfo(inputs);
@@ -207,6 +207,10 @@ public:
return std::move(shapes);
}
/**
* @brief Run shape inference with new input shapes for the network
* @param inputShapes - map of pairs: name of corresponding data and its dimension.
*/
virtual void reshape(const ICNNNetwork::InputShapes &inputShapes) {
CALL_STATUS_FNC(reshape, inputShapes);
}

View File

@@ -51,4 +51,4 @@ class MemoryState {
}
};
} // namespace InferenceEngine
} // namespace InferenceEngine

View File

@@ -11,6 +11,9 @@
#include <set>
#include <cctype>
namespace InferenceEngine {
namespace details {
/**
* @brief provides case-less comparison for stl algorithms
* @tparam Key type, usually std::string
@@ -73,3 +76,6 @@ using caseless_map = std::map<Key, Value, CaselessLess<Key>>;
template <class Key>
using caseless_set = std::set<Key, CaselessLess<Key>>;
} // namespace details
} // namespace InferenceEngine

View File

@@ -90,7 +90,7 @@ class CNNNetworkIterator {
*/
const CNNLayerPtr &operator*() const {
if (nullptr == currentLayer) {
THROW_IE_EXCEPTION << "iterator of ouf bound";
THROW_IE_EXCEPTION << "iterator out of bound";
}
return currentLayer;
}

View File

@@ -0,0 +1,21 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief A header file for CNNNetwork tools
* @file ie_cnn_network_tools.h
*/
#pragma once
#include <vector>
#include "ie_common.h"
#include "ie_icnn_network.hpp"
namespace InferenceEngine {
namespace details {
INFERENCE_ENGINE_API_CPP(std::vector<CNNLayerPtr>) CNNNetSortTopologically(const ICNNNetwork & network);
} // namespace details
} // namespace InferenceEngine

View File

@@ -68,6 +68,7 @@ inline void extract_exception(StatusCode status, char *msg) {
case RESULT_NOT_READY:throw ResultNotReady(msg);
case NOT_ALLOCATED:throw NotAllocated(msg);
case INFER_NOT_STARTED:throw InferNotStarted(msg);
case NETWORK_NOT_READ:throw NetworkNotRead(msg);
default:THROW_IE_EXCEPTION << msg;
}
}

View File

@@ -22,4 +22,4 @@ class NoReleaseOn : public T {
};
} // namespace details
} // namespace InferenceEngine
} // namespace InferenceEngine

View File

@@ -84,4 +84,4 @@ std::shared_ptr<IAllocator> make_pre_allocator(T *ptr, size_t size) {
}
} // namespace details
} // namespace InferenceEngine
} // namespace InferenceEngine

View File

@@ -29,6 +29,7 @@ namespace HeteroConfigParams {
* This option should be used with values: CONFIG_VALUE(NO) (default) or CONFIG_VALUE(YES)
*/
DECLARE_HETERO_CONFIG_KEY(DUMP_GRAPH_DOT);
DECLARE_HETERO_CONFIG_KEY(DUMP_DLA_MESSAGES);
} // namespace HeteroConfigParams
} // namespace InferenceEngine

View File

@@ -120,7 +120,7 @@ public:
: tensorDesc(p, SizeVector(dims.rbegin(), dims.rend()), l) {}
/**
* @depricated It works with reversed dimensions. Please create a new blob if you want to change a size.
* @deprecated It works with reversed dimensions. Please create a new blob if you want to change a size.
* @brief Changes Tensor size to the specified dimensions. If it was allocated, the previous data is deallocated and lost.
* @param dims New dimensions to set
* @param layout New layout to set
@@ -290,11 +290,16 @@ public:
* @param data_size Length of the pre-allocated array. If not set, size is assumed equal
* to the dot product of dims.
*/
TBlob(const TensorDesc& tensorDesc, T* ptr, size_t data_sze = 0): Blob(tensorDesc) {
if (size() != 0 && ptr == nullptr) {
TBlob(const TensorDesc& tensorDesc, T* ptr, size_t data_size = 0): Blob(tensorDesc) {
if (data_size == 0) {
data_size = size();
}
if (data_size != 0 && ptr == nullptr) {
THROW_IE_EXCEPTION << "Using Blob on external nullptr memory";
}
_allocator = details::make_pre_allocator(ptr, size());
_allocator = details::make_pre_allocator(ptr, data_size);
// blob on attached memory is always allocated, so we are not forcing the user to call allocate()
allocate();
}
@@ -327,11 +332,14 @@ public:
* @param ptr Pointer to the pre-allocated memory
* @param data_size Length of the pre-allocated array. If not set, size is assumed equal to dot product of dims.
*/
TBlob(Precision p, Layout l, const SizeVector& dims, T* ptr, size_t data_sze = 0) : Blob(p, l, dims) {
if (size() != 0 && ptr == nullptr) {
TBlob(Precision p, Layout l, const SizeVector& dims, T* ptr, size_t data_size = 0) : Blob(p, l, dims) {
if (data_size == 0) {
data_size = size();
}
if (data_size != 0 && ptr == nullptr) {
THROW_IE_EXCEPTION << "Using Blob on external nullptr memory";
}
_allocator = details::make_pre_allocator(ptr, size());
_allocator = details::make_pre_allocator(ptr, data_size);
// blob on attached memory is always allocated, so we are not forcing user to call allocate
allocate();
}
@@ -416,7 +424,10 @@ public:
if (tensorDesc.getDims().size() == 0) {
tensorDesc.setDims({static_cast<unsigned int>(that.size())});
}
allocate();
// minimisation of reallocations
if (_handle == nullptr) {
allocate();
}
auto memptr = data();
memcpy(memptr, that.data(), product(tensorDesc.getDims()) * sizeof(T));
}

View File

@@ -67,6 +67,12 @@ union UserValue {
void *v_ptr;
};
enum CellType {
ORIG,
LSTM,
GRU
};
/**
* @enum Layout
* @brief Layouts that the inference engine supports
@@ -94,6 +100,29 @@ enum Layout : uint8_t {
BLOCKED = 200,
};
inline std::ostream & operator << (std::ostream &out, const Layout & p) {
switch (p) {
#define PRINT_LAYOUT(name)\
case name : out << #name; break;
PRINT_LAYOUT(ANY);
PRINT_LAYOUT(NCHW);
PRINT_LAYOUT(NHWC);
PRINT_LAYOUT(OIHW);
PRINT_LAYOUT(C);
PRINT_LAYOUT(CHW);
PRINT_LAYOUT(HW);
PRINT_LAYOUT(NC);
PRINT_LAYOUT(CN);
PRINT_LAYOUT(BLOCKED);
#undef PRINT_LAYOUT
default:
out << static_cast<int>(p);
break;
}
return out;
}
/**
* @struct InferenceEngineProfileInfo
@@ -157,7 +186,8 @@ enum StatusCode : int {
REQUEST_BUSY = -8,
RESULT_NOT_READY = -9,
NOT_ALLOCATED = -10,
INFER_NOT_STARTED = -11
INFER_NOT_STARTED = -11,
NETWORK_NOT_READ = -12
};
/**
@@ -216,6 +246,10 @@ class InferNotStarted : public std::logic_error
{ using std::logic_error::logic_error; };
} // namespace InferenceEngine
/** @brief This class represents StatusCode::NETWORK_NOT_READ exception */
class NetworkNotRead : public std::logic_error
{ using std::logic_error::logic_error; };
#if defined(_WIN32)
#define __PRETTY_FUNCTION__ __FUNCSIG__
#else

View File

@@ -104,7 +104,7 @@ public:
* Batch is defined as the last element in the dimensions vector.
* @param batch_size Batch size to set
*/
inline void setBatchSize(size_t batch_size);
void setBatchSize(size_t batch_size);
/**
* @brief Sets the layout value for this Data instance

View File

@@ -11,7 +11,6 @@
#include "details/ie_so_pointer.hpp"
#include "ie_iextension.h"
#include "mkldnn/mkldnn_extension_ptr.hpp"
#include <string>
#include <memory>
#include <map>
@@ -166,8 +165,8 @@ public:
* @param resp Response descriptor
* @return Status code
*/
StatusCode getPrimitiveTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept override {
return actual->getPrimitiveTypes(types, size, resp);
StatusCode getShapeInferTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept override {
return actual->getShapeInferTypes(types, size, resp);
}
/**
@@ -204,11 +203,7 @@ inline std::shared_ptr<IShapeInferExtension> make_so_pointer(const std::string &
*/
template<>
inline std::shared_ptr<IExtension> make_so_pointer(const std::string &name) {
try {
return std::make_shared<Extension>(name);
} catch (InferenceEngine::details::InferenceEngineException& ex) {
return std::make_shared<MKLDNNPlugin::MKLDNNExtension>(name);
}
return std::make_shared<Extension>(name);
}
} // namespace InferenceEngine

View File

@@ -29,6 +29,7 @@ class ICNNNetReader : public details::IRelease {
public:
/**
* @brief Parses the topology part of the IR (.xml)
* This method can be called once only to read network. If you need to read another network instance then create new reader instance.
* @param filepath The full path to the .xml file of the IR
* @param resp Response message
* @return Result code
@@ -37,6 +38,7 @@ public:
/**
* @brief Parses the topology part of the IR (.xml) given the xml as a buffer
* This method can be called once only to read network. If you need to read another network instance then create new reader instance.
* @param model Pointer to a char array with the IR
* @param resp Response message
* @param size Size of the char array in bytes

View File

@@ -17,6 +17,7 @@
#include "details/ie_irelease.hpp"
#include "ie_preprocess.hpp"
#include "ie_input_info.hpp"
#include "ie_icnn_network_stats.hpp"
#include "ie_iextension.h"
#include <memory>
#include <map>
@@ -28,6 +29,8 @@ namespace InferenceEngine {
* @brief A collection that contains string as key, and Data smart pointer as value
*/
using OutputsDataMap = std::map<std::string, DataPtr>;
class IShapeInferExtension;
using IShapeInferExtensionPtr = std::shared_ptr<IShapeInferExtension>;
/**
* @brief This is the main interface to describe the NN topology
@@ -143,9 +146,10 @@ public:
* @note There are several limitations and it's not recommended to use it. Set batch to the input shape and call @reshape.
* @param size Size of batch to set
* @return Status code of the operation
* @note: Current implementation of the function sets batch size to the first dimension of 4D input layers in the networks
* and starts shape inference for IR starting from v3, for IR v2 it sets batch to the first dimension for all layers.
* Custom layers might require custom shape infer implementation, use @IShapeInferExtension interface to register them.
* @note: Current implementation of the function sets batch size to the first dimension of all layers in the networks.
* Before calling it make sure that all your layers have batch in the first dimension, otherwise the method works incorrectly.
* This limitation is resolved via [Shape Inference feature](./docs/Inference_Engine_Developer_Guide/ShapeInference.md)
* by using InferenceEngine::ICNNNetwork::reshape method.
*/
virtual StatusCode setBatchSize(size_t size, ResponseDesc* responseDesc) noexcept = 0;
@@ -161,12 +165,11 @@ public:
using InputShapes = std::map<std::string, SizeVector>;
/**
* @brief - Run shape inference with new input shapes for the network
* @param inputShapes - map of pairs: name of corresponding data and its dimension.
* @note currently all inputs are required
* @param resp Pointer to the response message that holds a description of an error if any occurred
* @return Status code of the operation
*/
* @brief Run shape inference with new input shapes for the network
* @param inputShapes - map of pairs: name of corresponding data and its dimension.
* @param resp Pointer to the response message that holds a description of an error if any occurred
* @return Status code of the operation
*/
virtual StatusCode reshape(const InputShapes& inputShapes, ResponseDesc* resp) noexcept { return NOT_IMPLEMENTED; };
/**
@@ -177,5 +180,7 @@ public:
*/
virtual StatusCode
AddExtension(const IShapeInferExtensionPtr& extension, ResponseDesc* resp) noexcept { return NOT_IMPLEMENTED; };
virtual StatusCode getStats(ICNNNetworkStats** stats, ResponseDesc* resp) const noexcept { return NOT_IMPLEMENTED; };
};
} // namespace InferenceEngine

View File

@@ -13,37 +13,39 @@
#include <memory>
#include <limits>
#include <vector>
#include <map>
#include "details/ie_irelease.hpp"
namespace InferenceEngine {
class NetworkNodeStats;
using NetworkNodeStatsPtr = std::shared_ptr<NetworkNodeStats>;
using NetworkNodeStatsWeakPtr = std::weak_ptr<NetworkNodeStats>;
using NetworkStatsMap = std::map<std::string, NetworkNodeStatsPtr>;
/**
* @class ICNNNetworkStats
* @brief This is the interface to describe the NN topology scoring statistics
*/
class ICNNNetworkStats : public details::IRelease {
public:
virtual void SaveToFile(const std::string& xmlPath, const std::string& binPath) const = 0;
virtual void LoadFromFile(const std::string& xmlPath, const std::string& binPath) = 0;
virtual void setNodesStats(const NetworkStatsMap& stats) = 0;
virtual const NetworkStatsMap& getNodesStats() const = 0;
virtual bool isEmpty() const = 0;
};
class NetworkNodeStats;
using NetworkNodeStatsPtr = std::shared_ptr<NetworkNodeStats>;
using NetworkNodeStatsWeakPtr = std::weak_ptr<NetworkNodeStats>;
class NetworkNodeStats {
public:
NetworkNodeStats() { }
explicit NetworkNodeStats(int statCount) {
float min = std::numeric_limits<float>::max();
float max = std::numeric_limits<float>::min();
float mn = (std::numeric_limits<float>::max)();
float mx = (std::numeric_limits<float>::min)();
for (int i = 0; i < statCount; i++) {
_minOutputs.push_back(min);
_maxOutputs.push_back(max);
_minOutputs.push_back(mn);
_maxOutputs.push_back(mx);
}
}

View File

@@ -22,7 +22,6 @@
#include "details/ie_no_copy.hpp"
#if defined(_WIN32) && defined(IMPLEMENT_INFERENCE_EXTENSION_API)
#define INFERENCE_EXTENSION_API(TYPE) extern "C" __declspec(dllexport) TYPE
#else
@@ -137,7 +136,9 @@ public:
* @return Status code
*/
virtual StatusCode getShapes(const std::vector<TensorDesc>& inShapes, std::vector<TensorDesc>& outShapes,
ResponseDesc* resp) noexcept = 0;
ResponseDesc* resp) noexcept {
return NOT_IMPLEMENTED;
}
/**
* @brief Gets all possible implementations for the given cnn Layer
@@ -156,6 +157,8 @@ class IShapeInferImpl {
public:
using Ptr = std::shared_ptr<IShapeInferImpl>;
virtual ~IShapeInferImpl() = default;
/**
* @brief check that reshape can be applied, that parameters and shapes are valid
*/
@@ -191,13 +194,13 @@ public:
virtual void Unload() noexcept = 0;
/**
* @brief Gets the array with types of layers which are included in the extension
* @brief Fills passed array with types of layers which shape infer implementations are included in the extension
* @param types Array to store the layer types
* @param size Size of the layer types array
* @param resp Response descriptor
* @return Status code
*/
virtual StatusCode getPrimitiveTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept = 0;
virtual StatusCode getShapeInferTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept = 0;
/**
* @brief Gets shape propagation implementation for the given string-type of cnn Layer
@@ -218,9 +221,20 @@ public:
virtual StatusCode getFactoryFor(ILayerImplFactory*& factory, const CNNLayer* cnnLayer,
ResponseDesc* resp) noexcept = 0;
StatusCode getShapeInferImpl(IShapeInferImpl::Ptr& impl,
const char* type,
ResponseDesc* resp) noexcept override {
/**
* @brief Fills passed array with types of layers which kernel implementations are included in the extension
* @param types Array to store the layer types
* @param size Size of the layer types array
* @param resp Response descriptor
* @return Status code
*/
virtual StatusCode getPrimitiveTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept = 0;
StatusCode getShapeInferTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept override {
return NOT_IMPLEMENTED;
};
StatusCode getShapeInferImpl(IShapeInferImpl::Ptr& impl, const char* type, ResponseDesc* resp) noexcept override {
return NOT_IMPLEMENTED;
};
};

View File

@@ -84,6 +84,8 @@ public:
QueryNetworkResult &res) noexcept {
QueryNetwork(device, network, res);
};
virtual void SetLogCallback(IErrorListener &listener) = 0;
};
using MapDeviceLoaders = std::map<std::string, InferenceEngine::IHeteroDeviceLoader::Ptr>;

View File

@@ -58,4 +58,4 @@ class IMemoryState : public details::no_copy {
virtual StatusCode GetLastState(Blob::CPtr & lastState, ResponseDesc *resp) const noexcept = 0;
};
} // namespace InferenceEngine
} // namespace InferenceEngine

View File

@@ -20,6 +20,9 @@
#include "ie_data.h"
#include "ie_blob.h"
#include "ie_device.hpp"
#include "ie_layers_property.hpp"
#include "ie_icnn_network.hpp"
namespace InferenceEngine {
/**
@@ -459,108 +462,92 @@ public:
using CNNLayer::CNNLayer;
};
/**
* @brief convinenent way to declare property with backward compatibility to 2D members
*/
#define DEFINE_PROP(prop_name) \
PropertyVector<unsigned int> prop_name;\
unsigned int &prop_name##_x = prop_name.at(X_AXIS);\
unsigned int &prop_name##_y = prop_name.at(Y_AXIS);\
/**
* @brief This class represents a standard 3D Convolution Layer
*/
class ConvolutionLayer : public WeightableLayer {
public:
/**
* @brief A convolution kernel width
* @brief A convolution kernel array [X, Y, Z, ...]
*/
unsigned int _kernel_x = 0;
DEFINE_PROP(_kernel);
/**
* @brief A convolution kernel height
* @brief A convolution paddings begin array [X, Y, Z, ...]
*/
unsigned int _kernel_y = 0;
DEFINE_PROP(_padding);
/**
* @brief An input convolution stride width
* @brief A convolution paddings end array [X, Y, Z, ...]
*/
unsigned int _stride_x = 1;
PropertyVector<unsigned int> _pads_end;
/**
* @brief An Input convolution stride height
* @brief A convolution strides array [X, Y, Z, ...]
*/
unsigned int _stride_y = 1;
DEFINE_PROP(_stride);
/**
* @brief A convolution dilations array [X, Y, Z, ...]
*/
DEFINE_PROP(_dilation);
/**
* @brief A number of output feature maps (size) generating the 3'rd output dimension
*/
unsigned int _out_depth = 0;
/**
* @brief Input padding width
*/
unsigned int _padding_x = 0;
/**
* @brief Input padding height
*/
unsigned int _padding_y = 0;
/**
* @brief Dilation width
*/
unsigned int _dilation_x = 1;
/**
* @brief Dilation height
*/
unsigned int _dilation_y = 1;
unsigned int _out_depth = 0u;
/**
* @brief Number of groups
*/
unsigned int _group = 1;
unsigned int _group = 1u;
/**
* @brief Creates a new ConvolutionLayer instance.
*/
using WeightableLayer::WeightableLayer;
explicit ConvolutionLayer(const LayerParams &p) : WeightableLayer(p),
_kernel(2, 0u), _padding(2, 0u), _stride(2, 1u), _dilation(2, 1u) {}
/**
* @brief assignment operator
*/
ConvolutionLayer & operator = (const ConvolutionLayer & that) {
if (&that != this) {
WeightableLayer::operator=(that);
_kernel = that._kernel;
_padding = that._padding;
_pads_end = that._pads_end;
_stride = that._stride;
_dilation = that._dilation;
_out_depth = that._out_depth;
_group = that._group;
}
return *this;
}
/**
* @brief move assignment operator
*/
ConvolutionLayer& operator = (ConvolutionLayer &&) = default;
/**
* @brief copy constructor
*/
ConvolutionLayer(const ConvolutionLayer & that) : WeightableLayer(that) {
operator = (that);
}
/**
* @brief move constructor
*/
ConvolutionLayer(ConvolutionLayer &&) = default;
};
/**
* @brief This class represents a standard deconvolution layer
*/
class DeconvolutionLayer : public WeightableLayer {
public:
/**
* @brief Deconvolution kernel width
*/
unsigned int _kernel_x = 0;
/**
* @brief Deconvolution kernel height
*/
unsigned int _kernel_y = 0;
/**
* @brief Input Deconvolution stride width
*/
unsigned int _stride_x = 0;
/**
* @brief Input Deconvolution stride height
*/
unsigned int _stride_y = 0;
/**
* @brief number of output feature maps (size) generating the 3'rd output dimension
*/
unsigned int _out_depth = 0;
/**
* @brief Input padding width
*/
unsigned int _padding_x = 0;
/**
* @brief Input padding height
*/
unsigned int _padding_y = 0;
/**
* @brief Dilation width
*/
unsigned int _dilation_x = 0;
/**
* @brief Dilation height
*/
unsigned int _dilation_y = 0;
/**
* @brief Number of groups
*/
unsigned int _group = 0;
/**
* @brief Creates a new DeconvolutionLayer instance.
*/
using WeightableLayer::WeightableLayer;
class DeconvolutionLayer : public ConvolutionLayer {
public:
using ConvolutionLayer::ConvolutionLayer;
using ConvolutionLayer::operator=;
};
/**
@@ -569,29 +556,21 @@ public:
class PoolingLayer : public CNNLayer {
public:
/**
* @brief Pooling kernel width
* @brief Pooling kernel array [X, Y, Z, ...]
*/
unsigned int _kernel_x = 0;
DEFINE_PROP(_kernel);
/**
* @brief Pooling kernel height
* @brief Pooling paddings begin array [X, Y, Z, ...]
*/
unsigned int _kernel_y = 0;
DEFINE_PROP(_padding);
/**
* @brief Input Pooling stride width
* @brief Pooling paddings end array [X, Y, Z, ...]
*/
unsigned int _stride_x = 0;
PropertyVector<unsigned int> _pads_end;
/**
* @brief Input Pooling stride height
* @brief Pooling strides array [X, Y, Z, ...]
*/
unsigned int _stride_y = 0;
/**
* @brief Input padding width
*/
unsigned int _padding_x = 0;
/**
* @brief Input padding height
*/
unsigned int _padding_y = 0;
DEFINE_PROP(_stride);
/**
* @enum PoolType
@@ -618,9 +597,44 @@ public:
/**
* @brief Creates a new PoolingLayer instance.
*/
using CNNLayer::CNNLayer;
explicit PoolingLayer(const LayerParams &p) : CNNLayer(p),
_kernel(2, 0u), _padding(2, 0u), _stride(2, 0u) {}
/**
* @brief assignment operator
*/
PoolingLayer & operator = (const PoolingLayer & that) {
if (&that != this) {
CNNLayer::operator=(that);
_kernel = that._kernel;
_padding = that._padding;
_pads_end = that._pads_end;
_stride = that._stride;
_type = that._type;
_exclude_pad = that._exclude_pad;
}
return *this;
}
/**
* @brief move assignment operator
*/
PoolingLayer& operator = (PoolingLayer &&) = default;
/**
* @brief copy constructor
*/
PoolingLayer(const PoolingLayer & that) : CNNLayer(that) {
operator=(that);
}
/**
* @brief move constructor
*/
PoolingLayer(PoolingLayer &&) = default;
};
#undef DEFINE_PROP
/**
* @brief This class represents a fully connected layer
*/
@@ -836,7 +850,6 @@ public:
*/
std::vector<int> axis;
/**
* @deprecated result size is defined by second input
* @brief A vector of dimensions to be preserved
*/
std::vector<int> dim;
@@ -912,6 +925,66 @@ public:
using WeightableLayer::WeightableLayer;
};
/**
* @brief This class represents RNN sequence layer
*/
class RNNLayer : public WeightableLayer {
public:
CellType cellType;
/**
* @brief An axis by which iteration is performed. Axis=0 means first input blob dimension is sequence, axis=1 means first dimension is batch.
*/
unsigned int _axis = 1;
using WeightableLayer::WeightableLayer;
/**
* @brief Creates a new RNNLayer instance.
*/
explicit RNNLayer(const LayerParams &p) : WeightableLayer(p) {}
};
/**
* @brief This class represents LSTMCell pseudo-layer to be used in TensorIterator
*/
class LSTMCell : public WeightableLayer {
public:
using WeightableLayer::WeightableLayer;
};
class ICNNNetReader;
/**
* @brief This class represents TensorIterator layer
*/
class TensorIterator : public CNNLayer {
public:
using CNNNetReaderPtr = std::shared_ptr<ICNNNetReader>;
CNNNetReaderPtr reader;
struct BackEdge {
int fromLayer;
int fromPort;
int toLayer;
int toPort;
};
struct Port {
int external_port_id;
int internal_layer_id;
int internal_port_id;
int axis;
int part_size;
int stride;
};
std::vector<Port> input_ports;
std::vector<Port> output_ports;
std::vector<BackEdge> backEdges;
using CNNLayer::CNNLayer;
};
/**
* @class PReLULayer
* @brief This class represents a Layer which performs Scale and Shift

View File

@@ -0,0 +1,125 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief a header file for describing property style structure used by CNNLayers
* @file ie_layers_property.h
*/
#pragma once
namespace InferenceEngine {
constexpr const int MAX_DIMS_NUMBER = 12;
enum eDIMS_AXIS : uint8_t {
X_AXIS = 0,
Y_AXIS,
Z_AXIS
};
template<class T, int N = MAX_DIMS_NUMBER>
class PropertyVector {
T _axises[N] = {};
bool _allocated[N] = {};
size_t _length = 0;
public:
PropertyVector() = default;
PropertyVector(size_t len, T val) {
if (len > N) {
THROW_IE_EXCEPTION << "Property size exceeed limit of: " << N;
}
for (int i = 0; i < len; i++) {
_axises[i] = val;
_allocated[i] = true;
}
_length = len;
}
/**
* @brief allows access up-to capacity size
* @param index
* @return
*/
T &at(int index) {
if (index >= N) {
THROW_IE_EXCEPTION << "Property index is out of bounds(" << index << "/" << N;
}
return _axises[index];
}
const T &operator[](size_t index) const {
if (index >= N ||!_allocated[index]) {
THROW_IE_EXCEPTION << "Property index ("<< index <<")is out of bounds";
}
return _axises[index];
}
T &operator[](size_t index) {
if (index >= N || !_allocated[index]) {
THROW_IE_EXCEPTION << "Property index ("<< index <<")is out of bounds";
}
return _axises[index];
}
PropertyVector &operator=(const PropertyVector &src) {
if (this != &src) {
_length = src.size();
for (size_t i = 0; i < N; i++) {
_allocated[i] = src._allocated[i];
if (_allocated[i]) {
_axises[i] = src[i];
}
}
}
return *this;
}
bool operator==(const PropertyVector& src) const {
if (this == &src) return true;
if (_length != src.size()) return false;
for (size_t i = 0; i < N; i++)
if ((_allocated[i] != src._allocated[i]) ||
(_allocated[i] && _axises[i] != src._axises[i])) return false;
return true;
}
size_t size() const {
return _length;
}
void insert(size_t axis, const T &val) {
if (axis < N) {
if (!_allocated[axis]) {
_allocated[axis] = true;
_length++;
}
_axises[axis] = val;
} else {
THROW_IE_EXCEPTION << "Layer Property insertion at(axis) should be in [0,"<< N<< ")";
}
}
void remove(size_t axis) {
if (axis < N && _allocated[axis]) {
_allocated[axis] = false;
_length--;
}
}
void clear() {
for (int i = 0; i != N; i++) {
_allocated[i] = 0;
}
_length = 0u;
}
bool exist(size_t axis) const {
return (axis < N && _allocated[axis]);
}
};
} // namespace InferenceEngine

View File

@@ -0,0 +1,354 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief Contains declarations and definitions for sequential and multi-threading implementations.
* Multi-threading support is implemented in two variants: using the Threading Building Blocks library and OpenMP* product.
* To build a particular implementation, use the corresponding identifier: IE_THREAD_TBB, IE_THREAD_OMP or IE_THREAD_SEQ.
* @file ie_parallel.hpp
*/
#pragma once
#define IE_THREAD_TBB 0
#define IE_THREAD_OMP 1
#define IE_THREAD_SEQ 2
#if IE_THREAD == IE_THREAD_TBB
#include "tbb/parallel_for.h"
#include "tbb/task_arena.h"
#include "tbb/parallel_reduce.h"
#include "tbb/blocked_range.h"
#include "tbb/blocked_range2d.h"
inline int parallel_get_max_threads() { return tbb::this_task_arena::max_concurrency(); }
inline int parallel_get_num_threads() { return parallel_get_max_threads(); }
inline int parallel_get_thread_num() { return tbb::this_task_arena::current_thread_index(); }
inline void parallel_set_num_threads(int n) { return; }
#elif IE_THREAD == IE_THREAD_OMP
#include <omp.h>
/* MSVC still supports omp 2.0 only */
#if defined(_MSC_VER) && !defined(__INTEL_COMPILER)
# define collapse(x)
#endif // defined(_MSC_VER) && !defined(__INTEL_COMPILER)
inline int parallel_get_max_threads() { return omp_get_max_threads(); }
inline int parallel_get_num_threads() { return omp_get_num_threads(); }
inline int parallel_get_thread_num() { return omp_get_thread_num(); }
inline void parallel_set_num_threads(int n) { omp_set_num_threads(n); }
#elif IE_THREAD == IE_THREAD_SEQ
inline int parallel_get_max_threads() { return 1; }
inline int parallel_get_num_threads() { return 1; }
inline int parallel_get_thread_num() { return 0; }
inline void parallel_set_num_threads(int n) { return; }
#endif
namespace InferenceEngine {
template <typename F>
void parallel_nt(int nthr, F func) {
#if IE_THREAD == IE_THREAD_TBB
if (nthr == 0) nthr = parallel_get_max_threads();
if (nthr == 1) {
func(0, 1);
return;
}
tbb::parallel_for(0, nthr, [&](int ithr) {
func(ithr, nthr);
});
#elif IE_THREAD == IE_THREAD_OMP
if (nthr == 1) {
func(0, 1);
return;
}
# pragma omp parallel num_threads(nthr)
func(parallel_get_thread_num(), parallel_get_num_threads());
#elif IE_THREAD == IE_THREAD_SEQ
func(0, 1);
#endif
}
template <typename T0, typename R, typename F>
R parallel_sum(const T0 D0, R &input, F func) {
#if IE_THREAD == IE_THREAD_TBB
return tbb::parallel_reduce(
tbb::blocked_range<T0>(0, D0), input,
[&](const tbb::blocked_range<T0>& r, R init)->R {
R sum = init;
for (T0 dim1 = r.begin(); dim1 < r.end(); ++dim1)
sum += func(dim1);
return sum;
},
[](R x, R y)->R {
return x + y;
});
#else
R sum = input;
#if IE_THREAD == IE_THREAD_OMP
#pragma omp parallel for reduction(+ : sum) schedule(static)
#endif
for (T0 dim1 = 0; dim1 < D0; dim1++) {
sum += func(dim1);
}
return sum;
#endif
}
template <typename T0, typename T1, typename R, typename F>
R parallel_sum2d(const T0 D0, const T1 D1, R input, F func) {
#if IE_THREAD == IE_THREAD_TBB
return tbb::parallel_reduce(
tbb::blocked_range2d<T0, T1>(0, D0, 0, D1), input,
[&](const tbb::blocked_range2d<T0, T1>& r, R init)->R {
R sum = init;
for (T0 dim2 = r.rows().begin(); dim2 < r.rows().end(); dim2++) {
for (T1 dim1 = r.cols().begin(); dim1 < r.cols().end(); dim1++) {
sum += func(dim2, dim1);
}
}
return sum;
},
[](R x, R y)->R {
return x + y;
});
#else
R sum = input;
#if IE_THREAD == IE_THREAD_OMP
#pragma omp parallel for collapse(2) reduction(+ : sum) schedule(static)
#endif
for (T0 dim2 = 0; dim2 < D0; dim2++) {
for (T1 dim1 = 0; dim1 < D1; dim1++) {
sum += func(dim2, dim1);
}
}
return sum;
#endif
}
template<typename T>
inline T parallel_it_init(T start) { return start; }
template<typename T, typename Q, typename R, typename... Args>
inline T parallel_it_init(T start, Q &x, const R &X, Args &&... tuple) {
start = parallel_it_init(start, static_cast<Args>(tuple)...);
x = start % X;
return start / X;
}
inline bool parallel_it_step() { return true; }
template<typename Q, typename R, typename... Args>
inline bool parallel_it_step(Q &x, const R &X, Args &&... tuple) {
if (parallel_it_step(static_cast<Args>(tuple)...)) {
x = (x + 1) % X;
return x == 0;
}
return false;
}
template <typename T, typename Q>
inline void splitter(T n, Q team, Q tid, T &n_start, T &n_end) {
if (team <= 1 || n == 0) {
n_start = 0;
n_end = n;
} else {
T n1 = (n + (T)team - 1) / (T)team;
T n2 = n1 - 1;
T T1 = n - n2 * (T)team;
n_end = (T)tid < T1 ? n1 : n2;
n_start = (T)tid <= T1 ? tid * n1 : T1 * n1 + ((T)tid - T1) * n2;
}
n_end += n_start;
}
template <typename T0, typename F>
void for_1d(const int ithr, const int nthr, const T0 &D0, F func) {
T0 d0{ 0 }, end{ 0 };
splitter(D0, nthr, ithr, d0, end);
for (; d0 < end; ++d0) func(d0);
}
template <typename T0, typename F>
void parallel_for(const T0 &D0, F func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
for_1d(ithr, nthr, D0, func);
});
#elif IE_THREAD == IE_THREAD_OMP
# pragma omp parallel
for_1d(parallel_get_thread_num(), parallel_get_num_threads(), D0, func);
#elif IE_THREAD == IE_THREAD_SEQ
for_1d(0, 1, D0, func);
#endif
}
template <typename T0, typename T1, typename F>
void for_2d(const int ithr, const int nthr, const T0 &D0, const T1 &D1, F func) {
const size_t work_amount = (size_t)D0 * D1;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
splitter(work_amount, nthr, ithr, start, end);
T0 d0{ 0 }; T1 d1{ 0 };
parallel_it_init(start, d0, D0, d1, D1);
for (size_t iwork = start; iwork < end; ++iwork) {
func(d0, d1);
parallel_it_step(d0, D0, d1, D1);
}
}
template <typename T0, typename T1, typename F>
void parallel_for2d(const T0 &D0, const T1 &D1, F func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
for_2d(ithr, nthr, D0, D1, func);
});
#elif IE_THREAD == IE_THREAD_OMP
# pragma omp parallel
for_2d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, func);
#elif IE_THREAD == IE_THREAD_SEQ
for_2d(0, 1, D0, D1, func);
#endif
}
template <typename T0, typename T1, typename T2, typename F>
void for_3d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
const T2 &D2, F func) {
const size_t work_amount = (size_t)D0 * D1 * D2;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
splitter(work_amount, nthr, ithr, start, end);
T0 d0{ 0 }; T1 d1{ 0 }; T2 d2{ 0 };
parallel_it_init(start, d0, D0, d1, D1, d2, D2);
for (size_t iwork = start; iwork < end; ++iwork) {
func(d0, d1, d2);
parallel_it_step(d0, D0, d1, D1, d2, D2);
}
}
template <typename T0, typename T1, typename T2, typename F>
void parallel_for3d(const T0 &D0, const T1 &D1, const T2 &D2, F func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
for_3d(ithr, nthr, D0, D1, D2, func);
});
#elif IE_THREAD == IE_THREAD_OMP
# pragma omp parallel
for_3d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, D2, func);
#elif IE_THREAD == IE_THREAD_SEQ
for_3d(0, 1, D0, D1, D2, func);
#endif
}
template <typename T0, typename T1, typename T2, typename T3, typename F>
void for_4d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const T3 &D3, F func) {
const size_t work_amount = (size_t)D0 * D1 * D2 * D3;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
splitter(work_amount, nthr, ithr, start, end);
T0 d0{ 0 }; T1 d1{ 0 }; T2 d2{ 0 }; T3 d3{ 0 };
parallel_it_init(start, d0, D0, d1, D1, d2, D2, d3, D3);
for (size_t iwork = start; iwork < end; ++iwork) {
func(d0, d1, d2, d3);
parallel_it_step(d0, D0, d1, D1, d2, D2, d3, D3);
}
}
template <typename T0, typename T1, typename T2, typename T3, typename F>
void parallel_for4d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3, F func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
for_4d(ithr, nthr, D0, D1, D2, D3, func);
});
#elif IE_THREAD == IE_THREAD_OMP
# pragma omp parallel
for_4d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, D2, D3, func);
#elif IE_THREAD == IE_THREAD_SEQ
for_4d(0, 1, D0, D1, D2, D3, func);
#endif
}
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename F>
void for_5d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const T3 &D3, const T4 &D4, F func) {
const size_t work_amount = (size_t)D0 * D1 * D2 * D3 * D4;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
splitter(work_amount, nthr, ithr, start, end);
T0 d0{ 0 }; T1 d1{ 0 }; T2 d2{ 0 }; T3 d3{ 0 }; T4 d4{ 0 };
parallel_it_init(start, d0, D0, d1, D1, d2, D2, d3, D3, d4, D4);
for (size_t iwork = start; iwork < end; ++iwork) {
func(d0, d1, d2, d3, d4);
parallel_it_step(d0, D0, d1, D1, d2, D2, d3, D3, d4, D4);
}
}
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename F>
void parallel_for5d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3,
const T4 &D4, F func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
for_5d(ithr, nthr, D0, D1, D2, D3, D4, func);
});
#elif IE_THREAD == IE_THREAD_OMP
# pragma omp parallel
for_5d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, D2, D3, D4, func);
#elif IE_THREAD == IE_THREAD_SEQ
for_5d(0, 1, D0, D1, D2, D3, D4, func);
#endif
}
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename T5, typename F>
void for_6d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const T3 &D3, const T4 &D4, const T5 &D5, F func) {
const size_t work_amount = (size_t)D0 * D1 * D2 * D3 * D4 * D5;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
splitter(work_amount, nthr, ithr, start, end);
T0 d0{ 0 }; T1 d1{ 0 }; T2 d2{ 0 }; T3 d3{ 0 }; T4 d4{ 0 }; T5 d5{ 0 };
parallel_it_init(start, d0, D0, d1, D1, d2, D2, d3, D3, d4, D4,
d5, D5);
for (size_t iwork = start; iwork < end; ++iwork) {
func(d0, d1, d2, d3, d4, d5);
parallel_it_step(d0, D0, d1, D1, d2, D2, d3, D3, d4, D4, d5, D5);
}
}
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename T5, typename F>
void parallel_for6d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3,
const T4 &D4, const T5 &D5, F func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
for_6d(ithr, nthr, D0, D1, D2, D3, D4, D5, func);
});
#elif IE_THREAD == IE_THREAD_OMP
# pragma omp parallel
for_6d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, D2, D3, D4, D5, func);
#elif IE_THREAD == IE_THREAD_SEQ
for_6d(0, 1, D0, D1, D2, D3, D4, D5, func);
#endif
}
} // namespace InferenceEngine

View File

@@ -21,6 +21,8 @@
#include <ie_device.hpp>
#include <ie_plugin_dispatcher.hpp>
#include <ie_plugin_config.hpp>
#include <ie_icnn_network.hpp>
#include <ie_icnn_network_stats.hpp>
#include <cpp/ie_cnn_net_reader.h>
#include <cpp/ie_plugin_cpp.hpp>
#include <cpp/ie_executable_network.hpp>
@@ -177,5 +179,4 @@ void copyToFloat(float *dst, const InferenceEngine::Blob *src) {
for (size_t i = 0; i < t_blob->size(); i++) dst[i] = srcPtr[i];
}
} // namespace InferenceEngine

View File

@@ -1,67 +0,0 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief A header file for the main MKL-DNN Extension API
* @file mkldnn_extension.hpp
*/
#pragma once
#include <ie_iextension.h>
#include "mkldnn_generic_primitive.hpp"
namespace InferenceEngine {
namespace MKLDNNPlugin {
/**
* @deprecated use new extensibility API
* @brief The IMKLDNNExtension class provides the main extension interface
*/
class IMKLDNNExtension : public IExtension {
public:
/**
* @brief Creates a generic layer and returns a pointer to an instance
* @param primitive Pointer to newly created layer
* @param layer Layer parameters (source for name, type, precision, attr, weights...)
* @param resp Optional: pointer to an already allocated object to contain information in case of failure
* @return Status code of the operation: OK (0) for success
*/
virtual InferenceEngine::StatusCode CreateGenericPrimitive(IMKLDNNGenericPrimitive*& primitive,
const InferenceEngine::CNNLayerPtr& layer,
InferenceEngine::ResponseDesc *resp) const noexcept = 0;
/**
* @brief This method isn't implemented for the old API
*/
StatusCode getPrimitiveTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept override {
return NOT_IMPLEMENTED;
};
/**
* @brief This method isn't implemented for the old API
*/
StatusCode getFactoryFor(ILayerImplFactory *&factory, const CNNLayer *cnnLayer, ResponseDesc *resp) noexcept override {
return NOT_IMPLEMENTED;
}
/**
* @brief Gets shape propagation implementation for the given string-type of cnn Layer
* @param impl the vector with implementations which is ordered by priority
* @param resp response descriptor
* @return status code
*/
StatusCode getShapeInferImpl(IShapeInferImpl::Ptr& impl, const char* type, ResponseDesc* resp) noexcept override {
return NOT_IMPLEMENTED;
};
};
/**
* @deprecated use new extensibility API
* @brief Creates the default instance of the extension
* @return The MKL-DNN Extension interface
*/
INFERENCE_EXTENSION_API(StatusCode) CreateMKLDNNExtension(IMKLDNNExtension*& ext, ResponseDesc* resp) noexcept;
} // namespace MKLDNNPlugin
} // namespace InferenceEngine

View File

@@ -1,138 +0,0 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief A header file that defines a wrapper class for handling extension instantiation and releasing resources
* @file mkldnn_extension_ptr.hpp
*/
#pragma once
#include "details/ie_so_pointer.hpp"
#include "mkldnn/mkldnn_extension.hpp"
#include <string>
#include <memory>
namespace InferenceEngine {
namespace details {
/**
* @deprecated use new extensibility API
* @brief The SOCreatorTrait class defines the name of the fabric
* for creating MKLDNNPlugin::IMKLDNNExtension object in DLL
*/
template<>
class SOCreatorTrait<MKLDNNPlugin::IMKLDNNExtension> {
public:
/**
* @brief A name of the fabric for creating an MKLDNNPlugin::IMKLDNNExtension object in DLL
*/
static constexpr auto name = "CreateMKLDNNExtension";
};
} // namespace details
namespace MKLDNNPlugin {
/**
* @deprecated use new extensibility API
* @brief This class is a C++ helper to work with objects created using extensions.
* Implements different interfaces.
*/
class MKLDNNExtension : public MKLDNNPlugin::IMKLDNNExtension {
public:
/**
* @brief Loads extension from a shared library
* @param name Logical name of the extension library (soname without .dll/.so/lib prefix)
*/
explicit MKLDNNExtension(const std::string &name)
: actual(name) {}
/**
* @brief Creates a generic layer and returns a pointer to an instance
* @param primitive Pointer to a newly created layer
* @param layer Layer parameters (source for name, type, precision, attr, weights...)
* @param resp Optional: pointer to an already allocated object to contain information in case of failure
* @return Status code of the operation: OK (0) for success
*/
InferenceEngine::StatusCode CreateGenericPrimitive(IMKLDNNGenericPrimitive *&primitive,
const InferenceEngine::CNNLayerPtr &layer,
InferenceEngine::ResponseDesc *resp) const noexcept override {
return actual->CreateGenericPrimitive(primitive, layer, resp);
}
/**
* @brief This method isn't implemented for the old API
*/
InferenceEngine::StatusCode getPrimitiveTypes(char**& types, unsigned int& size,
InferenceEngine::ResponseDesc* resp) noexcept override {
return actual->getPrimitiveTypes(types, size, resp);
}
/**
* @brief This method isn't implemented for the old API
*/
InferenceEngine::StatusCode getFactoryFor(InferenceEngine::ILayerImplFactory *&factory,
const InferenceEngine::CNNLayer *cnnLayer,
InferenceEngine::ResponseDesc *resp) noexcept override {
return actual->getFactoryFor(factory, cnnLayer, resp);
}
/**
* @brief This method isn't implemented for the old API
*/
InferenceEngine::StatusCode getShapeInferImpl(InferenceEngine::IShapeInferImpl::Ptr& impl, const char* type,
InferenceEngine::ResponseDesc* resp) noexcept override {
return actual->getShapeInferImpl(impl, type, resp);
};
/**
* @brief Gets the extension version information
* @param versionInfo A pointer to version info, set by plugin
*/
void GetVersion(const InferenceEngine::Version *&versionInfo) const noexcept override {
actual->GetVersion(versionInfo);
}
/**
* @brief Sets a log callback that is used to track what is going on inside
* @param listener Logging listener
*/
void SetLogCallback(InferenceEngine::IErrorListener &listener) noexcept override {
actual->SetLogCallback(listener);
}
/**
* @brief Cleans the resources up
*/
void Unload() noexcept override {
actual->Unload();
}
/**
* @brief Does nothing since destruction is done via regular mechanism
*/
void Release() noexcept override {}
protected:
/**
* @brief An SOPointer instance to the loaded templated object
*/
InferenceEngine::details::SOPointer<MKLDNNPlugin::IMKLDNNExtension> actual;
};
} // namespace MKLDNNPlugin
/**
* @deprecated use new extensibility API
* @brief Creates a special shared_pointer wrapper for the given type from a specific shared module
* @param name Name of the shared library file
* @return shared_pointer A wrapper for the given type from a specific shared module
*/
template<>
inline std::shared_ptr<MKLDNNPlugin::IMKLDNNExtension> make_so_pointer(const std::string &name) {
return std::make_shared<MKLDNNPlugin::MKLDNNExtension>(name);
}
} // namespace InferenceEngine

View File

@@ -1,200 +0,0 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief A header file for the main MKL-DNN Extension API to work with weights, and primitives in memory
* @file mkldnn_extension_types.hpp
*/
#pragma once
#include "ie_common.h"
#include "ie_precision.hpp"
namespace InferenceEngine {
namespace MKLDNNPlugin {
/**
* @deprecated use new extensibility API
* @brief Defines formats from MKL-DNN which are supported in the MKL-DNN plugin of IE.
*/
enum MemoryFormat {
/** Undefined memory format, used for empty memory descriptors. */
format_undef = 0,
/** Unspecified format. The primitive selects a format
* automatically. */
any,
/** A tensor in a generic format described by the stride and blocking
* values in each dimension. See #mkldnn_blocking_desc_t for more
* information. */
blocked,
/** 1D data tensor. */
x,
/** 2D data tensor. */
nc,
/** 4D data tensor in the @c nchw format typically used in Caffe. */
nchw,
/** 4D data tensor in the @c nhwc format typically used in TensorFlow. */
nhwc,
/** 4D data tensor in the @c chwn format typically used in Neon. */
chwn,
/** 4D data tensor in the @c nchw format with channels data laid out in
* memory in 8-element blocks. */
nChw8c,
/** 4D data tensor in the @c nchw format with channels data laid out in
* memory in 16-element blocks. */
nChw16c,
/** 2D weights tensor in the format (input channels, output channels). */
oi,
/** 2D weights tensor in the format (input channels, output channels). */
io,
/** 4D weights tensor in the format (input channels, output channels,
* width, height). */
oihw,
/** 4D weights tensor in the format (input channels, height, width,
* output channels). */
ihwo,
/** 4D weights tensor in the format (height, width, input channels,
* output channels). */
hwio,
/** 4D weights tensor in the @c oihw format with both input and output
* channels data laid out in memory in 8-element blocks. */
OIhw8i8o,
/** 4D weights tensor in the @c oihw format with both input and output
* channels data laid out in memory in 16-element blocks. */
OIhw16i16o,
/** 4D weights tensor in the @c oihw format with output channels data
* laid out in memory in 16-element blocks and input channels data
* laid out in memory in 8-element blocks blocked by pairs. */
OIhw8i16o2i,
/** 4D weights tensor in the @c oihw format with input channels data
* laid out in memory in 16-element blocks and output channels data
* laid out in memory in 8-element blocks blocked by pairs. */
OIhw8o16i2o,
/** 4D weights tensor in the @c oihw format with both input and output
* channels data laid out in memory in 8-element blocks. */
OIhw8o8i,
/** 4D weights tensor in the @c oihw format with both input and output
* channels data laid out in memory in 16-element blocks. */
OIhw16o16i,
/** 4D weights tensor in the format (output channels, input channels,
* height, width) with output channels data laid out in memory in 8-element
* blocks. */
Oihw8o,
/** 4D weights tensor in the format (output channels, input channels,
* height, width) with output channels data laid out in memory in
* 16-element blocks. */
Oihw16o,
/** 4D weights tensor in the format (output channels, width, height, input
* channels) with output channels data laid out in memory in 8-element
* blocks. */
Ohwi8o,
/** 4D weights tensor in the format (output channels, width, height, input
* channels) with output channels data laid out in memory in 16-element
* blocks. */
Ohwi16o,
/** 4D weights tensor in the @c oihw format with both input and output
* channels data laid out in memory in 16-element and 4-element blocks. */
OhIw16o4i,
/** 5D weights tensor in the @c oihw format with extra outer dimension for
* groups. */
goihw,
/** 5D weights tensor in the blocked version of @c goihw format with both
* input and output channels data laid out in memory in 8-element blocks.
*/
gOIhw8i8o,
/** 5D weights tensor in the blocked version of @c goihw format with both
* input and output channels data laid out in memory in 16-element blocks.
*/
gOIhw16i16o,
/** 5D weights tensor in the @c oihw format with output channels data
* laid out in memory in 16-element blocks and input channels data
* laid out in memory in 8-element blocks blocked by pairs. */
gOIhw8i16o2i,
/** 5D weights tensor in the @c oihw format with input channels data
* laid out in memory in 16-element blocks and output channels data
* laid out in memory in 8-element blocks blocked by pairs. */
gOIhw8o16i2o,
/** 5D weights tensor in the blocked version of @c goihw format with both
* input and output channels data laid out in memory in 8-element blocks.
*/
gOIhw8o8i,
/** 5D weights tensor in the blocked version of @c goihw format with both
* input and output channels data laid out in memory in 16-element blocks.
*/
gOIhw16o16i,
/** 5D weights tensor in the blocked version of @c goihw format with output
* channels data laid out in memory in 8-element blocks. */
gOihw8o,
/** 5D weights tensor in the blocked version of @c goihw format with output
* channels data laid out in memory in 16-element blocks. */
gOihw16o,
/** 5D weights tensor in the blocked version of @c goihw format with output
* channels data laid out in memory in 8-element blocks. */
gOhwi8o,
/** 5D weights tensor in the blocked version of @c goihw format with output
* channels data laid out in memory in 16-element blocks. */
gOhwi16o,
/** 5D weights tensor in the @c goihw format with both input and output
* channels data laid out in memory in 16-element and 4-element blocks. */
gOhIw16o4i,
/** 4D weights tensor in the oihw format with input channels data laid out
* in memory in 8-element blocks. */
oIhw8i = nChw8c,
/** 4D weights tensor in the oihw format with input channels data laid out
* in memory in 16-element blocks. */
oIhw16i = nChw16c,
};
/**
* @deprecated use new extensibility API
* @brief Stores necessary information about the primitive memory object.
* Such as precision, dimensions, memory format etc.
*/
struct MKLDNNPrimitiveMemory {
/**
* @brief precision type
*/
Precision precision;
/**
* @brief dimensions of the given primitive
*/
SizeVector dims;
/**
* @brief memory type of the given primitive
*/
MemoryFormat format;
/**
* @brief primitive data stored
*/
void *data;
/**
* @brief A constructor.
*/
MKLDNNPrimitiveMemory() : format(format_undef), data(nullptr) {}
};
/**
* @deprecated use new extensibility API
* @brief Stores necessary information about the primitive weights.
*/
struct MKLDNNWeightsMemory {
/**
* @brief size of weights
*/
size_t size;
/**
* @brief pointer to weights data
*/
void *data;
/**
* @brief A constructor.
*/
MKLDNNWeightsMemory() : size(0), data(nullptr) {}
};
} // namespace MKLDNNPlugin
} // namespace InferenceEngine

View File

@@ -1,123 +0,0 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief a header file for MKL-DNN Generic Primitive API
* @file mkldnn_generic_primitive.hpp
*/
#pragma once
#include "mkldnn_extension_types.hpp"
#include "details/ie_irelease.hpp"
#include <vector>
namespace InferenceEngine {
namespace MKLDNNPlugin {
/**
* @deprecated use new extensibility API
* @brief The MKLDNNGenericFormats stores weights, biases, inputs and outputs of the primitive
*/
class MKLDNNGenericFormats {
public:
/**
* @brief A default constructor
* @param ins - vector of inputs
* @param outs - vector of outputs
* @param weights - weights, format_undef by default
* @param biases - biases, format_undef by default
*/
MKLDNNGenericFormats(const std::vector<MemoryFormat> &ins, const std::vector<MemoryFormat> &outs,
const MemoryFormat weights = MemoryFormat::format_undef,
const MemoryFormat biases = MemoryFormat::format_undef) : inputs(ins), outputs(outs) {
this->weights = weights;
this->biases = biases;
}
/**
* @brief Get input formats
* @return vector of input formats
*/
const std::vector<MemoryFormat>& GetInputs() const noexcept {
return inputs;
}
/**
* @brief Get output formats
* @return vector of output formats
*/
const std::vector<MemoryFormat>& GetOutputs() const noexcept {
return outputs;
}
/**
* @brief Get weights format
* @return weights format
*/
const MemoryFormat& GetWeights() const noexcept {
return weights;
}
/**
* @brief Get biases format
* @return biases format
*/
const MemoryFormat& GetBiases() const noexcept {
return biases;
}
private:
std::vector<MemoryFormat> inputs;
std::vector<MemoryFormat> outputs;
MemoryFormat weights;
MemoryFormat biases;
};
/**
* @deprecated use new extensibility API
* @brief The IMKLDNNGenericPrimitive is the main Generic Primitive interface
*/
class IMKLDNNGenericPrimitive : public InferenceEngine::details::IRelease {
public:
void Release() noexcept override {
delete this;
}
/**
* @brief Sets inputs nd outputs
* @param inputs - vector of input primitives
* @param outputs - vector of output primitives
*/
void SetMemory(const std::vector<MKLDNNPrimitiveMemory>& inputs,
const std::vector<MKLDNNPrimitiveMemory>& outputs) noexcept {
this->inputs = inputs;
this->outputs = outputs;
}
/**
* @brief Gets supported formats
* @return vector of supported formats
*/
virtual std::vector<MKLDNNGenericFormats> GetSupportedFormats() noexcept = 0;
/**
* @brief Entry point of actual execution of primitive.
* Error reporting mechanism missed, static check should be done in constructor
*/
virtual void Execute() noexcept = 0;
protected:
/**
* @brief Vector of input primitives
*/
std::vector<MKLDNNPrimitiveMemory> inputs;
/**
* @brief Vector of output primitives
*/
std::vector<MKLDNNPrimitiveMemory> outputs;
};
} // namespace MKLDNNPlugin
} // namespace InferenceEngine

View File

@@ -5,14 +5,7 @@ cmake_minimum_required (VERSION 2.8)
project(Samples)
list (APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)
find_package(InferenceEngine 1.2)
if (NOT InferenceEngine_FOUND)
message(FATAL_ERROR "")
endif()
if("${CMAKE_BUILD_TYPE}" STREQUAL "")
if (CMAKE_BUILD_TYPE STREQUAL "")
message(STATUS "CMAKE_BUILD_TYPE not defined, 'Release' will be used")
set(CMAKE_BUILD_TYPE "Release")
endif()
@@ -27,35 +20,35 @@ if (NOT(BIN_FOLDER))
set (BIN_FOLDER ${ARCH})
endif()
if (NOT (IE_MAIN_SOURCE_DIR))
set(NEED_EXTENSIONS TRUE)
if (WIN32)
set (IE_MAIN_SOURCE_DIR ${CMAKE_SOURCE_DIR}/../bin/)
else()
set (IE_MAIN_SOURCE_DIR ${CMAKE_CURRENT_BINARY_DIR})
endif()
if (NOT(IE_MAIN_SOURCE_DIR))
# in case if samples are built out of IE repo
set (IE_MAIN_SAMPLES_DIR ${CMAKE_CURRENT_BINARY_DIR})
else()
# in case if samples are built from IE repo
set (IE_MAIN_SAMPLES_DIR ${IE_MAIN_SOURCE_DIR})
endif()
if(NOT(UNIX))
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
set (CMAKE_LIBRARY_PATH ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
set (CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
set (CMAKE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
set (LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
set (CMAKE_LIBRARY_PATH ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
set (CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
set (CMAKE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
set (LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
set (LIBRARY_OUTPUT_PATH ${LIBRARY_OUTPUT_DIRECTORY}) # compatibility issue: linux uses LIBRARY_OUTPUT_PATH, windows uses LIBRARY_OUTPUT_DIRECTORY
else ()
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
set (CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
set (CMAKE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
set (LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
set (CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
set (CMAKE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
set (LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
set (LIBRARY_OUTPUT_PATH ${LIBRARY_OUTPUT_DIRECTORY}/lib)
endif()
set(CMAKE_CXX_FLAGS "-std=c++11 ${CMAKE_CXX_FLAGS}")
find_package(InferenceEngine 1.4 REQUIRED)
if (WIN32)
if(NOT "${CMAKE_SIZEOF_VOID_P}" EQUAL "8")
message(FATAL_ERROR "Only 64-bit supported on Windows")
@@ -65,7 +58,7 @@ if (WIN32)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_SCL_SECURE_NO_WARNINGS -DNOMINMAX")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /EHsc") #no asynchronous structured exception handling
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /LARGEADDRESSAWARE")
if (ENABLE_OMP)
if (THREADING STREQUAL "OMP")
find_package(OpenMP)
if (OPENMP_FOUND)
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
@@ -81,8 +74,6 @@ else()
endif()
endif()
include(feature_defs OPTIONAL)
####################################
## to use C++11
set (CMAKE_CXX_STANDARD 11)
@@ -105,28 +96,35 @@ if (UNIX)
SET(LIB_DL dl)
endif()
# Find OpenCV libray if exists
# Find OpenCV library if exists
find_package(OpenCV)
include_directories(${OpenCV_INCLUDE_DIRS})
if(OpenCV_FOUND)
include_directories(${OpenCV_INCLUDE_DIRS})
add_definitions(-DUSE_OPENCV)
else()
set (BUILD_VALIDATION_APP OFF)
message(WARNING "No suitable OpenCV version detected, BUILD_VALIDATION_APP is set to OFF")
endif()
add_subdirectory(common/format_reader)
if (NEED_EXTENSIONS)
add_subdirectory(extension)
endif()
####################################################
# SAMPLES list
####################################################
add_subdirectory(classification_sample)
add_subdirectory(classification_sample_async)
add_subdirectory(hello_autoresize_classification)
add_subdirectory(hello_classification)
add_subdirectory(hello_request_classification)
add_subdirectory(hello_shape_infer_ssd)
add_subdirectory(object_detection_sample_ssd)
add_subdirectory(style_transfer_sample)
if (OpenCV_FOUND)
add_subdirectory(benchmark_app)
add_subdirectory(calibration_tool)
if (BUILD_VALIDATION_APP)
add_subdirectory(validation_app)
else()
message(STATUS "Validation app build is switched off")
endif()
####################################################

View File

@@ -1,80 +0,0 @@
Inference Engine Samples {#SamplesOverview}
================
The Inference Engine sample applications are simple console applications that demonstrate how you can use the Intel's Deep Learning Inference Engine in your applications.
The Deep Learning Inference Engine release package provides the following sample applications available in the samples
directory in the Inference Engine installation directory:
- [CPU Extensions](@ref CPUExtensions) library with topology-specific layers (like DetectionOutput used in the SSD*, below)
- [Hello Autoresize Classification Sample](@ref InferenceEngineHelloAutoresizeClassificationSample) - Input of any size and layout can be set to an infer request which will be pre-processed automatically during inference (the sample supports only images as inputs)
- [Hello Infer Request Classification Sample](@ref InferenceEngineHelloRequestClassificationSample) - Inference of image classification networks via Infer Request API (the sample supports only images as inputs)
- [Image Classification Sample](@ref InferenceEngineClassificationSampleApplication) - Inference of image classification networks like AlexNet and GoogLeNet (the sample supports only images as inputs)
- [Image Classification Sample, pipelined](@ref InferenceEngineClassificationPipelinedSampleApplication)- Maximize performance via pipelined execution, the sample supports only images as inputs
- [Neural Style Transfer Sample](@ref InferenceEngineNeuralStyleTransferSampleApplication) - Style Transfer sample (the sample supports only images as inputs)
- [Object Detection for SSD Sample](@ref InferenceEngineObjectDetectionSSDSampleApplication) - Inference of object detection networks based on the SSD, this sample is simplified version that supports only images as inputs
- [Validation App](@ref InferenceEngineValidationApp) - Infers pack of images resulting in total accuracy (only images as inputs)
## <a name="build_samples_linux"></a> Building the Sample Applications on Linux*
The officially supported Linux build environment is the following:
* Ubuntu* 16.04 LTS 64-bit or CentOS* 7.4 64-bit
* GCC* 5.4.0 (for Ubuntu* 16.04) or GCC* 4.8.5 (for CentOS* 7.4)
* CMake* version 2.8 or higher.
* OpenCV 3.3 or later (required for some samples)
<br>You can build the sample applications using the <i>CMake</i> file in the `samples` directory.
Create a new directory and change your current directory to the new one:
```sh
mkdir build
cd build
```
Run <i>CMake</i> to generate Make files:
```sh
cmake -DCMAKE_BUILD_TYPE=Release <path_to_inference_engine_samples_directory>
```
To build samples with debug information, use the following command:
```sh
cmake -DCMAKE_BUILD_TYPE=Debug <path_to_inference_engine_samples_directory>
```
Run <i>Make</i> to build the application:
```sh
make
```
For ease of reference, the Inference Engine installation folder is referred to as <code><INSTALL_DIR></code>.
After that you can find binaries for all samples applications in the <code>intel64/Release</code> subfolder.
## <a name="build_samples_windows"></a> Building the Sample Applications on Microsoft Windows* OS
The recommended Windows build environment is the following:
* Microsoft Windows* 10
* Microsoft* Visual Studio* 2015 including Microsoft Visual Studio 2015 Community or Microsoft Visual Studio 2017
* CMake* version 2.8 or later
* OpenCV* 3.3 or later
Generate Microsoft Visual Studio solution file using <code>create_msvc_solution.bat</code> file in the <code>samples</code> directory and then build the solution <code>samples\build\Samples.sln</code> in the Microsoft Visual Studio 2015.
## Running the Sample Applications
Before running compiled binary files, make sure your application can find the Inference Engine libraries.
Use the `setvars.sh` script, which will set all necessary environment variables.
For that, run (assuming that you are in a <code><INSTALL_DIR>/deployment_tools/inference_engine/bin/intel64/Release</code> folder):
<pre>
source ../../setvars.sh
</pre>
What is left is running the required sample with appropriate commands, providing IR information (typically with "-m" command-line option).
Please note that Inference Engine assumes that weights are in the same folder as _.xml_ file.
## See Also
* [Introduction to Intel's Deep Learning Inference Engine](@ref Intro)
---
\* Other names and brands may be claimed as the property of others.

View File

@@ -0,0 +1,43 @@
# Copyright (c) 2018 Intel Corporation
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
cmake_minimum_required(VERSION 2.8)
set (TARGET_NAME "benchmark_app")
if( BUILD_SAMPLE_NAME AND NOT ${BUILD_SAMPLE_NAME} STREQUAL ${TARGET_NAME} )
message(STATUS "SAMPLE ${TARGET_NAME} SKIPPED")
return()
endif()
file (GLOB SRC
${CMAKE_CURRENT_SOURCE_DIR}/*.cpp
)
# Create named folders for the sources within the .vcproj
# Empty name lists them directly under the .vcproj
source_group("src" FILES ${SRC})
link_directories(${LIB_FOLDER})
# Create library file from sources.
add_executable(${TARGET_NAME} ${SRC})
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
COMPILE_PDB_NAME ${TARGET_NAME})
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} IE::ie_cpu_extension format_reader gflags)
if(UNIX)
target_link_libraries(${TARGET_NAME} ${LIB_DL} pthread)
endif()

View File

@@ -0,0 +1,87 @@
# Benchmark Application Demo
This topic demonstrates how to run the Benchmark Application demo, which performs inference using convolutional networks.
## How It Works
**NOTE:** To achieve benchmark results similar to the official published results, set CPU frequency to 2.9GHz and GPU frequency to 1GHz.
Upon the start-up, the application reads command-line parameters and loads a network and images to the Inference Engine plugin. The number of infer requests and execution approach depend on a mode defined with the `-api` command-line parameter.
### Synchronous API
For synchronous mode, the primary metric is latency. The application creates one infer request and executes the `Infer` method. A number of executions is defined by one of the two values:
* Number of iterations defined with the `-niter` command-line argument
* Predefined duration if `-niter` is skipped. Predefined duration value depends on device.
During the execution, the application collects two types of metrics:
* Latency for each infer request executed with `Infer` method
* Duration of all executions
Reported latency value is calculated as mean value of all collected latencies. Reported throughput value is a derivative from reported latency and additionally depends on batch size.
### Asynchronous API
For asynchronous mode, the primary metric is throughput in frames per second (FPS). The application creates a certain number of infer requests and executes the `StartAsync` method. A number of infer is specified with the `-nireq` command-line parameter. A number of executions is defined by one of the two values:
* Number of iterations defined with the `-niter` command-line argument
* Predefined duration if `-niter` is skipped. Predefined duration value depends on device.
The infer requests are executed asynchronously. `Wait` method is used to wait for previous execution to complete. The application measures all infer requests executions and reports the throughput metric based on batch size and total execution duration.
## Running
Running the application with the `-h` option yields the following usage message:
```sh
./benchmark_app -h
InferenceEngine:
API version ............ <version>
Build .................. <number>
[ INFO ] Parsing input parameters
benchmark_app [OPTION]
Options:
-h Print a usage message
-i "<path>" Required. Path to a folder with images or to image files.
-m "<path>" Required. Path to an .xml file with a trained model.
-pp "<path>" Path to a plugin folder.
-api "<sync/async>" Required. Enable using sync/async API.
-d "<device>" Specify a target device to infer on: CPU, GPU, FPGA or MYRIAD. Use "-d HETERO:<comma separated devices list>" format to specify HETERO plugin. The application looks for a suitable plugin for the specified device.
-niter "<integer>" Optional. Number of iterations. If not specified, the number of iterations is calculated depending on a device.
-nireq "<integer>" Optional. Number of infer requests (default value is 2).
-l "<absolute_path>" Required for CPU custom layers. Absolute path to a shared library with the kernels implementations.
Or
-c "<absolute_path>" Required for GPU custom kernels. Absolute path to an .xml file with the kernels description.
-b "<integer>" Optional. Batch size value. If not specified, the batch size value is determined from IR.
```
Running the application with the empty list of options yields the usage message given above and an error message.
To run the demo, you can use one-layer public models or one-layer pre-trained and optimized models delivered with the package that support images as input.
For example, to do inference on an image using a trained network with multiple outputs on CPU, run the following command:
```sh
./benchmark_app -i <path_to_image>/inputImage.bmp -m <path_to_model>/multiple-output.xml -d CPU
```
**NOTE**: Public models should be first converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](./docs/Model_Optimizer_Developer_Guide/Deep_Learning_Model_Optimizer_DevGuide.md).
## Demo Output
Application output depends on a used API. For synchronous API, the application outputs latency and throughput:
```
[ INFO ] Start inference synchronously (60000 ms duration)
[ INFO ] Latency: 37.91 ms
[ INFO ] Throughput: 52.7566 FPS
```
For asynchronous API, the application outputs only throughput:
```
[ INFO ] Start inference asynchronously (60000 ms duration, 2 inference requests in parallel)
[ INFO ] Throughput: 48.2031 FPS
```
## See Also
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)

View File

@@ -0,0 +1,119 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#pragma once
#include <string>
#include <vector>
#include <gflags/gflags.h>
#include <iostream>
#ifdef _WIN32
#include <os/windows/w_dirent.h>
#else
#include <sys/stat.h>
#include <dirent.h>
#endif
/// @brief message for help argument
static const char help_message[] = "Print a usage message";
/// @brief message for images argument
static const char image_message[] = "Required. Path to a folder with images or to image files.";
/// @brief message for images argument
static const char multi_input_message[] = "Path to multi input file containing.";
/// @brief message for model argument
static const char model_message[] = "Required. Path to an .xml file with a trained model.";
/// @brief message for plugin_path argument
static const char plugin_path_message[] = "Path to a plugin folder.";
/// @brief message for plugin argument
static const char api_message[] = "Required. Enable using sync/async API.";
/// @brief message for assigning cnn calculation to device
static const char target_device_message[] = "Specify a target device to infer on: CPU, GPU, FPGA or MYRIAD. " \
"Use \"-d HETERO:<comma separated devices list>\" format to specify HETERO plugin. " \
"The application looks for a suitable plugin for the specified device.";
/// @brief message for iterations count
static const char iterations_count_message[] = "Optional. Number of iterations. " \
"If not specified, the number of iterations is calculated depending on a device.";
/// @brief message for iterations count
static const char infer_requests_count_message[] = "Optional. Number of infer requests (default value is 2).";
/// @brief message for user library argument
static const char custom_cpu_library_message[] = "Required for CPU custom layers. Absolute path to a shared library with the kernels implementations.";
/// @brief message for clDNN custom kernels desc
static const char custom_cldnn_message[] = "Required for GPU custom kernels. Absolute path to an .xml file with the kernels description.";
static const char batch_size_message[] = "Batch size value. If not specified, the batch size value is determined from IR";
/// @brief Define flag for showing help message <br>
DEFINE_bool(h, false, help_message);
/// @brief Define parameter for set image file <br>
/// i or mif is a required parameter
DEFINE_string(i, "", image_message);
/// @brief Define parameter for set model file <br>
/// It is a required parameter
DEFINE_string(m, "", model_message);
/// @brief Define parameter for set path to plugins <br>
DEFINE_string(pp, "", plugin_path_message);
/// @brief Enable per-layer performance report
DEFINE_string(api, "async", api_message);
/// @brief device the target device to infer on <br>
DEFINE_string(d, "", target_device_message);
/// @brief Absolute path to CPU library with user layers <br>
/// It is a required parameter
DEFINE_string(l, "", custom_cpu_library_message);
/// @brief Define parameter for clDNN custom kernels path <br>
/// Default is ./lib
DEFINE_string(c, "", custom_cldnn_message);
/// @brief Iterations count (default 0)
/// Sync mode: iterations count
/// Async mode: StartAsync counts
DEFINE_int32(niter, 0, iterations_count_message);
/// @brief Number of infer requests in parallel
DEFINE_int32(nireq, 2, infer_requests_count_message);
/// @brief Define parameter for batch size <br>
/// Default is 0 (that means don't specify)
DEFINE_int32(b, 0, batch_size_message);
/**
* @brief This function show a help message
*/
static void showUsage() {
std::cout << std::endl;
std::cout << "universal_app [OPTION]" << std::endl;
std::cout << "Options:" << std::endl;
std::cout << std::endl;
std::cout << " -h " << help_message << std::endl;
std::cout << " -i \"<path>\" " << image_message << std::endl;
std::cout << " -m \"<path>\" " << model_message << std::endl;
std::cout << " -pp \"<path>\" " << plugin_path_message << std::endl;
std::cout << " -api \"<sync/async>\" " << api_message << std::endl;
std::cout << " -d \"<device>\" " << target_device_message << std::endl;
std::cout << " -niter \"<integer>\" " << iterations_count_message << std::endl;
std::cout << " -l \"<absolute_path>\" " << custom_cpu_library_message << std::endl;
std::cout << " Or" << std::endl;
std::cout << " -c \"<absolute_path>\" " << custom_cldnn_message << std::endl;
std::cout << " -nireq \"<integer>\" " << infer_requests_count_message << std::endl;
std::cout << " -b \"<integer>\" " << batch_size_message << std::endl;
}

View File

@@ -0,0 +1,417 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#include <algorithm>
#include <chrono>
#include <memory>
#include <map>
#include <string>
#include <vector>
#include <utility>
#include <inference_engine.hpp>
#include <format_reader_ptr.h>
#include <samples/common.hpp>
#include <samples/slog.hpp>
#include <samples/args_helper.hpp>
#include "benchmark_app.h"
using namespace InferenceEngine;
long long getDurationInNanoseconds(const std::string& device);
double getMedianValue(const std::vector<float>& sortedTimes);
void fillBlobWithImage(
Blob::Ptr& inputBlob,
const std::vector<std::string>& filePaths,
const size_t batchSize,
const InferenceEngine::InputInfo& info);
static const std::vector<std::pair<std::string, long long>> deviceDurationsInSeconds{
{ "CPU", 60LL },
{ "GPU", 60LL },
{ "VPU", 60LL },
{ "MYRIAD", 60LL },
{ "FPGA", 120LL },
{ "UNKNOWN", 120LL }
};
/**
* @brief The entry point the benchmark application
*/
int main(int argc, char *argv[]) {
try {
slog::info << "InferenceEngine: " << InferenceEngine::GetInferenceEngineVersion() << slog::endl;
slog::info << "Parsing input parameters" << slog::endl;
gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
if (FLAGS_h) {
showUsage();
return 0;
}
if (FLAGS_m.empty()) {
throw std::logic_error("Model required is not set. Please use -h.");
}
if (FLAGS_api.empty()) {
throw std::logic_error("API not selected. Please use -h.");
}
if (FLAGS_api != "async" && FLAGS_api != "sync") {
throw std::logic_error("Incorrect API. Please use -h.");
}
if (FLAGS_i.empty()) {
throw std::logic_error("Input is not set. Please use -h.");
}
if (FLAGS_niter < 0) {
throw std::logic_error("Number of iterations should be positive (invalid -niter option value)");
}
if (FLAGS_nireq < 0) {
throw std::logic_error("Number of inference requests should be positive (invalid -nireq option value)");
}
if (FLAGS_b < 0) {
throw std::logic_error("Batch size should be positive (invalid -b option value)");
}
std::vector<std::string> inputs;
parseInputFilesArguments(inputs);
if (inputs.size() == 0ULL) {
throw std::logic_error("no images found");
}
// --------------------------- 1. Load Plugin for inference engine -------------------------------------
slog::info << "Loading plugin" << slog::endl;
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp }).getPluginByDevice(FLAGS_d);
if (!FLAGS_l.empty()) {
// CPU (MKLDNN) extensions is loaded as a shared library and passed as a pointer to base extension
const std::shared_ptr<IExtension> extension_ptr = InferenceEngine::make_so_pointer<InferenceEngine::IExtension>(FLAGS_l);
plugin.AddExtension(extension_ptr);
slog::info << "CPU (MKLDNN) extensions is loaded " << FLAGS_l << slog::endl;
} else if (!FLAGS_c.empty()) {
// Load clDNN Extensions
plugin.SetConfig({ {CONFIG_KEY(CONFIG_FILE), FLAGS_c} });
slog::info << "GPU extensions is loaded " << FLAGS_c << slog::endl;
}
InferenceEngine::ResponseDesc resp;
const Version *pluginVersion = plugin.GetVersion();
slog::info << pluginVersion << slog::endl << slog::endl;
// --------------------------- 2. Read IR Generated by ModelOptimizer (.xml and .bin files) ------------
slog::info << "Loading network files" << slog::endl;
InferenceEngine::CNNNetReader netBuilder;
netBuilder.ReadNetwork(FLAGS_m);
const std::string binFileName = fileNameNoExt(FLAGS_m) + ".bin";
netBuilder.ReadWeights(binFileName);
InferenceEngine::CNNNetwork cnnNetwork = netBuilder.getNetwork();
const InferenceEngine::InputsDataMap inputInfo(cnnNetwork.getInputsInfo());
if (inputInfo.empty()) {
throw std::logic_error("no inputs info is provided");
}
if (inputInfo.size() != 1) {
throw std::logic_error("only one input layer network is supported");
}
// --------------------------- 3. Resize network to match image sizes and given batch----------------------
if (FLAGS_b != 0) {
// We support models having only one input layers
ICNNNetwork::InputShapes shapes = cnnNetwork.getInputShapes();
const ICNNNetwork::InputShapes::iterator& it = shapes.begin();
if (it->second.size() != 4) {
throw std::logic_error("Unsupported model for batch size changing in automatic mode");
}
it->second[0] = FLAGS_b;
slog::info << "Resizing network to batch = " << FLAGS_b << slog::endl;
cnnNetwork.reshape(shapes);
}
const size_t batchSize = cnnNetwork.getBatchSize();
const Precision precision = inputInfo.begin()->second->getPrecision();
slog::info << (FLAGS_b != 0 ? "Network batch size was changed to: " : "Network batch size: ") << batchSize <<
", precision: " << precision << slog::endl;
// --------------------------- 4. Configure input & output ---------------------------------------------
const InferenceEngine::Precision inputPrecision = InferenceEngine::Precision::U8;
for (auto& item : inputInfo) {
/** Set the precision of input data provided by the user, should be called before load of the network to the plugin **/
item.second->setInputPrecision(inputPrecision);
}
const size_t imagesCount = inputs.size();
if (batchSize > imagesCount) {
slog::warn << "Network batch size " << batchSize << " is greater than images count " << imagesCount <<
", some input files will be duplicated" << slog::endl;
} else if (batchSize < imagesCount) {
slog::warn << "Network batch size " << batchSize << " is less then images count " << imagesCount <<
", some input files will be ignored" << slog::endl;
}
// ------------------------------ Prepare output blobs -------------------------------------------------
slog::info << "Preparing output blobs" << slog::endl;
InferenceEngine::OutputsDataMap outputInfo(cnnNetwork.getOutputsInfo());
InferenceEngine::BlobMap outputBlobs;
for (auto& item : outputInfo) {
const InferenceEngine::DataPtr outData = item.second;
if (!outData) {
throw std::logic_error("output data pointer is not valid");
}
InferenceEngine::SizeVector outputDims = outData->dims;
const InferenceEngine::Precision outputPrecision = InferenceEngine::Precision::FP32;
/** Set the precision of output data provided by the user, should be called before load of the network to the plugin **/
outData->precision = outputPrecision;
InferenceEngine::TBlob<float>::Ptr output = InferenceEngine::make_shared_blob<float>(item.second->getTensorDesc());
output->allocate();
outputBlobs[item.first] = output;
}
// --------------------------- 5. Loading model to the plugin ------------------------------------------
slog::info << "Loading model to the plugin" << slog::endl;
const std::map<std::string, std::string> networkConfig;
InferenceEngine::ExecutableNetwork exeNetwork = plugin.LoadNetwork(cnnNetwork, networkConfig);
// --------------------------- 6. Performance measurements stuff ------------------------------------------
typedef std::chrono::high_resolution_clock Time;
typedef std::chrono::nanoseconds ns;
std::vector<float> times;
long long durationInNanoseconds;
if (FLAGS_niter != 0) {
durationInNanoseconds = 0LL;
times.reserve(FLAGS_niter);
} else {
durationInNanoseconds = getDurationInNanoseconds(FLAGS_d);
}
if (FLAGS_api == "sync") {
InferRequest inferRequest = exeNetwork.CreateInferRequest();
slog::info << "Sync request created" << slog::endl;
for (const InputsDataMap::value_type& item : inputInfo) {
Blob::Ptr inputBlob = inferRequest.GetBlob(item.first);
fillBlobWithImage(inputBlob, inputs, batchSize, *item.second);
}
if (FLAGS_niter != 0) {
slog::info << "Start inference synchronously (" << FLAGS_niter << " sync inference executions)" << slog::endl << slog::endl;
} else {
slog::info << "Start inference synchronously (" << durationInNanoseconds * 0.000001 << " ms duration)" << slog::endl << slog::endl;
}
const auto startTime = Time::now();
auto currentTime = Time::now();
size_t iteration = 0ULL;
while ((iteration < FLAGS_niter) || ((FLAGS_niter == 0LL) && ((currentTime - startTime).count() < durationInNanoseconds))) {
const auto iterationStartTime = Time::now();
inferRequest.Infer();
currentTime = Time::now();
const auto iterationDurationNs = std::chrono::duration_cast<ns>(currentTime - iterationStartTime);
times.push_back(static_cast<double>(iterationDurationNs.count()) * 0.000001);
iteration++;
}
std::sort(times.begin(), times.end());
const double latency = getMedianValue(times);
slog::info << "Latency: " << latency << " ms" << slog::endl;
slog::info << "Throughput: " << batchSize * 1000.0 / latency << " FPS" << slog::endl;
} else if (FLAGS_api == "async") {
std::vector<InferRequest> inferRequests;
inferRequests.reserve(FLAGS_nireq);
for (size_t i = 0; i < FLAGS_nireq; i++) {
InferRequest inferRequest = exeNetwork.CreateInferRequest();
inferRequests.push_back(inferRequest);
for (const InputsDataMap::value_type& item : inputInfo) {
Blob::Ptr inputBlob = inferRequest.GetBlob(item.first);
fillBlobWithImage(inputBlob, inputs, batchSize, *item.second);
}
}
if (FLAGS_niter != 0) {
slog::info << "Start inference asynchronously (" << FLAGS_niter <<
" async inference executions, " << FLAGS_nireq <<
" inference requests in parallel)" << slog::endl << slog::endl;
} else {
slog::info << "Start inference asynchronously (" << durationInNanoseconds * 0.000001 <<
" ms duration, " << FLAGS_nireq <<
" inference requests in parallel)" << slog::endl << slog::endl;
}
size_t currentInference = 0ULL;
bool requiredInferenceRequestsWereExecuted = false;
long long previousInference = 1LL - FLAGS_nireq;
// warming up - out of scope
inferRequests[0].StartAsync();
inferRequests[0].Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
const size_t stepsCount = FLAGS_niter + FLAGS_nireq - 1;
/** Start inference & calculate performance **/
const auto startTime = Time::now();
size_t step = 0ULL;
while ((!requiredInferenceRequestsWereExecuted) ||
(step < stepsCount) ||
((FLAGS_niter == 0LL) && ((Time::now() - startTime).count() < durationInNanoseconds))) {
// start new inference
inferRequests[currentInference].StartAsync();
// wait the latest inference execution if exists
if (previousInference >= 0) {
const StatusCode code = inferRequests[previousInference].Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
if (code != StatusCode::OK) {
throw std::logic_error("Wait");
}
}
currentInference++;
if (currentInference >= FLAGS_nireq) {
currentInference = 0;
requiredInferenceRequestsWereExecuted = true;
}
previousInference++;
if (previousInference >= FLAGS_nireq) {
previousInference = 0;
}
step++;
}
// wait the latest inference executions
for (size_t notCompletedIndex = 0ULL; notCompletedIndex < (FLAGS_nireq - 1); ++notCompletedIndex) {
if (previousInference >= 0) {
const StatusCode code = inferRequests[previousInference].Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
if (code != StatusCode::OK) {
throw std::logic_error("Wait");
}
}
previousInference++;
if (previousInference >= FLAGS_nireq) {
previousInference = 0LL;
}
}
const double totalDuration = std::chrono::duration_cast<ns>(Time::now() - startTime).count() * 0.000001;
const double fps = batchSize * 1000.0 * step / totalDuration;
slog::info << "Throughput: " << fps << " FPS" << slog::endl;
} else {
throw std::logic_error("unknown api command line argument value");
}
} catch (const std::exception& ex) {
slog::err << ex.what() << slog::endl;
return 3;
}
return 0;
}
long long getDurationInNanoseconds(const std::string& device) {
auto duration = 0LL;
for (const auto& deviceDurationInSeconds : deviceDurationsInSeconds) {
if (device.find(deviceDurationInSeconds.first) != std::string::npos) {
duration = std::max(duration, deviceDurationInSeconds.second);
}
}
if (duration == 0LL) {
const auto unknownDeviceIt = find_if(
deviceDurationsInSeconds.begin(),
deviceDurationsInSeconds.end(),
[](std::pair<std::string, long long> deviceDuration) { return deviceDuration.first == "UNKNOWN"; });
if (unknownDeviceIt == deviceDurationsInSeconds.end()) {
throw std::logic_error("UNKNOWN device was not found in device duration list");
}
duration = unknownDeviceIt->second;
slog::warn << "Default duration " << duration << " seconds for unknown device '" << device << "' is used" << slog::endl;
}
return duration * 1000000000LL;
}
double getMedianValue(const std::vector<float>& sortedTimes) {
return (sortedTimes.size() % 2 != 0) ?
sortedTimes[sortedTimes.size() / 2ULL] :
(sortedTimes[sortedTimes.size() / 2ULL] + sortedTimes[sortedTimes.size() / 2ULL - 1ULL]) / 2.0;
}
void fillBlobWithImage(
Blob::Ptr& inputBlob,
const std::vector<std::string>& filePaths,
const size_t batchSize,
const InferenceEngine::InputInfo& info) {
uint8_t* inputBlobData = inputBlob->buffer().as<uint8_t*>();
const SizeVector& inputBlobDims = inputBlob->dims();
slog::info << "Input dimensions (" << info.getTensorDesc().getLayout() << "): ";
for (const auto& i : info.getTensorDesc().getDims()) {
slog::info << i << " ";
}
slog::info << slog::endl;
/** Collect images data ptrs **/
std::vector<std::shared_ptr<uint8_t>> vreader;
vreader.reserve(batchSize);
for (size_t i = 0ULL, inputIndex = 0ULL; i < batchSize; i++, inputIndex++) {
if (inputIndex >= filePaths.size()) {
inputIndex = 0ULL;
}
FormatReader::ReaderPtr reader(filePaths[inputIndex].c_str());
if (reader.get() == nullptr) {
slog::warn << "Image " << filePaths[inputIndex] << " cannot be read!" << slog::endl << slog::endl;
continue;
}
/** Getting image data **/
std::shared_ptr<uint8_t> imageData(reader->getData(info.getDims()[0], info.getDims()[1]));
if (imageData) {
vreader.push_back(imageData);
}
}
/** Fill input tensor with images. First b channel, then g and r channels **/
const size_t numChannels = inputBlobDims[2];
const size_t imageSize = inputBlobDims[1] * inputBlobDims[0];
/** Iterate over all input images **/
for (size_t imageId = 0; imageId < vreader.size(); ++imageId) {
/** Iterate over all pixel in image (b,g,r) **/
for (size_t pid = 0; pid < imageSize; pid++) {
/** Iterate over all channels **/
for (size_t ch = 0; ch < numChannels; ++ch) {
/** [images stride + channels stride + pixel id ] all in bytes **/
inputBlobData[imageId * imageSize * numChannels + ch * imageSize + pid] = vreader.at(imageId).get()[pid*numChannels + ch];
}
}
}
}

View File

@@ -0,0 +1,61 @@
#!/bin/bash
# Copyright (c) 2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
error() {
local code="${3:-1}"
if [[ -n "$2" ]];then
echo "Error on or near line $1: $2; exiting with status ${code}"
else
echo "Error on or near line $1; exiting with status ${code}"
fi
exit "${code}"
}
trap 'error ${LINENO}' ERR
SAMPLES_PATH="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
if [[ -z "${InferenceEngine_DIR}" ]]; then
printf "\nInferenceEngine_DIR environment variable is not set. Trying to find setupvars.sh to set it. \n"
setvars_path=$SAMPLES_PATH/../..
if [ -e "$setvars_path/inference_engine/bin/setvars.sh" ]; then # for Intel Deep Learning Deployment Toolkit package
setvars_path="$setvars_path/inference_engine/bin/setvars.sh"
elif [ -e "$setvars_path/../bin/setupvars.sh" ]; then # for OpenVINO package
setvars_path="$setvars_path/../bin/setupvars.sh"
elif [ -e "$setvars_path/../setupvars.sh" ]; then
setvars_path="$setvars_path/../setupvars.sh"
else
printf "Error: setupvars.sh is not found in hardcoded paths. \n\n"
exit 1
fi
if ! source $setvars_path ; then
printf "Unable to run ./setupvars.sh. Please check its presence. \n\n"
exit 1
fi
fi
if ! command -v cmake &>/dev/null; then
printf "\n\nCMAKE is not installed. It is required to build Inference Engine samples. Please install it. \n\n"
exit 1
fi
build_dir=$HOME/inference_engine_samples_build
mkdir -p $build_dir
cd $build_dir
cmake -DCMAKE_BUILD_TYPE=Release $SAMPLES_PATH
make -j8
printf "\nBuild completed, you can find binaries for all samples in the $HOME/inference_engine_samples_build/intel64/Release subfolder.\n\n"

View File

@@ -0,0 +1,68 @@
# Copyright (c) 2018 Intel Corporation
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
cmake_minimum_required(VERSION 2.8)
set (TARGET_NAME "calibration_tool")
file (GLOB MAIN_SRC
${CMAKE_CURRENT_SOURCE_DIR}/*.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/pugixml/*.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/ClassificationProcessor.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/classification_set_generator.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/image_decoder.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/ObjectDetectionProcessor.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/Processor.cpp
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/VOCAnnotationParser.cpp
)
file (GLOB MAIN_HEADERS
${CMAKE_CURRENT_SOURCE_DIR}/*.hpp
${CMAKE_CURRENT_SOURCE_DIR}/pugixml/*.hpp
)
# Create named folders for the sources within the .vcproj
# Empty name lists them directly under the .vcproj
source_group("src" FILES ${MAIN_SRC})
source_group("include" FILES ${MAIN_HEADERS})
# opencv include folders
find_package(OpenCV QUIET COMPONENTS core imgproc highgui imgcodecs)
if(NOT(OpenCV_FOUND))
find_package(OpenCV QUIET COMPONENTS world)
if(NOT(OpenCV_FOUND))
message(WARNING "No suitable OpenCV version detected, " ${TARGET_NAME} " skipped")
return()
endif()
endif()
# Properties->C/C++->General->Additional Include Directories
include_directories (${CMAKE_CURRENT_SOURCE_DIR}/../classification_sample/core
${CMAKE_CURRENT_SOURCE_DIR}/../common
${CMAKE_CURRENT_SOURCE_DIR}/../common/os/windows
${CMAKE_CURRENT_SOURCE_DIR}/../../include
${OpenCV_INCLUDE_DIRS}
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app)
link_directories(${LIB_FOLDER})
# Create library file from sources.
add_executable(${TARGET_NAME} ${MAIN_SRC} ${MAIN_HEADERS})
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
COMPILE_PDB_NAME ${TARGET_NAME})
target_link_libraries(${TARGET_NAME} gflags IE::ie_cpu_extension ${InferenceEngine_LIBRARIES} ${OpenCV_LIBRARIES})
if (UNIX)
target_link_libraries(${TARGET_NAME} dl)
endif()

View File

@@ -0,0 +1,103 @@
# Calibration Tool
Inference Engine Calibration Tool calibrates a given FP32 model so that is can be run in low-precision 8-bit integer
mode while keeping the input data of this model in the original precision.
## Calibration Tool Options
The core command-line options for the Calibration Tool are the same as for
[Validation Application](./samples/validation_app/README.md). However, the Calibration Tool has the following specific options: `-t`, `-subset`, `-output`, and `-threshold`.
Running the Calibration Tool with the `-h` option yields the following usage message with all CLI options listed:
```sh
Usage: calibration_tool [OPTION]
Available options:
-h Print a help message
-t <type> Type of an inferred network ("C" by default)
-t "C" to calibrate Classification network and write the calibrated network to IR
-t "OD" to calibrate Object Detection network and write the calibrated network to IR
-t "RawC" to collect only statistics for Classification network and write statistics to IR. With this option, a model is not calibrated. For calibration and statisctics collection, use "-t C" instead.
-t "RawOD" to collect only statistics for Object Detection network and write statistics to IR. With this option, a model is not calibrated. For calibration and statisctics collection, use "-t OD" instead
-i <path> Required. Path to a directory with validation images. For Classification models, the directory must contain folders named as labels with images inside or a .txt file with a list of images. For Object Detection models, the dataset must be in VOC format.
-m <path> Required. Path to an .xml file with a trained model, including model name and extension.
-l <absolute_path> Required for CPU custom layers. Absolute path to a shared library with the kernel implementations.
-c <absolute_path> Required for GPU custom kernels. Absolute path to an .xml file with the kernel descriptions.
-d <device> Target device to infer on: CPU (default), GPU, FPGA, or MYRIAD. The application looks for a suitable plugin for the specified device.
-b N Batch size value. If not specified, the batch size value is taken from IR
-ppType <type> Preprocessing type. Options: "None", "Resize", "ResizeCrop"
-ppSize N Preprocessing size (used with ppType="ResizeCrop")
-ppWidth W Preprocessing width (overrides -ppSize, used with ppType="ResizeCrop")
-ppHeight H Preprocessing height (overrides -ppSize, used with ppType="ResizeCrop")
--dump Dump file names and inference results to a .csv file
-subset Number of pictures from the whole validation set tocreate the calibration dataset. Default value is 0, which stands forthe whole provided dataset
-output <output_IR> Output name for calibrated model. Default is <original_model_name>_i8.xml|bin
-threshold Threshold for a maximum accuracy drop of quantized model. Must be an integer number (percents) without a percent sign. Default value is 1, which stands for accepted accuracy drop in 1%
Classification-specific options:
-Czb true "Zero is a background" flag. Some networks are trained with a modified dataset where the class IDs are enumerated from 1, but 0 is an undefined "background" class (which is never detected)
Object detection-specific options:
-ODkind <kind> Type of an Object Detection model. Options: SSD
-ODa <path> Required for Object Detection models. Path to a directory containing an .xml file with annotations for images.
-ODc <file> Required for Object Detection models. Path to a file with a list of classes
-ODsubdir <name> Directory between the path to images (specified with -i) and image name (specified in the .xml file). For VOC2007 dataset, use JPEGImages.
```
The tool options are divided into two categories:
1. **Common options** named with a single letter or a word, such as <code>-b</code> or <code>--dump</code>.
These options are the same in all calibration tool modes.
2. **Network type-specific options** named as an acronym of the network type (<code>C</code> or <code>OD</code>)
followed by a letter or a word.
## Calibrate a Classification Model
To calibrate a classification convolutional neural network (CNN)
on a subset of images (first 2000 images) from the given dataset (specified with the `-i` option), run the following command:
```bash
./calibration_tool -t C -i <path_to_images_directory_or_txt_file> -m <path_to_classification_model>/<model_name>.xml -d <CPU|GPU> -subset 2000
```
The dataset must have the correct format. Classification models support two formats: folders
named as labels that contain all images of this class and ImageNet*-like format, with the
`.txt` file containing list of images and IDs of classes.
For more information on the structure of the datasets, refer to the **Prepare a Dataset** section of the
[Validation Application document](./samples/validation_app/README.md).
If you decide to use the subset of the given dataset, use the ImageNet-like format
instead of "folder as classes" format. This brings a more accurate calibration as you are likely to get images
representing different classes.
For example, to calibrate the pretrained TensorFlow\* `inception_v4_tf.xml` classification model,
run the following command:
```bash
./calibration_tool -t C -m inception_v4_tf.xml -i ILSVRC2012_val.txt -Czb false -ppType "ResizeCrop" -ppSize 342 -b 1 -d CPU -subset 2000
```
## Calibrate Object Detection Model
This topic demonstrates how to run the Calibration Tool on the Object Detection CNN on a set of images. Please
review the list of Object Detection models used for validation of the Calibration Tool
in the [8-bit Inference Introduction](./docs/Inference_Engine_Developer_Guide/Int8Inference.md).
Any network that can be inferred with the Inference Engine and has the same input and output
format as the SSD CNN should be supported as well.
### Run SSD Network on the VOC dataset
Before you start calibrating the model, make sure your dataset is in the correct format. For more information,
refer to the **Prepare a Dataset** section of the
[Validation Application document](./samples/validation_app/README.md).
Once you have prepared the dataset, you can calibrate the model on it by running the following command:
```bash
./calibration_tool -d CPU -t OD -ODa "<path_to_image_annotations>/VOCdevkit/VOC2007/Annotations" -i "<path_to_image_directory>/VOCdevkit" -m "<path_to_model>/vgg_voc0712_ssd_300x300.xml" -ODc "<path_to_classes_list>/VOC_SSD_Classes.txt" -ODsubdir JPEGImages -subset 500
```
## See Also
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)

View File

@@ -0,0 +1,847 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#include "calibrator_processors.h"
#include <string> // std::string
#include <iostream> // std::cout
#include <sstream> // std::stringstream
#include <iomanip>
#include <algorithm>
#include <map>
#include <memory>
#include <utility>
#include <list>
#include "details/ie_cnn_network_tools.h"
#include "details/caseless.hpp"
using namespace InferenceEngine;
using namespace InferenceEngine::details;
using InferenceEngine::details::InferenceEngineException;
CNNLayerPtr Int8Calibrator::addScaleShiftBeforeLayer(std::string name, CNNLayer::Ptr beforeLayer, size_t port, std::vector<float> scale) {
if (beforeLayer->insData.size() < port) {
THROW_IE_EXCEPTION << "cannot find appropraite port for addScaleShiftBeforeLayer";
}
DataPtr pData = beforeLayer->insData[port].lock();
LayerParams params;
params.name = name;
params.precision = Precision::FP32;
params.type = "ScaleShift";
CNNLayerPtr lptr = std::make_shared<ScaleShiftLayer>(params);
ScaleShiftLayer *pScaleShift = dynamic_cast<ScaleShiftLayer *>(lptr.get());
SizeVector wdims({ pData->dims[2] });
if (scale.size() == 1) {
scale.resize(wdims[0]);
for (int i = 1; i < wdims[0]; i++) {
scale[i] = scale[0];
}
}
if (scale.size() != pData->dims[2]) {
THROW_IE_EXCEPTION << "Failed to add scaleshift before " << beforeLayer->name << " due to scales and layer output dims incossitency";
}
Blob::Ptr weights = nullptr;
weights = make_shared_blob<float>(Precision::FP32, Layout::C, wdims);
weights->allocate();
float *buffer = weights->buffer().as<float *>();
if (buffer == nullptr) {
THROW_IE_EXCEPTION << "Could not allocate weights buffer";
}
for (size_t i = 0, idx = 0; i < pData->dims[2]; i++) {
buffer[i] = scale[i];
}
pScaleShift->_weights = weights;
SizeVector bdims({ pData->dims[2] });
Blob::Ptr biases = nullptr;
biases = make_shared_blob<float>(Precision::FP32, Layout::C, bdims);
biases->allocate();
buffer = biases->buffer().as<float *>();
for (size_t i = 0, idx = 0; i < pData->dims[2]; i++) {
buffer[i] = 0.f;
}
pScaleShift->_biases = biases;
Data *edge2 = new Data(*pData.get());
DataPtr newEdge(edge2);
lptr->insData.push_back(pData);
lptr->outData.push_back(newEdge);
newEdge->name = /*"EdgeAfter_" +*/ params.name;
newEdge->creatorLayer = lptr;
newEdge->inputTo.clear();
newEdge->inputTo[beforeLayer->name] = beforeLayer;
pData->inputTo.erase(beforeLayer->name);
pData->inputTo[params.name] = lptr;
for (size_t i = 0; i < beforeLayer->insData.size(); i++) {
DataPtr d = beforeLayer->insData[i].lock();
if (d == pData) {
beforeLayer->insData[i] = newEdge;
break;
}
}
return lptr;
}
float Int8Calibrator::compare_NRMSD(InferenceEngine::Blob::Ptr res, InferenceEngine::Blob::Ptr ref) {
float *res_ptr = res->buffer().as<float *>();
size_t res_size = res->size();
float *ref_ptr = ref->buffer().as<float *>();
size_t ref_size = ref->size();
float sum = 0;
float mmin = ref_ptr[0], mmax = ref_ptr[0];
for (size_t i = 0; i < ref_size; i++) {
float sqr = (ref_ptr[i] - res_ptr[i]);
sqr *= sqr;
sum += sqr;
mmin = std::min(mmin, ref_ptr[i]);
mmax = std::max(mmax, ref_ptr[i]);
}
sum /= ref_size;
sum = pow(sum, 0.5);
sum /= mmax - mmin;
return sum;
}
InferenceEngine::NetworkStatsMap Int8Calibrator::getStatistic(float threshold) {
InferenceEngine::NetworkStatsMap netNodesStats;
// go over all outputs and get aggregated statistics
for (auto l : _statData.registeredLayers()) {
NetworkNodeStatsPtr nodeStats;
size_t channels = _statData.getNumberChannels(l);
if (netNodesStats.find(l) == netNodesStats.end()) {
nodeStats = NetworkNodeStatsPtr(new NetworkNodeStats(channels));
netNodesStats[l] = nodeStats;
} else {
nodeStats = netNodesStats[l];
}
for (size_t c = 0; c < channels; c++) {
_statData.getDataMinMax(l, c, nodeStats->_minOutputs[c], nodeStats->_maxOutputs[c], threshold);
}
}
return netNodesStats;
}
void Int8Calibrator::collectFP32Statistic() {
_collectByLayer = false;
_collectStatistic = true;
networkReaderC = InferenceEngine::CNNNetReader();
networkReaderC.ReadNetwork(_modelFileNameI8C);
if (!networkReaderC.isParseSuccess()) THROW_IE_EXCEPTION << "cannot load a failed Model";
if (_cBatch == 0) {
// Zero means "take batch value from the IR"
_cBatch = networkReaderC.getNetwork().getBatchSize();
} else {
// Not zero means "use the specified value"
networkReaderC.getNetwork().setBatchSize(_cBatch);
}
/** Extract model name and load weights **/
std::string binFileName = fileNameNoExt(_modelFileNameI8C) + ".bin";
networkReaderC.ReadWeights(binFileName.c_str());
auto network = networkReaderC.getNetwork();
std::vector<CNNLayerPtr> layersAfterInputs;
std::string hackPrefix = "scaleshifted_input:";
for (auto &&layer : network) {
if (layer->insData.size() > 0) {
std::string inName = layer->input()->getName();
for (auto &&input : network.getInputsInfo()) {
if (inName == input.first) {
layersAfterInputs.push_back(layer);
_inputsFromLayers[hackPrefix + layer->name] = inName;
}
}
}
}
for (auto &&layer : layersAfterInputs) {
std::string firstInputName = hackPrefix + layer->name;
auto scaleShiftLayer = addScaleShiftBeforeLayer(firstInputName, layer, 0, { 1.f });
((ICNNNetwork&)network).addLayer(scaleShiftLayer);
}
// 1. add all layers as output one
for (auto &&layer : network) {
std::string layerType = network.getLayerByName(layer->name.c_str())->type;
if (/*layerType != "Split" &&*/layerType != "Input") {
network.addOutput(layer->name);
}
_statData.registerLayer(layer->name);
}
ExecutableNetwork executable_network = _pluginI8C.LoadNetwork(network, { { CONFIG_KEY(EXCLUSIVE_ASYNC_REQUESTS), CONFIG_VALUE(YES) } });
_inferRequestI8C = executable_network.CreateInferRequest();
}
void Int8Calibrator::validateInt8Config(const InferenceEngine::NetworkStatsMap &stat,
const std::map<std::string, bool> &layersToInt8) {
_collectByLayer = false;
_collectStatistic = false;
networkReaderC = InferenceEngine::CNNNetReader();
networkReaderC.ReadNetwork(_modelFileNameI8C);
if (!networkReaderC.isParseSuccess()) THROW_IE_EXCEPTION << "cannot load a failed Model";
if (_cBatch == 0) {
// Zero means "take batch value from the IR"
_cBatch = networkReaderC.getNetwork().getBatchSize();
} else {
// Not zero means "use the specified value"
networkReaderC.getNetwork().setBatchSize(_cBatch);
}
/** Extract model name and load weights **/
std::string binFileName = fileNameNoExt(_modelFileNameI8C) + ".bin";
networkReaderC.ReadWeights(binFileName.c_str());
// Initialize statistic
ICNNNetworkStats *pstats = nullptr;
StatusCode s = ((ICNNNetwork&)networkReaderC.getNetwork()).getStats(&pstats, nullptr);
if (s == StatusCode::OK && pstats) {
pstats->setNodesStats(stat);
}
auto network = networkReaderC.getNetwork();
for (auto l : layersToInt8) {
network.getLayerByName(l.first.c_str())->
params["quantization_level"] = (l.second == false) ? "FP32" : "I8";
}
ExecutableNetwork executable_network = _pluginI8C.LoadNetwork(network, { { CONFIG_KEY(EXCLUSIVE_ASYNC_REQUESTS), CONFIG_VALUE(YES) } });
_inferRequestI8C = executable_network.CreateInferRequest();
}
CNNNetwork Int8Calibrator::createICNNNetworkForLayer(CNNLayer::Ptr layerToClone) {
CNNLayer::Ptr layerRelU = layerToClone->outData[0]->inputTo.begin()->second;
InferenceEngine::CNNNetReader reader1;
std::string inpuitName = layerToClone->insData[0].lock()->name;
std::string model = "<net name=\"L\" version=\"2\" batch=\"1\"><layers> " \
"<layer name=\"" +
inpuitName +
"\" type=\"Input\" precision=\"FP32\" id=\"0\"> "\
"<output>"\
"<port id=\"0\">"\
"<dim>1</dim>"\
"<dim>3</dim>"\
"<dim>224</dim>"\
"<dim>224</dim>"\
"</port>"\
"</output>"\
"</layer>" \
"<layer name=\"" +
layerToClone->name +
"\" type=\"Convolution\" precision=\"FP32\" id=\"1\">" \
"<convolution_data stride-x=\"2\" stride-y=\"2\" pad-x=\"3\" pad-y=\"3\" kernel-x=\"7\" kernel-y=\"7\" output=\"64\" group=\"1\" />"\
"<input>"\
"<port id=\"1\">"\
"<dim>1</dim>"\
"<dim>3</dim>"\
"<dim>224</dim>"\
"<dim>224</dim>"\
"</port>"\
"</input>"\
"<output>"\
"<port id=\"2\">"\
"<dim>1</dim>"\
"<dim>64</dim>"\
"<dim>112</dim>"\
"<dim>112</dim>"\
"</port>"\
"</output>"\
"</layer>"\
"<layer name=\"" +
layerRelU->name +
"\" type=\"ReLU\" precision=\"FP32\" id=\"2\">"\
"<input>"
"<port id=\"3\">"\
"<dim>1</dim>"\
"<dim>64</dim>"\
"<dim>112</dim>"\
"<dim>112</dim>"\
"</port>"\
"</input>"\
"<output>"\
"<port id=\"4\">"\
"<dim>1</dim>"\
"<dim>64</dim>"\
"<dim>112</dim>"\
"<dim>112</dim>"\
"</port>"\
"</output>"\
"</layer>"\
"<layer name=\"" +
layerToClone->name +
"_\" type=\"ScaleShift\" precision=\"FP32\" id=\"3\">"\
"<input>"
"<port id=\"5\">"\
"<dim>1</dim>"\
"<dim>64</dim>"\
"<dim>112</dim>"\
"<dim>112</dim>"\
"</port>"\
"</input>"\
"<output>"\
"<port id=\"6\">"\
"<dim>1</dim>"\
"<dim>64</dim>"\
"<dim>112</dim>"\
"<dim>112</dim>"\
"</port>"\
"</output>"\
"</layer>"\
"</layers> <edges>"\
"<edge from-layer=\"0\" from-port=\"0\" to-layer=\"1\" to-port=\"1\"/> "\
"<edge from-layer=\"1\" from-port=\"2\" to-layer=\"2\" to-port=\"3\"/> "\
"<edge from-layer=\"2\" from-port=\"4\" to-layer=\"3\" to-port=\"5\"/> "\
"</edges></net>";
reader1.ReadNetwork(model.c_str(), model.length());
ICNNNetwork &n = reader1.getNetwork();
InferenceEngine::InputsDataMap inputs;
n.getInputsInfo(inputs);
CNNLayerPtr inputLayer = inputs.begin()->second->getInputData()->creatorLayer.lock();
CNNLayerPtr convLayer;
n.getLayerByName(layerToClone->name.c_str(), convLayer, nullptr);
ConvolutionLayer *pConvS = dynamic_cast<ConvolutionLayer *>(layerToClone.get());
ConvolutionLayer *pConvT = dynamic_cast<ConvolutionLayer *>(convLayer.get());
pConvT->_kernel_x = pConvS->_kernel_x;
pConvT->_kernel_y = pConvS->_kernel_y;
pConvT->_stride_x = pConvS->_stride_x;
pConvT->_stride_y = pConvS->_stride_y;
pConvT->_out_depth = pConvS->_out_depth;
pConvT->_padding_x = pConvS->_padding_x;
pConvT->_padding_y = pConvS->_padding_y;
pConvT->_dilation_x = pConvS->_dilation_x;
pConvT->_dilation_y = pConvS->_dilation_y;
pConvT->_group = pConvS->_group;
pConvT->_weights = pConvS->_weights;
pConvT->_biases = pConvS->_biases;
pConvT->blobs = pConvS->blobs;
std::shared_ptr<Data> cur = layerToClone->insData[0].lock();
if (cur == nullptr) {
THROW_IE_EXCEPTION << "[Samples] shared ptr layerToClone->insData[0].lock() return nullptr";
}
DataPtr inputEdge = std::make_shared<Data>(*cur.get());
inputEdge->getInputTo().clear();
inputEdge->name = inpuitName;
inputEdge->creatorLayer = inputLayer;
inputEdge->inputTo[layerToClone->name] = convLayer;
inputEdge->getInputTo().clear();
inputEdge->inputTo[layerToClone->name] = convLayer;
inputs.begin()->second->setInputData(inputEdge);
convLayer->insData.clear();
convLayer->insData.push_back(inputEdge);
inputLayer->outData.clear();
inputLayer->outData.push_back(inputEdge);
DataPtr convEdge = std::make_shared<Data>(*layerToClone->outData[0].get());
convEdge->getInputTo().clear();
convEdge->creatorLayer = convLayer;
convEdge->name = convLayer->name;
convLayer->outData.clear();
convLayer->outData.push_back(convEdge);
CNNLayerPtr reluLayer;
n.getLayerByName(layerRelU->name.c_str(), reluLayer, nullptr);
DataPtr reluEdge = std::make_shared<Data>(*layerRelU->outData[0].get());
reluEdge->getInputTo().clear();
reluEdge->creatorLayer = reluLayer;
reluEdge->name = reluLayer->name;
reluLayer->insData.clear();
reluLayer->insData.push_back(convEdge);
reluLayer->outData.clear();
reluLayer->outData.push_back(reluEdge);
convEdge->inputTo[reluLayer->name] = reluLayer;
CNNLayerPtr ssLayer;
std::string ssLayerName = convLayer->name + "_";
n.getLayerByName(ssLayerName.c_str(), ssLayer, nullptr);
DataPtr ssEdge = std::make_shared<Data>(*layerRelU->outData[0].get());
ssEdge->getInputTo().clear();
ssEdge->creatorLayer = ssLayer;
ssEdge->name = ssLayer->name;
ssLayer->insData.clear();
ssLayer->insData.push_back(reluEdge);
ssLayer->outData.clear();
ssLayer->outData.push_back(ssEdge);
reluEdge->inputTo[ssLayer->name] = ssLayer;
n.addOutput(ssLayer->name);
// filling weights and biases
size_t channels = ssEdge->getTensorDesc().getDims()[1];
Blob::Ptr weights = nullptr;
SizeVector wdims;
wdims.push_back(channels);
weights = make_shared_blob<float, const SizeVector>(Precision::FP32, Layout::C, wdims);
weights->allocate();
float *dataw = weights->buffer().as<float *>();
for (size_t i = 0; i < channels; i++) {
dataw[i] = 1.0f;
}
ssLayer->blobs["weights"] = weights;
Blob::Ptr biases = nullptr;
SizeVector bdims;
bdims.push_back(channels);
biases = make_shared_blob<float, const SizeVector>(Precision::FP32, Layout::C, bdims);
biases->allocate();
float *datab = biases->buffer().as<float *>();
for (size_t i = 0; i < channels; i++) {
datab[i] = 0.0f;
}
ssLayer->blobs["biases"] = biases;
auto wss = dynamic_cast<WeightableLayer*>(ssLayer.get());
wss->_weights = weights;
wss->_biases = biases;
return reader1.getNetwork();
}
void Int8Calibrator::collectByLayerStatistic(const InferenceEngine::NetworkStatsMap &stat) {
_collectByLayer = true;
_collectStatistic = false;
networkReaderC = InferenceEngine::CNNNetReader();
networkReaderC.ReadNetwork(_modelFileNameI8C);
if (!networkReaderC.isParseSuccess()) THROW_IE_EXCEPTION << "cannot load a failed Model";
if (_cBatch != 0) {
networkReaderC.getNetwork().setBatchSize(_cBatch);
}
/** Extract model name and load weights **/
std::string binFileName = fileNameNoExt(_modelFileNameI8C) + ".bin";
networkReaderC.ReadWeights(binFileName.c_str());
auto network = networkReaderC.getNetwork();
// 1. add all layers as output one
for (auto &&layer : network) {
std::string layerType = network.getLayerByName(layer->name.c_str())->type;
if (/*layerType != "Split" &&*/layerType != "Input") {
network.addOutput(layer->name);
}
if (layerType == "Convolution") {
_layersAccuracyDrop[layer->name] = 0.f;
}
}
ExecutableNetwork executable_network = _pluginI8C.LoadNetwork(network, { { CONFIG_KEY(EXCLUSIVE_ASYNC_REQUESTS), CONFIG_VALUE(YES) } });
_inferRequestI8C = executable_network.CreateInferRequest();
// 2. go over all layers which affect accuracy and create network basing on it
for (auto l : _layersAccuracyDrop) {
CNNLayerPtr layerToClone = network.getLayerByName(l.first.c_str());
CNNLayerPtr layerRelU = nullptr;
// verification if there is Conv-RELU patern
// currently it is only supported
// if only one output from conv and if it is an output to relu
bool quattization = false;
if (layerToClone->outData.size() == 1 && layerToClone->outData[0]->inputTo.size() == 1) {
layerRelU = layerToClone->outData[0]->inputTo.begin()->second;
if (layerRelU->type == "ReLU") {
quattization = true;
}
}
if (quattization) {
CNNNetwork n = createICNNNetworkForLayer(layerToClone);
if (_cBatch != 0) {
n.setBatchSize(_cBatch);
}
// Initialize statistic
ICNNNetworkStats *pstats = nullptr;
ICNNNetwork &in = n;
StatusCode s = in.getStats(&pstats, nullptr);
if (s == StatusCode::OK && pstats) {
pstats->setNodesStats(stat);
}
InferenceEngine::InputsDataMap inputs = n.getInputsInfo();
DataPtr q = inputs.begin()->second->getInputData();
ExecutableNetwork enetwork = _pluginI8C.LoadNetwork(n, { { CONFIG_KEY(EXCLUSIVE_ASYNC_REQUESTS), CONFIG_VALUE(YES) } });
_singleLayerNetworks.push_back(enetwork);
InferenceEngine::InferRequest request = enetwork.CreateInferRequest();
std::string inpuitName = layerToClone->insData[0].lock()->name;
request.SetBlob(inpuitName, _inferRequestI8C.GetBlob(inpuitName));
_singleLayerRequests[layerToClone->name] = { request, layerRelU->name, layerToClone->name };
}
}
}
void Int8Calibrator::collectCalibrationStatistic() {
if (_collectByLayer) {
std::map<std::string, SingleLayerData>::iterator it = _singleLayerRequests.begin();
while (it != _singleLayerRequests.end()) {
it->second._request.Infer();
Blob::Ptr expected = _inferRequestI8C.GetBlob(it->second._outputName);
std::string i8Out = it->second._outputI8Name + "_";
Blob::Ptr result = it->second._request.GetBlob(i8Out.c_str());
float diff = compare_NRMSD(result, expected);
it->second._int8Accuracy.push_back(diff);
it++;
}
}
if (_collectStatistic) {
for (auto l : _statData.registeredLayers()) {
auto outBlob = _inferRequestI8C.GetBlob(l);
std::string outName = l;
if (_inputsFromLayers.find(l) != _inputsFromLayers.end()) {
outName = _inputsFromLayers[l];
}
size_t N, C, statCount;
if (outBlob->dims().size() == 4 && outBlob->layout() == Layout::NCHW) {
N = outBlob->dims()[3];
C = outBlob->dims()[2];
statCount = C;
} else if (outBlob->dims().size() == 2 && outBlob->layout() == Layout::NC) {
N = outBlob->dims()[1];
C = outBlob->dims()[0];
statCount = 1;
} else {
continue;
}
// Counting min/max outputs per channel
for (size_t n = 0; n < N; n++) {
if (outBlob->dims().size() == 4) {
size_t _HW = outBlob->dims()[0] * outBlob->dims()[1];
for (size_t c = 0; c < C; c++) {
if (outBlob->getTensorDesc().getPrecision() == Precision::FP32) {
float *ptr = &outBlob->buffer().as<float *>()[(n * C + c) * _HW];
_statData.addTensorStatistics(outName, c, ptr, _HW);
} else if (outBlob->getTensorDesc().getPrecision() == Precision::U8) {
uint8_t *ptr = &outBlob->buffer().as<uint8_t *>()[(n * C + c) * _HW];
_statData.addTensorStatistics(outName, c, ptr, _HW);
} else {
throw std::logic_error(std::string("Unsupported precision: ") + outBlob->getTensorDesc().getPrecision().name());
}
}
} else if (outBlob->dims().size() == 2) {
if (outBlob->getTensorDesc().getPrecision() == Precision::FP32) {
float *ptr = &outBlob->buffer().as<float *>()[n * C];
_statData.addTensorStatistics(outName, 0, ptr, C);
} else if (outBlob->getTensorDesc().getPrecision() == Precision::U8) {
uint8_t *ptr = &outBlob->buffer().as<uint8_t *>()[n * C];
_statData.addTensorStatistics(outName, 0, ptr, C);
} else {
throw std::logic_error(std::string("Unsupported precision: ") + outBlob->getTensorDesc().getPrecision().name());
}
}
}
}
}
}
void Int8Calibrator::calculateLayersAccuracyDrop() {
_layersAccuracyDrop.clear();
std::map<std::string, SingleLayerData>::iterator it = _singleLayerRequests.begin();
while (it != _singleLayerRequests.end()) {
// calculate average metric per layer over all images and sort in desc order
float mo = 0.f;
for (auto d : it->second._int8Accuracy) {
mo += d;
}
mo = mo / it->second._int8Accuracy.size();
_layersAccuracyDrop[it->first] = mo;
it++;
}
// correction of accuracy drop to have sorted values for cases when accuracy drop is equal
// correction is added according to topological order
// this will prioritize returning of layers to FP32 starting from layers closer to the end of network
std::vector<CNNLayerPtr> ordered = InferenceEngine::details::CNNNetSortTopologically(networkReaderC.getNetwork());
float c = 0.00001f;
for (auto l : ordered) {
auto it = _layersAccuracyDrop.find(l->name);
if (it != _layersAccuracyDrop.end()) {
it->second += c;
}
c += 0.00001f;
}
_singleLayerRequests.clear();
}
std::map<std::string, float> Int8Calibrator::layersAccuracyDrop() {
return _layersAccuracyDrop;
}
//--------------------------------------------------------------------------------------------------
ClassificationCalibrator::ClassificationCalibrator(int nPictures, const std::string &flags_m,
const std::string &flags_d, const std::string &flags_i,
int flags_b, InferenceEngine::InferencePlugin plugin,
CsvDumper &dumper, const std::string &flags_l,
PreprocessingOptions preprocessingOptions, bool zeroBackground) :
ClassificationProcessor(flags_m, flags_d, flags_i, flags_b,
plugin, dumper, flags_l,
preprocessingOptions, zeroBackground) {
_modelFileNameI8C = modelFileName;
_pluginI8C = plugin;
_nPictures = nPictures;
_cBatch = flags_b;
}
shared_ptr<Processor::InferenceMetrics> ClassificationCalibrator::Process() {
inferRequest = _inferRequestI8C;
int top1Result = 0, total = 0;
ClassificationSetGenerator generator;
auto validationMap = generator.getValidationMap(imagesPath);
ImageDecoder decoder;
// ----------------------------Do inference-------------------------------------------------------------
std::vector<int> expected(batch);
std::vector<std::string> files(batch);
int captured = 0;
if (!_nPictures) {
_nPictures = validationMap.size();
}
ConsoleProgress progress(_nPictures);
CalibrationMetrics im;
std::string firstInputName = this->inputInfo.begin()->first;
std::string firstOutputName = this->outInfo.begin()->first;
auto firstInputBlob = inferRequest.GetBlob(firstInputName);
auto firstOutputBlob = inferRequest.GetBlob(firstOutputName);
size_t ipics = 0;
auto iter = validationMap.begin();
while (iter != validationMap.end() && ipics < _nPictures) {
int b = 0;
int filesWatched = 0;
for (; b < batch && iter != validationMap.end() && ipics + b < _nPictures ; b++, iter++, filesWatched++) {
expected[b] = iter->first;
try {
decoder.insertIntoBlob(iter->second, b, *firstInputBlob, preprocessingOptions);
files[b] = iter->second;
} catch (const InferenceEngineException &iex) {
slog::warn << "Can't read file " << iter->second << slog::endl;
// Could be some non-image file in directory
b--;
continue;
}
}
ipics += batch;
Infer(progress, filesWatched, im);
collectCalibrationStatistic();
std::vector<unsigned> results;
auto firstOutputData = firstOutputBlob->buffer().as<PrecisionTrait<Precision::FP32>::value_type *>();
InferenceEngine::TopResults(1, *firstOutputBlob, results);
for (int i = 0; i < b; i++) {
int expc = expected[i];
if (zeroBackground) expc++;
bool top1Scored = (results[i] == expc);
if (top1Scored) top1Result++;
total++;
}
}
progress.finish();
calculateLayersAccuracyDrop();
im.AccuracyResult = static_cast<float>(top1Result) / static_cast<float>(total);
return std::shared_ptr<Processor::InferenceMetrics>(new CalibrationMetrics(im));
}
//--------------------------------------------------------------------------------------------------
SSDObjectDetectionCalibrator::SSDObjectDetectionCalibrator(int nPictures, const std::string &flags_m,
const std::string &flags_d, const std::string &flags_i,
const std::string &subdir, int flags_b,
double threshold,
InferencePlugin plugin, CsvDumper &dumper,
const std::string &flags_a, const std::string &classes_list_file) :
SSDObjectDetectionProcessor(flags_m, flags_d, flags_i, subdir, flags_b,
threshold,
plugin, dumper,
flags_a, classes_list_file) {
_modelFileNameI8C = modelFileName;
_pluginI8C = plugin;
_nPictures = nPictures;
}
shared_ptr<Processor::InferenceMetrics> SSDObjectDetectionCalibrator::Process() {
inferRequest = _inferRequestI8C;
// Parsing PASCAL VOC2012 format
VOCAnnotationParser vocAnnParser;
VOCAnnotationCollector annCollector(annotationsPath);
if (annCollector.annotations().size() == 0) {
ObjectDetectionInferenceMetrics emptyIM(this->threshold);
return std::shared_ptr<InferenceMetrics>(new ObjectDetectionInferenceMetrics(emptyIM));
}
// Getting desired results from annotations
std::map<std::string, ImageDescription> desiredForFiles;
for (auto &ann : annCollector.annotations()) {
std::list<DetectedObject> dobList;
for (auto &obj : ann.objects) {
DetectedObject dob(classes[obj.name], obj.bndbox.xmin, obj.bndbox.ymin, obj.bndbox.xmax, obj.bndbox.ymax, 1.0, obj.difficult != 0);
dobList.push_back(dob);
}
ImageDescription id(dobList);
desiredForFiles.insert(std::pair<std::string, ImageDescription>(ann.folder + "/" + (!subdir.empty() ? subdir + "/" : "") + ann.filename, id));
}
ImageDecoder decoder;
const int maxProposalCount = outputDims[1];
const int objectSize = outputDims[0];
for (auto &item : outInfo) {
DataPtr outputData = item.second;
if (!outputData) {
throw std::logic_error("output data pointer is not valid");
}
}
// -----------------------------------------------------------------------------------------------------
// ----------------------------Do inference-------------------------------------------------------------
std::vector<VOCAnnotation> expected(batch);
if (!_nPictures) {
_nPictures = annCollector.annotations().size();
}
ConsoleProgress progress(_nPictures);
ObjectDetectionInferenceMetrics im(threshold);
vector<VOCAnnotation>::const_iterator iter = annCollector.annotations().begin();
std::map<std::string, ImageDescription> scaledDesiredForFiles;
std::string firstInputName = this->inputInfo.begin()->first;
auto firstInputBlob = inferRequest.GetBlob(firstInputName);
size_t ipics = 0;
while (iter != annCollector.annotations().end() && ipics < _nPictures) {
std::vector<std::string> files;
int b = 0;
int filesWatched = 0;
for (; b < batch && iter != annCollector.annotations().end(); b++, iter++, filesWatched++) {
expected[b] = *iter;
string filename = iter->folder + "/" + (!subdir.empty() ? subdir + "/" : "") + iter->filename;
try {
Size orig_size = decoder.insertIntoBlob(std::string(imagesPath) + "/" + filename, b, *firstInputBlob, preprocessingOptions);
float scale_x, scale_y;
scale_x = 1.0 / iter->size.width; // orig_size.width;
scale_y = 1.0 / iter->size.height; // orig_size.height;
if (scaleProposalToInputSize) {
scale_x *= firstInputBlob->dims()[0];
scale_y *= firstInputBlob->dims()[1];
}
// Scaling the desired result (taken from the annotation) to the network size
scaledDesiredForFiles.insert(std::pair<std::string, ImageDescription>(filename, desiredForFiles.at(filename).scale(scale_x, scale_y)));
files.push_back(filename);
} catch (const InferenceEngineException &iex) {
slog::warn << "Can't read file " << this->imagesPath + "/" + filename << slog::endl;
// Could be some non-image file in directory
b--;
continue;
}
ipics++;
}
if (files.size() == batch) {
InferenceEngine::StatusCode sts;
InferenceEngine::ResponseDesc dsc;
// Infer model
Infer(progress, filesWatched, im);
collectCalibrationStatistic();
// Processing the inference result
std::map<std::string, std::list<DetectedObject>> detectedObjects = processResult(files);
// Calculating similarity
//
for (int b = 0; b < files.size(); b++) {
ImageDescription result(detectedObjects[files[b]]);
im.apc.consumeImage(result, scaledDesiredForFiles.at(files[b]));
}
}
}
progress.finish();
calculateLayersAccuracyDrop();
CalibrationMetrics imCalibration;
const ObjectDetectionInferenceMetrics &odim = dynamic_cast<const ObjectDetectionInferenceMetrics&>(im);
if (im.nRuns > 0) {
std::map<int, double> appc = odim.apc.calculateAveragePrecisionPerClass();
double mAP = 0;
for (auto i : appc) {
mAP += i.second;
}
imCalibration.AccuracyResult = mAP / appc.size();
}
return std::shared_ptr<Processor::InferenceMetrics>(new CalibrationMetrics(imCalibration));
}

View File

@@ -0,0 +1,178 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#pragma once
#include <vector>
#include <string>
#include "inference_engine.hpp"
#include "ClassificationProcessor.hpp"
#include "SSDObjectDetectionProcessor.hpp"
#include "data_stats.h"
#include <map>
#include <memory>
/**
* Calibrator class representing unified stages for calibration of any kind of networks
*/
class Int8Calibrator {
public:
/**
* Intermediate structure storing of data for measurements of by-layer statistic of accuracy drop
*/
struct SingleLayerData {
InferenceEngine::InferRequest _request;
std::string _outputName;
std::string _outputI8Name;
std::vector<float> _int8Accuracy;
};
/**
* Initializes state to collect accuracy of FP32 network and collect statistic
* of activations. The statistic of activations is stored in _statData and has all max/min for all
* layers and for all pictures
* The inference of all pictures and real collect of the statistic happen during call of
* Processor::Process()
*/
void collectFP32Statistic();
/**
* Initializes a state to collect intermediate numeric accuracy drop happening during quantization of
* certain layer to int8. The numeric accuracy drop is measured using NRMSD metric.
*
* For this purpose it creates dedicated network for certain layer, initializes this
* network by statistic that cause execute dedicated network in int8 mode.
*
* In addition to original network we create full original network executed in FP32 mode, and
* register all layers as output ones.
* Information from these layers is used as
* a) input to dedicated layer networks
* b) comparison for NRMSD algorithm between I8 and FP32 calc
*
* The inference of all pictures and real collect of the drop happen during call of
* Processor::Process()
* @param stat
*/
void collectByLayerStatistic(const InferenceEngine::NetworkStatsMap &stat);
/**
* Initialize state to collect accuracy drop in int8 mode to be compared later vs FP32 accuracy
* metric.
*
* The inference of all pictures and real collect of the accuracy happen during call of
* Processor::Process()
*
* @param stat - The statistic for normalization
* @param layersToInt8 - list of layers planned to be executed in int8. if layer is absent in this
* map, it is assumed that it will be executed in int8
*/
void validateInt8Config(const InferenceEngine::NetworkStatsMap &stat,
const std::map<std::string, bool>& layersToInt8);
/**
* Statistic collected in the collectFP32Statistic is processed with threshold passed as a parameter
* for this method. All values for each layers and for all pictures are sorted and number of min/max
* values which exceed threshold is thrown off
* @param threshold - parameter for thrown off outliers in activation statistic
* @return InferenceEngine::NetworkStatsMap - mapping of layer name to NetworkNodeStatsPtr
*/
InferenceEngine::NetworkStatsMap getStatistic(float threshold);
/**
* returns by-layer accuracy drop container
*/
std::map<std::string, float> layersAccuracyDrop();
protected:
/**
* This function should be called from final callibrator after and each Infer for each picture
* It calculates by layer accuracy drop and as well it also collect activation values statistic
*/
void collectCalibrationStatistic();
/**
* This function should be called from calibration class after Infer of all picture
* It calculates average NRMSD based accuracy drop for each layer and fills _layersAccuracyDrop
*/
void calculateLayersAccuracyDrop();
bool _collectByLayer = false;
bool _collectStatistic = true;
InferencePlugin _pluginI8C;
std::string _modelFileNameI8C;
InferenceEngine::CNNNetReader networkReaderC;
InferenceEngine::InferRequest _inferRequestI8C;
int _cBatch = 0;
int _nPictures;
private:
/**
* helper function for getting statistic for input layers. For getting statistic for them, we are
* adding scalshift just after the input with scale == 1 and shift == 0
*/
CNNLayerPtr addScaleShiftBeforeLayer(std::string name, InferenceEngine::CNNLayer::Ptr beforeLayer,
size_t port, std::vector<float> scale);
/**
* Returns Normalized root-mean-square deviation metric for two blobs passed to the function
*/
float compare_NRMSD(InferenceEngine::Blob::Ptr res, InferenceEngine::Blob::Ptr ref);
/**
* Creates dedicated i8 network around selected layer. Currently this network beside layer itself
* has to have ReLU and ScaleShift layers.
* Since Inference Engine API mostly directed to the loading of network from IR, we need to create
* such IR first, read through stream and modify network to correspond required parameters
*/
InferenceEngine::CNNNetwork createICNNNetworkForLayer(InferenceEngine::CNNLayer::Ptr layerToClone);
std::map<std::string, float> _layersAccuracyDrop;
std::vector<InferenceEngine::ExecutableNetwork> _singleLayerNetworks;
std::map<std::string, SingleLayerData> _singleLayerRequests;
std::map<std::string, std::string> _inputsFromLayers;
AggregatedDataStats _statData;
};
/**
* This class represents the only one generalized metric which will be used for comparison of
* accuracy drop
*/
struct CalibrationMetrics : public ClassificationProcessor::InferenceMetrics {
public:
float AccuracyResult = 0;
};
/**
* Сalibration class for classification networks.
* Responsible for proper post processing of results and calculate of Top1 metric which is used as
* universal metric for accuracy and particiapted in verification of accuracy drop
*/
class ClassificationCalibrator : public ClassificationProcessor, public Int8Calibrator {
public:
ClassificationCalibrator(int nPictures, const std::string &flags_m, const std::string &flags_d,
const std::string &flags_i, int flags_b,
InferenceEngine::InferencePlugin plugin, CsvDumper &dumper, const std::string &flags_l,
PreprocessingOptions preprocessingOptions, bool zeroBackground);
shared_ptr<InferenceMetrics> Process()override;
};
/**
* Calibration class for SSD object detection networks.
* Responsible for proper post processing of results and calculate of mAP metric which is used as
* universal metric for accuracy and participated in verification of accuracy drop
*/
class SSDObjectDetectionCalibrator : public SSDObjectDetectionProcessor, public Int8Calibrator {
public:
SSDObjectDetectionCalibrator(int nPictures, const std::string &flags_m, const std::string &flags_d,
const std::string &flags_i, const std::string &subdir, int flags_b,
double threshold,
InferencePlugin plugin, CsvDumper &dumper,
const std::string &flags_a, const std::string &classes_list_file);
shared_ptr<InferenceMetrics> Process()override;
};

View File

@@ -0,0 +1,105 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#include <stdlib.h>
#include <cfloat>
#include <cmath>
#include <stdint.h>
#include <iostream>
#include <limits>
#include <vector>
#include <algorithm>
#include <string>
#include "data_stats.h"
TensorStatistic::TensorStatistic(float* data, size_t count, size_t nbuckets) {
_min = std::numeric_limits<float>::max();
_max = std::numeric_limits<float>::min();
for (size_t i = 0; i < count; i++) {
float val = static_cast<float>(data[i]);
if (_min > val) {
_min = val;
}
if (_max < val) {
_max = val;
}
}
if (_min == _max) {
return;
}
}
float TensorStatistic::getMaxValue() const {
return _max;
}
float TensorStatistic::getMinValue() const {
return _min;
}
std::vector<std::string> AggregatedDataStats::registeredLayers() {
std::vector<std::string> layers;
for (auto l : _data) {
layers.push_back(l.first);
}
return layers;
}
void AggregatedDataStats::registerLayer(std::string layer) {
_data[layer];
}
void AggregatedDataStats::addTensorStatistics(const std::string& name, size_t channel, float* data, size_t count) {
auto&& byChannel = _data[name];
byChannel[channel].push_back(TensorStatistic(data, count));
}
void AggregatedDataStats::addTensorStatistics(const std::string &name, size_t channel, uint8_t *data, size_t count) {
std::vector<float> intermediate;
for (size_t i = 0; i < count; i++) {
intermediate.push_back(data[i]);
}
addTensorStatistics(name, channel, intermediate.data(), count);
}
size_t AggregatedDataStats::getNumberChannels(const std::string& name) const {
auto it = _data.find(name);
if (it != _data.end()) {
return it->second.size();
}
return 0;
}
void AggregatedDataStats::getDataMinMax(const std::string& name, size_t channel, float& min, float& max, float threshold) {
// take data by name
auto it = _data.find(name);
if (it != _data.end()) {
auto stats = it->second[channel];
// having absolute min/max values, we can create new statistic
std::vector<float> maxValues;
std::vector<float> minValues;
for (size_t i = 0; i < stats.size(); i++) {
const TensorStatistic& tsS = stats[i];
maxValues.push_back(tsS.getMaxValue());
minValues.push_back(tsS.getMinValue());
}
// define number of elements to throw out
size_t elementToTake = maxValues.size() * threshold / 100;
int elementsToThrow = maxValues.size() - elementToTake;
std::sort(maxValues.begin(), maxValues.end());
std::sort(minValues.begin(), minValues.end());
min = minValues[elementsToThrow];
max = maxValues[elementToTake - 1];
} else {
min = max = 0.f;
}
}

View File

@@ -0,0 +1,32 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#pragma once
#include <vector>
#include <map>
#include <string>
struct TensorStatistic {
TensorStatistic(float* data, size_t count, size_t nbuckets = 1000);
float getMaxValue() const;
float getMinValue()const;
protected:
float _min;
float _max;
};
class AggregatedDataStats {
public:
void addTensorStatistics(const std::string& name, size_t channel, float* data, size_t count);
void addTensorStatistics(const std::string &name, size_t channel, uint8_t *data, size_t count);
void getDataMinMax(const std::string& name, size_t channel, float& min, float& max, float threshold);
size_t getNumberChannels(const std::string& name) const;
std::vector <std::string> registeredLayers();
void registerLayer(std::string layer);
protected:
std::map<std::string, std::map<size_t, std::vector<TensorStatistic> > > _data;
};

View File

@@ -0,0 +1,521 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief The entry point for Inference Engine validation application
* @file validation_app/main.cpp
*/
#include <gflags/gflags.h>
#include <algorithm>
#include <functional>
#include <iostream>
#include <map>
#include <fstream>
#include <random>
#include <string>
#include <tuple>
#include <vector>
#include <limits>
#include <iomanip>
#include <memory>
#include <ext_list.hpp>
#include <samples/common.hpp>
#include <samples/slog.hpp>
#include "user_exception.hpp"
#include "calibrator_processors.h"
#include "SSDObjectDetectionProcessor.hpp"
#include "YOLOObjectDetectionProcessor.hpp"
#include "network_serializer.h"
#include "ie_icnn_network_stats.hpp"
#include "details/caseless.hpp"
using namespace std;
using namespace InferenceEngine;
using namespace InferenceEngine::details;
using InferenceEngine::details::InferenceEngineException;
#define DEFAULT_PATH_P "./lib"
/// @brief Message for help argument
static const char help_message[] = "Print a help message";
/// @brief Message for images argument
static const char image_message[] = "Required. Path to a directory with validation images. For Classification models, the directory must contain"
" folders named as labels with images inside or a .txt file with"
" a list of images. For Object Detection models, the dataset must be in"
" VOC format.";
/// @brief Message for plugin_path argument
static const char plugin_path_message[] = "Path to a plugin folder";
/// @brief message for model argument
static const char model_message[] = "Required. Path to an .xml file with a trained model, including model name and "
"extension.";
/// @brief Message for plugin argument
static const char plugin_message[] = "Plugin name. For example, CPU. If this parameter is passed, "
"the sample looks for a specified plugin only.";
/// @brief Message for assigning cnn calculation to device
static const char target_device_message[] = "Target device to infer on: CPU (default), GPU, FPGA, or MYRIAD."
" The application looks for a suitable plugin for the specified device.";
/// @brief Message for label argument
static const char label_message[] = "Path to a file with labels for a model";
/// @brief M`essage for batch argumenttype
static const char batch_message[] = "Batch size value. If not specified, the batch size value is taken from IR";
/// @brief Message for dump argument
static const char dump_message[] = "Dump file names and inference results to a .csv file";
/// @brief Message for network type
static const char type_message[] = "Type of an inferred network (\"C\" by default)";
/// @brief Message for pp-type
static const char preprocessing_type[] = "Preprocessing type. Options: \"None\", \"Resize\", \"ResizeCrop\"";
/// @brief Message for pp-crop-size
static const char preprocessing_size[] = "Preprocessing size (used with ppType=\"ResizeCrop\")";
static const char preprocessing_width[] = "Preprocessing width (overrides -ppSize, used with ppType=\"ResizeCrop\")";
static const char preprocessing_height[] = "Preprocessing height (overrides -ppSize, used with ppType=\"ResizeCrop\")";
static const char obj_detection_annotations_message[] = "Required for Object Detection models. Path to a directory"
" containing an .xml file with annotations for images.";
static const char obj_detection_classes_message[] = "Required for Object Detection models. Path to a file with"
" a list of classes";
static const char obj_detection_subdir_message[] = "Directory between the path to images (specified with -i) and image name (specified in the"
" .xml file). For VOC2007 dataset, use JPEGImages.";
static const char obj_detection_kind_message[] = "Type of an Object Detection model. Options: SSD";
/// @brief Message for GPU custom kernels desc
static const char custom_cldnn_message[] = "Required for GPU custom kernels. "
"Absolute path to an .xml file with the kernel descriptions.";
/// @brief Message for user library argument
static const char custom_cpu_library_message[] = "Required for CPU custom layers. "
"Absolute path to a shared library with the kernel implementations.";
static const char zero_background_message[] = "\"Zero is a background\" flag. Some networks are trained with a modified"
" dataset where the class IDs "
" are enumerated from 1, but 0 is an undefined \"background\" class"
" (which is never detected)";
/// @brief Network type options and their descriptions
static const char* types_descriptions[][2] = {
{ "C", "calibrate Classification network and write the calibrated network to IR" },
// { "SS", "semantic segmentation" }, // Not supported yet
{ "OD", "calibrate Object Detection network and write the calibrated network to IR" },
{ "RawC", "collect only statistics for Classification network and write statistics to IR. With this option, a model is not calibrated. For calibration "
"and statisctics collection, use \"-t C\" instead." },
{ "RawOD", "collect only statistics for Object Detection network and write statistics to IR. With this option, a model is not calibrated. For calibration "
"and statisctics collection, use \"-t OD\" instead" },
{ nullptr, nullptr }
};
static const char accuracy_threshold_message[] = "Threshold for a maximum accuracy drop of quantized model."
" Must be an integer number (percents)"
" without a percent sign. Default value is 1, which stands for accepted"
" accuracy drop in 1%";
static const char number_of_pictures_message[] = "Number of pictures from the whole validation set to"
"create the calibration dataset. Default value is 0, which stands for"
"the whole provided dataset";
static const char output_model_name[] = "Output name for calibrated model. Default is <original_model_name>_i8.xml|bin";
/// @brief Define flag for showing help message <br>
DEFINE_bool(h, false, help_message);
/// @brief Define parameter for a path to images <br>
/// It is a required parameter
DEFINE_string(i, "", image_message);
/// @brief Define parameter for a path to model file <br>
/// It is a required parameter
DEFINE_string(m, "", model_message);
/// @brief Define parameter for a plugin name <br>
/// It is a required parameter
DEFINE_string(p, "", plugin_message);
/// @brief Define parameter for a path to a file with labels <br>
/// Default is empty
DEFINE_string(OCl, "", label_message);
/// @brief Define parameter for a path to plugins <br>
/// Default is ./lib
DEFINE_string(pp, DEFAULT_PATH_P, plugin_path_message);
/// @brief Define paraneter for a target device to infer on <br>
DEFINE_string(d, "CPU", target_device_message);
/// @brief Define parameter for batch size <br>
/// Default is 0 (which means that batch size is not specified)
DEFINE_int32(b, 0, batch_message);
/// @brief Define flag to dump results to a file <br>
DEFINE_bool(dump, false, dump_message);
/// @brief Define parameter for a network type
DEFINE_string(t, "C", type_message);
/// @brief Define parameter for preprocessing type
DEFINE_string(ppType, "", preprocessing_type);
/// @brief Define parameter for preprocessing size
DEFINE_int32(ppSize, 0, preprocessing_size);
DEFINE_int32(ppWidth, 0, preprocessing_width);
DEFINE_int32(ppHeight, 0, preprocessing_height);
DEFINE_bool(Czb, false, zero_background_message);
DEFINE_string(ODa, "", obj_detection_annotations_message);
DEFINE_string(ODc, "", obj_detection_classes_message);
DEFINE_string(ODsubdir, "", obj_detection_subdir_message);
/// @brief Define parameter for a type of Object Detection network
DEFINE_string(ODkind, "SSD", obj_detection_kind_message);
/// @brief Define parameter for GPU kernels path <br>
/// Default is ./lib
DEFINE_string(c, "", custom_cldnn_message);
/// @brief Define parameter for a path to CPU library with user layers <br>
/// It is an optional parameter
DEFINE_string(l, "", custom_cpu_library_message);
/// @brief Define parameter for accuracy drop threshold
DEFINE_double(threshold, 1.0f, accuracy_threshold_message);
DEFINE_int32(subset, 0, number_of_pictures_message);
DEFINE_string(output, "", output_model_name);
/**
* @brief This function shows a help message
*/
static void showUsage() {
std::cout << std::endl;
std::cout << "Usage: calibration_tool [OPTION]" << std::endl << std::endl;
std::cout << "Available options:" << std::endl;
std::cout << std::endl;
std::cout << " -h " << help_message << std::endl;
std::cout << " -t <type> " << type_message << std::endl;
for (int i = 0; types_descriptions[i][0] != nullptr; i++) {
std::cout << " -t \"" << types_descriptions[i][0] << "\" to " << types_descriptions[i][1] << std::endl;
}
std::cout << " -i <path> " << image_message << std::endl;
std::cout << " -m <path> " << model_message << std::endl;
std::cout << " -l <absolute_path> " << custom_cpu_library_message << std::endl;
std::cout << " -c <absolute_path> " << custom_cldnn_message << std::endl;
std::cout << " -d <device> " << target_device_message << std::endl;
std::cout << " -b N " << batch_message << std::endl;
std::cout << " -ppType <type> " << preprocessing_type << std::endl;
std::cout << " -ppSize N " << preprocessing_size << std::endl;
std::cout << " -ppWidth W " << preprocessing_width << std::endl;
std::cout << " -ppHeight H " << preprocessing_height << std::endl;
std::cout << " --dump " << dump_message << std::endl;
std::cout << " -subset " << number_of_pictures_message << std::endl;
std::cout << " -output <output_IR> " << output_model_name << std::endl;
std::cout << " -threshold " << accuracy_threshold_message << std::endl;
std::cout << std::endl;
std::cout << " Classification-specific options:" << std::endl;
std::cout << " -Czb true " << zero_background_message << std::endl;
std::cout << std::endl;
std::cout << " Object detection-specific options:" << std::endl;
std::cout << " -ODkind <kind> " << obj_detection_kind_message << std::endl;
std::cout << " -ODa <path> " << obj_detection_annotations_message << std::endl;
std::cout << " -ODc <file> " << obj_detection_classes_message << std::endl;
std::cout << " -ODsubdir <name> " << obj_detection_subdir_message << std::endl << std::endl;
}
enum NetworkType {
Undefined = -1,
Classification,
ObjDetection,
RawC,
RawOD
};
std::string strtolower(const std::string& s) {
std::string res = s;
std::transform(res.begin(), res.end(), res.begin(), ::tolower);
return res;
}
void SaveCalibratedIR(const std::string &originalName,
const std::string &outModelName,
const std::map<std::string, bool>& layersToInt8,
const InferenceEngine::NetworkStatsMap& statMap) {
slog::info << "Layers profile for Int8 quantization\n";
CNNNetReader networkReader;
networkReader.ReadNetwork(originalName);
if (!networkReader.isParseSuccess())THROW_IE_EXCEPTION << "cannot load a failed Model";
/** Extract model name and load weights **/
std::string binFileName = fileNameNoExt(originalName)+ ".bin";
networkReader.ReadWeights(binFileName.c_str());
auto network = networkReader.getNetwork();
for (auto &&layer : network) {
if (CaselessEq<std::string>()(layer->type, "convolution")) {
auto it = layersToInt8.find(layer->name);
if (it != layersToInt8.end() && it->second == false) {
layer->params["quantization_level"] = "FP32";
std::cout << layer->name << ": " << "FP32" << std::endl;
} else {
layer->params["quantization_level"] = "I8";
std::cout << layer->name << ": " << "I8" << std::endl;
}
}
}
ICNNNetworkStats* pstats = nullptr;
StatusCode s = ((ICNNNetwork&)networkReader.getNetwork()).getStats(&pstats, nullptr);
if (s == StatusCode::OK && pstats) {
pstats->setNodesStats(statMap);
}
slog::info << "Write calibrated network to " << outModelName << ".(xml|bin) IR file\n";
CNNNetworkSerializer serializer;
serializer.Serialize(outModelName + ".xml", outModelName + ".bin", networkReader.getNetwork());
}
/**
* @brief The main function of inference engine sample application
* @param argc - The number of arguments
* @param argv - Arguments
* @return 0 if all good
*/
int main(int argc, char *argv[]) {
try {
slog::info << "InferenceEngine: " << GetInferenceEngineVersion() << slog::endl;
// ---------------------------Parsing and validating input arguments--------------------------------------
slog::info << "Parsing input parameters" << slog::endl;
bool noOptions = argc == 1;
gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
if (FLAGS_h || noOptions) {
showUsage();
return 1;
}
UserExceptions ee;
NetworkType netType = Undefined;
// Checking the network type
if (std::string(FLAGS_t) == "C") {
netType = Classification;
} else if (std::string(FLAGS_t) == "OD") {
netType = ObjDetection;
} else if (std::string(FLAGS_t) == "RawC") {
netType = RawC;
} else if (std::string(FLAGS_t) == "RawOD") {
netType = RawOD;
} else {
ee << UserException(5, "Unknown network type specified (invalid -t option)");
}
// Checking required options
if (FLAGS_m.empty()) ee << UserException(3, "Model file is not specified (missing -m option)");
if (FLAGS_i.empty()) ee << UserException(4, "Images list is not specified (missing -i option)");
if (FLAGS_d.empty()) ee << UserException(5, "Target device is not specified (missing -d option)");
if (FLAGS_b < 0) ee << UserException(6, "Batch must be positive (invalid -b option value)");
if (netType == ObjDetection) {
// Checking required OD-specific options
if (FLAGS_ODa.empty()) ee << UserException(11, "Annotations folder is not specified for object detection (missing -a option)");
if (FLAGS_ODc.empty()) ee << UserException(12, "Classes file is not specified (missing -c option)");
if (FLAGS_b > 0) ee << UserException(13, "Batch option other than 0 is not supported for Object Detection networks");
}
if (!ee.empty()) throw ee;
// -----------------------------------------------------------------------------------------------------
// ---------------------Loading plugin for Inference Engine------------------------------------------------
slog::info << "Loading plugin" << slog::endl;
/** Loading the library with extensions if provided**/
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp, "../../../lib/intel64", "" }).getPluginByDevice(FLAGS_d);
/** Loading default extensions **/
if (FLAGS_d.find("CPU") != std::string::npos) {
/**
* cpu_extensions library is compiled from "extension" folder containing
* custom CPU plugin layer implementations. These layers are not supported
* by CPU, but they can be useful for inferring custom topologies.
**/
plugin.AddExtension(std::make_shared<Extensions::Cpu::CpuExtensions>());
}
if (!FLAGS_l.empty()) {
// CPU extensions are loaded as a shared library and passed as a pointer to base extension
IExtensionPtr extension_ptr = make_so_pointer<IExtension>(FLAGS_l);
plugin.AddExtension(extension_ptr);
slog::info << "CPU Extension loaded: " << FLAGS_l << slog::endl;
}
if (!FLAGS_c.empty()) {
// GPU extensions are loaded from an .xml description and OpenCL kernel files
plugin.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, FLAGS_c}});
slog::info << "GPU Extension loaded: " << FLAGS_c << slog::endl;
}
printPluginVersion(plugin, std::cout);
CsvDumper dumper(FLAGS_dump);
std::shared_ptr<Processor> processor;
PreprocessingOptions preprocessingOptions;
if (strtolower(FLAGS_ppType.c_str()) == "none") {
preprocessingOptions = PreprocessingOptions(false, ResizeCropPolicy::DoNothing);
} else if (strtolower(FLAGS_ppType) == "resizecrop") {
size_t ppWidth = FLAGS_ppSize;
size_t ppHeight = FLAGS_ppSize;
if (FLAGS_ppWidth > 0) ppWidth = FLAGS_ppSize;
if (FLAGS_ppHeight > 0) ppHeight = FLAGS_ppSize;
if (FLAGS_ppSize > 0 || (FLAGS_ppWidth > 0 && FLAGS_ppHeight > 0)) {
preprocessingOptions = PreprocessingOptions(false, ResizeCropPolicy::ResizeThenCrop, ppWidth, ppHeight);
} else {
THROW_USER_EXCEPTION(2) << "Size must be specified for preprocessing type " << FLAGS_ppType;
}
} else if (strtolower(FLAGS_ppType) == "resize" || FLAGS_ppType.empty()) {
preprocessingOptions = PreprocessingOptions(false, ResizeCropPolicy::Resize);
} else {
THROW_USER_EXCEPTION(2) << "Unknown preprocessing type: " << FLAGS_ppType;
}
if (netType == Classification || netType == RawC) {
processor = std::shared_ptr<Processor>(
new ClassificationCalibrator(FLAGS_subset, FLAGS_m, FLAGS_d, FLAGS_i, FLAGS_b,
plugin, dumper, FLAGS_l, preprocessingOptions, FLAGS_Czb));
} else if (netType == ObjDetection || netType == RawOD) {
if (FLAGS_ODkind == "SSD") {
processor = std::shared_ptr<Processor>(
new SSDObjectDetectionCalibrator(FLAGS_subset, FLAGS_m, FLAGS_d, FLAGS_i, FLAGS_ODsubdir, FLAGS_b,
0.5, plugin, dumper, FLAGS_ODa, FLAGS_ODc));
/* } else if (FLAGS_ODkind == "YOLO") {
processor = std::shared_ptr<Processor>(
new YOLOObjectDetectionProcessor(FLAGS_m, FLAGS_d, FLAGS_i, FLAGS_ODsubdir, FLAGS_b,
0.5, plugin, dumper, FLAGS_ODa, FLAGS_ODc));
*/
}
} else {
THROW_USER_EXCEPTION(2) << "Unknown network type specified" << FLAGS_ppType;
}
if (!processor.get()) {
THROW_USER_EXCEPTION(2) << "Processor pointer is invalid" << FLAGS_ppType;
}
Int8Calibrator* calibrator = dynamic_cast<Int8Calibrator*>(processor.get());
if (netType != RawC && netType != RawOD) {
slog::info << "Collecting accuracy metric in FP32 mode to get a baseline, collecting activation statistics" << slog::endl;
} else {
slog::info << "Collecting activation statistics" << slog::endl;
}
calibrator->collectFP32Statistic();
shared_ptr<Processor::InferenceMetrics> pIMFP32 = processor->Process();
const CalibrationMetrics* mFP32 = dynamic_cast<const CalibrationMetrics*>(pIMFP32.get());
std:: cout << " FP32 Accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% " << std::endl;
InferenceEngine::NetworkStatsMap statMap;
std::map<std::string, bool> layersToInt8;
bool bAccuracy = false;
if (netType != RawC && netType != RawOD) {
slog::info << "Verification of network accuracy if all possible layers converted to INT8" << slog::endl;
float bestThreshold = 100.f;
float maximalAccuracy = 0.f;
for (float threshold = 100.0f; threshold > 95.0f; threshold -= 0.5) {
std::cout << "Validate int8 accuracy, threshold for activation statistics = " << threshold << std::endl;
InferenceEngine::NetworkStatsMap tmpStatMap = calibrator->getStatistic(threshold);
calibrator->validateInt8Config(tmpStatMap, {});
shared_ptr<Processor::InferenceMetrics> pIM_I8 = processor->Process();
const CalibrationMetrics *mI8 = dynamic_cast<const CalibrationMetrics *>(pIM_I8.get());
if (maximalAccuracy < mI8->AccuracyResult) {
maximalAccuracy = mI8->AccuracyResult;
bestThreshold = threshold;
}
std::cout << " Accuracy is " << OUTPUT_FLOATING(100.0 * mI8->AccuracyResult) << "%" << std::endl;
}
statMap = calibrator->getStatistic(bestThreshold);
if ((mFP32->AccuracyResult - maximalAccuracy) > (FLAGS_threshold / 100)) {
slog::info << "Accuracy of all layers conversion does not correspond to the required threshold\n";
cout << "FP32 Accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% vs " <<
"all Int8 layers Accuracy: " << OUTPUT_FLOATING(100.0 * maximalAccuracy) << "%, " <<
"threshold for activation statistics: " << bestThreshold << "%" << std::endl;
slog::info << "Collecting intermediate per-layer accuracy drop" << slog::endl;
// getting statistic on accuracy drop by layers
calibrator->collectByLayerStatistic(statMap);
processor->Process();
// starting to reduce number of layers being converted to Int8
std::map<std::string, float> layersAccuracyDrop = calibrator->layersAccuracyDrop();
std::map<float, std::string> orderedLayersAccuracyDrop;
for (auto d : layersAccuracyDrop) {
orderedLayersAccuracyDrop[d.second] = d.first;
layersToInt8[d.first] = true;
}
std::map<float, std::string>::const_reverse_iterator it = orderedLayersAccuracyDrop.crbegin();
shared_ptr<Processor::InferenceMetrics> pIM_I8;
const CalibrationMetrics *mI8;
while (it != orderedLayersAccuracyDrop.crend() && bAccuracy == false) {
slog::info << "Returning of '" << it->second << "' to FP32 precision, start validation\n";
layersToInt8[it->second] = false;
calibrator->validateInt8Config(statMap, layersToInt8);
pIM_I8 = processor->Process();
mI8 = dynamic_cast<const CalibrationMetrics *>(pIM_I8.get());
maximalAccuracy = mI8->AccuracyResult;
if ((mFP32->AccuracyResult - maximalAccuracy) > (FLAGS_threshold / 100)) {
cout << "FP32 Accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% vs " <<
"current Int8 configuration Accuracy: " << OUTPUT_FLOATING(100.0 * maximalAccuracy) << "%" << std::endl;
} else {
bAccuracy = true;
}
it++;
}
} else {
bAccuracy = true;
}
if (bAccuracy) {
slog::info << "Achieved required accuracy drop satisfying threshold\n";
cout << "FP32 accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% vs " <<
"current Int8 configuration accuracy: " << OUTPUT_FLOATING(100.0 * maximalAccuracy) << "% " <<
"with threshold for activation statistic: " << bestThreshold << "%" << std::endl;
std::string outModelName = FLAGS_output.empty() ? fileNameNoExt(FLAGS_m) + "_i8" : fileNameNoExt(FLAGS_output);
SaveCalibratedIR(FLAGS_m, outModelName, layersToInt8, statMap);
} else {
slog::info << "Required threshold of accuracy drop cannot be achieved with any int8 quantization\n";
}
} else {
std::cout << "Collected activation statistics, writing maximum values to IR" << std::endl;
statMap = calibrator->getStatistic(100.0f);
std::string outModelName = FLAGS_output.empty() ? fileNameNoExt(FLAGS_m) + "_i8" : fileNameNoExt(FLAGS_output);
SaveCalibratedIR(FLAGS_m, outModelName, layersToInt8, statMap);
}
if (dumper.dumpEnabled()) {
slog::info << "Dump file generated: " << dumper.getFilename() << slog::endl;
}
} catch (const InferenceEngineException& ex) {
slog::err << "Inference problem: \n" << ex.what() << slog::endl;
return 1;
} catch (const UserException& ex) {
slog::err << "Input problem: \n" << ex.what() << slog::endl;
showUsage();
return ex.exitCode();
} catch (const UserExceptions& ex) {
if (ex.list().size() == 1) {
slog::err << "Input problem: " << ex.what() << slog::endl;
showUsage();
return ex.list().begin()->exitCode();
} else {
const char* s = ex.what();
slog::err << "Input problems: \n" << ex.what() << slog::endl;
showUsage();
return ex.list().begin()->exitCode();
}
}
return 0;
}

View File

@@ -0,0 +1,381 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#include <fstream>
#include <map>
#include <vector>
#include <string>
#include <ie_precision.hpp>
#include "details/ie_cnn_network_tools.h"
#include "details/caseless.hpp"
#include "ie_layers_property.hpp"
#include "network_serializer.h"
#include "../common/samples/common.hpp"
using namespace InferenceEngine;
using namespace details;
template<typename T>
std::string arrayToIRProperty(const T& property) {
std::string sProperty;
for (size_t i = 0; i < property.size(); i++) {
sProperty = sProperty + std::to_string(property[i]) +
std::string((i != property.size() - 1) ? "," : "");
}
return sProperty;
}
template<typename T>
std::string arrayRevertToIRProperty(const T& property) {
std::string sProperty;
for (size_t i = 0; i < property.size(); i++) {
sProperty = sProperty + std::to_string(property[property.size() - i - 1]) +
std::string((i != property.size() - 1) ? "," : "");
}
return sProperty;
}
void CNNNetworkSerializer::Serialize(const std::string &xmlPath, const std::string &binPath,
ICNNNetwork &network) {
std::ofstream ofsBin(binPath, std::ofstream::out | std::ofstream::binary);
pugi::xml_document doc;
pugi::xml_node net = doc.append_child("net");
char name[1024];
network.getName(name, 1024);
net.append_attribute("name").set_value(name);
net.append_attribute("version").set_value("3");
net.append_attribute("batch").set_value("1");
pugi::xml_node layers = net.append_child("layers");
size_t dataOffset = 0;
std::string dataName = "data";
std::vector<CNNLayerPtr> ordered;
ordered = CNNNetSortTopologically(network);
std::map<CNNLayer::Ptr, int> matching;
for (size_t i = 0; i < ordered.size(); i++) {
matching[ordered[i]] = i;
}
for (size_t i = 0; i < ordered.size(); i++) {
CNNLayerPtr node = ordered[i];
pugi::xml_node layer = layers.append_child("layer");
Precision precision = node->precision;
layer.append_attribute("name").set_value(node->name.c_str());
layer.append_attribute("type").set_value(node->type.c_str());
layer.append_attribute("precision").set_value(precision.name());
layer.append_attribute("id").set_value(i);
updateStdLayerParams(node);
auto &params = node->params;
if (params.size()) {
pugi::xml_node data = layer.append_child(dataName.c_str());
for (auto it : params) {
data.append_attribute(it.first.c_str()).set_value(it.second.c_str());
}
}
if (node->insData.size()) {
pugi::xml_node input = layer.append_child("input");
for (size_t iport = 0; iport < node->insData.size(); iport++) {
DataPtr d = node->insData[iport].lock();
pugi::xml_node port = input.append_child("port");
port.append_attribute("id").set_value(iport);
for (auto dim : d->getDims()) {
port.append_child("dim").text().set(dim);
}
}
}
if (node->outData.size()) {
pugi::xml_node input = layer.append_child("output");
for (size_t oport = 0; oport < node->outData.size(); oport++) {
pugi::xml_node port = input.append_child("port");
port.append_attribute("id").set_value(node->insData.size() + oport);
for (auto dim : node->outData[oport]->getDims()) {
port.append_child("dim").text().set(dim);
}
}
}
if (node->blobs.size()) {
auto blobsNode = layer.append_child("blobs");
for (auto dataIt : node->blobs) {
const char *dataPtr = dataIt.second->buffer().as<char*>();
size_t dataSize = dataIt.second->byteSize();
pugi::xml_node data = blobsNode.append_child(dataIt.first.c_str());
data.append_attribute("offset").set_value(dataOffset);
data.append_attribute("size").set_value(dataSize);
dataOffset += dataSize;
ofsBin.write(dataPtr, dataSize);
}
}
}
pugi::xml_node edges = net.append_child("edges");
for (size_t i = 0; i < ordered.size(); i++) {
CNNLayer::Ptr node = ordered[i];
if (node->outData.size()) {
auto itFrom = matching.find(node);
if (itFrom == matching.end()) {
THROW_IE_EXCEPTION << "Internal error, cannot find " << node->name << " in matching container during serialization of IR";
}
for (size_t oport = 0; oport < node->outData.size(); oport++) {
DataPtr outData = node->outData[oport];
for (auto inputTo : outData->inputTo) {
auto itTo = matching.find(inputTo.second);
if (itTo == matching.end()) {
THROW_IE_EXCEPTION << "Broken edge form layer " << node->name << " to layer " << inputTo.first<< "during serialization of IR";
}
size_t foundPort = -1;
for (size_t iport = 0; iport < inputTo.second->insData.size(); iport++) {
if (inputTo.second->insData[iport].lock() == outData) {
foundPort = iport;
}
}
if (foundPort == -1) {
THROW_IE_EXCEPTION << "Broken edge from layer to parent, cannot find parent " << outData->name << " for layer " << inputTo.second->name
<< "\ninitial layer for edge output " << node->name;
}
pugi::xml_node edge = edges.append_child("edge");
edge.append_attribute("from-layer").set_value(itFrom->second);
edge.append_attribute("from-port").set_value(oport + node->insData.size());
edge.append_attribute("to-layer").set_value(itTo->second);
edge.append_attribute("to-port").set_value(foundPort);
}
}
}
}
InputsDataMap inputInfo;
network.getInputsInfo(inputInfo);
// assuming that we have preprocess only for one input
for (auto ii : inputInfo) {
auto pp = ii.second->getPreProcess();
size_t nInChannels = pp.getNumberOfChannels();
if (nInChannels) {
pugi::xml_node preproc = net.append_child("pre-process");
preproc.append_attribute("reference-layer-name").set_value(ii.first.c_str());
preproc.append_attribute("mean-precision").set_value(Precision(Precision::FP32).name());
for (size_t ch = 0; ch < nInChannels; ch++) {
PreProcessChannel::Ptr &preProcessChannel = pp[ch];
auto channel = preproc.append_child("channel");
channel.append_attribute("id").set_value(ch);
auto mean = channel.append_child("mean");
if (!preProcessChannel->meanData) {
mean.append_attribute("value").set_value(preProcessChannel->meanValue);
} else {
THROW_IE_EXCEPTION << "Mean data is not supported yet for serialization of the model";
}
}
}
}
// adding statistic to the file if statistic exists
ICNNNetworkStats* netNodesStats = nullptr;
auto stats = net.append_child("statistics");
network.getStats(&netNodesStats, nullptr);
NetworkStatsMap statsmap = netNodesStats->getNodesStats();
auto joinCommas = [&](std::vector<float>& v) -> std::string {
std::string res;
for (size_t i = 0; i < v.size(); ++i) {
res += std::to_string(v[i]);
if (i < v.size() - 1) {
res += ", ";
}
}
return res;
};
for (auto itStats : statsmap) {
auto layer = stats.append_child("layer");
layer.append_child("name").text().set(itStats.first.c_str());
layer.append_child("min").text().set(joinCommas(itStats.second->_minOutputs).c_str());
layer.append_child("max").text().set(joinCommas(itStats.second->_maxOutputs).c_str());
}
doc.save_file(xmlPath.c_str());
}
void CNNNetworkSerializer::updateStdLayerParams(CNNLayer::Ptr layer) {
auto layerPtr = layer.get();
auto type = layer->type;
auto &params = layer->params;
if (CaselessEq<std::string>()(layer->type, "power")) {
PowerLayer *lr = dynamic_cast<PowerLayer *>(layerPtr);
params["scale"] = std::to_string(lr->scale);
params["shift"] = std::to_string(lr->offset);
params["power"] = std::to_string(lr->power);
} else if (CaselessEq<std::string>()(layer->type, "convolution") ||
CaselessEq<std::string>()(layer->type, "deconvolution")) {
ConvolutionLayer *lr = dynamic_cast<ConvolutionLayer *>(layerPtr);
params["kernel"] = arrayRevertToIRProperty(lr->_kernel);
params["pads_begin"] = arrayRevertToIRProperty(lr->_padding);
params["pads_end"] = arrayRevertToIRProperty(lr->_pads_end);
params["strides"] = arrayRevertToIRProperty(lr->_stride);
params["dilations"] = arrayRevertToIRProperty(lr->_dilation);
params["output"] = std::to_string(lr->_out_depth);
params["group"] = std::to_string(lr->_group);
} else if (CaselessEq<std::string>()(layer->type, "relu")) {
ReLULayer *lr = dynamic_cast<ReLULayer *>(layerPtr);
if (lr->negative_slope != 0.0f) {
params["negative_slope"] = std::to_string(lr->negative_slope);
}
} else if (CaselessEq<std::string>()(layer->type, "norm") ||
CaselessEq<std::string>()(layer->type, "lrn")) {
NormLayer *lr = dynamic_cast<NormLayer *>(layerPtr);
params["alpha"] = std::to_string(lr->_alpha);
params["beta"] = std::to_string(lr->_beta);
params["local-size"] = std::to_string(lr->_size);
params["region"] = lr->_isAcrossMaps ? "across" : "same";
} else if (CaselessEq<std::string>()(layer->type, "pooling")) {
PoolingLayer *lr = dynamic_cast<PoolingLayer *>(layerPtr);
params["kernel"] = arrayRevertToIRProperty(lr->_kernel);
params["pads_begin"] = arrayRevertToIRProperty(lr->_padding);
params["pads_end"] = arrayRevertToIRProperty(lr->_pads_end);
params["strides"] = arrayRevertToIRProperty(lr->_stride);
switch (lr->_type) {
case PoolingLayer::MAX:
params["pool-method"] = "max";
break;
case PoolingLayer::AVG:
params["pool-method"] = "avg";
break;
default:
THROW_IE_EXCEPTION << "Found unsupported pooling method: " << lr->_type;
}
} else if (CaselessEq<std::string>()(layer->type, "split")) {
SplitLayer *lr = dynamic_cast<SplitLayer *>(layerPtr);
params["axis"] = std::to_string(lr->_axis);
} else if (CaselessEq<std::string>()(layer->type, "concat")) {
ConcatLayer *lr = dynamic_cast<ConcatLayer *>(layerPtr);
params["axis"] = std::to_string(lr->_axis);
} else if (CaselessEq<std::string>()(layer->type, "FullyConnected") ||
CaselessEq<std::string>()(layer->type, "InnerProduct")) {
FullyConnectedLayer *lr = dynamic_cast<FullyConnectedLayer *>(layerPtr);
params["out-size"] = std::to_string(lr->_out_num);
} else if (CaselessEq<std::string>()(layer->type, "softmax")) {
SoftMaxLayer *lr = dynamic_cast<SoftMaxLayer *>(layerPtr);
params["axis"] = std::to_string(lr->axis);
} else if (CaselessEq<std::string>()(layer->type, "reshape")) {
// need to add here support of flatten layer if it is created from API
ReshapeLayer *lr = dynamic_cast<ReshapeLayer *>(layerPtr);
params["axis"] = std::to_string(lr->axis);
params["num_axes"] = std::to_string(lr->num_axes);
params["dim"] = arrayToIRProperty(lr->shape);
} else if (CaselessEq<std::string>()(layer->type, "Eltwise")) {
EltwiseLayer *lr = dynamic_cast<EltwiseLayer *>(layerPtr);
std::string op;
switch (lr->_operation) {
case EltwiseLayer::Sum:
op = "sum";
break;
case EltwiseLayer::Prod:
op = "prod";
break;
case EltwiseLayer::Max:
op = "max";
break;
default:
break;
}
params["operation"] = op;
} else if (CaselessEq<std::string>()(layer->type, "scaleshift")) {
ScaleShiftLayer *lr = dynamic_cast<ScaleShiftLayer *>(layerPtr);
params["broadcast"] = std::to_string(lr->_broadcast);
} else if (CaselessEq<std::string>()(layer->type, "crop")) {
CropLayer *lr = dynamic_cast<CropLayer *>(layerPtr);
params["axis"] = arrayToIRProperty(lr->axis);
params["offset"] = arrayToIRProperty(lr->offset);
params["dim"] = arrayToIRProperty(lr->dim);
} else if (CaselessEq<std::string>()(layer->type, "tile")) {
TileLayer *lr = dynamic_cast<TileLayer *>(layerPtr);
params["axis"] = std::to_string(lr->axis);
params["tiles"] = std::to_string(lr->tiles);
} else if (CaselessEq<std::string>()(layer->type, "prelu")) {
PReLULayer *lr = dynamic_cast<PReLULayer *>(layerPtr);
params["channel_shared"] = std::to_string(lr->_channel_shared);
} else if (CaselessEq<std::string>()(layer->type, "clamp")) {
ClampLayer *lr = dynamic_cast<ClampLayer *>(layerPtr);
params["min"] = std::to_string(lr->min_value);
params["max"] = std::to_string(lr->max_value);
} else if (CaselessEq<std::string>()(layer->type, "BatchNormalization")) {
BatchNormalizationLayer *lr = dynamic_cast<BatchNormalizationLayer *>(layerPtr);
params["epsilon"] = std::to_string(lr->epsilon);
} else if (CaselessEq<std::string>()(layer->type, "grn")) {
GRNLayer *lr = dynamic_cast<GRNLayer *>(layerPtr);
params["bias"] = std::to_string(lr->bias);
} else if (CaselessEq<std::string>()(layer->type, "mvn")) {
MVNLayer *lr = dynamic_cast<MVNLayer *>(layerPtr);
params["across_channels"] = std::to_string(lr->across_channels);
params["normalize_variance"] = std::to_string(lr->normalize);
} else if (CaselessEq<std::string>()(layer->type, "rnn") ||
CaselessEq<std::string>()(layer->type, "TensorIterator") ||
CaselessEq<std::string>()(layer->type, "LSTMCell")) {
THROW_IE_EXCEPTION << "Not covered layers for writing to IR";
}
if (layer->params.find("quantization_level") != layer->params.end()) {
params["quantization_level"] = layer->params["quantization_level"];
}
// update of weightable layers
WeightableLayer *pwlayer = dynamic_cast<WeightableLayer *>(layerPtr);
if (pwlayer) {
if (pwlayer->_weights) {
pwlayer->blobs["weights"] = pwlayer->_weights;
}
if (pwlayer->_biases) {
pwlayer->blobs["biases"] = pwlayer->_biases;
}
}
}

View File

@@ -0,0 +1,21 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#pragma once
#include "inference_engine.hpp"
#include <pugixml/pugixml.hpp>
#include <string>
/** Class for serialization of model been presented as ICNNNetwork to the disk
*/
class CNNNetworkSerializer {
public:
void Serialize(const std::string &xmlPath, const std::string &binPath,
InferenceEngine::ICNNNetwork& network);
protected:
void updateStdLayerParams(InferenceEngine::CNNLayer::Ptr layer);
};

View File

@@ -36,7 +36,7 @@ add_executable(${TARGET_NAME} ${SRC})
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
COMPILE_PDB_NAME ${TARGET_NAME})
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} cpu_extension format_reader gflags)
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} IE::ie_cpu_extension format_reader gflags)
if(UNIX)
target_link_libraries(${TARGET_NAME} ${LIB_DL} pthread)

View File

@@ -1,4 +1,4 @@
# Image Classification Sample {#InferenceEngineClassificationSampleApplication}
# Image Classification Sample
This topic demonstrates how to build and run the Image Classification sample application, which does
inference using image classification networks like AlexNet and GoogLeNet.
@@ -37,6 +37,8 @@ Options:
Number of iterations (default 1)
-pc
Enables per-layer performance report
-p_msg
Enables messages from a plugin
```
@@ -63,4 +65,4 @@ Engine plugin. When inference is done, the application creates an
output image and outputs data to the standard output stream.
## See Also
* [Using Inference Engine Samples](@ref SamplesOverview)
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)

View File

@@ -1,18 +1,7 @@
/*
// Copyright (c) 2018 Intel Corporation
// Copyright (C) 2018 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// SPDX-License-Identifier: Apache-2.0
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
*/
#pragma once
@@ -61,6 +50,8 @@ static const char custom_cldnn_message[] = "Required for clDNN (GPU)-targeted cu
static const char custom_cpu_library_message[] = "Required for MKLDNN (CPU)-targeted custom layers." \
"Absolute path to a shared library with the kernels impl.";
/// @brief message for plugin messages
static const char plugin_message[] = "Enables messages from a plugin";
/// @brief Define flag for showing help message <br>
DEFINE_bool(h, false, help_message);
@@ -96,6 +87,9 @@ DEFINE_string(l, "", custom_cpu_library_message);
/// @brief Iterations count (default 1)
DEFINE_int32(ni, 1, iterations_count_message);
/// @brief Enable plugin messages
DEFINE_bool(p_msg, false, plugin_message);
/**
* @brief This function show a help message
*/
@@ -115,4 +109,5 @@ static void showUsage() {
std::cout << " -nt \"<integer>\" " << ntop_message << std::endl;
std::cout << " -ni \"<integer>\" " << iterations_count_message << std::endl;
std::cout << " -pc " << performance_counter_message << std::endl;
std::cout << " -p_msg " << plugin_message << std::endl;
}

View File

@@ -1,18 +1,7 @@
/*
// Copyright (c) 2018 Intel Corporation
// Copyright (C) 2018 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// SPDX-License-Identifier: Apache-2.0
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
*/
#include <fstream>
#include <vector>
@@ -32,6 +21,8 @@
using namespace InferenceEngine;
ConsoleErrorListener error_listener;
bool ParseAndCheckCommandLine(int argc, char *argv[]) {
// ---------------------------Parsing and validation of input args--------------------------------------
gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
@@ -72,13 +63,16 @@ int main(int argc, char *argv[]) {
/** This vector stores paths to the processed images **/
std::vector<std::string> imageNames;
parseImagesArguments(imageNames);
parseInputFilesArguments(imageNames);
if (imageNames.empty()) throw std::logic_error("No suitable images were found");
// -----------------------------------------------------------------------------------------------------
// --------------------------- 1. Load Plugin for inference engine -------------------------------------
slog::info << "Loading plugin" << slog::endl;
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp, "../../../lib/intel64" , "" }).getPluginByDevice(FLAGS_d);
if (FLAGS_p_msg) {
static_cast<InferenceEngine::InferenceEnginePluginPtr>(plugin)->SetLogCallback(error_listener);
}
/** Loading default extensions **/
if (FLAGS_d.find("CPU") != std::string::npos) {

View File

@@ -37,7 +37,7 @@ set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_F
COMPILE_PDB_NAME ${TARGET_NAME})
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} cpu_extension format_reader gflags)
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} IE::ie_cpu_extension format_reader gflags)
if(UNIX)
target_link_libraries(${TARGET_NAME} ${LIB_DL} pthread)

View File

@@ -1,4 +1,4 @@
# Image Classification Sample Async {#InferenceEngineClassificationPipelinedSampleApplication}
# Image Classification Sample Async
This sample demonstrates how to build and execute inference in pipelined mode on example of classifications networks.
@@ -52,6 +52,8 @@ Options:
Enables per-layer performance report
-nireq "<integer>"
Number of infer request for pipelined mode (default 1)
-p_msg
Enables messages from a plugin
```
@@ -78,4 +80,4 @@ Then in the loop it starts inference for the current infer request and switch fo
When inference is done, the application outputs data to the standard output stream.
## See Also
* [Using Inference Engine Samples](@ref SamplesOverview)
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)

View File

@@ -1,18 +1,7 @@
/*
// Copyright (c) 2018 Intel Corporation
// Copyright (C) 2018 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// SPDX-License-Identifier: Apache-2.0
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
*/
#pragma once
@@ -65,6 +54,9 @@ static const char custom_cldnn_message[] = "Required for clDNN (GPU)-targeted cu
static const char custom_cpu_library_message[] = "Required for MKLDNN (CPU)-targeted custom layers." \
"Absolute path to a shared library with the kernels impl.";
/// @brief message for plugin messages
static const char plugin_message[] = "Enables messages from a plugin";
/// @brief Define flag for showing help message <br>
DEFINE_bool(h, false, help_message);
@@ -103,6 +95,9 @@ DEFINE_int32(ni, 1, iterations_count_message);
/// @brief Number of infer requests
DEFINE_int32(nireq, 1, ninfer_request_message);
/// @brief Enable plugin messages
DEFINE_bool(p_msg, false, plugin_message);
/**
* @brief This function show a help message
*/
@@ -123,4 +118,5 @@ static void showUsage() {
std::cout << " -ni \"<integer>\" " << iterations_count_message << std::endl;
std::cout << " -pc " << performance_counter_message << std::endl;
std::cout << " -nireq \"<integer>\" " << ninfer_request_message << std::endl;
std::cout << " -p_msg " << plugin_message << std::endl;
}

View File

@@ -1,18 +1,7 @@
/*
// Copyright (c) 2018 Intel Corporation
// Copyright (C) 2018 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// SPDX-License-Identifier: Apache-2.0
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
*/
/**
* @brief The entry point the Inference Engine sample application
@@ -30,7 +19,7 @@
#include <inference_engine.hpp>
#include <format_reader/format_reader_ptr.h>
#include <format_reader_ptr.h>
#include <samples/common.hpp>
#include <samples/slog.hpp>
@@ -43,6 +32,8 @@
using namespace InferenceEngine;
ConsoleErrorListener error_listener;
bool ParseAndCheckCommandLine(int argc, char *argv[]) {
// ---------------------------Parsing and validation of input args--------------------------------------
slog::info << "Parsing input parameters" << slog::endl;
@@ -88,13 +79,16 @@ int main(int argc, char *argv[]) {
/** This vector stores paths to the processed images **/
std::vector<std::string> imageNames;
parseImagesArguments(imageNames);
parseInputFilesArguments(imageNames);
if (imageNames.empty()) throw std::logic_error("No suitable images were found");
// -----------------------------------------------------------------------------------------------------
// --------------------------- 1. Load Plugin for inference engine -------------------------------------
slog::info << "Loading plugin" << slog::endl;
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp, "../../../lib/intel64" , "" }).getPluginByDevice(FLAGS_d);
if (FLAGS_p_msg) {
static_cast<InferenceEngine::InferenceEnginePluginPtr>(plugin)->SetLogCallback(error_listener);
}
/** Loading default extensions **/
if (FLAGS_d.find("CPU") != std::string::npos) {
@@ -194,7 +188,6 @@ int main(int argc, char *argv[]) {
if (FLAGS_pc) {
config[PluginConfigParams::KEY_PERF_COUNT] = PluginConfigParams::YES;
}
ExecutableNetwork executable_network = plugin.LoadNetwork(network, {});
// -----------------------------------------------------------------------------------------------------

View File

@@ -1,57 +0,0 @@
# Copyright (c) 2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
cmake_minimum_required (VERSION 2.8)
include(CPUID)
include(OptimizationFlags)
set(OpenCV_STATIC OFF)
set (BUILD_VALIDATION_APP OFF)
find_package(OpenCV 3.3 QUIET COMPONENTS core imgproc highgui imgcodecs)
if(NOT(OpenCV_FOUND))
find_package(OpenCV 3.3 QUIET COMPONENTS world)
endif()
if (OpenCV_FOUND)
set (BUILD_VALIDATION_APP ON)
else()
message(WARNING "No suitable OpenCV version detected, BUILD_VALIDATION_APP is set to OFF")
endif()
macro(enable_omp)
if(UNIX) # Linux
add_definitions(-fopenmp)
find_library(intel_omp_lib iomp5
PATHS ${InferenceEngine_INCLUDE_DIRS}/../external/mkltiny_lnx/lib
)
elseif(WIN32) # Windows
if(${CMAKE_CXX_COMPILER_ID} STREQUAL MSVC)
set(OPENMP_FLAGS "/Qopenmp /openmp")
set(CMAKE_SHARED_LINKER_FLAGS " ${CMAKE_SHARED_LINKER_FLAGS} /nodefaultlib:vcomp")
elseif(${CMAKE_CXX_COMPILER_ID} STREQUAL Intel)
set(OPENMP_FLAGS "/Qopenmp /openmp")
else()
message("Unknown compiler ID. OpenMP support is disabled.")
endif()
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OPENMP_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OPENMP_FLAGS}")
find_library(intel_omp_lib
libiomp5md
PATHS "${InferenceEngine_INCLUDE_DIRS}/../lib/intel64/${CMAKE_BUILD_TYPE}"
)
endif()
endmacro(enable_omp)

View File

@@ -1,6 +1,17 @@
# Copyright (C) 2018 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
# Copyright (c) 2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
cmake_minimum_required(VERSION 2.8)
set (TARGET_NAME "format_reader")
@@ -15,8 +26,12 @@ file (GLOB LIBRARY_HEADERS
# Find OpenCV libray if exists
find_package(OpenCV)
include_directories(${OpenCV_INCLUDE_DIRS})
if(OpenCV_FOUND)
include_directories(${OpenCV_INCLUDE_DIRS})
else()
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " is built without OPENCV support")
endif()
if(UNIX)
list(REMOVE_ITEM MAIN_SRC ${CMAKE_CURRENT_SOURCE_DIR}/dllmain.cpp)
else()
@@ -37,4 +52,4 @@ add_library(${TARGET_NAME} SHARED ${MAIN_SRC} ${LIBRARY_HEADERS})
target_link_libraries(${TARGET_NAME} ${OpenCV_LIBRARIES})
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
COMPILE_PDB_NAME ${TARGET_NAME})
COMPILE_PDB_NAME ${TARGET_NAME})

View File

@@ -25,22 +25,22 @@ private:
static Register<BitMap> reg;
typedef struct {
unsigned short type; /* Magic identifier */
unsigned int size; /* File size in bytes */
unsigned int reserved;
unsigned int offset; /* Offset to image data, bytes */
unsigned short type = 0u; /* Magic identifier */
unsigned int size = 0u; /* File size in bytes */
unsigned int reserved = 0u;
unsigned int offset = 0u; /* Offset to image data, bytes */
} BmpHeader;
typedef struct {
unsigned int size; /* Header size in bytes */
int width, height; /* Width and height of image */
unsigned short planes; /* Number of colour planes */
unsigned short bits; /* Bits per pixel */
unsigned int compression; /* Compression type */
unsigned int imagesize; /* Image size in bytes */
int xresolution, yresolution; /* Pixels per meter */
unsigned int ncolours; /* Number of colours */
unsigned int importantcolours; /* Important colours */
unsigned int size = 0u; /* Header size in bytes */
int width = 0, height = 0; /* Width and height of image */
unsigned short planes = 0u; /* Number of colour planes */
unsigned short bits = 0u; /* Bits per pixel */
unsigned int compression = 0u; /* Compression type */
unsigned int imagesize = 0u; /* Image size in bytes */
int xresolution = 0, yresolution = 0; /* Pixels per meter */
unsigned int ncolours = 0u; /* Number of colours */
unsigned int importantcolours = 0u; /* Important colours */
} BmpInfoHeader;
public:

View File

@@ -23,19 +23,21 @@
#endif
/**
* @brief This function check input args and find images in given folder
* @brief This function checks input args and existence of specified files in a given folder
* @param arg path to a file to be checked for existence
* @return files updated vector of verified input files
*/
void readImagesArguments(std::vector<std::string> &images, const std::string& arg) {
void readInputFilesArguments(std::vector<std::string> &files, const std::string& arg) {
struct stat sb;
if (stat(arg.c_str(), &sb) != 0) {
std::cout << "[ WARNING ] File " << arg << " cannot be opened!" << std::endl;
slog::warn << "File " << arg << " cannot be opened!" << slog::endl;
return;
}
if (S_ISDIR(sb.st_mode)) {
DIR *dp;
dp = opendir(arg.c_str());
if (dp == nullptr) {
std::cout << "[ WARNING ] Directory " << arg << " cannot be opened!" << std::endl;
slog::warn << "Directory " << arg << " cannot be opened!" << slog::endl;
return;
}
@@ -43,19 +45,29 @@ void readImagesArguments(std::vector<std::string> &images, const std::string& ar
while (nullptr != (ep = readdir(dp))) {
std::string fileName = ep->d_name;
if (fileName == "." || fileName == "..") continue;
std::cout << "[ INFO ] Add file " << ep->d_name << " from directory " << arg << "." << std::endl;
images.push_back(arg + "/" + ep->d_name);
files.push_back(arg + "/" + ep->d_name);
}
closedir(dp);
} else {
files.push_back(arg);
}
if (files.size() < 20) {
slog::info << "Files were added: " << files.size() << slog::endl;
for (std::string filePath : files) {
slog::info << " " << filePath << slog::endl;
}
} else {
images.push_back(arg);
slog::info << "Files were added: " << files.size() << ". Too many to display each of them." << slog::endl;
}
}
/**
* @brief This function find -i/--images key in input args
* It's necessary to process multiple values for single key
* @return files updated vector of verified input files
*/
void parseImagesArguments(std::vector<std::string> &images) {
void parseInputFilesArguments(std::vector<std::string> &files) {
std::vector<std::string> args = gflags::GetArgvs();
bool readArguments = false;
for (size_t i = 0; i < args.size(); i++) {
@@ -69,6 +81,6 @@ void parseImagesArguments(std::vector<std::string> &images) {
if (args.at(i).c_str()[0] == '-') {
break;
}
readImagesArguments(images, args.at(i));
readInputFilesArguments(files, args.at(i));
}
}

View File

@@ -46,6 +46,20 @@
#endif
#endif
/**
* @brief This class represents a console error listener.
*
*/
class ConsoleErrorListener : public InferenceEngine::IErrorListener {
/**
* @brief The plugin calls this method with a null terminated error message (in case of error)
* @param msg Error message
*/
void onError(const char *msg) noexcept override {
std::clog << "Plugin message: " << msg << std::endl;
}
};
/**
* @brief Trims from both ends (in place)
* @param s - string to trim
@@ -183,7 +197,7 @@ static UNUSED std::ostream &operator<<(std::ostream &os, const PluginVersion &ve
}
inline void printPluginVersion(InferenceEngine::InferenceEnginePluginPtr ptr, std::ostream& stream) {
const PluginVersion *pluginVersion;
const PluginVersion *pluginVersion = nullptr;
ptr->GetVersion((const InferenceEngine::Version*&)pluginVersion);
stream << pluginVersion << std::endl;
}
@@ -462,9 +476,10 @@ static UNUSED bool writeOutputBmp(std::string name, unsigned char *data, size_t
* @param width - width of the rectangle
* @param rectangles - vector points for the rectangle, should be 4x compared to num classes
* @param classes - vector of classes
* @param thickness - thickness of a line (in pixels) to be used for bounding boxes
*/
static UNUSED void addRectangles(unsigned char *data, size_t height, size_t width, std::vector<int> rectangles, std::vector<int> classes) {
std::vector<Color> colors = {
static UNUSED void addRectangles(unsigned char *data, size_t height, size_t width, std::vector<int> rectangles, std::vector<int> classes, int thickness = 1) {
std::vector<Color> colors = { // colors to be used for bounding boxes
{ 128, 64, 128 },
{ 232, 35, 244 },
{ 70, 70, 70 },
@@ -497,38 +512,47 @@ static UNUSED void addRectangles(unsigned char *data, size_t height, size_t widt
int w = rectangles.at(i * 4 + 2);
int h = rectangles.at(i * 4 + 3);
int cls = classes.at(i) % colors.size(); // color of a bounding box line
if (x < 0) x = 0;
if (y < 0) y = 0;
if (w < 0) w = 0;
if (h < 0) h = 0;
if (x >= width) { x = width - 1; w = 0; }
if (y >= height) { y = height - 1; h = 0; }
if (x >= width) { x = width - 1; w = 0; thickness = 1; }
if (y >= height) { y = height - 1; h = 0; thickness = 1; }
if (x + w >= width) { w = width - x - 1; }
if (y + h >= height) { h = height - y - 1; }
size_t shift_first = y*width * 3;
size_t shift_second = (y + h)*width * 3;
int cls = classes.at(i) % colors.size();
for (int i = x; i < x + w; i++) {
data[shift_first + i * 3] = colors.at(cls).red();
data[shift_first + i * 3 + 1] = colors.at(cls).green();
data[shift_first + i * 3 + 2] = colors.at(cls).blue();
data[shift_second + i * 3] = colors.at(cls).red();
data[shift_second + i * 3 + 1] = colors.at(cls).green();
data[shift_second + i * 3 + 2] = colors.at(cls).blue();
thickness = std::min(std::min(thickness, w / 2 + 1), h / 2 + 1);
size_t shift_first;
size_t shift_second;
for (int t = 0; t < thickness; t++) {
shift_first = (y + t) * width * 3;
shift_second = (y + h - t) * width * 3;
for (int i = x; i < x + w + 1; i++) {
data[shift_first + i * 3] = colors.at(cls).red();
data[shift_first + i * 3 + 1] = colors.at(cls).green();
data[shift_first + i * 3 + 2] = colors.at(cls).blue();
data[shift_second + i * 3] = colors.at(cls).red();
data[shift_second + i * 3 + 1] = colors.at(cls).green();
data[shift_second + i * 3 + 2] = colors.at(cls).blue();
}
}
shift_first = x * 3;
shift_second = (x + w) * 3;
for (int i = y; i < y + h; i++) {
data[shift_first + i*width * 3] = colors.at(cls).red();
data[shift_first + i*width * 3 + 1] = colors.at(cls).green();
data[shift_first + i*width * 3 + 2] = colors.at(cls).blue();
data[shift_second + i*width * 3] = colors.at(cls).red();
data[shift_second + i*width * 3 + 1] = colors.at(cls).green();
data[shift_second + i*width * 3 + 2] = colors.at(cls).blue();
for (int t = 0; t < thickness; t++) {
shift_first = (x + t) * 3;
shift_second = (x + w - t) * 3;
for (int i = y; i < y + h + 1; i++) {
data[shift_first + i * width * 3] = colors.at(cls).red();
data[shift_first + i * width * 3 + 1] = colors.at(cls).green();
data[shift_first + i * width * 3 + 2] = colors.at(cls).blue();
data[shift_second + i * width * 3] = colors.at(cls).red();
data[shift_second + i * width * 3 + 1] = colors.at(cls).green();
data[shift_second + i * width * 3 + 2] = colors.at(cls).blue();
}
}
}
}
@@ -1091,4 +1115,4 @@ static InferenceEngine::Blob::Ptr wrapMat2Blob(const cv::Mat &mat) {
return InferenceEngine::make_shared_blob<uint8_t>(tDesc, mat.data);
}
#endif
#endif

View File

@@ -0,0 +1,31 @@
@echo off
:: Copyright (c) 2018 Intel Corporation
::
:: Licensed under the Apache License, Version 2.0 (the "License");
:: you may not use this file except in compliance with the License.
:: You may obtain a copy of the License at
::
:: http://www.apache.org/licenses/LICENSE-2.0
::
:: Unless required by applicable law or agreed to in writing, software
:: distributed under the License is distributed on an "AS IS" BASIS,
:: WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
:: See the License for the specific language governing permissions and
:: limitations under the License.
@setlocal
set "ROOT_DIR=%~dp0"
set "SOLUTION_DIR64=%USERPROFILE%\Documents\Intel\OpenVINO\inference_engine_samples_2015"
if exist "%SOLUTION_DIR64%" rd /s /q "%SOLUTION_DIR64%"
if "%InferenceEngine_DIR%"=="" set "InferenceEngine_DIR=%ROOT_DIR%\..\share"
if exist "%ROOT_DIR%\..\..\bin\setupvars.bat" call "%ROOT_DIR%\..\..\bin\setupvars.bat"
if exist "%ROOT_DIR%\..\..\..\bin\setupvars.bat" call "%ROOT_DIR%\..\..\..\bin\setupvars.bat"
echo Creating Visual Studio 2015 (x64) files in %SOLUTION_DIR64%... && ^
cd "%ROOT_DIR%" && cmake -E make_directory "%SOLUTION_DIR64%" && cd "%SOLUTION_DIR64%" && cmake -G "Visual Studio 14 2015 Win64" "%ROOT_DIR%"
echo Done.
pause

View File

@@ -0,0 +1,31 @@
@echo off
:: Copyright (c) 2018 Intel Corporation
::
:: Licensed under the Apache License, Version 2.0 (the "License");
:: you may not use this file except in compliance with the License.
:: You may obtain a copy of the License at
::
:: http://www.apache.org/licenses/LICENSE-2.0
::
:: Unless required by applicable law or agreed to in writing, software
:: distributed under the License is distributed on an "AS IS" BASIS,
:: WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
:: See the License for the specific language governing permissions and
:: limitations under the License.
@setlocal
set "ROOT_DIR=%~dp0"
set "SOLUTION_DIR64=%USERPROFILE%\Documents\Intel\OpenVINO\inference_engine_samples_2017"
if exist "%SOLUTION_DIR64%" rd /s /q "%SOLUTION_DIR64%"
if "%InferenceEngine_DIR%"=="" set "InferenceEngine_DIR=%ROOT_DIR%\..\share"
if exist "%ROOT_DIR%\..\..\bin\setupvars.bat" call "%ROOT_DIR%\..\..\bin\setupvars.bat"
if exist "%ROOT_DIR%\..\..\..\bin\setupvars.bat" call "%ROOT_DIR%\..\..\..\bin\setupvars.bat"
echo Creating Visual Studio 2017 (x64) files in %SOLUTION_DIR64%... && ^
cd "%ROOT_DIR%" && cmake -E make_directory "%SOLUTION_DIR64%" && cd "%SOLUTION_DIR64%" && cmake -G "Visual Studio 15 2017 Win64" "%ROOT_DIR%"
echo Done.
pause

View File

@@ -26,15 +26,15 @@ file (GLOB SRC
# Find OpenCV libray if exists
find_package(OpenCV)
if(OpenCV_FOUND)
include_directories(${OpenCV_INCLUDE_DIRS})
else()
if(NOT(OpenCV_FOUND))
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " skiped")
return()
endif()
source_group("src" FILES ${SRC})
include_directories(${OpenCV_INCLUDE_DIRS})
link_directories(${LIB_FOLDER})
# Create library file from sources.
@@ -46,7 +46,6 @@ set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_F
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} ${OpenCV_LIBRARIES})
if(UNIX)
target_link_libraries(${TARGET_NAME} ${LIB_DL})
endif()

View File

@@ -1,15 +1,15 @@
# Hello Autoresize Classification Sample {#InferenceEngineHelloAutoresizeClassificationSample}
# Hello Autoresize Classification Sample
This topic describes how to run the Hello Autoresize Classification sample application.
The sample is simplified version of [Image Classification Sample](@ref InferenceEngineClassificationSampleApplication).
The sample is simplified version of [Image Classification Sample](./samples/classification_sample/README.md).
It's intended to demonstrate using of new input autoresize API of Inference Engine in applications. Refer to
[Integrate with customer application New Request API](@ref IntegrateIEInAppNewAPI) for details.
[Integrate with customer application New Request API](./docs/Inference_Engine_Developer_Guide/Integrate_with_customer_application_new_API.md) for details.
There is also new API introduced to crop a ROI object and set it as input without additional memory re-allocation.
To properly demonstrate this new API it's required to run several networks in pipeline which is out of scope of this sample.
Please refer to [Object Detection for SSD Demo app](@ref InferenceEngineObjectDetectionSSDDemoAsyncApplication) or
[Security Barrier Camera Demo](@ref InferenceEngineSecurityBarrierCameraDemoApplication) or
[Crossroad Camera Demo](@ref InferenceEngineCrossroadCameraDemoApplication) with an example of using of new crop ROI API.
Please refer to [Object Detection for SSD Demo app](./samples/object_detection_demo_ssd_async/README.md) or
[Security Barrier Camera Demo](./samples/security_barrier_camera_demo/README.md) or
[Crossroad Camera Demo](./samples/crossroad_camera_demo/README.md) with an example of using of new crop ROI API.
## Running
@@ -23,4 +23,4 @@ You can do inference on an image using a trained AlexNet network on Intel&reg; P
The application outputs top-10 inference results.
## See Also
* [Using Inference Engine Samples](@ref SamplesOverview)
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)

View File

@@ -1,18 +1,7 @@
/*
// Copyright (c) 2018 Intel Corporation
// Copyright (C) 2018 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// SPDX-License-Identifier: Apache-2.0
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
*/
#include <iomanip>
#include <vector>

View File

@@ -26,9 +26,7 @@ file (GLOB SRC
# Find OpenCV libray if exists
find_package(OpenCV)
if(OpenCV_FOUND)
include_directories(${OpenCV_INCLUDE_DIRS})
else()
if(NOT(OpenCV_FOUND))
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " skiped")
return()
endif()
@@ -37,6 +35,8 @@ endif()
# Empty name lists them directly under the .vcproj
source_group("src" FILES ${SRC})
include_directories(${OpenCV_INCLUDE_DIRS})
link_directories(${LIB_FOLDER})
# Create library file from sources.

View File

@@ -1,18 +1,7 @@
/*
// Copyright (c) 2018 Intel Corporation
// Copyright (C) 2018 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// SPDX-License-Identifier: Apache-2.0
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
*/
#include <iomanip>
#include <vector>

View File

@@ -26,9 +26,7 @@ file (GLOB SRC
# Find OpenCV libray if exists
find_package(OpenCV)
if(OpenCV_FOUND)
include_directories(${OpenCV_INCLUDE_DIRS})
else()
if(NOT(OpenCV_FOUND))
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " skiped")
return()
endif()
@@ -37,6 +35,8 @@ endif()
# Empty name lists them directly under the .vcproj
source_group("src" FILES ${SRC})
include_directories(${OpenCV_INCLUDE_DIRS})
link_directories(${LIB_FOLDER})
# Create library file from sources.

View File

@@ -1,9 +1,9 @@
# Hello Infer Request Classification Sample {#InferenceEngineHelloRequestClassificationSample}
# Hello Infer Request Classification Sample
This topic describes how to run the Hello Infer Classification sample application.
The sample is simplified version of [Image Classification Sample](@ref InferenceEngineClassificationSampleApplication).
The sample is simplified version of [Image Classification Sample](./samples/classification_sample/README.md).
It's intended to demonstrate using of new Infer Request API of Inference Engine in applications. Refer to
[Integrate with customer application New Request API](@ref IntegrateIEInAppNewAPI) for details.
[Integrate with customer application New Request API](./docs/Inference_Engine_Developer_Guide/Integrate_with_customer_application_new_API.md) for details.
## Running
@@ -18,4 +18,4 @@ The application outputs top-10 inference results.
## See Also
* [Using Inference Engine Samples](@ref SamplesOverview)
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)

View File

@@ -1,18 +1,7 @@
/*
// Copyright (c) 2018 Intel Corporation
// Copyright (C) 2018 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
// SPDX-License-Identifier: Apache-2.0
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
*/
#include <iomanip>
#include <vector>

View File

@@ -0,0 +1,56 @@
# Copyright (c) 2018 Intel Corporation
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
cmake_minimum_required(VERSION 2.8)
set(TARGET_NAME "hello_shape_infer_ssd")
if (BUILD_SAMPLE_NAME AND NOT ${BUILD_SAMPLE_NAME} STREQUAL ${TARGET_NAME})
message(STATUS "SAMPLE ${TARGET_NAME} SKIPPED")
return()
endif ()
file(GLOB SRC
${CMAKE_CURRENT_SOURCE_DIR}/*.cpp
)
file(GLOB HEADERS
${CMAKE_CURRENT_SOURCE_DIR}/*.hpp
)
# Find OpenCV libray if exists
find_package(OpenCV)
if(NOT(OpenCV_FOUND))
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " skiped")
return()
endif()
# Create named folders for the sources within the .vcproj
# Empty name lists them directly under the .vcproj
source_group("src" FILES ${SRC})
source_group("headers" FILES ${HEADERS})
include_directories(${OpenCV_INCLUDE_DIRS})
link_directories(${LIB_FOLDER})
# Create library file from sources.
add_executable(${TARGET_NAME} ${SRC} ${HEADERS})
set_target_properties(${TARGET_NAME} PROPERTIES COMPILE_PDB_NAME ${TARGET_NAME})
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} IE::ie_cpu_extension ${OpenCV_LIBRARIES})
if (UNIX)
target_link_libraries(${TARGET_NAME} ${LIB_DL})
endif ()

View File

@@ -0,0 +1,20 @@
# Hello Shape Infer Sample
This topic demonstrates how to run the Hello Shape Infer SSD application, which does inference using object detection
networks like SSD-VGG. The sample shows how to use [Shape Inference feature](./docs/Inference_Engine_Developer_Guide/ShapeInference.md).
## Running
You can use the following command to do inference on Intel&reg; Processors on an image using a trained SSD network:
```sh
./hello_shape_infer_ssd <path_to_model>/ssd_300.xml <path_to_image>/500x500.bmp CPU 3
```
### Outputs
The application renders an image with detected objects enclosed in rectangles. It outputs the list of classes
of the detected objects along with the respective confidence values and the coordinates of the
rectangles to the standard output stream.
## See Also
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)

View File

@@ -0,0 +1,173 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#include <vector>
#include <memory>
#include <string>
#include <opencv2/opencv.hpp>
#include <inference_engine.hpp>
#include <samples/common.hpp>
#include <ext_list.hpp>
#include "shape_infer_extension.hpp"
using namespace InferenceEngine;
int main(int argc, char* argv[]) {
try {
// ------------------------------ Parsing and validation of input args ---------------------------------
if (argc != 5) {
std::cout << "Usage : ./hello_shape_infer_ssd <path_to_model> <path_to_image> <device> <batch>"
<< std::endl;
return EXIT_FAILURE;
}
const std::string input_model{argv[1]};
const std::string input_image_path{argv[2]};
const std::string device_name{argv[3]};
const size_t batch_size{std::stoul(argv[4])};
// -----------------------------------------------------------------------------------------------------
// --------------------------- 1. Load Plugin for inference engine -------------------------------------
InferencePlugin plugin = PluginDispatcher({"../../../lib/intel64", ""}).getPluginByDevice(device_name);
IExtensionPtr cpuExtension, inPlaceExtension;
if (device_name == "CPU") {
cpuExtension = std::make_shared<Extensions::Cpu::CpuExtensions>();
inPlaceExtension = std::make_shared<InPlaceExtension>();
plugin.AddExtension(cpuExtension);
// register sample's custom kernel (CustomReLU)
plugin.AddExtension(inPlaceExtension);
}
// -----------------------------------------------------------------------------------------------------
// --------------------------- 2. Read IR Generated by ModelOptimizer (.xml and .bin files) ------------
CNNNetReader network_reader;
network_reader.ReadNetwork(input_model);
network_reader.ReadWeights(input_model.substr(0, input_model.size() - 4) + ".bin");
CNNNetwork network = network_reader.getNetwork();
OutputsDataMap outputs_info(network.getOutputsInfo());
InputsDataMap inputs_info(network.getInputsInfo());
if (inputs_info.size() != 1 && outputs_info.size() != 1)
throw std::logic_error("Sample supports clean SSD network with one input and one output");
// --------------------------- Resize network to match image sizes and given batch----------------------
if (device_name == "CPU") {
// register shape inference functions (SpatialTransformer) from CPU Extension
network.AddExtension(cpuExtension);
// register sample's custom shape inference (CustomReLU)
network.AddExtension(inPlaceExtension);
}
auto input_shapes = network.getInputShapes();
std::string input_name;
SizeVector input_shape;
std::tie(input_name, input_shape) = *input_shapes.begin();
cv::Mat image = cv::imread(input_image_path);
input_shape[0] = batch_size;
input_shape[2] = image.rows;
input_shape[3] = image.cols;
input_shapes[input_name] = input_shape;
std::cout << "Resizing network to the image size = [" << image.rows << "x" << image.cols << "] "
<< "with batch = " << batch_size << std::endl;
network.reshape(input_shapes);
// -----------------------------------------------------------------------------------------------------
// --------------------------- 3. Configure input & output ---------------------------------------------
// --------------------------- Prepare input blobs -----------------------------------------------------
InputInfo::Ptr input_info;
std::tie(input_name, input_info) = *inputs_info.begin();
input_info->setLayout(Layout::NCHW);
input_info->setPrecision(Precision::U8);
// --------------------------- Prepare output blobs ----------------------------------------------------
DataPtr output_info;
std::string output_name;
std::tie(output_name, output_info) = *outputs_info.begin();
if (output_info->creatorLayer.lock()->type != "DetectionOutput")
throw std::logic_error("Can't find a DetectionOutput layer in the topology");
const SizeVector output_shape = output_info->getTensorDesc().getDims();
const int max_proposal_count = output_shape[2];
const int object_size = output_shape[3];
if (object_size != 7) {
throw std::logic_error("Output item should have 7 as a last dimension");
}
if (output_shape.size() != 4) {
throw std::logic_error("Incorrect output dimensions for SSD model");
}
if (output_info == nullptr) {
THROW_IE_EXCEPTION << "[SAMPLES] shared_ptr ouput_info == nullptr";
}
output_info->setPrecision(Precision::FP32);
auto dumpVec = [](const SizeVector& vec) -> std::string {
if (vec.empty()) return "[]";
std::stringstream oss;
oss << "[" << vec[0];
for (size_t i = 1; i < vec.size(); i++) oss << "," << vec[i];
oss << "]";
return oss.str();
};
std::cout << "Resulting input shape = " << dumpVec(input_shape) << std::endl;
std::cout << "Resulting output shape = " << dumpVec(output_shape) << std::endl;
// -----------------------------------------------------------------------------------------------------
// --------------------------- 4. Loading model to the plugin ------------------------------------------
ExecutableNetwork executable_network = plugin.LoadNetwork(network, {});
// -----------------------------------------------------------------------------------------------------
// --------------------------- 5. Create infer request -------------------------------------------------
InferRequest infer_request = executable_network.CreateInferRequest();
// -----------------------------------------------------------------------------------------------------
// --------------------------- 6. Prepare input --------------------------------------------------------
Blob::Ptr input = infer_request.GetBlob(input_name);
for (int b = 0; b < batch_size; b++) {
matU8ToBlob<uint8_t>(image, input, b);
}
// -----------------------------------------------------------------------------------------------------
// --------------------------- 7. Do inference --------------------------------------------------------
infer_request.Infer();
// -----------------------------------------------------------------------------------------------------
// --------------------------- 8. Process output ------------------------------------------------------
Blob::Ptr output = infer_request.GetBlob(output_name);
const float* detection = output->buffer().as<PrecisionTrait<Precision::FP32>::value_type*>();
/* Each detection has image_id that denotes processed image */
for (int cur_proposal = 0; cur_proposal < max_proposal_count; cur_proposal++) {
float image_id = detection[cur_proposal * object_size + 0];
float label = detection[cur_proposal * object_size + 1];
float confidence = detection[cur_proposal * object_size + 2];
/* CPU and GPU plugins have difference in DetectionOutput layer, so we need both checks */
if (image_id < 0 || confidence == 0) {
continue;
}
float xmin = detection[cur_proposal * object_size + 3] * image.cols;
float ymin = detection[cur_proposal * object_size + 4] * image.rows;
float xmax = detection[cur_proposal * object_size + 5] * image.cols;
float ymax = detection[cur_proposal * object_size + 6] * image.rows;
if (confidence > 0.5) {
/** Drawing only objects with >50% probability **/
std::ostringstream conf;
conf << ":" << std::fixed << std::setprecision(3) << confidence;
cv::rectangle(image, cv::Point2f(xmin, ymin), cv::Point2f(xmax, ymax), cv::Scalar(0, 0, 255));
std::cout << "[" << cur_proposal << "," << label << "] element, prob = " << confidence <<
", bbox = (" << xmin << "," << ymin << ")-(" << xmax << "," << ymax << ")" << ", batch id = "
<< image_id << std::endl;
}
}
cv::imwrite("hello_shape_infer_ssd_output.jpg", image);
std::cout << "The resulting image was saved in the file: hello_shape_infer_ssd_output.jpg" << std::endl;
// -----------------------------------------------------------------------------------------------------
} catch (const std::exception& ex) {
std::cerr << ex.what() << std::endl;
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}

View File

@@ -0,0 +1,146 @@
// Copyright (C) 2018 Intel Corporation
//
// SPDX-License-Identifier: Apache-2.0
//
#include <map>
#include <memory>
#include <string>
#include <algorithm>
#include <vector>
#include <inference_engine.hpp>
#define CUSTOM_RELU_TYPE std::string("CustomReLU")
class CustomReLUImpl : public InferenceEngine::ILayerExecImpl {
public:
explicit CustomReLUImpl(const InferenceEngine::CNNLayer& layer) : _layer(layer) {}
InferenceEngine::StatusCode getSupportedConfigurations(std::vector<InferenceEngine::LayerConfig>& conf,
InferenceEngine::ResponseDesc* resp) noexcept override {
InferenceEngine::DataConfig inDataConfig;
InferenceEngine::DataConfig outDataConfig;
auto firstInput = *_layer.insData.begin();
auto firstOutput = *_layer.outData.begin();
inDataConfig.desc = firstInput.lock()->getTensorDesc();
outDataConfig.desc = firstOutput->getTensorDesc();
InferenceEngine::LayerConfig layerConfig;
layerConfig.inConfs = {inDataConfig};
layerConfig.outConfs = {outDataConfig};
conf.push_back(layerConfig);
return InferenceEngine::StatusCode::OK;
}
InferenceEngine::StatusCode
init(InferenceEngine::LayerConfig& config, InferenceEngine::ResponseDesc* resp) noexcept override {
return InferenceEngine::StatusCode::OK;
}
InferenceEngine::StatusCode
execute(std::vector<InferenceEngine::Blob::Ptr>& inputs, std::vector<InferenceEngine::Blob::Ptr>& outputs,
InferenceEngine::ResponseDesc* resp) noexcept override {
static bool wasCalled = false;
if (!wasCalled) {
std::cout << "Running " + CUSTOM_RELU_TYPE + " kernel for the first time (next messages won't be printed)"
<< std::endl;
wasCalled = true;
}
for (size_t i = 0; i < inputs.size(); i++) {
auto inputBlob = inputs[i];
auto outputBlob = outputs[i];
auto inputData = inputBlob->buffer().as<InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type*>();
auto outputData = outputBlob->buffer().as<InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type*>();
for (size_t j = 0; j < inputBlob->size(); j++) {
outputData[j] = inputData[j] < 0 ? 0 : inputData[j];
}
}
return InferenceEngine::StatusCode::OK;
}
private:
const InferenceEngine::CNNLayer _layer;
};
class CustomReLUFactory : public InferenceEngine::ILayerImplFactory {
public:
explicit CustomReLUFactory(const InferenceEngine::CNNLayer* layer) : _layer(*layer) {}
InferenceEngine::StatusCode
getImplementations(std::vector<InferenceEngine::ILayerImpl::Ptr>& impls,
InferenceEngine::ResponseDesc* resp) noexcept override {
impls.push_back(std::make_shared<CustomReLUImpl>(_layer));
return InferenceEngine::StatusCode::OK;
}
private:
InferenceEngine::CNNLayer _layer;
};
class CustomReLUResizeImpl : public InferenceEngine::IShapeInferImpl {
public:
InferenceEngine::StatusCode inferShapes(const std::vector<InferenceEngine::SizeVector>& inShapes,
const std::map<std::string, std::string>& params,
const std::map<std::string, InferenceEngine::Blob::Ptr>& blobs,
std::vector<InferenceEngine::SizeVector>& outShapes,
InferenceEngine::ResponseDesc* desc) noexcept override {
static bool wasCalled = false;
if (!wasCalled) {
std::cout << "Running " + CUSTOM_RELU_TYPE +
" shape inference for the first time (next messages won't be printed)" << std::endl;
wasCalled = true;
}
outShapes = inShapes;
return InferenceEngine::StatusCode::OK;
}
};
class InPlaceExtension : public InferenceEngine::IExtension {
public:
InPlaceExtension() {
_shapeInferImpl = std::make_shared<CustomReLUResizeImpl>();
}
InferenceEngine::StatusCode
getPrimitiveTypes(char**& types, unsigned int& size, InferenceEngine::ResponseDesc* resp) noexcept override {
size = 1;
types = new char* [size];
std::string type = CUSTOM_RELU_TYPE;
types[0] = new char[type.size() + 1];
std::copy(type.begin(), type.end(), types[0]);
types[0][type.size()] = 0;
return InferenceEngine::OK;
};
InferenceEngine::StatusCode
getShapeInferTypes(char**& types, unsigned int& size, InferenceEngine::ResponseDesc* resp) noexcept override {
return getPrimitiveTypes(types, size, resp);
};
InferenceEngine::StatusCode getShapeInferImpl(InferenceEngine::IShapeInferImpl::Ptr& impl, const char* type,
InferenceEngine::ResponseDesc* resp) noexcept override {
if (CUSTOM_RELU_TYPE.compare(type) != 0) return InferenceEngine::StatusCode::NOT_IMPLEMENTED;
impl = _shapeInferImpl;
return InferenceEngine::StatusCode::OK;
}
void GetVersion(const InferenceEngine::Version*& versionInfo) const noexcept override {};
void SetLogCallback(InferenceEngine::IErrorListener& listener) noexcept override {};
void Unload() noexcept override {};
void Release() noexcept override {}
InferenceEngine::StatusCode
getFactoryFor(InferenceEngine::ILayerImplFactory*& factory, const InferenceEngine::CNNLayer* cnnLayer,
InferenceEngine::ResponseDesc* resp) noexcept override {
if (cnnLayer->type != CUSTOM_RELU_TYPE)
return InferenceEngine::StatusCode::NOT_IMPLEMENTED;
factory = new CustomReLUFactory(cnnLayer);
return InferenceEngine::StatusCode::OK;
};
private:
InferenceEngine::IShapeInferImpl::Ptr _shapeInferImpl;
};

View File

@@ -43,7 +43,7 @@ add_dependencies(${TARGET_NAME} gflags)
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
COMPILE_PDB_NAME ${TARGET_NAME})
target_link_libraries(${TARGET_NAME} format_reader cpu_extension ${InferenceEngine_LIBRARIES} gflags)
target_link_libraries(${TARGET_NAME} format_reader IE::ie_cpu_extension ${InferenceEngine_LIBRARIES} gflags)
if(UNIX)
target_link_libraries( ${TARGET_NAME} ${LIB_DL} pthread)

Some files were not shown because too many files have changed in this diff Show More