committed by
openvino-pushbot
parent
54eab18036
commit
55a41d7570
1
.gitignore
vendored
1
.gitignore
vendored
@@ -282,7 +282,6 @@ report/
|
||||
/CMakeCache.txt
|
||||
.vimprj/
|
||||
build_IA32/
|
||||
doc/
|
||||
.dir-locals.el
|
||||
GTAGS
|
||||
GPATH
|
||||
|
||||
3
.gitmodules
vendored
Normal file
3
.gitmodules
vendored
Normal file
@@ -0,0 +1,3 @@
|
||||
[submodule "inference-engine/thirdparty/ade"]
|
||||
path = inference-engine/thirdparty/ade
|
||||
url = https://github.com/opencv/ade.git
|
||||
@@ -2,29 +2,33 @@
|
||||
|
||||
The software was validated on:
|
||||
- Ubuntu\* 16.04 with default GCC\* 5.4.0
|
||||
- CentOS\* 7.4 with default GCC\* 4.8.5 (using clDNN library built separately with GCC\* 5.2)
|
||||
- CentOS\* 7.4 with default GCC\* 4.8.5
|
||||
- [Intel® Graphics Compute Runtime for OpenCL™ Driver package 18.28.11080](https://github.com/intel/compute-runtime/releases/tag/18.28.11080).
|
||||
|
||||
### Software Requirements
|
||||
- [CMake\*](https://cmake.org/download/) 3.9 or higher
|
||||
- GCC\* 4.8 or higher to build the Inference Engine
|
||||
- GCC\* 5.2 or higher to build the Compute Library for Deep Neural Networks (clDNN library)
|
||||
- OpenBLAS\*
|
||||
|
||||
### Build Steps
|
||||
1. Install OpenBLAS and other dependencies using the `install_dependencies.sh` script in the project root folder.
|
||||
2. Create a build folder:
|
||||
1. Clone submodules:
|
||||
```sh
|
||||
git submodule init
|
||||
git submodule update --recursive
|
||||
```
|
||||
2. Install build dependencies using the `install_dependencies.sh` script in the project root folder.
|
||||
3. Create a build folder:
|
||||
```sh
|
||||
mkdir build
|
||||
```
|
||||
3. Inference Engine uses a CMake-based build system. In the created `build` directory, run `cmake` to fetch project dependencies and create Unix makefiles, then run `make` to build the project:
|
||||
4. Inference Engine uses a CMake-based build system. In the created `build` directory, run `cmake` to fetch project dependencies and create Unix makefiles, then run `make` to build the project:
|
||||
```sh
|
||||
cmake -DCMAKE_BUILD_TYPE=Release ..
|
||||
make -j16
|
||||
```
|
||||
You can use the following additional build options:
|
||||
- Use `BLAS_INCLUDE_DIRS` and `BLAS_LIBRARIES` cmake options to specify path to OpenBLAS headers and library, for example use the following options on CentOS\*: `-DBLAS_INCLUDE_DIRS=/usr/include/openblas -DBLAS_LIBRARIES=/usr/lib64/libopenblas.so.0`
|
||||
- To build clDNN from sources, please specify the `-DENABLE_CLDNN_BUILD=ON` option for `cmake`. By default pre-built version of the clDNN library is used.
|
||||
- Internal JIT GEMM implementation is used by default.
|
||||
- To switch to OpenBLAS\* implementation, use `GEMM=OPENBLAS` option and `BLAS_INCLUDE_DIRS` and `BLAS_LIBRARIES` cmake options to specify path to OpenBLAS headers and library, for example use the following options on CentOS\*: `-DGEMM=OPENBLAS -DBLAS_INCLUDE_DIRS=/usr/include/openblas -DBLAS_LIBRARIES=/usr/lib64/libopenblas.so.0`
|
||||
- To switch to optimized MKL-ML\* GEMM implementation, use `GEMM=MKL` and `MKLROOT` cmake options to specify path to unpacked MKL-ML with `include` and `lib` folders, for example use the following options: `-DGEMM=MKL -DMKLROOT=<path_to_MKL>`. MKL-ML\* package can be downloaded [here](https://github.com/intel/mkl-dnn/releases/download/v0.17/mklml_lnx_2019.0.1.20180928.tgz)
|
||||
- To switch on/off the CPU and GPU plugins, use `cmake` options `-DENABLE_MKL_DNN=ON/OFF` and `-DENABLE_CLDNN=ON/OFF`.
|
||||
|
||||
## Build on Windows\* Systems:
|
||||
@@ -39,25 +43,31 @@ The software was validated on:
|
||||
- [Intel® C++ Compiler](https://software.intel.com/en-us/intel-parallel-studio-xe) 18.0 to build the Inference Engine on Windows.
|
||||
|
||||
### Build Steps
|
||||
1. Download and install [Intel® C++ Compiler](https://software.intel.com/en-us/intel-parallel-studio-xe) 18.0
|
||||
2. Install OpenBLAS:
|
||||
1. Clone submodules:
|
||||
```sh
|
||||
git submodule init
|
||||
git submodule update --recursive
|
||||
```
|
||||
2. Download and install [Intel® C++ Compiler](https://software.intel.com/en-us/intel-parallel-studio-xe) 18.0
|
||||
3. Install OpenBLAS:
|
||||
1. Download [OpenBLAS\*](https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download)
|
||||
2. Unzip the downloaded package to a directory on your machine. In this document, this directory is referred to as `<OPENBLAS_DIR>`.
|
||||
3. Create build directory:
|
||||
4. Create build directory:
|
||||
```sh
|
||||
mkdir build
|
||||
```
|
||||
4. In the `build` directory, run `cmake` to fetch project dependencies and generate a Visual Studio solution:
|
||||
5. In the `build` directory, run `cmake` to fetch project dependencies and generate a Visual Studio solution:
|
||||
```sh
|
||||
cd build
|
||||
cmake -G "Visual Studio 15 2017 Win64" -T "Intel C++ Compiler 18.0" -DOS_FOLDER=ON ^
|
||||
-DBLAS_INCLUDE_DIRS=<OPENBLAS_DIR>\include ^
|
||||
-DBLAS_LIBRARIES=<OPENBLAS_DIR>\lib\libopenblas.dll.a ^
|
||||
-DCMAKE_BUILD_TYPE=Release ^
|
||||
-DICCLIB="C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\compiler\lib" ..
|
||||
```
|
||||
|
||||
5. Build generated solution in Visual Studio 2017 or run `cmake --build .` to build from the command line.
|
||||
- To switch to OpenBLAS GEMM implementation, use -DGEMM=OPENBLAS cmake option and specify path to OpenBLAS using `-DBLAS_INCLUDE_DIRS=<OPENBLAS_DIR>\include` and `-DBLAS_LIBRARIES=<OPENBLAS_DIR>\lib\libopenblas.dll.a` options. Prebuilt OpenBLAS\* package can be downloaded [here](https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download), mingw64* runtime dependencies [here](https://sourceforge.net/projects/openblas/files/v0.2.14/mingw64_dll.zip/download)
|
||||
- To switch to optimized MKL-ML GEMM implementation, use `GEMM=MKL` and `MKLROOT` cmake options to specify path to unpacked MKL-ML with `include` and `lib` folders, for example use the following options: `-DGEMM=MKL -DMKLROOT=<path_to_MKL>`. MKL-ML\* package can be downloaded [here](https://github.com/intel/mkl-dnn/releases/download/v0.17/mklml_win_2019.0.1.20180928.zip)
|
||||
|
||||
6. Build generated solution in Visual Studio 2017 or run `cmake --build . --config Release` to build from the command line.
|
||||
|
||||
---
|
||||
\* Other names and brands may be claimed as the property of others.
|
||||
|
||||
@@ -2,10 +2,11 @@
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
include ("features")
|
||||
include("features")
|
||||
include("mode")
|
||||
include("omp")
|
||||
if (THREADING STREQUAL "OMP")
|
||||
include("omp")
|
||||
endif()
|
||||
include("itt")
|
||||
|
||||
#64 bits platform
|
||||
@@ -40,7 +41,7 @@ if (WIN32)
|
||||
|
||||
if (MINGW)
|
||||
SET(ENABLE_CLDNN OFF) # dont have mingw dll for linking
|
||||
set(ENABLE_SAMPLES_CORE OFF)
|
||||
set(ENABLE_SAMPLES OFF)
|
||||
endif()
|
||||
endif()
|
||||
|
||||
@@ -63,14 +64,6 @@ if (NOT ENABLE_MKL_DNN)
|
||||
set(GEMM OPENBLAS)
|
||||
endif()
|
||||
|
||||
if (NOT ENABLE_VPU)
|
||||
set(ENABLE_MYRIAD OFF)
|
||||
endif()
|
||||
|
||||
if (NOT ENABLE_MYRIAD)
|
||||
set(ENABLE_VPU OFF)
|
||||
endif()
|
||||
|
||||
#next section set defines to be accesible in c++/c code for certain feature
|
||||
if (ENABLE_PROFILING_RAW)
|
||||
add_definitions(-DENABLE_PROFILING_RAW=1)
|
||||
@@ -100,8 +93,6 @@ if (ENABLE_OBJECT_DETECTION_TESTS)
|
||||
add_definitions(-DENABLE_OBJECT_DETECTION_TESTS=1)
|
||||
endif()
|
||||
|
||||
#models dependend tests
|
||||
|
||||
if (DEVELOPMENT_PLUGIN_MODE)
|
||||
message (STATUS "Enabled development plugin mode")
|
||||
|
||||
@@ -121,14 +112,9 @@ if (VERBOSE_BUILD)
|
||||
set(CMAKE_VERBOSE_MAKEFILE ON)
|
||||
endif()
|
||||
|
||||
if (NOT ENABLE_OMP)
|
||||
if (THREADING STREQUAL "TBB" OR THREADING STREQUAL "SEQ")
|
||||
set(ENABLE_INTEL_OMP OFF)
|
||||
message(STATUS "ENABLE_INTEL_OMP should be disabled if THREADING is TBB or Sequential. ENABLE_INTEL_OMP option is " ${ENABLE_INTEL_OMP})
|
||||
endif()
|
||||
|
||||
if (NOT GEMM STREQUAL "MKL" AND NOT GEMM STREQUAL "OPENBLAS")
|
||||
message("FATAL_ERROR" "GEMM should be set to MKL|OPENBLAS")
|
||||
endif()
|
||||
|
||||
print_enabled_features()
|
||||
|
||||
message(STATUS "GEMM = ${GEMM}")
|
||||
print_enabled_features()
|
||||
@@ -4,6 +4,7 @@
|
||||
#
|
||||
|
||||
cmake_minimum_required(VERSION 2.8)
|
||||
cmake_policy(SET CMP0054 NEW)
|
||||
|
||||
#features trigger supported by build system
|
||||
include(check_features)
|
||||
@@ -40,19 +41,6 @@ endif()
|
||||
set(MODELS_PATH "${TEMP}/models")
|
||||
debug_message(STATUS "MODELS_PATH=" ${MODELS_PATH})
|
||||
|
||||
#clDNN
|
||||
if (ENABLE_CLDNN AND NOT ENABLE_CLDNN_BUILD)
|
||||
if(NOT IE_SUBMODULE_IN_CLDNN)
|
||||
RESOLVE_DEPENDENCY(CLDNN
|
||||
ARCHIVE_UNIFIED "cldnn-main-03988.zip"
|
||||
TARGET_PATH "${TEMP}/clDNN"
|
||||
ENVIRONMENT "CLDNN"
|
||||
VERSION_REGEX ".*_(([a-z]+-)?[a-z]+-[0-9]+)---.*"
|
||||
FOLDER) #new cldnn package dont have toplevel cldnn folder
|
||||
debug_message(STATUS "clDNN=" ${CLDNN})
|
||||
endif ()
|
||||
endif ()
|
||||
|
||||
## enable cblas_gemm from OpenBLAS package
|
||||
if (GEMM STREQUAL "OPENBLAS")
|
||||
if(NOT BLAS_LIBRARIES OR NOT BLAS_INCLUDE_DIRS)
|
||||
@@ -67,51 +55,87 @@ debug_message(STATUS "openblas=" ${BLAS_LIBRARIES})
|
||||
endif ()
|
||||
|
||||
#MKL-ml package
|
||||
if (GEMM STREQUAL "MKL" OR ENABLE_INTEL_OMP)
|
||||
if (GEMM STREQUAL "MKL")
|
||||
if(NOT MKLROOT)
|
||||
message(FATAL_ERROR "MKLROOT not found: install MKL and set -DMKLROOT=<path_to_MKL>")
|
||||
endif()
|
||||
debug_message(STATUS "mkl_ml=" ${MKLROOT})
|
||||
endif ()
|
||||
|
||||
if (ENABLE_INTEL_OMP)
|
||||
if (WIN32)
|
||||
RESOLVE_DEPENDENCY(MKL
|
||||
ARCHIVE_WIN "mkltiny_win_20180512.zip"
|
||||
TARGET_PATH "${TEMP}/mkltiny_win_20180512"
|
||||
ENVIRONMENT "MKLROOT"
|
||||
RESOLVE_DEPENDENCY(OMP
|
||||
ARCHIVE_WIN "iomp.zip"
|
||||
TARGET_PATH "${TEMP}/omp"
|
||||
ENVIRONMENT "OMP"
|
||||
VERSION_REGEX ".*_([a-z]*_([a-z0-9]+\\.)*[0-9]+).*")
|
||||
elseif(LINUX)
|
||||
RESOLVE_DEPENDENCY(MKL
|
||||
ARCHIVE_LIN "mkltiny_lnx_20180511.tgz"
|
||||
TARGET_PATH "${TEMP}/mkltiny_lnx_20180511"
|
||||
ENVIRONMENT "MKLROOT"
|
||||
RESOLVE_DEPENDENCY(OMP
|
||||
ARCHIVE_LIN "iomp.tgz"
|
||||
TARGET_PATH "${TEMP}/omp"
|
||||
ENVIRONMENT "OMP"
|
||||
VERSION_REGEX ".*_([a-z]*_([a-z0-9]+\\.)*[0-9]+).*")
|
||||
endif()
|
||||
debug_message(STATUS "mkl_ml=" ${MKL})
|
||||
log_rpath_from_dir(OMP "${OMP}/lib")
|
||||
debug_message(STATUS "intel_omp=" ${OMP})
|
||||
endif ()
|
||||
|
||||
#TBB package
|
||||
if (THREADING STREQUAL "TBB")
|
||||
if (WIN32)
|
||||
#TODO: add target_path to be platform specific as well, to avoid following if
|
||||
RESOLVE_DEPENDENCY(TBB
|
||||
ARCHIVE_WIN "tbb2018_20180618_win.zip" #TODO: windows zip archive created incorrectly using old name for folder
|
||||
TARGET_PATH "${TEMP}/tbb"
|
||||
ENVIRONMENT "TBBROOT"
|
||||
VERSION_REGEX ".*_([a-z]*_([a-z0-9]+\\.)*[0-9]+).*")
|
||||
elseif(LINUX)
|
||||
RESOLVE_DEPENDENCY(TBB
|
||||
ARCHIVE_LIN "tbb2018_20180618_lin.tgz"
|
||||
TARGET_PATH "${TEMP}/tbb"
|
||||
ENVIRONMENT "TBBROOT")
|
||||
endif()
|
||||
set(TBB_INCLUDE_DIRS "${TBB}/include")
|
||||
find_path(TBB_INCLUDE_DIRS tbb/tbb.h)
|
||||
find_library(TBB_LIBRARIES_RELEASE tbb HINTS "${TBB}/lib")
|
||||
if (TBB_INCLUDE_DIRS AND TBB_LIBRARIES_RELEASE)
|
||||
log_rpath_from_dir(TBB "${TBB}/lib")
|
||||
else()
|
||||
message("FATAL_ERROR" "TBB is unset")
|
||||
endif()
|
||||
debug_message(STATUS "tbb=" ${TBB})
|
||||
endif ()
|
||||
|
||||
if (ENABLE_OPENCV)
|
||||
if (WIN32)
|
||||
RESOLVE_DEPENDENCY(OPENCV
|
||||
ARCHIVE_WIN "opencv_3.4.3.zip"
|
||||
TARGET_PATH "${TEMP}/opencv"
|
||||
ARCHIVE_WIN "opencv_4.0.0-0256.zip"
|
||||
TARGET_PATH "${TEMP}/opencv_4.0.0"
|
||||
ENVIRONMENT "OpenCV_DIR"
|
||||
VERSION_REGEX ".*_([0-9]+.[0-9]+.[0-9]+).*")
|
||||
log_rpath_from_dir(OPENCV "\\opencv\\x64\\vc14\\bin")
|
||||
set( ENV{OpenCV_DIR} ${OPENCV} )
|
||||
log_rpath_from_dir(OPENCV "\\opencv_4.0.0\\bin")
|
||||
set( ENV{OpenCV_DIR} ${OPENCV}/cmake )
|
||||
elseif(LINUX)
|
||||
if (${LINUX_OS_NAME} STREQUAL "Ubuntu 16.04")
|
||||
RESOLVE_DEPENDENCY(OPENCV
|
||||
ARCHIVE_LIN "opencv_3.4.3_ubuntu16.tar.bz2"
|
||||
TARGET_PATH "${TEMP}/opencv_ubuntu16"
|
||||
ARCHIVE_LIN "opencv_4.0.0-0256_ubuntu16.tgz"
|
||||
TARGET_PATH "${TEMP}/opencv_4.0.0_ubuntu"
|
||||
ENVIRONMENT "OpenCV_DIR"
|
||||
VERSION_REGEX ".*_([0-9]+.[0-9]+.[0-9]+).*")
|
||||
log_rpath_from_dir(OPENCV "opencv_ubuntu16/lib")
|
||||
log_rpath_from_dir(OPENCV "opencv_4.0.0_ubuntu/lib")
|
||||
elseif (${LINUX_OS_NAME} STREQUAL "CentOS 7")
|
||||
RESOLVE_DEPENDENCY(OPENCV
|
||||
ARCHIVE_LIN "opencv_3.4.3_centos7.tar.bz2"
|
||||
TARGET_PATH "${TEMP}/opencv_centos7"
|
||||
ARCHIVE_LIN "opencv_4.0.0-0256_centos.tgz"
|
||||
TARGET_PATH "${TEMP}/opencv_4.0.0_centos"
|
||||
ENVIRONMENT "OpenCV_DIR"
|
||||
VERSION_REGEX ".*_([0-9]+.[0-9]+.[0-9]+).*")
|
||||
log_rpath_from_dir(OPENCV "opencv_centos7/lib")
|
||||
log_rpath_from_dir(OPENCV "opencv_4.0.0_centos/lib")
|
||||
endif()
|
||||
set( ENV{OpenCV_DIR} ${OPENCV}/share )
|
||||
set( ENV{OpenCV_DIR} ${OPENCV}/cmake )
|
||||
endif()
|
||||
debug_message(STATUS "opencv=" ${OPENCV})
|
||||
endif()
|
||||
|
||||
include(omp)
|
||||
if (THREADING STREQUAL "OMP")
|
||||
include(omp)
|
||||
endif ()
|
||||
|
||||
@@ -24,4 +24,4 @@ function (Download from to fatal result output)
|
||||
endfunction(Download)
|
||||
|
||||
include ("download_and_apply")
|
||||
include ("download_and_extract")
|
||||
include ("download_and_extract")
|
||||
@@ -53,4 +53,4 @@ function (DownloadAndCheck from to fatal result)
|
||||
file(REMOVE ${to}.md5)
|
||||
set(${result} "${status_res}" PARENT_SCOPE)
|
||||
|
||||
endfunction(DownloadAndCheck)
|
||||
endfunction(DownloadAndCheck)
|
||||
@@ -144,7 +144,7 @@ function (CheckOrDownloadAndExtract component RELATIVE_URL archive_name unpacked
|
||||
set (status "ON")
|
||||
set (on_master FALSE)
|
||||
|
||||
set (URL "https://download.01.org/openvinotoolkit/2018_R3/dldt/inference_engine/${RELATIVE_URL}")
|
||||
set (URL "https://download.01.org/openvinotoolkit/2018_R4/dldt/inference_engine/${RELATIVE_URL}")
|
||||
|
||||
#no message on recursive calls
|
||||
if (${use_alternatives})
|
||||
|
||||
@@ -45,4 +45,4 @@ function (extract archive_path unpacked_path folder result)
|
||||
endif()
|
||||
|
||||
endif()
|
||||
endfunction (extract)
|
||||
endfunction (extract)
|
||||
@@ -15,18 +15,25 @@ ie_option (ENABLE_MKL_DNN "MKL-DNN plugin for inference engine" ON)
|
||||
|
||||
ie_option (ENABLE_CLDNN "clDnn based plugin for inference engine" ON)
|
||||
|
||||
ie_option (ENABLE_CLDNN_BUILD "build clDnn from sources" OFF)
|
||||
|
||||
ie_option (ENABLE_PROFILING_ITT "ITT tracing of IE and plugins internals" ON)
|
||||
|
||||
ie_option (ENABLE_PROFILING_RAW "Raw counters profiling (just values, no start/stop time or timeline)" OFF)
|
||||
|
||||
# "MKL-DNN library might use MKL-ML or OpenBLAS for gemm tasks: OPENBLAS|MKL"
|
||||
if (NOT GEMM)
|
||||
set (GEMM "OPENBLAS")
|
||||
endif()
|
||||
#
|
||||
|
||||
ie_option (ENABLE_OMP "MKL-DNN library based on OMP implementation" ON)
|
||||
# "MKL-DNN library might use MKL-ML or OpenBLAS for gemm tasks: MKL|OPENBLAS|JIT"
|
||||
if (NOT GEMM STREQUAL "MKL" AND NOT GEMM STREQUAL "OPENBLAS" AND NOT GEMM STREQUAL "JIT")
|
||||
set (GEMM "JIT")
|
||||
message(STATUS "GEMM should be set to MKL|OPENBLAS|JIT. Default option is " ${GEMM})
|
||||
endif()
|
||||
list (APPEND IE_OPTIONS GEMM)
|
||||
|
||||
# "MKL-DNN library based on OMP or TBB or Sequential implementation: TBB|OMP|SEQ"
|
||||
if (NOT THREADING STREQUAL "TBB" AND NOT THREADING STREQUAL "OMP" AND NOT THREADING STREQUAL "SEQ")
|
||||
set (THREADING "OMP")
|
||||
message(STATUS "THREADING should be set to TBB|OMP|SEQ. Default option is " ${THREADING})
|
||||
endif()
|
||||
list (APPEND IE_OPTIONS THREADING)
|
||||
|
||||
ie_option (ENABLE_INTEL_OMP "MKL-DNN library based on Intel OMP implementation" ON)
|
||||
|
||||
@@ -60,4 +67,3 @@ ie_option (ENABLE_PLUGIN_RPATH "enables rpath information to be present in plugi
|
||||
|
||||
#name of environment variable stored path to temp directory"
|
||||
set (DL_SDK_TEMP "DL_SDK_TEMP")
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@ cmake_minimum_required(VERSION 2.8)
|
||||
if (UNIX)
|
||||
function(get_linux_name res_var)
|
||||
if (NOT EXISTS "/etc/lsb-release")
|
||||
execute_process(COMMAND find /etc/ -maxdepth 1 -type f -name *-release -exec cat {} \;
|
||||
execute_process(COMMAND find -L /etc/ -maxdepth 1 -type f -name *-release -exec cat {} \;
|
||||
OUTPUT_VARIABLE release_data RESULT_VARIABLE result)
|
||||
set(name_regex "NAME=\"([^ \"\n]*).*\"\n")
|
||||
set(version_regex "VERSION=\"([0-9]+(\\.[0-9]+)?)[^\n]*\"")
|
||||
|
||||
@@ -3,17 +3,19 @@
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
cmake_policy(SET CMP0054 NEW)
|
||||
|
||||
if (APPLE OR WIN32)
|
||||
|
||||
find_path(OMP_INC omp.h)
|
||||
find_library(OMP_LIB iomp5
|
||||
PATHS ${MKL}/lib)
|
||||
PATHS ${OMP}/lib)
|
||||
|
||||
if (OMP_INC AND OMP_LIB)
|
||||
set(HAVE_OMP TRUE)
|
||||
get_filename_component(OMP_LIB_DIR "${OMP_LIB}" PATH)
|
||||
else()
|
||||
if (ENABLE_OMP)
|
||||
if (THREADING STREQUAL "OMP")
|
||||
find_package(OpenMP)
|
||||
if (NOT OPENMP_FOUND)
|
||||
message(WARNING "OpenMP not found. OpenMP support will be disabled.")
|
||||
@@ -34,7 +36,7 @@ macro(enable_omp)
|
||||
elseif(UNIX) # Linux
|
||||
add_definitions(-fopenmp)
|
||||
elseif(WIN32) # Windows
|
||||
if (ENABLE_OMP)
|
||||
if (THREADING STREQUAL "OMP")
|
||||
set(OPENMP_FLAGS "/Qopenmp /openmp")
|
||||
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${CMAKE_CCXX_FLAGS} ${OPENMP_FLAGS}")
|
||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${CMAKE_CCXX_FLAGS} ${OPENMP_FLAGS}")
|
||||
@@ -45,13 +47,13 @@ macro(enable_omp)
|
||||
if (WIN32)
|
||||
find_library(intel_omp_lib
|
||||
libiomp5md
|
||||
PATHS ${MKL}/lib ${ICCLIB})
|
||||
PATHS ${OMP}/lib ${ICCLIB})
|
||||
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /nodefaultlib:vcomp")
|
||||
set (CMAKE_SHARED_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /nodefaultlib:vcomp")
|
||||
else()
|
||||
find_library(intel_omp_lib
|
||||
iomp5
|
||||
PATHS ${MKL}/lib)
|
||||
PATHS ${OMP}/lib)
|
||||
endif()
|
||||
endif()
|
||||
endmacro(enable_omp)
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
# Usage: ie_option(<option_variable> "description" <initial value or boolean expression> [IF <condition>])
|
||||
function (ie_option variable description value)
|
||||
option(${variable} "${description}" ${value})
|
||||
list (APPEND IE_OPTIONS "${variable}")
|
||||
|
||||
@@ -8,4 +8,4 @@ if (ENABLE_SANITIZER)
|
||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsanitize=address -fuse-ld=gold")
|
||||
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -fsanitize=address")
|
||||
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=address")
|
||||
endif()
|
||||
endif()
|
||||
@@ -1,7 +1,7 @@
|
||||
# Copyright (C) 2018 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
set(InferenceEngine_VERSION 1.2.0)
|
||||
set(InferenceEngine_VERSION 1.4.0)
|
||||
set(PACKAGE_VERSION ${InferenceEngine_VERSION})
|
||||
|
||||
set(PACKAGE_VERSION_EXACT False)
|
||||
@@ -14,4 +14,4 @@ endif()
|
||||
|
||||
if(PACKAGE_FIND_VERSION VERSION_LESS PACKAGE_VERSION)
|
||||
set(PACKAGE_VERSION_COMPATIBLE True)
|
||||
endif()
|
||||
endif()
|
||||
|
||||
@@ -75,10 +75,17 @@ else()
|
||||
set(os_name "${os_name} ${CMAKE_MATCH_1}")
|
||||
|
||||
if (NOT os_name)
|
||||
message(FATAL_ERROR "Cannot detect OS via reading /etc/*-release:\n ${release_data}")
|
||||
if(InferenceEngine_FIND_REQUIRED)
|
||||
message(FATAL_ERROR "Cannot detect OS via reading /etc/*-release:\n ${release_data}")
|
||||
elseif(NOT InferenceEngine_FIND_QUIETLY)
|
||||
message(WARNING "Cannot detect OS via reading /etc/*-release:\n ${release_data}")
|
||||
endif()
|
||||
return()
|
||||
endif()
|
||||
|
||||
message (STATUS "/etc/*-release distrib: ${os_name}")
|
||||
if (NOT InferenceEngine_FIND_QUIETLY)
|
||||
message (STATUS "/etc/*-release distrib: ${os_name}")
|
||||
endif()
|
||||
|
||||
if (${os_name} STREQUAL "Ubuntu 14.04")
|
||||
set(_OS_PATH "ubuntu_14.04/")
|
||||
@@ -89,7 +96,12 @@ else()
|
||||
elseif (${os_name} STREQUAL "poky 2.0")
|
||||
set(_OS_PATH "ubuntu_16.04/")
|
||||
else()
|
||||
message(FATAL_ERROR "${os_name} is not supported. List of supported OS: Ubuntu 14.04, Ubuntu 16.04, CentOS 7")
|
||||
if(InferenceEngine_FIND_REQUIRED)
|
||||
message(FATAL_ERROR "${os_name} is not supported. List of supported OS: Ubuntu 14.04, Ubuntu 16.04, CentOS 7")
|
||||
elseif(NOT InferenceEngine_FIND_QUIETLY)
|
||||
message(WARNING "${os_name} is not supported. List of supported OS: Ubuntu 14.04, Ubuntu 16.04, CentOS 7")
|
||||
endif()
|
||||
return()
|
||||
endif()
|
||||
endif()
|
||||
endif()
|
||||
@@ -98,18 +110,23 @@ else()
|
||||
unset(IE_INCLUDE_DIR CACHE)
|
||||
endif()
|
||||
|
||||
if(IE_SRC_DIR AND NOT "${IE_ROOT_DIR}/src" EQUAL "${IE_SRC_DIR}")
|
||||
unset(IE_SRC_DIR CACHE)
|
||||
endif()
|
||||
|
||||
if(IE_LIBRARY AND NOT "${IE_ROOT_DIR}/lib/${_OS_PATH}/${_ARCH}" EQUAL "${IE_LIBRARY}")
|
||||
unset(IE_LIBRARY CACHE)
|
||||
endif()
|
||||
|
||||
set(_IE_ROOT_INCLUDE_DIR "${IE_ROOT_DIR}/include")
|
||||
set(_IE_ROOT_SRC_DIR "${IE_ROOT_DIR}/src")
|
||||
set(_IE_ROOT_LIBRARY "${IE_ROOT_DIR}/lib/${_OS_PATH}/${_ARCH}")
|
||||
|
||||
|
||||
find_path(IE_INCLUDE_DIR inference_engine.hpp "${_IE_ROOT_INCLUDE_DIR}")
|
||||
#message("InferenceEngine_INCLUDE_DIR=${IE_INCLUDE_DIR}:${_IE_ROOT_INCLUDE_DIR}")
|
||||
find_path(IE_SRC_DIR extension "${_IE_ROOT_SRC_DIR}")
|
||||
|
||||
include(FindPackageHandleStandardArgs)
|
||||
|
||||
if (WIN32)
|
||||
find_library(IE_RELEASE_LIBRARY inference_engine "${_IE_ROOT_LIBRARY}/Release")
|
||||
find_library(IE_DEBUG_LIBRARY inference_engine "${_IE_ROOT_LIBRARY}/Debug")
|
||||
@@ -146,6 +163,9 @@ else()
|
||||
set(InferenceEngine_INCLUDE_DIRS ${IE_INCLUDE_DIR})
|
||||
set(InferenceEngine_LIBRARIES IE::inference_engine)
|
||||
set(InferenceEngine_FOUND TRUE)
|
||||
|
||||
add_subdirectory(${IE_SRC_DIR}/extension EXCLUDE_FROM_ALL ie_cpu_extension)
|
||||
add_library(IE::ie_cpu_extension ALIAS ie_cpu_extension)
|
||||
endif()
|
||||
endif()
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Overview of Inference Engine Python* API {#InferEnginePythonAPI}
|
||||
# Overview of Inference Engine Python* API
|
||||
|
||||
**NOTE:** It is a preview version of the Inference Engine Python\* API for evaluation purpose only.
|
||||
Module structure and API itself may be changed in future releases.
|
||||
@@ -32,20 +32,21 @@ after running the environment configuration script.
|
||||
This class stores main information about the layer and allow to modify some layer parameters
|
||||
### Class attributes:
|
||||
|
||||
* `name` - name of the layer
|
||||
* `type` - layer type
|
||||
* `precision` - layer base operating precision
|
||||
* `affinity` - layer affinity set by user or default affinity set by IEPlugin.set_initial_affinity() method.
|
||||
The affinity attribute provides getter and setter interface, so the layer affinity can be modified directly in following way
|
||||
|
||||
* `name` - Name of the layer
|
||||
* `type`- Layer type
|
||||
* `precision` - Layer base operating precision. Provides getter and setter interfaces.
|
||||
* `affinity` - Layer affinity set by user or a default affinity set by the `IEPlugin.set_initial_affinity()` method.
|
||||
The affinity attribute provides getter and setter interfaces, so the layer affinity can be modified directly.
|
||||
For example:
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> plugin = IEPlugin(device="HETERO:FPGA,CPU")
|
||||
>>> plugin.set_config({"TARGET_FALLBACK": "HETERO:FPGA,CPU"})
|
||||
>>> plugin.set_initial_affinity(net)
|
||||
>>> for l in net.layers.values():
|
||||
... if l.type == "Convolution":
|
||||
... l.affinity = "CPU"
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> plugin = IEPlugin(device="HETERO:FPGA,CPU")
|
||||
>>> plugin.set_config({"TARGET_FALLBACK": "HETERO:FPGA,CPU"})
|
||||
>>> plugin.set_initial_affinity(net)
|
||||
>>> for l in net.layers.values():
|
||||
... if l.type == "Convolution":
|
||||
... l.affinity = "CPU"
|
||||
|
||||
```
|
||||
|
||||
@@ -61,18 +62,18 @@ To understand how default and non-default affinities are set:
|
||||
|
||||
1. Call `net.layers` function right after model loading and check that layer affinity parameter is empty.
|
||||
2. Call `plugin.set_default_affinity(net)`.
|
||||
3. Call `net.layers` and check layer affinity parameters to see how plugin set default affinity
|
||||
3. Call `net.layers` and check layer affinity parameters to see how plugin set a default affinity
|
||||
4. Set layer affinity how it's described above
|
||||
5. Call `net.layers` again and check layer affinity parameters to see how it was changed after manual affinity
|
||||
setting
|
||||
|
||||
Please refer to `affinity_setting_sample.py` to see the full usage pipeline.
|
||||
Please refer to `affinity_setting_demo.py` to see the full usage pipeline.
|
||||
|
||||
* `weights` - dictionary with layer weights, biases or custom blobs if any
|
||||
* `params` - layer specific parameters. Provides getter and setter interface which allows to get and\or modify layer parameters.
|
||||
Please note that some modifications can be ignored and\or overwriten by target plugin (e.g. modification of
|
||||
convolution kernel size will be reflected in layer parameters but finally the plugin will ignore it and will
|
||||
use initial kernel size)
|
||||
* `weights`- Dictionary with layer weights, biases or custom blobs if any
|
||||
* `params` - Layer specific parameters. Provides getter and setter interfaces to get and modify layer parameters.
|
||||
Please note that some modifications can be ignored and\or overwriten by target plugin (e.g. modification of
|
||||
convolution kernel size will be reflected in layer parameters but finally the plugin will ignore it and will
|
||||
use initial kernel size)
|
||||
|
||||
## <a name="ienetwork-class"></a>IENetwork
|
||||
|
||||
@@ -86,41 +87,53 @@ There is no explicit class constructor. Use `from_ir` class method to read the I
|
||||
### Class attributes:
|
||||
|
||||
* `name` - Name of the loaded network
|
||||
* `inputs` - a dictionary of input layer name as a key and input data shape as a value
|
||||
* `inputs` - A dictionary that maps input layer names to <a name="inputinfo-class"></a>InputInfo objects.
|
||||
For example, to get a shape of the input layer:
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.inputs
|
||||
{'data': [1, 3, 224, 224]}
|
||||
```
|
||||
* `outputs` - a list of output layer names
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.inputs
|
||||
{'data': <inference_engine.ie_api.InputInfo object at 0x7efe042dedd8>}
|
||||
>>> net.inputs['data'].shape
|
||||
[1, 3, 224, 224]
|
||||
```
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.outputs
|
||||
['prob']
|
||||
```
|
||||
* `outputs` - A dictionary that maps output layer names to <a name="inputinfo-class"></a>OutputInfo objects
|
||||
For example, to get a shape of the output layer:
|
||||
|
||||
* `batch_size` - Batch size of the network. Provides getter and setter interface which allows to get and modify the
|
||||
network batch size in the following way:
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.batch_size
|
||||
1
|
||||
>>> net.batch_size = 4
|
||||
>>> net.batch_size
|
||||
4
|
||||
```
|
||||
* `layers` - return dictionary with the network layer names as key and <a name="ienetlayer-class"></a>IENetLayer objects containing layer properties
|
||||
as value
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.inputs
|
||||
{'prob': <inference_engine.ie_api.OutputInfo object at 0x7efe03ab95d0>}
|
||||
>>> net.outputs['prob'].shape
|
||||
[1, 1000]
|
||||
```
|
||||
|
||||
* `batch_size` - Batch size of the network. Provides getter and setter interfaces to get and modify the
|
||||
network batch size. For example:
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.batch_size
|
||||
1
|
||||
>>> net.batch_size = 4
|
||||
>>> net.batch_size
|
||||
4
|
||||
>>> net.inputs['data'].shape
|
||||
[4, 3, 224, 224]
|
||||
```
|
||||
|
||||
* `layers` - Return dictionary that maps network layer names to <a name="ienetlayer-class"></a>`IENetLayer`
|
||||
objects containing layer properties. For example, to list all network layers:
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.layers
|
||||
{'conv0': <inference_engine.ie_api.IENetLayer object at 0x7f3a4c102370>}
|
||||
```
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.layers
|
||||
{'conv0': <inference_engine.ie_api.IENetLayer object at 0x7f3a4c102370>
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
### Class Methods
|
||||
|
||||
* `from_ir(model: str, weights: str)`
|
||||
@@ -131,19 +144,20 @@ There is no explicit class constructor. Use `from_ir` class method to read the I
|
||||
|
||||
* Parameters:
|
||||
|
||||
* model - path to `.xml` file of the IR
|
||||
* weights - path to `.bin` file of the IR
|
||||
* model - Path to `.xml` file of the IR
|
||||
* weights - Path to `.bin` file of the IR
|
||||
|
||||
* Return value:
|
||||
|
||||
An instance of the `IENetwork` class
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net
|
||||
<inference_engine.ie_api.IENetwork object at 0x7fd7dbce54b0>
|
||||
```
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net
|
||||
<inference_engine.ie_api.IENetwork object at 0x7fd7dbce54b0>
|
||||
```
|
||||
|
||||
### Instance Methods
|
||||
|
||||
@@ -156,24 +170,89 @@ There is no explicit class constructor. Use `from_ir` class method to read the I
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `outputs` - a list of layer names to be set as model outputs. In case of setting one layer as output, string with one layer can be provided.
|
||||
* `outputs` - List of layer names to be set as model outputs. In case of setting one layer as output, string with one layer can be provided.
|
||||
|
||||
* Return value:
|
||||
|
||||
None
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.add_outputs(["conv5_1/dwise', conv2_1/expand'])]
|
||||
>>> net.outputs
|
||||
['prob', 'conv5_1/dwise', 'conv2_1/expand']
|
||||
```
|
||||
|
||||
Note that the last layers (nodes without successors in graph representation of the model) are set as output
|
||||
by default. In the case above, `prob` layer is a default output and `conv5_1/dwise`, `conv2_1/expand` are user-defined
|
||||
outputs.
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> net.add_outputs(["conv5_1/dwise', conv2_1/expand'])]
|
||||
>>> net.outputs
|
||||
['prob', 'conv5_1/dwise', 'conv2_1/expand']
|
||||
```
|
||||
|
||||
**Note**
|
||||
|
||||
The last layers (nodes without successors in graph representation of the model) are set as output
|
||||
by default. In the case above, `prob` layer is a default output and `conv5_1/dwise`, `conv2_1/expand` are user-defined
|
||||
outputs.
|
||||
|
||||
* `reshape(input_shapes: dict)`:
|
||||
|
||||
* Description:
|
||||
|
||||
The method reshapes the network to change spatial dimensions, batch size, or any dimension.
|
||||
|
||||
**Note:**
|
||||
|
||||
Before using this method, make sure that the target shape is applicable for the network
|
||||
Changing the network shape to an arbitrary value may lead to unpredictable behaviour.
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `input_shapes` - The dictionary that maps input layer names to tuples with the target shape
|
||||
|
||||
* Return value:
|
||||
|
||||
None
|
||||
|
||||
* Usage example:
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> input_layer = next(iter(net.inputs))
|
||||
>>> n, c, h, w = net.inputs[input_layer]
|
||||
>>> net.reshape({input_layer: (n, c, h*2, w*2)}]
|
||||
```
|
||||
|
||||
## <a name="inputinfo-class"></a>InputInfo
|
||||
|
||||
This class contains the information about the network input layers
|
||||
|
||||
### Class attributes:
|
||||
|
||||
* `precision` - Precision of the input data provided by user. Provides setter and getter interfaces
|
||||
to get and modify input layer precision.
|
||||
|
||||
List of applicable precisions: FP32 FP16, I32, I16, I8, U32, U16
|
||||
|
||||
**Note**: Support of any calculation precision depends on the target plugin
|
||||
|
||||
* `layout` - Layout of the input data provided by user. Provides setter and getter interfaces
|
||||
to get and modify input layer layout.
|
||||
|
||||
List of applicable layouts: NCHW, NHWC, OIHW, C, CHW, HW, NC, CN, BLOCKED
|
||||
|
||||
* `shape` - input layer data shape
|
||||
|
||||
|
||||
## <a name="outputinfo-class"></a>OutputInfo
|
||||
|
||||
This class contains the information about the network input layers
|
||||
|
||||
### Class attributes:
|
||||
|
||||
* `precision` - Precision of the output data. Provides setter and getter interfaces
|
||||
to get and modify output layer precision.
|
||||
|
||||
* `layout` - Layout of the output data provided by user
|
||||
|
||||
* `shape` - Input layer data shape
|
||||
|
||||
## <a name="ieplugin-class"></a>IEPlugin Class
|
||||
|
||||
This class is the main plugin interface and serves to initialize and configure the plugin.
|
||||
@@ -184,8 +263,8 @@ This class is the main plugin interface and serves to initialize and configure t
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `device` - target device name. Supported devices: CPU, GPU, FPGA, MYRIAD, HETERO
|
||||
* `plugin_dirs` - list of paths to plugin directories
|
||||
* `device` - Target device name. Supported devices: CPU, GPU, FPGA, MYRIAD, HETERO
|
||||
* `plugin_dirs` - List of paths to plugin directories
|
||||
|
||||
### Properties
|
||||
|
||||
@@ -194,7 +273,7 @@ This class is the main plugin interface and serves to initialize and configure t
|
||||
|
||||
### Instance Methods
|
||||
|
||||
* `load(network: IENetwork, num_requests: int=1, config=None)`
|
||||
* ```load(network: IENetwork, num_requests: int=1, config=None)```
|
||||
|
||||
* Description:
|
||||
|
||||
@@ -204,23 +283,25 @@ This class is the main plugin interface and serves to initialize and configure t
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `network` - a valid IENetwork instance created by `IENetwork.from_ir()` method
|
||||
* `num_requests` - a positive integer value of infer requests to be created. Number of infer requests may be limited
|
||||
* `network` - A valid IENetwork instance created by `IENetwork.from_ir()` method
|
||||
* `num_requests` - A positive integer value of infer requests to be created. Number of infer requests may be limited
|
||||
by device capabilities.
|
||||
* `config` - a dictionary of plugin configuration keys and their values
|
||||
* `config` - A dictionary of plugin configuration keys and their values
|
||||
|
||||
* Return value:
|
||||
|
||||
None
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> plugin = IEPlugin(device="CPU")
|
||||
>>> exec_net = plugin.load(network=net, num_requsts=2)
|
||||
>>> exec_net
|
||||
<inference_engine.ie_api.ExecutableNetwork object at 0x7f5140bbcd38>
|
||||
```
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> plugin = IEPlugin(device="CPU")
|
||||
>>> exec_net = plugin.load(network=net, num_requsts=2)
|
||||
>>> exec_net
|
||||
<inference_engine.ie_api.ExecutableNetwork object at 0x7f5140bbcd38>
|
||||
```
|
||||
|
||||
* `set_initial_affinity(net: IENetwork)`
|
||||
|
||||
* Description:
|
||||
@@ -230,7 +311,7 @@ This class is the main plugin interface and serves to initialize and configure t
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `net` - a valid instance of IENetwork
|
||||
* `net` - A valid instance of IENetwork
|
||||
|
||||
* Return value:
|
||||
|
||||
@@ -248,17 +329,20 @@ This class is the main plugin interface and serves to initialize and configure t
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `extension_path` - a full path to CPU extensions library
|
||||
* `extension_path` - A full path to CPU extensions library
|
||||
|
||||
* Return value:
|
||||
|
||||
None
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> plugin = IEPlugin(device="CPU")
|
||||
>>> plugin.add_cpu_extenstions(ext_lib_path)
|
||||
```
|
||||
|
||||
```py
|
||||
>>> plugin = IEPlugin(device="CPU")
|
||||
>>> plugin.add_cpu_extenstions(ext_lib_path)
|
||||
```
|
||||
|
||||
|
||||
* `set_config(config: dict)`
|
||||
|
||||
* Description:
|
||||
@@ -268,7 +352,7 @@ This class is the main plugin interface and serves to initialize and configure t
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `config` - a dictionary of keys and values of acceptable configuration parameters
|
||||
* `config` - A dictionary of keys and values of acceptable configuration parameters
|
||||
|
||||
* Return value:
|
||||
|
||||
@@ -279,6 +363,7 @@ This class is the main plugin interface and serves to initialize and configure t
|
||||
See `set_affinity` method of the `IENetwork` class.
|
||||
|
||||
* `get_supported_layers(net: IENetwork)`
|
||||
|
||||
* Description:
|
||||
|
||||
Returns the set of layers supported by the plugin. Please note that in case of CPU plugin support of
|
||||
@@ -286,7 +371,7 @@ This class is the main plugin interface and serves to initialize and configure t
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `net` - a valid instance of IENetwork
|
||||
* `net` - A valid instance of IENetwork
|
||||
|
||||
* Return value:
|
||||
|
||||
@@ -306,16 +391,19 @@ There is no explicit class constructor. To make a valid instance of `ExecutableN
|
||||
|
||||
### Class attributes
|
||||
|
||||
* `requests` - a tuple of InferRequest instances
|
||||
* `requests` - A tuple of InferRequest instances
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> plugin = IEPlugin(device="CPU")
|
||||
>>> exec_net = plugin.load(network=net, num_requsts=2)
|
||||
>>> exec_net.requests
|
||||
(<inference_engine.ie_api.InferRequest object at 0x7f66f56c57e0>, <inference_engine.ie_api.InferRequest object at 0x7f66f56c58b8>, <inference_engine.ie_api.InferRequest object at 0x7f66f56c5900>)
|
||||
```
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> plugin = IEPlugin(device="CPU")
|
||||
>>> exec_net = plugin.load(network=net, num_requsts=3)
|
||||
>>> exec_net.requests
|
||||
(<inference_engine.ie_api.InferRequest object at 0x7f66f56c57e0>,
|
||||
<inference_engine.ie_api.InferRequest object at 0x7f66f56c58b8>,
|
||||
<inference_engine.ie_api.InferRequest object at 0x7f66f56c5900>)
|
||||
```
|
||||
|
||||
### Instance Methods
|
||||
|
||||
@@ -327,27 +415,28 @@ There is no explicit class constructor. To make a valid instance of `ExecutableN
|
||||
Wraps `infer()` method of the `InferRequest` class
|
||||
|
||||
* Parameters:
|
||||
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
|
||||
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
|
||||
|
||||
* Return value:
|
||||
|
||||
A dictionary of output layer name as a key and `numpy.ndarray` with output data of the layer as a value
|
||||
A dictionary that maps output layer names to `numpy.ndarray` objects with output data of the layer
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> plugin = IEPlugin(device="CPU")
|
||||
>>> exec_net = plugin.load(network=net, num_requsts=2)
|
||||
>>> res = exec_net.infer({'data': img})
|
||||
>>> res
|
||||
{'prob': array([[[[2.83426580e-08]],
|
||||
[[2.40166020e-08]],
|
||||
[[1.29469613e-09]],
|
||||
[[2.95946148e-08]]
|
||||
......
|
||||
]])}
|
||||
```
|
||||
For illustration of input data preparation, please see samples (for example, `classification_sample.py`).
|
||||
|
||||
```py
|
||||
>>> net = IENetwork.from_ir(model=path_to_xml_file, weights=path_to_bin_file)
|
||||
>>> plugin = IEPlugin(device="CPU")
|
||||
>>> exec_net = plugin.load(network=net, num_requsts=2)
|
||||
>>> res = exec_net.infer({'data': img})
|
||||
>>> res
|
||||
{'prob': array([[[[2.83426580e-08]],
|
||||
[[2.40166020e-08]],
|
||||
[[1.29469613e-09]],
|
||||
[[2.95946148e-08]]
|
||||
......
|
||||
]])}
|
||||
```
|
||||
For illustration of input data preparation, please see samples (for example, `classification_sample.py`).
|
||||
|
||||
* `start_async(request_id, inputs=None)`
|
||||
|
||||
@@ -358,21 +447,23 @@ There is no explicit class constructor. To make a valid instance of `ExecutableN
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `request_id` - index of infer request to start inference
|
||||
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
|
||||
* `request_id` - Index of infer request to start inference
|
||||
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
|
||||
|
||||
* Return value:
|
||||
|
||||
A handler of specified infer request, which is an instance of the `InferRequest` class.
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> infer_request_handle = exec_net.start_async(request_id=0, inputs={input_blob: image})
|
||||
>>> infer_status = infer_request_handle.wait()
|
||||
>>> res = infer_request_handle.outputs[out_blob]
|
||||
```
|
||||
For more details about infer requests processing, see `classification_sample_async.py` (simplified case) and
|
||||
`object_detection_demo_ssd_async.py` (real synchronous use case) samples.
|
||||
|
||||
```py
|
||||
>>> infer_request_handle = exec_net.start_async(request_id=0, inputs={input_blob: image})
|
||||
>>> infer_status = infer_request_handle.wait()
|
||||
>>> res = infer_request_handle.outputs[out_blob]
|
||||
```
|
||||
|
||||
For more details about infer requests processing, see `classification_sample_async.py` (simplified case) and
|
||||
`object_detection_demo_ssd_async.py` (real asynchronous use case) samples.
|
||||
|
||||
## <a name="inferrequest"></a>InferRequest Class
|
||||
|
||||
@@ -386,19 +477,20 @@ class with specified number of requests to get `ExecutableNetwork` instance whic
|
||||
|
||||
### Class attributes
|
||||
|
||||
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
|
||||
* `outputs` - a dictionary of output layer name as a key and `numpy.ndarray` with output data of the layer as a value
|
||||
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
|
||||
* `outputs` - A dictionary that maps output layer names to `numpy.ndarray` objects with output data of the layer
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> exec_net.requests[0].inputs['data'][:] = image
|
||||
>>> exec_net.requests[0].infer()
|
||||
>>> res = exec_net.requests[0].outputs['prob']
|
||||
>>> np.flip(np.sort(np.squeeze(res)),0)
|
||||
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
|
||||
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
|
||||
2.26027006e-03, 2.12283316e-03 ...])
|
||||
```
|
||||
* Usage example:
|
||||
|
||||
```py
|
||||
>>> exec_net.requests[0].inputs['data'][:] = image
|
||||
>>> exec_net.requests[0].infer()
|
||||
>>> res = exec_net.requests[0].outputs['prob']
|
||||
>>> np.flip(np.sort(np.squeeze(res)),0)
|
||||
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
|
||||
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
|
||||
2.26027006e-03, 2.12283316e-03 ...])
|
||||
```
|
||||
|
||||
### Instance Methods
|
||||
|
||||
@@ -413,22 +505,23 @@ To run inference, please use simplified methods `infer()` and `start_async()` of
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
|
||||
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
|
||||
|
||||
* Return value:
|
||||
|
||||
None
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> exec_net = plugin.load(network=net, num_requests=2)
|
||||
>>> exec_net.requests[0].infer({input_blob: image})
|
||||
>>> res = exec_net.requests[0].outputs['prob']
|
||||
>>> np.flip(np.sort(np.squeeze(res)),0)
|
||||
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
|
||||
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
|
||||
2.26027006e-03, 2.12283316e-03 ...])
|
||||
```
|
||||
|
||||
```py
|
||||
>>> exec_net = plugin.load(network=net, num_requests=2)
|
||||
>>> exec_net.requests[0].infer({input_blob: image})
|
||||
>>> res = exec_net.requests[0].outputs['prob']
|
||||
>>> np.flip(np.sort(np.squeeze(res)),0)
|
||||
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
|
||||
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
|
||||
2.26027006e-03, 2.12283316e-03 ...])
|
||||
```
|
||||
|
||||
* `async_infer(inputs=None)`
|
||||
|
||||
@@ -438,23 +531,24 @@ To run inference, please use simplified methods `infer()` and `start_async()` of
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `inputs` - a dictionary of input layer name as a key and `numpy.ndarray` of proper shape with input data for the layer as a value
|
||||
* `inputs` - A dictionary that maps input layer names to `numpy.ndarray` objects of proper shape with input data for the layer
|
||||
|
||||
* Return value:
|
||||
|
||||
None
|
||||
|
||||
* Usage example:
|
||||
```py
|
||||
>>> exec_net = plugin.load(network=net, num_requests=2)
|
||||
>>> exec_net.requests[0].async_infer({input_blob: image})
|
||||
>>> exec_net.requests[0].wait()
|
||||
>>> res = exec_net.requests[0].outputs['prob']
|
||||
>>> np.flip(np.sort(np.squeeze(res)),0)
|
||||
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
|
||||
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
|
||||
2.26027006e-03, 2.12283316e-03 ...])
|
||||
```
|
||||
|
||||
```py
|
||||
>>> exec_net = plugin.load(network=net, num_requests=2)
|
||||
>>> exec_net.requests[0].async_infer({input_blob: image})
|
||||
>>> exec_net.requests[0].wait()
|
||||
>>> res = exec_net.requests[0].outputs['prob']
|
||||
>>> np.flip(np.sort(np.squeeze(res)),0)
|
||||
array([4.85416055e-01, 1.70385033e-01, 1.21873841e-01, 1.18894853e-01,
|
||||
5.45198545e-02, 2.44456064e-02, 5.41366823e-03, 3.42589128e-03,
|
||||
2.26027006e-03, 2.12283316e-03 ...])
|
||||
```
|
||||
|
||||
* `wait(timeout=-1)`
|
||||
|
||||
@@ -467,14 +561,14 @@ To run inference, please use simplified methods `infer()` and `start_async()` of
|
||||
|
||||
There are special values of the timeout parameter:
|
||||
|
||||
* 0 - immediately returns the inference status. It does not block or interrupt execution.
|
||||
* 0 - Immediately returns the inference status. It does not block or interrupt execution.
|
||||
To find statuses meaning, please refer to InferenceEngine::StatusCode in Inference Engine C++ documentation
|
||||
|
||||
* -1 - waits until inference result becomes available (default value)
|
||||
* -1 - Waits until inference result becomes available (default value)
|
||||
|
||||
* Parameters:
|
||||
|
||||
* `timeout` - time to wait in milliseconds or special (0, -1) cases described above.
|
||||
* `timeout` - Time to wait in milliseconds or special (0, -1) cases described above.
|
||||
If not specified, `timeout` value is set to -1 by default.
|
||||
|
||||
* Usage example:
|
||||
@@ -498,19 +592,20 @@ To run inference, please use simplified methods `infer()` and `start_async()` of
|
||||
|
||||
* Usage example:
|
||||
|
||||
```py
|
||||
>>> exec_net = plugin.load(network=net, num_requests=2)
|
||||
>>> exec_net.requests[0].infer({input_blob: image})
|
||||
>>> exec_net.requests[0].get_perf_counts()
|
||||
{'Conv2D': {'exec_type': 'jit_avx2_1x1',
|
||||
'real_time': 154,
|
||||
'cpu_time': 154,
|
||||
'status': 'EXECUTED',
|
||||
'layer_type': 'Convolution'},
|
||||
'Relu6': {'exec_type': 'undef',
|
||||
'real_time': 0,
|
||||
'cpu_time': 0,
|
||||
'status': 'NOT_RUN',
|
||||
'layer_type': 'Clamp'}
|
||||
...
|
||||
}
|
||||
```py
|
||||
>>> exec_net = plugin.load(network=net, num_requests=2)
|
||||
>>> exec_net.requests[0].infer({input_blob: image})
|
||||
>>> exec_net.requests[0].get_perf_counts()
|
||||
{'Conv2D': {'exec_type': 'jit_avx2_1x1',
|
||||
'real_time': 154,
|
||||
'cpu_time': 154,
|
||||
'status': 'EXECUTED',
|
||||
'layer_type': 'Convolution'},
|
||||
'Relu6': {'exec_type': 'undef',
|
||||
'real_time': 0,
|
||||
'cpu_time': 0,
|
||||
'status': 'NOT_RUN',
|
||||
'layer_type': 'Clamp'}
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
@@ -9,7 +9,6 @@ from .ie_api_impl_defs cimport Blob, TensorDesc
|
||||
from libcpp.string cimport string
|
||||
from libcpp.vector cimport vector
|
||||
from libcpp.memory cimport unique_ptr
|
||||
from libcpp cimport bool
|
||||
|
||||
cdef class BlobBuffer:
|
||||
cdef Blob.Ptr ptr
|
||||
@@ -57,3 +56,9 @@ cdef class IENetReader:
|
||||
|
||||
cdef class IENetLayer:
|
||||
cdef C.IENetLayer impl
|
||||
|
||||
cdef class InputInfo:
|
||||
cdef C.InputInfo impl
|
||||
|
||||
cdef class OutputInfo:
|
||||
cdef C.OutputInfo impl
|
||||
@@ -14,6 +14,7 @@ from libcpp.memory cimport unique_ptr
|
||||
from libc.stdint cimport int64_t
|
||||
import os
|
||||
import numpy as np
|
||||
from copy import deepcopy
|
||||
|
||||
cdef extern from "<utility>" namespace "std" nogil:
|
||||
cdef unique_ptr[C.IEExecNetwork] move(unique_ptr[C.IEExecNetwork])
|
||||
@@ -32,7 +33,8 @@ cdef dict_to_c_map(py_dict):
|
||||
c_map[k.encode()] = v.encode()
|
||||
return c_map
|
||||
|
||||
supported_precisions = ["fp32", "fp16", "q78", "i32", "i16", "i8", "u32", "u16"]
|
||||
supported_precisions = ["FP32", "FP16", "Q78", "I32", "I16", "I8", "U32", "U16"]
|
||||
supported_layouts = ["NCHW", "NHWC", "OIHW", "C", "CHW", "HW", "NC", "CN", "BLOCKED"]
|
||||
known_plugins = ['CPU', 'GPU', 'FPGA', 'MYRIAD', 'HETERO']
|
||||
|
||||
def get_version():
|
||||
@@ -62,6 +64,7 @@ cdef class IENetLayer:
|
||||
weights_buffer.reset(weights.second)
|
||||
weights_map[weights.first.decode()] = weights_buffer.to_numpy()
|
||||
return weights_map
|
||||
|
||||
@property
|
||||
def params(self):
|
||||
return {k.decode(): v.decode() for k, v in self.impl.params}
|
||||
@@ -73,6 +76,56 @@ cdef class IENetLayer:
|
||||
def params(self, params_map):
|
||||
self.impl.setParams(dict_to_c_map(params_map))
|
||||
|
||||
@precision.setter
|
||||
def precision(self, precision: str):
|
||||
self.impl.setPrecision(precision.upper().encode())
|
||||
|
||||
|
||||
cdef class InputInfo:
|
||||
@property
|
||||
def precision(self):
|
||||
return self.impl.precision.decode()
|
||||
@property
|
||||
def layout(self):
|
||||
return self.impl.layout.decode()
|
||||
@property
|
||||
def shape(self):
|
||||
return self.impl.dims
|
||||
|
||||
@precision.setter
|
||||
def precision(self, precision):
|
||||
if precision.upper() not in supported_precisions:
|
||||
raise AttributeError(
|
||||
"Unsupported precision {}! List of supported precisions: {}".format(precision, supported_precisions))
|
||||
self.impl.setPrecision(precision.encode())
|
||||
@layout.setter
|
||||
def layout(self, layout):
|
||||
if layout.upper() not in supported_layouts:
|
||||
raise AttributeError(
|
||||
"Unsupported layout {}! List of supported layouts: {}".format(layout, supported_layouts))
|
||||
self.impl.setLayout(layout.encode())
|
||||
|
||||
|
||||
cdef class OutputInfo:
|
||||
@property
|
||||
def precision(self):
|
||||
return self.impl.precision.decode()
|
||||
@property
|
||||
def layout(self):
|
||||
return self.impl.layout.decode()
|
||||
@property
|
||||
def shape(self):
|
||||
return self.impl.dims
|
||||
@precision.setter
|
||||
def precision(self, precision):
|
||||
if precision.upper() not in supported_precisions:
|
||||
raise AttributeError(
|
||||
"Unsupported precision {}! List of supported precisions: {}".format(precision, supported_precisions))
|
||||
self.impl.setPrecision(precision.encode())
|
||||
# @layout.setter
|
||||
# def layout(self, layout):
|
||||
# self.impl.setLayout(layout.encode())
|
||||
|
||||
cdef class ExecutableNetwork:
|
||||
def __init__(self):
|
||||
self._requests = []
|
||||
@@ -80,8 +133,8 @@ cdef class ExecutableNetwork:
|
||||
def infer(self, inputs=None):
|
||||
current_request = self.requests[0]
|
||||
current_request.infer(inputs)
|
||||
if inputs is not None:
|
||||
return {k: v for k, v in current_request.outputs.items()}
|
||||
return deepcopy(current_request.outputs)
|
||||
|
||||
|
||||
def start_async(self, request_id, inputs=None):
|
||||
if request_id not in list(range(len(self.requests))):
|
||||
@@ -147,7 +200,7 @@ cdef class InferRequest:
|
||||
|
||||
def _fill_inputs(self, inputs):
|
||||
for k, v in inputs.items():
|
||||
self.inputs[k][:] = v
|
||||
self._inputs[k][:] = v
|
||||
|
||||
cdef class IENetwork:
|
||||
@property
|
||||
@@ -157,11 +210,25 @@ cdef class IENetwork:
|
||||
|
||||
@property
|
||||
def inputs(self):
|
||||
return {k.decode(): v for k, v in self.impl.inputs}
|
||||
cdef map[string, C.InputInfo] c_inputs = self.impl.getInputs()
|
||||
inputs = {}
|
||||
cdef InputInfo in_info
|
||||
for input in c_inputs:
|
||||
in_info = InputInfo()
|
||||
in_info.impl = input.second
|
||||
inputs[input.first.decode()] = in_info
|
||||
return inputs
|
||||
|
||||
@property
|
||||
def outputs(self):
|
||||
return [k.decode() for k in self.impl.outputs]
|
||||
cdef map[string, C.OutputInfo] c_outputs = self.impl.getOutputs()
|
||||
outputs = {}
|
||||
cdef OutputInfo out_info
|
||||
for out in c_outputs:
|
||||
out_info = OutputInfo()
|
||||
out_info.impl = out.second
|
||||
outputs[out.first.decode()] = out_info
|
||||
return outputs
|
||||
|
||||
@property
|
||||
def batch_size(self):
|
||||
@@ -176,7 +243,7 @@ cdef class IENetwork:
|
||||
|
||||
@property
|
||||
def layers(self):
|
||||
cdef map[string, C.IENetLayer] c_layers = <map[string, C.IENetLayer]>self.impl.getLayers()
|
||||
cdef map[string, C.IENetLayer] c_layers = <map[string, C.IENetLayer]> self.impl.getLayers()
|
||||
layers = {}
|
||||
cdef IENetLayer net_l = IENetLayer()
|
||||
for l in c_layers:
|
||||
@@ -188,22 +255,23 @@ cdef class IENetwork:
|
||||
@classmethod
|
||||
def from_ir(cls, model: str, weights: str):
|
||||
if not os.path.isfile(model):
|
||||
raise FileNotFoundError("Path to the model {} doesn't exists or it's a directory".format(model))
|
||||
raise Exception("Path to the model {} doesn't exists or it's a directory".format(model))
|
||||
if not os.path.isfile(weights):
|
||||
raise FileNotFoundError("Path to the weights {} doesn't exists or it's a directory".format(weights))
|
||||
raise Exception("Path to the weights {} doesn't exists or it's a directory".format(weights))
|
||||
net_reader = IENetReader()
|
||||
return net_reader.read(model, weights)
|
||||
|
||||
# TODO: Use enum with precision type instead of srting parameter when python2 support will not be required.
|
||||
def add_outputs(self, outputs, precision="FP32"):
|
||||
if precision.lower() not in supported_precisions:
|
||||
raise AttributeError("Unsupported precision {}! List of supported precisions: {}".format(precision, supported_precisions))
|
||||
if precision.upper() not in supported_precisions:
|
||||
raise AttributeError(
|
||||
"Unsupported precision {}! List of supported precisions: {}".format(precision, supported_precisions))
|
||||
if not isinstance(outputs, list):
|
||||
outputs = [outputs]
|
||||
cdef vector[string] _outputs
|
||||
for l in outputs:
|
||||
_outputs.push_back(l.encode())
|
||||
self.impl.addOutputs(_outputs, precision.lower().encode())
|
||||
self.impl.addOutputs(_outputs, precision.upper().encode())
|
||||
|
||||
def reshape(self, input_shapes: dict):
|
||||
cdef map[string, vector[size_t]] c_input_shapes;
|
||||
@@ -241,7 +309,8 @@ cdef class IEPlugin:
|
||||
|
||||
cpdef ExecutableNetwork load(self, IENetwork network, int num_requests=1, config=None):
|
||||
if num_requests <= 0:
|
||||
raise ValueError("Incorrect number of requests specified: {}. Expected positive integer number.".format(num_requests))
|
||||
raise ValueError(
|
||||
"Incorrect number of requests specified: {}. Expected positive integer number.".format(num_requests))
|
||||
cdef ExecutableNetwork exec_net = ExecutableNetwork()
|
||||
cdef vector[string] inputs_list
|
||||
cdef vector[string] outputs_list
|
||||
@@ -275,13 +344,13 @@ cdef class IEPlugin:
|
||||
|
||||
return exec_net
|
||||
|
||||
cpdef void set_initial_affinity(self,IENetwork net) except *:
|
||||
cpdef void set_initial_affinity(self, IENetwork net) except *:
|
||||
if self.device.find("HETERO") == -1:
|
||||
raise RuntimeError("set_initial_affinity method applicable only for HETERO device")
|
||||
self.impl.setInitialAffinity(net.impl)
|
||||
|
||||
cpdef set get_supported_layers(self,IENetwork net):
|
||||
return set([l.decode() for l in self.impl.queryNetwork(net.impl)])
|
||||
cpdef set get_supported_layers(self, IENetwork net):
|
||||
return set([l.decode() for l in self.impl.queryNetwork(net.impl)])
|
||||
|
||||
@property
|
||||
def device(self):
|
||||
@@ -305,8 +374,6 @@ cdef class IEPlugin:
|
||||
c_config[to_std_string(k)] = to_std_string(v)
|
||||
self.impl.setConfig(c_config)
|
||||
|
||||
|
||||
|
||||
cdef class IENetReader:
|
||||
def read(self, model: str, weights: str) -> IENetwork:
|
||||
cdef IENetwork net = IENetwork()
|
||||
@@ -349,7 +416,6 @@ cdef class BlobBuffer:
|
||||
buffer.strides = self.strides.data()
|
||||
buffer.suboffsets = NULL
|
||||
|
||||
|
||||
cdef char*_get_blob_format(self, const TensorDesc & desc):
|
||||
cdef Precision precision = desc.getPrecision()
|
||||
name = bytes(precision.name()).decode()
|
||||
|
||||
@@ -6,6 +6,25 @@
|
||||
#include "ie_api_impl.hpp"
|
||||
#include "hetero/hetero_plugin_config.hpp"
|
||||
#include "ie_iinfer_request.hpp"
|
||||
std::map <std::string,InferenceEngine::Precision> precision_map = {{"FP32", InferenceEngine::Precision::FP32},
|
||||
{"FP16", InferenceEngine::Precision::FP16},
|
||||
{"Q78", InferenceEngine::Precision::Q78},
|
||||
{"I32", InferenceEngine::Precision::I32},
|
||||
{"I16", InferenceEngine::Precision::I16},
|
||||
{"I8", InferenceEngine::Precision::I8},
|
||||
{"U16", InferenceEngine::Precision::U16},
|
||||
{"U8", InferenceEngine::Precision::U8}};
|
||||
|
||||
std::map <std::string,InferenceEngine::Layout> layout_map = {{"ANY", InferenceEngine::Layout::ANY},
|
||||
{"NCHW", InferenceEngine::Layout::NCHW},
|
||||
{"NHWC", InferenceEngine::Layout::NHWC},
|
||||
{"OIHW", InferenceEngine::Layout::OIHW},
|
||||
{"C", InferenceEngine::Layout::C},
|
||||
{"CHW", InferenceEngine::Layout::CHW},
|
||||
{"HW", InferenceEngine::Layout::HW},
|
||||
{"NC", InferenceEngine::Layout::NC},
|
||||
{"CN", InferenceEngine::Layout::CN},
|
||||
{"BLOCKED", InferenceEngine::Layout::BLOCKED}};
|
||||
#define stringify( name ) # name
|
||||
#define IE_CHECK_CALL(expr) { \
|
||||
auto ret = (expr); \
|
||||
@@ -14,33 +33,18 @@
|
||||
} \
|
||||
} \
|
||||
|
||||
|
||||
|
||||
InferenceEnginePython::IENetwork InferenceEnginePython::IENetReader::read(std::string const &model,
|
||||
std::string const &weights)
|
||||
{
|
||||
InferenceEngine::CNNNetReader net_reader;
|
||||
net_reader.ReadNetwork(model);
|
||||
net_reader.ReadWeights(weights);
|
||||
|
||||
const std::string &net_name = net_reader.getName();
|
||||
std::map<std::string, std::vector<size_t>> inputs;
|
||||
const InferenceEngine::InputsDataMap &inputsInfo = net_reader.getNetwork().getInputsInfo();
|
||||
for (auto &item : inputsInfo)
|
||||
{
|
||||
const InferenceEngine::TensorDesc &inputTensorDesc = item.second->getTensorDesc();
|
||||
InferenceEngine::SizeVector dims = inputTensorDesc.getDims();
|
||||
inputs[item.first] = dims;
|
||||
}
|
||||
|
||||
// TODO: store output shapes for each output
|
||||
std::vector<std::string> outputs;
|
||||
const InferenceEngine::OutputsDataMap &outputsInfo = net_reader.getNetwork().getOutputsInfo();
|
||||
for (auto &item : outputsInfo)
|
||||
{
|
||||
outputs.push_back(item.first);
|
||||
}
|
||||
InferenceEngine::CNNNetwork network = net_reader.getNetwork();
|
||||
std::size_t batch_size = network.getBatchSize();
|
||||
return {network, net_name, batch_size, inputs, outputs};
|
||||
return {network, net_name, batch_size};
|
||||
}
|
||||
|
||||
std::map<std::string, InferenceEnginePython::IENetLayer> InferenceEnginePython::IENetwork::getLayers()
|
||||
@@ -91,17 +95,47 @@ std::map<std::string, InferenceEnginePython::IENetLayer> InferenceEnginePython::
|
||||
return result;
|
||||
|
||||
}
|
||||
std::map<std::string, InferenceEnginePython::InputInfo> InferenceEnginePython::IENetwork::getInputs(){
|
||||
std::map<std::string, InferenceEnginePython::InputInfo> inputs;
|
||||
const InferenceEngine::InputsDataMap &inputsInfo = actual.getInputsInfo();
|
||||
for (auto & in : inputsInfo){
|
||||
InferenceEnginePython::InputInfo info;
|
||||
info.actual = *in.second;
|
||||
const InferenceEngine::TensorDesc &inputTensorDesc = in.second->getTensorDesc();
|
||||
info.dims = inputTensorDesc.getDims();
|
||||
for (auto it : precision_map )
|
||||
if (it.second == in.second->getPrecision())
|
||||
info.precision = it.first;
|
||||
for (auto it : layout_map )
|
||||
if (it.second == in.second->getLayout())
|
||||
info.layout = it.first;
|
||||
inputs[in.first] = info;
|
||||
}
|
||||
return inputs;
|
||||
}
|
||||
|
||||
std::map<std::string, InferenceEnginePython::OutputInfo> InferenceEnginePython::IENetwork::getOutputs(){
|
||||
std::map<std::string, InferenceEnginePython::OutputInfo> outputs;
|
||||
const InferenceEngine::OutputsDataMap &outputsInfo = actual.getOutputsInfo();
|
||||
for (auto & out : outputsInfo){
|
||||
InferenceEnginePython::OutputInfo info;
|
||||
info.actual = out.second;
|
||||
const InferenceEngine::TensorDesc &inputTensorDesc = out.second->getTensorDesc();
|
||||
info.dims = inputTensorDesc.getDims();
|
||||
for (auto it : precision_map )
|
||||
if (it.second == out.second->getPrecision())
|
||||
info.precision = it.first;
|
||||
for (auto it : layout_map )
|
||||
if (it.second == out.second->getLayout())
|
||||
info.layout = it.first;
|
||||
outputs[out.first] = info;
|
||||
}
|
||||
return outputs;
|
||||
}
|
||||
|
||||
void InferenceEnginePython::IENetwork::addOutputs(const std::vector<std::string> & out_layers, const std::string &precision)
|
||||
{
|
||||
std::map <std::string,InferenceEngine::Precision> precision_map = {{"fp32", InferenceEngine::Precision::FP32},
|
||||
{"fp16", InferenceEngine::Precision::FP16},
|
||||
{"q78", InferenceEngine::Precision::Q78},
|
||||
{"i32", InferenceEngine::Precision::I32},
|
||||
{"i16", InferenceEngine::Precision::I16},
|
||||
{"i8", InferenceEngine::Precision::I8},
|
||||
{"u16", InferenceEngine::Precision::U16},
|
||||
{"u8", InferenceEngine::Precision::U8}};
|
||||
|
||||
for (auto && l : out_layers)
|
||||
{
|
||||
InferenceEngine::OutputsDataMap outputsDataMap = actual.getOutputsInfo();
|
||||
@@ -118,32 +152,29 @@ void InferenceEnginePython::IENetwork::addOutputs(const std::vector<std::string>
|
||||
actual.addOutput(l);
|
||||
InferenceEngine::OutputsDataMap outputsDataMapUpd = actual.getOutputsInfo();
|
||||
outputsDataMapUpd[l]->setPrecision(precision_map[precision]);
|
||||
outputs.push_back(l);
|
||||
}
|
||||
}
|
||||
|
||||
void InferenceEnginePython::IENetwork::setBatch(const size_t size)
|
||||
{
|
||||
actual.setBatchSize(size);
|
||||
const InferenceEngine::InputsDataMap &inputsInfo = actual.getInputsInfo();
|
||||
for (auto &item : inputsInfo)
|
||||
{
|
||||
const InferenceEngine::TensorDesc &inputTensorDesc = item.second->getTensorDesc();
|
||||
InferenceEngine::SizeVector dims = inputTensorDesc.getDims();
|
||||
inputs[item.first] = dims;
|
||||
}
|
||||
}
|
||||
void InferenceEnginePython::IENetwork::reshape(const std::map<std::string, std::vector<size_t>> & input_shapes){
|
||||
actual.reshape(input_shapes);
|
||||
const InferenceEngine::InputsDataMap &inputsInfo = actual.getInputsInfo();
|
||||
for (auto &item : inputsInfo)
|
||||
{
|
||||
const InferenceEngine::TensorDesc &inputTensorDesc = item.second->getTensorDesc();
|
||||
InferenceEngine::SizeVector dims = inputTensorDesc.getDims();
|
||||
inputs[item.first] = dims;
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
void InferenceEnginePython::InputInfo::setPrecision(std::string precision){
|
||||
actual.setPrecision(precision_map[precision]);
|
||||
}
|
||||
|
||||
void InferenceEnginePython::InputInfo::setLayout(std::string layout){
|
||||
actual.setLayout(layout_map[layout]);
|
||||
}
|
||||
|
||||
void InferenceEnginePython::OutputInfo::setPrecision(std::string precision){
|
||||
actual->setPrecision(precision_map[precision]);
|
||||
}
|
||||
|
||||
InferenceEnginePython::IEPlugin::IEPlugin(const std::string &device, const std::vector<std::string> &plugin_dirs)
|
||||
{
|
||||
|
||||
@@ -211,6 +242,9 @@ std::map<std::string, InferenceEngine::Blob::Ptr> InferenceEnginePython::IENetLa
|
||||
return weights;
|
||||
}
|
||||
|
||||
void InferenceEnginePython::IENetLayer::setPrecision(std::string precision){
|
||||
layer_ptr->precision = precision_map[precision];
|
||||
}
|
||||
void InferenceEnginePython::IEPlugin::addCpuExtension(const std::string &extension_path)
|
||||
{
|
||||
InferenceEngine::ResponseDesc response;
|
||||
@@ -295,13 +329,13 @@ std::vector<std::string> InferenceEnginePython::InferRequestWrap::getOutputsList
|
||||
}
|
||||
|
||||
void InferenceEnginePython::InferRequestWrap::infer() {
|
||||
InferenceEngine::ResponseDesc responseDesc;
|
||||
request_ptr->Infer(&responseDesc);
|
||||
InferenceEngine::ResponseDesc response;
|
||||
IE_CHECK_CALL(request_ptr->Infer(&response));
|
||||
}
|
||||
|
||||
void InferenceEnginePython::InferRequestWrap::infer_async() {
|
||||
InferenceEngine::ResponseDesc responseDesc;
|
||||
request_ptr->StartAsync(&responseDesc);
|
||||
InferenceEngine::ResponseDesc response;
|
||||
IE_CHECK_CALL(request_ptr->StartAsync(&response));
|
||||
}
|
||||
|
||||
int InferenceEnginePython::InferRequestWrap::wait(int64_t timeout) {
|
||||
|
||||
@@ -14,13 +14,8 @@
|
||||
#include <sstream>
|
||||
#include "ie_extension.h"
|
||||
|
||||
|
||||
namespace InferenceEnginePython {
|
||||
//struct BlobInfo {
|
||||
// int layout;
|
||||
// std::vector<std::size_t> dims;
|
||||
// std::string name;
|
||||
// std::vector<std::string> inputTo;
|
||||
//};
|
||||
struct IENetLayer {
|
||||
InferenceEngine::CNNLayerPtr layer_ptr;
|
||||
std::string name;
|
||||
@@ -28,11 +23,25 @@ struct IENetLayer {
|
||||
std::string precision;
|
||||
std::string affinity;
|
||||
std::map<std::string, std::string> params;
|
||||
// std::map<std::string, InferenceEnginePython::BlobInfo> blob_info;
|
||||
// std::map<std::string, InferenceEngine::Blob::Ptr> weights;
|
||||
void setAffinity(const std::string & target_affinity);
|
||||
void setParams(const std::map<std::string, std::string> & params_map);
|
||||
std::map<std::string, InferenceEngine::Blob::Ptr> getWeights();
|
||||
void setPrecision(std::string precision);
|
||||
};
|
||||
struct InputInfo{
|
||||
InferenceEngine::InputInfo actual;
|
||||
std::vector<size_t> dims;
|
||||
std::string precision;
|
||||
std::string layout;
|
||||
void setPrecision(std::string precision);
|
||||
void setLayout(std::string layout);
|
||||
};
|
||||
struct OutputInfo{
|
||||
InferenceEngine::DataPtr actual;
|
||||
std::vector<size_t> dims;
|
||||
std::string precision;
|
||||
std::string layout;
|
||||
void setPrecision(std::string precision);
|
||||
};
|
||||
struct ProfileInfo {
|
||||
std::string status;
|
||||
@@ -46,15 +55,11 @@ struct IENetwork {
|
||||
InferenceEngine::CNNNetwork actual;
|
||||
std::string name;
|
||||
std::size_t batch_size;
|
||||
std::map<std::string, std::vector<size_t>> inputs;
|
||||
std::vector<std::string> outputs;
|
||||
void setPrecision() {
|
||||
InferenceEngine::CNNNetwork one;
|
||||
InferenceEngine::CNNNetwork second(std::move(one));
|
||||
}
|
||||
void setBatch(const size_t size);
|
||||
void addOutputs(const std::vector<std::string> &out_layers, const std::string &precision);
|
||||
std::map<std::string, InferenceEnginePython::IENetLayer> getLayers();
|
||||
std::map<std::string, InferenceEnginePython::InputInfo> getInputs();
|
||||
std::map<std::string, InferenceEnginePython::OutputInfo> getOutputs();
|
||||
void reshape(const std::map<std::string, std::vector<size_t>> & input_shapes);
|
||||
};
|
||||
|
||||
|
||||
@@ -43,12 +43,21 @@ cdef extern from "ie_api_impl.hpp" namespace "InferenceEnginePython":
|
||||
void setAffinity(const string & target_affinity) except +
|
||||
void setParams(const map[string, string] & params_map) except +
|
||||
map[string, Blob.Ptr] getWeights() except +
|
||||
void setPrecision(string precision) except +
|
||||
|
||||
cdef cppclass InputInfo:
|
||||
vector[size_t] dims
|
||||
string precision
|
||||
string layout
|
||||
void setPrecision(string precision)
|
||||
void setLayout(string layout)
|
||||
|
||||
cdef cppclass OutputInfo:
|
||||
vector[size_t] dims
|
||||
string precision
|
||||
string layout
|
||||
void setPrecision(string precision)
|
||||
|
||||
# cdef cppclass BlobInfo:
|
||||
# int layout
|
||||
# vector[size_t] dims
|
||||
# string name
|
||||
# vector[string] inputTo
|
||||
|
||||
cdef cppclass ProfileInfo:
|
||||
string status
|
||||
@@ -71,8 +80,9 @@ cdef extern from "ie_api_impl.hpp" namespace "InferenceEnginePython":
|
||||
string name
|
||||
size_t batch_size
|
||||
map[string, vector[size_t]] inputs
|
||||
vector[string] outputs
|
||||
map[string, IENetLayer] getLayers() except +
|
||||
map[string, InputInfo] getInputs() except +
|
||||
map[string, OutputInfo] getOutputs() except +
|
||||
void addOutputs(vector[string] &, string &) except +
|
||||
void setAffinity(map[string, string] &types_affinity_map, map[string, string] &layers_affinity_map) except +
|
||||
void setBatch(size_t size) except +
|
||||
|
||||
@@ -1,112 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Copyright (c) 2018 Intel Corporation
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
"""
|
||||
|
||||
from __future__ import print_function
|
||||
import sys
|
||||
import os
|
||||
from argparse import ArgumentParser
|
||||
import cv2
|
||||
import numpy as np
|
||||
import logging as log
|
||||
from openvino.inference_engine import IENetwork, IEPlugin
|
||||
|
||||
|
||||
def build_argparser():
|
||||
parser = ArgumentParser()
|
||||
parser.add_argument("-m", "--model", help="Path to an .xml file with a trained model.", required=True, type=str)
|
||||
parser.add_argument("-i", "--input", help="Path to a folder with images or path to an image files", required=True,
|
||||
type=str)
|
||||
parser.add_argument("-l", "--cpu_extension",
|
||||
help="MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernels "
|
||||
"impl.", type=str, default=None)
|
||||
parser.add_argument("-pp", "--plugin_dir", help="Path to a plugin folder", type=str, default=None)
|
||||
parser.add_argument("-d", "--device",
|
||||
help="Specify hetero plugin configuration; e.g. HETERO:FPGA,CPU", default="HETERO:CPU,GPU",
|
||||
type=str)
|
||||
parser.add_argument("-nt", "--number_top", help="Number of top results", default=10, type=int)
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
def main():
|
||||
log.basicConfig(format="[ %(levelname)s ] %(message)s", level=log.INFO, stream=sys.stdout)
|
||||
args = build_argparser().parse_args()
|
||||
assert args.device.split(':')[0] == "HETERO", "This sample supports only Hetero Plugin. " \
|
||||
"Please specify correct device, e.g. HETERO:FPGA,CPU"
|
||||
model_xml = args.model
|
||||
model_bin = os.path.splitext(model_xml)[0] + ".bin"
|
||||
|
||||
# Plugin initialization for specified device and load extensions library if specified
|
||||
plugin = IEPlugin(device=args.device, plugin_dirs=args.plugin_dir)
|
||||
if args.cpu_extension and 'CPU' in args.device:
|
||||
plugin.add_cpu_extension(args.cpu_extension)
|
||||
# Read IR
|
||||
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
|
||||
|
||||
if "CPU" in plugin.device:
|
||||
supported_layers = plugin.get_supported_layers(net)
|
||||
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
|
||||
if len(not_supported_layers) != 0:
|
||||
log.error("Following layers are not supported by the plugin for specified device {}:\n {}".
|
||||
format(plugin.device, ', '.join(not_supported_layers)))
|
||||
log.error("Please try to specify cpu extensions library path in sample's command line parameters using -l "
|
||||
"or --cpu_extension command line argument")
|
||||
sys.exit(1)
|
||||
net_ops = set([l.type for l in net.layers.values()])
|
||||
if not any([op == "Convolution" for op in net_ops]):
|
||||
log.warning("Specified IR doesn't contain any Convolution operations for which affinity going to be set.\n"
|
||||
"Try to use another topology to make the affinity setting result more visible.")
|
||||
|
||||
# Configure the plugin to initialize default affinity for network in set_initial_affinity() function.
|
||||
plugin.set_config({"TARGET_FALLBACK": args.device.split(':')[1]})
|
||||
# Enable graph visualization
|
||||
plugin.set_config({"HETERO_DUMP_GRAPH_DOT": "YES"})
|
||||
plugin.set_initial_affinity(net)
|
||||
|
||||
for l in net.layers.values():
|
||||
if l.type == "Convolution":
|
||||
l.affinity = "GPU"
|
||||
|
||||
assert len(net.inputs.keys()) == 1, "Sample supports only single input topologies"
|
||||
assert len(net.outputs) == 1, "Sample supports only single output topologies"
|
||||
input_blob = next(iter(net.inputs))
|
||||
out_blob = next(iter(net.outputs))
|
||||
# Read and pre-process input image
|
||||
n, c, h, w = net.inputs[input_blob]
|
||||
image = cv2.imread(args.input)
|
||||
image = cv2.resize(image, (w, h))
|
||||
image = image.transpose((2, 0, 1)) # Change data layout from HWC to CHW
|
||||
image = image.reshape((n, c, h, w))
|
||||
# Load network to the plugin
|
||||
exec_net = plugin.load(network=net)
|
||||
del net
|
||||
# Start sync inference
|
||||
res = exec_net.infer(inputs={input_blob: image})
|
||||
top_ind = np.argsort(res[out_blob], axis=1)[0, -args.number_top:][::-1]
|
||||
for i in top_ind:
|
||||
log.info("%f #%d" % (res[out_blob][0, i], i))
|
||||
del exec_net
|
||||
del plugin
|
||||
cwd = os.getcwd()
|
||||
log.info(
|
||||
"Graphs representing default and resulting affinities dumped to {} and {} files respectively"
|
||||
.format(os.path.join(cwd, 'hetero_affinity.dot'), os.path.join(cwd, 'hetero_subgraphs.dot'))
|
||||
)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main() or 0)
|
||||
@@ -60,7 +60,7 @@ def main():
|
||||
log.info("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
|
||||
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
|
||||
|
||||
if "CPU" in plugin.device:
|
||||
if plugin.device == "CPU":
|
||||
supported_layers = plugin.get_supported_layers(net)
|
||||
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
|
||||
if len(not_supported_layers) != 0:
|
||||
@@ -79,7 +79,7 @@ def main():
|
||||
net.batch_size = len(args.input)
|
||||
|
||||
# Read and pre-process input images
|
||||
n, c, h, w = net.inputs[input_blob]
|
||||
n, c, h, w = net.inputs[input_blob].shape
|
||||
images = np.ndarray(shape=(n, c, h, w))
|
||||
for i in range(n):
|
||||
image = cv2.imread(args.input[i])
|
||||
|
||||
@@ -60,7 +60,7 @@ def main():
|
||||
log.info("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
|
||||
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
|
||||
|
||||
if "CPU" in plugin.device:
|
||||
if plugin.device == "CPU":
|
||||
supported_layers = plugin.get_supported_layers(net)
|
||||
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
|
||||
if len(not_supported_layers) != 0:
|
||||
@@ -78,7 +78,7 @@ def main():
|
||||
net.batch_size = len(args.input)
|
||||
|
||||
# Read and pre-process input images
|
||||
n, c, h, w = net.inputs[input_blob]
|
||||
n, c, h, w = net.inputs[input_blob].shape
|
||||
images = np.ndarray(shape=(n, c, h, w))
|
||||
for i in range(n):
|
||||
image = cv2.imread(args.input[i])
|
||||
|
||||
@@ -1,176 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Copyright (c) 2018 Intel Corporation
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
"""
|
||||
|
||||
from __future__ import print_function
|
||||
import sys
|
||||
import os
|
||||
from argparse import ArgumentParser
|
||||
import cv2
|
||||
import time
|
||||
import logging as log
|
||||
from openvino.inference_engine import IENetwork, IEPlugin
|
||||
|
||||
|
||||
def build_argparser():
|
||||
parser = ArgumentParser()
|
||||
parser.add_argument("-m", "--model", help="Path to an .xml file with a trained model.", required=True, type=str)
|
||||
parser.add_argument("-i", "--input",
|
||||
help="Path to video file or image. 'cam' for capturing video stream from camera", required=True,
|
||||
type=str)
|
||||
parser.add_argument("-l", "--cpu_extension",
|
||||
help="MKLDNN (CPU)-targeted custom layers.Absolute path to a shared library with the kernels "
|
||||
"impl.", type=str, default=None)
|
||||
parser.add_argument("-pp", "--plugin_dir", help="Path to a plugin folder", type=str, default=None)
|
||||
parser.add_argument("-d", "--device",
|
||||
help="Specify the target device to infer on; CPU, GPU, FPGA or MYRIAD is acceptable. Sample "
|
||||
"will look for a suitable plugin for device specified (CPU by default)", default="CPU",
|
||||
type=str)
|
||||
parser.add_argument("--labels", help="Labels mapping file", default=None, type=str)
|
||||
parser.add_argument("-pt", "--prob_threshold", help="Probability threshold for detections filtering",
|
||||
default=0.5, type=float)
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
def main():
|
||||
log.basicConfig(format="[ %(levelname)s ] %(message)s", level=log.INFO, stream=sys.stdout)
|
||||
args = build_argparser().parse_args()
|
||||
model_xml = args.model
|
||||
model_bin = os.path.splitext(model_xml)[0] + ".bin"
|
||||
# Plugin initialization for specified device and load extensions library if specified
|
||||
log.info("Initializing plugin for {} device...".format(args.device))
|
||||
plugin = IEPlugin(device=args.device, plugin_dirs=args.plugin_dir)
|
||||
if args.cpu_extension and 'CPU' in args.device:
|
||||
plugin.add_cpu_extension(args.cpu_extension)
|
||||
|
||||
# Read IR
|
||||
log.info("Reading IR...")
|
||||
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
|
||||
|
||||
if "CPU" in plugin.device:
|
||||
supported_layers = plugin.get_supported_layers(net)
|
||||
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
|
||||
if len(not_supported_layers) != 0:
|
||||
log.error("Following layers are not supported by the plugin for specified device {}:\n {}".
|
||||
format(plugin.device, ', '.join(not_supported_layers)))
|
||||
log.error("Please try to specify cpu extensions library path in sample's command line parameters using -l "
|
||||
"or --cpu_extension command line argument")
|
||||
sys.exit(1)
|
||||
assert len(net.inputs.keys()) == 1, "Sample supports only single input topologies"
|
||||
assert len(net.outputs) == 1, "Sample supports only single output topologies"
|
||||
input_blob = next(iter(net.inputs))
|
||||
out_blob = next(iter(net.outputs))
|
||||
log.info("Loading IR to the plugin...")
|
||||
exec_net = plugin.load(network=net, num_requests=2)
|
||||
# Read and pre-process input image
|
||||
n, c, h, w = net.inputs[input_blob]
|
||||
del net
|
||||
if args.input == 'cam':
|
||||
input_stream = 0
|
||||
else:
|
||||
input_stream = args.input
|
||||
assert os.path.isfile(args.input), "Specified input file doesn't exist"
|
||||
if args.labels:
|
||||
with open(args.labels, 'r') as f:
|
||||
labels_map = [x.strip() for x in f]
|
||||
else:
|
||||
labels_map = None
|
||||
|
||||
cap = cv2.VideoCapture(input_stream)
|
||||
|
||||
cur_request_id = 0
|
||||
next_request_id = 1
|
||||
|
||||
log.info("Starting inference in async mode...")
|
||||
log.info("To switch between sync and async modes press Tab button")
|
||||
log.info("To stop the sample execution press Esc button")
|
||||
is_async_mode = True
|
||||
render_time = 0
|
||||
while cap.isOpened():
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
initial_w = cap.get(3)
|
||||
initial_h = cap.get(4)
|
||||
in_frame = cv2.resize(frame, (w, h))
|
||||
in_frame = in_frame.transpose((2, 0, 1)) # Change data layout from HWC to CHW
|
||||
in_frame = in_frame.reshape((n, c, h, w))
|
||||
|
||||
# Main sync point:
|
||||
# in the truly Async mode we start the NEXT infer request, while waiting for the CURRENT to complete
|
||||
# in the regular mode we start the CURRENT request and immediately wait for it's completion
|
||||
inf_start = time.time()
|
||||
if is_async_mode:
|
||||
exec_net.start_async(request_id=next_request_id, inputs={input_blob: in_frame})
|
||||
else:
|
||||
exec_net.start_async(request_id=cur_request_id, inputs={input_blob: in_frame})
|
||||
if exec_net.requests[cur_request_id].wait(-1) == 0:
|
||||
inf_end = time.time()
|
||||
det_time = inf_end - inf_start
|
||||
|
||||
# Parse detection results of the current request
|
||||
res = exec_net.requests[cur_request_id].outputs[out_blob]
|
||||
for obj in res[0][0]:
|
||||
# Draw only objects when probability more than specified threshold
|
||||
if obj[2] > args.prob_threshold:
|
||||
xmin = int(obj[3] * initial_w)
|
||||
ymin = int(obj[4] * initial_h)
|
||||
xmax = int(obj[5] * initial_w)
|
||||
ymax = int(obj[6] * initial_h)
|
||||
class_id = int(obj[1])
|
||||
# Draw box and label\class_id
|
||||
color = (min(class_id * 12.5, 255), min(class_id * 7, 255), min(class_id * 5, 255))
|
||||
cv2.rectangle(frame, (xmin, ymin), (xmax, ymax), color, 2)
|
||||
det_label = labels_map[class_id] if labels_map else str(class_id)
|
||||
cv2.putText(frame, det_label + ' ' + str(round(obj[2] * 100, 1)) + ' %', (xmin, ymin - 7),
|
||||
cv2.FONT_HERSHEY_COMPLEX, 0.6, color, 1)
|
||||
|
||||
# Draw performance stats
|
||||
inf_time_message = "Inference time: N\A for async mode" if is_async_mode else \
|
||||
"Inference time: {:.3f} ms".format(det_time * 1000)
|
||||
render_time_message = "OpenCV rendering time: {:.3f} ms".format(render_time * 1000)
|
||||
async_mode_message = "Async mode is on. Processing request {}".format(cur_request_id) if is_async_mode else \
|
||||
"Async mode is off. Processing request {}".format(cur_request_id)
|
||||
|
||||
cv2.putText(frame, inf_time_message, (15, 15), cv2.FONT_HERSHEY_COMPLEX, 0.5, (200, 10, 10), 1)
|
||||
cv2.putText(frame, render_time_message, (15, 30), cv2.FONT_HERSHEY_COMPLEX, 0.5, (10, 10, 200), 1)
|
||||
cv2.putText(frame, async_mode_message, (10, int(initial_h - 20)), cv2.FONT_HERSHEY_COMPLEX, 0.5,
|
||||
(10, 10, 200), 1)
|
||||
|
||||
#
|
||||
render_start = time.time()
|
||||
cv2.imshow("Detection Results", frame)
|
||||
render_end = time.time()
|
||||
render_time = render_end - render_start
|
||||
|
||||
key = cv2.waitKey(1)
|
||||
if key == 27:
|
||||
break
|
||||
if (9 == key):
|
||||
is_async_mode = not is_async_mode
|
||||
log.info("Switched to {} mode".format("async" if is_async_mode else "sync"))
|
||||
|
||||
if is_async_mode:
|
||||
cur_request_id, next_request_id = next_request_id, cur_request_id
|
||||
|
||||
cv2.destroyAllWindows()
|
||||
del exec_net
|
||||
del plugin
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(main() or 0)
|
||||
@@ -82,7 +82,7 @@ def main():
|
||||
log.info("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
|
||||
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
|
||||
|
||||
if "CPU" in plugin.device:
|
||||
if plugin.device == "CPU":
|
||||
supported_layers = plugin.get_supported_layers(net)
|
||||
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
|
||||
if len(not_supported_layers) != 0:
|
||||
@@ -100,7 +100,7 @@ def main():
|
||||
net.batch_size = len(args.input)
|
||||
|
||||
# Read and pre-process input images
|
||||
n, c, h, w = net.inputs[input_blob]
|
||||
n, c, h, w = net.inputs[input_blob].shape
|
||||
images = np.ndarray(shape=(n, c, h, w))
|
||||
for i in range(n):
|
||||
image = cv2.imread(args.input[i])
|
||||
|
||||
@@ -69,7 +69,7 @@ def main():
|
||||
log.info("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
|
||||
net = IENetwork.from_ir(model=model_xml, weights=model_bin)
|
||||
|
||||
if "CPU" in plugin.device:
|
||||
if plugin.device == "CPU":
|
||||
supported_layers = plugin.get_supported_layers(net)
|
||||
not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
|
||||
if len(not_supported_layers) != 0:
|
||||
@@ -88,7 +88,7 @@ def main():
|
||||
net.batch_size = len(args.input)
|
||||
|
||||
# Read and pre-process input images
|
||||
n, c, h, w = net.inputs[input_blob]
|
||||
n, c, h, w = net.inputs[input_blob].shape
|
||||
images = np.ndarray(shape=(n, c, h, w))
|
||||
for i in range(n):
|
||||
image = cv2.imread(args.input[i])
|
||||
|
||||
@@ -75,7 +75,11 @@ public:
|
||||
CNNNetwork getNetwork() {
|
||||
// network obj are to be updated upon this call
|
||||
if (network.get() == nullptr) {
|
||||
network.reset(new CNNNetwork(actual));
|
||||
try {
|
||||
network.reset(new CNNNetwork(actual));
|
||||
} catch (...) {
|
||||
THROW_IE_EXCEPTION << "Could not allocate memory";
|
||||
}
|
||||
}
|
||||
return *network.get();
|
||||
}
|
||||
|
||||
@@ -191,7 +191,7 @@ public:
|
||||
* @brief - Helper method to get collect all input shapes with names of corresponding Data objects
|
||||
* @return Map of pairs: input's name and its dimension.
|
||||
*/
|
||||
virtual ICNNNetwork::InputShapes getInputShapes() {
|
||||
virtual ICNNNetwork::InputShapes getInputShapes() const {
|
||||
ICNNNetwork::InputShapes shapes;
|
||||
InputsDataMap inputs;
|
||||
actual->getInputsInfo(inputs);
|
||||
@@ -207,6 +207,10 @@ public:
|
||||
return std::move(shapes);
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Run shape inference with new input shapes for the network
|
||||
* @param inputShapes - map of pairs: name of corresponding data and its dimension.
|
||||
*/
|
||||
virtual void reshape(const ICNNNetwork::InputShapes &inputShapes) {
|
||||
CALL_STATUS_FNC(reshape, inputShapes);
|
||||
}
|
||||
|
||||
@@ -51,4 +51,4 @@ class MemoryState {
|
||||
}
|
||||
};
|
||||
|
||||
} // namespace InferenceEngine
|
||||
} // namespace InferenceEngine
|
||||
@@ -11,6 +11,9 @@
|
||||
#include <set>
|
||||
#include <cctype>
|
||||
|
||||
namespace InferenceEngine {
|
||||
namespace details {
|
||||
|
||||
/**
|
||||
* @brief provides case-less comparison for stl algorithms
|
||||
* @tparam Key type, usually std::string
|
||||
@@ -73,3 +76,6 @@ using caseless_map = std::map<Key, Value, CaselessLess<Key>>;
|
||||
|
||||
template <class Key>
|
||||
using caseless_set = std::set<Key, CaselessLess<Key>>;
|
||||
|
||||
} // namespace details
|
||||
} // namespace InferenceEngine
|
||||
@@ -90,7 +90,7 @@ class CNNNetworkIterator {
|
||||
*/
|
||||
const CNNLayerPtr &operator*() const {
|
||||
if (nullptr == currentLayer) {
|
||||
THROW_IE_EXCEPTION << "iterator of ouf bound";
|
||||
THROW_IE_EXCEPTION << "iterator out of bound";
|
||||
}
|
||||
return currentLayer;
|
||||
}
|
||||
|
||||
21
inference-engine/include/details/ie_cnn_network_tools.h
Normal file
21
inference-engine/include/details/ie_cnn_network_tools.h
Normal file
@@ -0,0 +1,21 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
/**
|
||||
* @brief A header file for CNNNetwork tools
|
||||
* @file ie_cnn_network_tools.h
|
||||
*/
|
||||
#pragma once
|
||||
#include <vector>
|
||||
#include "ie_common.h"
|
||||
#include "ie_icnn_network.hpp"
|
||||
|
||||
namespace InferenceEngine {
|
||||
namespace details {
|
||||
|
||||
INFERENCE_ENGINE_API_CPP(std::vector<CNNLayerPtr>) CNNNetSortTopologically(const ICNNNetwork & network);
|
||||
|
||||
} // namespace details
|
||||
} // namespace InferenceEngine
|
||||
@@ -68,6 +68,7 @@ inline void extract_exception(StatusCode status, char *msg) {
|
||||
case RESULT_NOT_READY:throw ResultNotReady(msg);
|
||||
case NOT_ALLOCATED:throw NotAllocated(msg);
|
||||
case INFER_NOT_STARTED:throw InferNotStarted(msg);
|
||||
case NETWORK_NOT_READ:throw NetworkNotRead(msg);
|
||||
default:THROW_IE_EXCEPTION << msg;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -22,4 +22,4 @@ class NoReleaseOn : public T {
|
||||
};
|
||||
|
||||
} // namespace details
|
||||
} // namespace InferenceEngine
|
||||
} // namespace InferenceEngine
|
||||
@@ -84,4 +84,4 @@ std::shared_ptr<IAllocator> make_pre_allocator(T *ptr, size_t size) {
|
||||
}
|
||||
|
||||
} // namespace details
|
||||
} // namespace InferenceEngine
|
||||
} // namespace InferenceEngine
|
||||
@@ -29,6 +29,7 @@ namespace HeteroConfigParams {
|
||||
* This option should be used with values: CONFIG_VALUE(NO) (default) or CONFIG_VALUE(YES)
|
||||
*/
|
||||
DECLARE_HETERO_CONFIG_KEY(DUMP_GRAPH_DOT);
|
||||
DECLARE_HETERO_CONFIG_KEY(DUMP_DLA_MESSAGES);
|
||||
|
||||
} // namespace HeteroConfigParams
|
||||
} // namespace InferenceEngine
|
||||
|
||||
@@ -120,7 +120,7 @@ public:
|
||||
: tensorDesc(p, SizeVector(dims.rbegin(), dims.rend()), l) {}
|
||||
|
||||
/**
|
||||
* @depricated It works with reversed dimensions. Please create a new blob if you want to change a size.
|
||||
* @deprecated It works with reversed dimensions. Please create a new blob if you want to change a size.
|
||||
* @brief Changes Tensor size to the specified dimensions. If it was allocated, the previous data is deallocated and lost.
|
||||
* @param dims New dimensions to set
|
||||
* @param layout New layout to set
|
||||
@@ -290,11 +290,16 @@ public:
|
||||
* @param data_size Length of the pre-allocated array. If not set, size is assumed equal
|
||||
* to the dot product of dims.
|
||||
*/
|
||||
TBlob(const TensorDesc& tensorDesc, T* ptr, size_t data_sze = 0): Blob(tensorDesc) {
|
||||
if (size() != 0 && ptr == nullptr) {
|
||||
TBlob(const TensorDesc& tensorDesc, T* ptr, size_t data_size = 0): Blob(tensorDesc) {
|
||||
if (data_size == 0) {
|
||||
data_size = size();
|
||||
}
|
||||
|
||||
if (data_size != 0 && ptr == nullptr) {
|
||||
THROW_IE_EXCEPTION << "Using Blob on external nullptr memory";
|
||||
}
|
||||
_allocator = details::make_pre_allocator(ptr, size());
|
||||
|
||||
_allocator = details::make_pre_allocator(ptr, data_size);
|
||||
// blob on attached memory is always allocated, so we are not forcing the user to call allocate()
|
||||
allocate();
|
||||
}
|
||||
@@ -327,11 +332,14 @@ public:
|
||||
* @param ptr Pointer to the pre-allocated memory
|
||||
* @param data_size Length of the pre-allocated array. If not set, size is assumed equal to dot product of dims.
|
||||
*/
|
||||
TBlob(Precision p, Layout l, const SizeVector& dims, T* ptr, size_t data_sze = 0) : Blob(p, l, dims) {
|
||||
if (size() != 0 && ptr == nullptr) {
|
||||
TBlob(Precision p, Layout l, const SizeVector& dims, T* ptr, size_t data_size = 0) : Blob(p, l, dims) {
|
||||
if (data_size == 0) {
|
||||
data_size = size();
|
||||
}
|
||||
if (data_size != 0 && ptr == nullptr) {
|
||||
THROW_IE_EXCEPTION << "Using Blob on external nullptr memory";
|
||||
}
|
||||
_allocator = details::make_pre_allocator(ptr, size());
|
||||
_allocator = details::make_pre_allocator(ptr, data_size);
|
||||
// blob on attached memory is always allocated, so we are not forcing user to call allocate
|
||||
allocate();
|
||||
}
|
||||
@@ -416,7 +424,10 @@ public:
|
||||
if (tensorDesc.getDims().size() == 0) {
|
||||
tensorDesc.setDims({static_cast<unsigned int>(that.size())});
|
||||
}
|
||||
allocate();
|
||||
// minimisation of reallocations
|
||||
if (_handle == nullptr) {
|
||||
allocate();
|
||||
}
|
||||
auto memptr = data();
|
||||
memcpy(memptr, that.data(), product(tensorDesc.getDims()) * sizeof(T));
|
||||
}
|
||||
|
||||
@@ -67,6 +67,12 @@ union UserValue {
|
||||
void *v_ptr;
|
||||
};
|
||||
|
||||
enum CellType {
|
||||
ORIG,
|
||||
LSTM,
|
||||
GRU
|
||||
};
|
||||
|
||||
/**
|
||||
* @enum Layout
|
||||
* @brief Layouts that the inference engine supports
|
||||
@@ -94,6 +100,29 @@ enum Layout : uint8_t {
|
||||
|
||||
BLOCKED = 200,
|
||||
};
|
||||
inline std::ostream & operator << (std::ostream &out, const Layout & p) {
|
||||
switch (p) {
|
||||
#define PRINT_LAYOUT(name)\
|
||||
case name : out << #name; break;
|
||||
|
||||
PRINT_LAYOUT(ANY);
|
||||
PRINT_LAYOUT(NCHW);
|
||||
PRINT_LAYOUT(NHWC);
|
||||
PRINT_LAYOUT(OIHW);
|
||||
PRINT_LAYOUT(C);
|
||||
PRINT_LAYOUT(CHW);
|
||||
PRINT_LAYOUT(HW);
|
||||
PRINT_LAYOUT(NC);
|
||||
PRINT_LAYOUT(CN);
|
||||
PRINT_LAYOUT(BLOCKED);
|
||||
#undef PRINT_LAYOUT
|
||||
default:
|
||||
out << static_cast<int>(p);
|
||||
break;
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* @struct InferenceEngineProfileInfo
|
||||
@@ -157,7 +186,8 @@ enum StatusCode : int {
|
||||
REQUEST_BUSY = -8,
|
||||
RESULT_NOT_READY = -9,
|
||||
NOT_ALLOCATED = -10,
|
||||
INFER_NOT_STARTED = -11
|
||||
INFER_NOT_STARTED = -11,
|
||||
NETWORK_NOT_READ = -12
|
||||
};
|
||||
|
||||
/**
|
||||
@@ -216,6 +246,10 @@ class InferNotStarted : public std::logic_error
|
||||
{ using std::logic_error::logic_error; };
|
||||
} // namespace InferenceEngine
|
||||
|
||||
/** @brief This class represents StatusCode::NETWORK_NOT_READ exception */
|
||||
class NetworkNotRead : public std::logic_error
|
||||
{ using std::logic_error::logic_error; };
|
||||
|
||||
#if defined(_WIN32)
|
||||
#define __PRETTY_FUNCTION__ __FUNCSIG__
|
||||
#else
|
||||
|
||||
@@ -104,7 +104,7 @@ public:
|
||||
* Batch is defined as the last element in the dimensions vector.
|
||||
* @param batch_size Batch size to set
|
||||
*/
|
||||
inline void setBatchSize(size_t batch_size);
|
||||
void setBatchSize(size_t batch_size);
|
||||
|
||||
/**
|
||||
* @brief Sets the layout value for this Data instance
|
||||
|
||||
@@ -11,7 +11,6 @@
|
||||
|
||||
#include "details/ie_so_pointer.hpp"
|
||||
#include "ie_iextension.h"
|
||||
#include "mkldnn/mkldnn_extension_ptr.hpp"
|
||||
#include <string>
|
||||
#include <memory>
|
||||
#include <map>
|
||||
@@ -166,8 +165,8 @@ public:
|
||||
* @param resp Response descriptor
|
||||
* @return Status code
|
||||
*/
|
||||
StatusCode getPrimitiveTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept override {
|
||||
return actual->getPrimitiveTypes(types, size, resp);
|
||||
StatusCode getShapeInferTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept override {
|
||||
return actual->getShapeInferTypes(types, size, resp);
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -204,11 +203,7 @@ inline std::shared_ptr<IShapeInferExtension> make_so_pointer(const std::string &
|
||||
*/
|
||||
template<>
|
||||
inline std::shared_ptr<IExtension> make_so_pointer(const std::string &name) {
|
||||
try {
|
||||
return std::make_shared<Extension>(name);
|
||||
} catch (InferenceEngine::details::InferenceEngineException& ex) {
|
||||
return std::make_shared<MKLDNNPlugin::MKLDNNExtension>(name);
|
||||
}
|
||||
return std::make_shared<Extension>(name);
|
||||
}
|
||||
|
||||
} // namespace InferenceEngine
|
||||
|
||||
@@ -29,6 +29,7 @@ class ICNNNetReader : public details::IRelease {
|
||||
public:
|
||||
/**
|
||||
* @brief Parses the topology part of the IR (.xml)
|
||||
* This method can be called once only to read network. If you need to read another network instance then create new reader instance.
|
||||
* @param filepath The full path to the .xml file of the IR
|
||||
* @param resp Response message
|
||||
* @return Result code
|
||||
@@ -37,6 +38,7 @@ public:
|
||||
|
||||
/**
|
||||
* @brief Parses the topology part of the IR (.xml) given the xml as a buffer
|
||||
* This method can be called once only to read network. If you need to read another network instance then create new reader instance.
|
||||
* @param model Pointer to a char array with the IR
|
||||
* @param resp Response message
|
||||
* @param size Size of the char array in bytes
|
||||
|
||||
@@ -17,6 +17,7 @@
|
||||
#include "details/ie_irelease.hpp"
|
||||
#include "ie_preprocess.hpp"
|
||||
#include "ie_input_info.hpp"
|
||||
#include "ie_icnn_network_stats.hpp"
|
||||
#include "ie_iextension.h"
|
||||
#include <memory>
|
||||
#include <map>
|
||||
@@ -28,6 +29,8 @@ namespace InferenceEngine {
|
||||
* @brief A collection that contains string as key, and Data smart pointer as value
|
||||
*/
|
||||
using OutputsDataMap = std::map<std::string, DataPtr>;
|
||||
class IShapeInferExtension;
|
||||
using IShapeInferExtensionPtr = std::shared_ptr<IShapeInferExtension>;
|
||||
|
||||
/**
|
||||
* @brief This is the main interface to describe the NN topology
|
||||
@@ -143,9 +146,10 @@ public:
|
||||
* @note There are several limitations and it's not recommended to use it. Set batch to the input shape and call @reshape.
|
||||
* @param size Size of batch to set
|
||||
* @return Status code of the operation
|
||||
* @note: Current implementation of the function sets batch size to the first dimension of 4D input layers in the networks
|
||||
* and starts shape inference for IR starting from v3, for IR v2 it sets batch to the first dimension for all layers.
|
||||
* Custom layers might require custom shape infer implementation, use @IShapeInferExtension interface to register them.
|
||||
* @note: Current implementation of the function sets batch size to the first dimension of all layers in the networks.
|
||||
* Before calling it make sure that all your layers have batch in the first dimension, otherwise the method works incorrectly.
|
||||
* This limitation is resolved via [Shape Inference feature](./docs/Inference_Engine_Developer_Guide/ShapeInference.md)
|
||||
* by using InferenceEngine::ICNNNetwork::reshape method.
|
||||
*/
|
||||
virtual StatusCode setBatchSize(size_t size, ResponseDesc* responseDesc) noexcept = 0;
|
||||
|
||||
@@ -161,12 +165,11 @@ public:
|
||||
using InputShapes = std::map<std::string, SizeVector>;
|
||||
|
||||
/**
|
||||
* @brief - Run shape inference with new input shapes for the network
|
||||
* @param inputShapes - map of pairs: name of corresponding data and its dimension.
|
||||
* @note currently all inputs are required
|
||||
* @param resp Pointer to the response message that holds a description of an error if any occurred
|
||||
* @return Status code of the operation
|
||||
*/
|
||||
* @brief Run shape inference with new input shapes for the network
|
||||
* @param inputShapes - map of pairs: name of corresponding data and its dimension.
|
||||
* @param resp Pointer to the response message that holds a description of an error if any occurred
|
||||
* @return Status code of the operation
|
||||
*/
|
||||
virtual StatusCode reshape(const InputShapes& inputShapes, ResponseDesc* resp) noexcept { return NOT_IMPLEMENTED; };
|
||||
|
||||
/**
|
||||
@@ -177,5 +180,7 @@ public:
|
||||
*/
|
||||
virtual StatusCode
|
||||
AddExtension(const IShapeInferExtensionPtr& extension, ResponseDesc* resp) noexcept { return NOT_IMPLEMENTED; };
|
||||
|
||||
virtual StatusCode getStats(ICNNNetworkStats** stats, ResponseDesc* resp) const noexcept { return NOT_IMPLEMENTED; };
|
||||
};
|
||||
} // namespace InferenceEngine
|
||||
|
||||
@@ -13,37 +13,39 @@
|
||||
#include <memory>
|
||||
#include <limits>
|
||||
#include <vector>
|
||||
#include <map>
|
||||
#include "details/ie_irelease.hpp"
|
||||
|
||||
namespace InferenceEngine {
|
||||
|
||||
class NetworkNodeStats;
|
||||
|
||||
using NetworkNodeStatsPtr = std::shared_ptr<NetworkNodeStats>;
|
||||
using NetworkNodeStatsWeakPtr = std::weak_ptr<NetworkNodeStats>;
|
||||
using NetworkStatsMap = std::map<std::string, NetworkNodeStatsPtr>;
|
||||
/**
|
||||
* @class ICNNNetworkStats
|
||||
* @brief This is the interface to describe the NN topology scoring statistics
|
||||
*/
|
||||
class ICNNNetworkStats : public details::IRelease {
|
||||
public:
|
||||
virtual void SaveToFile(const std::string& xmlPath, const std::string& binPath) const = 0;
|
||||
virtual void LoadFromFile(const std::string& xmlPath, const std::string& binPath) = 0;
|
||||
virtual void setNodesStats(const NetworkStatsMap& stats) = 0;
|
||||
virtual const NetworkStatsMap& getNodesStats() const = 0;
|
||||
|
||||
virtual bool isEmpty() const = 0;
|
||||
};
|
||||
|
||||
|
||||
class NetworkNodeStats;
|
||||
|
||||
using NetworkNodeStatsPtr = std::shared_ptr<NetworkNodeStats>;
|
||||
using NetworkNodeStatsWeakPtr = std::weak_ptr<NetworkNodeStats>;
|
||||
|
||||
class NetworkNodeStats {
|
||||
public:
|
||||
NetworkNodeStats() { }
|
||||
explicit NetworkNodeStats(int statCount) {
|
||||
float min = std::numeric_limits<float>::max();
|
||||
float max = std::numeric_limits<float>::min();
|
||||
float mn = (std::numeric_limits<float>::max)();
|
||||
float mx = (std::numeric_limits<float>::min)();
|
||||
|
||||
for (int i = 0; i < statCount; i++) {
|
||||
_minOutputs.push_back(min);
|
||||
_maxOutputs.push_back(max);
|
||||
_minOutputs.push_back(mn);
|
||||
_maxOutputs.push_back(mx);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -22,7 +22,6 @@
|
||||
#include "details/ie_no_copy.hpp"
|
||||
|
||||
|
||||
|
||||
#if defined(_WIN32) && defined(IMPLEMENT_INFERENCE_EXTENSION_API)
|
||||
#define INFERENCE_EXTENSION_API(TYPE) extern "C" __declspec(dllexport) TYPE
|
||||
#else
|
||||
@@ -137,7 +136,9 @@ public:
|
||||
* @return Status code
|
||||
*/
|
||||
virtual StatusCode getShapes(const std::vector<TensorDesc>& inShapes, std::vector<TensorDesc>& outShapes,
|
||||
ResponseDesc* resp) noexcept = 0;
|
||||
ResponseDesc* resp) noexcept {
|
||||
return NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Gets all possible implementations for the given cnn Layer
|
||||
@@ -156,6 +157,8 @@ class IShapeInferImpl {
|
||||
public:
|
||||
using Ptr = std::shared_ptr<IShapeInferImpl>;
|
||||
|
||||
virtual ~IShapeInferImpl() = default;
|
||||
|
||||
/**
|
||||
* @brief check that reshape can be applied, that parameters and shapes are valid
|
||||
*/
|
||||
@@ -191,13 +194,13 @@ public:
|
||||
virtual void Unload() noexcept = 0;
|
||||
|
||||
/**
|
||||
* @brief Gets the array with types of layers which are included in the extension
|
||||
* @brief Fills passed array with types of layers which shape infer implementations are included in the extension
|
||||
* @param types Array to store the layer types
|
||||
* @param size Size of the layer types array
|
||||
* @param resp Response descriptor
|
||||
* @return Status code
|
||||
*/
|
||||
virtual StatusCode getPrimitiveTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept = 0;
|
||||
virtual StatusCode getShapeInferTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept = 0;
|
||||
|
||||
/**
|
||||
* @brief Gets shape propagation implementation for the given string-type of cnn Layer
|
||||
@@ -218,9 +221,20 @@ public:
|
||||
virtual StatusCode getFactoryFor(ILayerImplFactory*& factory, const CNNLayer* cnnLayer,
|
||||
ResponseDesc* resp) noexcept = 0;
|
||||
|
||||
StatusCode getShapeInferImpl(IShapeInferImpl::Ptr& impl,
|
||||
const char* type,
|
||||
ResponseDesc* resp) noexcept override {
|
||||
/**
|
||||
* @brief Fills passed array with types of layers which kernel implementations are included in the extension
|
||||
* @param types Array to store the layer types
|
||||
* @param size Size of the layer types array
|
||||
* @param resp Response descriptor
|
||||
* @return Status code
|
||||
*/
|
||||
virtual StatusCode getPrimitiveTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept = 0;
|
||||
|
||||
StatusCode getShapeInferTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept override {
|
||||
return NOT_IMPLEMENTED;
|
||||
};
|
||||
|
||||
StatusCode getShapeInferImpl(IShapeInferImpl::Ptr& impl, const char* type, ResponseDesc* resp) noexcept override {
|
||||
return NOT_IMPLEMENTED;
|
||||
};
|
||||
};
|
||||
|
||||
@@ -84,6 +84,8 @@ public:
|
||||
QueryNetworkResult &res) noexcept {
|
||||
QueryNetwork(device, network, res);
|
||||
};
|
||||
|
||||
virtual void SetLogCallback(IErrorListener &listener) = 0;
|
||||
};
|
||||
|
||||
using MapDeviceLoaders = std::map<std::string, InferenceEngine::IHeteroDeviceLoader::Ptr>;
|
||||
|
||||
@@ -58,4 +58,4 @@ class IMemoryState : public details::no_copy {
|
||||
virtual StatusCode GetLastState(Blob::CPtr & lastState, ResponseDesc *resp) const noexcept = 0;
|
||||
};
|
||||
|
||||
} // namespace InferenceEngine
|
||||
} // namespace InferenceEngine
|
||||
@@ -20,6 +20,9 @@
|
||||
#include "ie_data.h"
|
||||
#include "ie_blob.h"
|
||||
#include "ie_device.hpp"
|
||||
#include "ie_layers_property.hpp"
|
||||
|
||||
#include "ie_icnn_network.hpp"
|
||||
|
||||
namespace InferenceEngine {
|
||||
/**
|
||||
@@ -459,108 +462,92 @@ public:
|
||||
using CNNLayer::CNNLayer;
|
||||
};
|
||||
|
||||
/**
|
||||
* @brief convinenent way to declare property with backward compatibility to 2D members
|
||||
*/
|
||||
#define DEFINE_PROP(prop_name) \
|
||||
PropertyVector<unsigned int> prop_name;\
|
||||
unsigned int &prop_name##_x = prop_name.at(X_AXIS);\
|
||||
unsigned int &prop_name##_y = prop_name.at(Y_AXIS);\
|
||||
|
||||
/**
|
||||
* @brief This class represents a standard 3D Convolution Layer
|
||||
*/
|
||||
class ConvolutionLayer : public WeightableLayer {
|
||||
public:
|
||||
/**
|
||||
* @brief A convolution kernel width
|
||||
* @brief A convolution kernel array [X, Y, Z, ...]
|
||||
*/
|
||||
unsigned int _kernel_x = 0;
|
||||
DEFINE_PROP(_kernel);
|
||||
/**
|
||||
* @brief A convolution kernel height
|
||||
* @brief A convolution paddings begin array [X, Y, Z, ...]
|
||||
*/
|
||||
unsigned int _kernel_y = 0;
|
||||
DEFINE_PROP(_padding);
|
||||
/**
|
||||
* @brief An input convolution stride width
|
||||
* @brief A convolution paddings end array [X, Y, Z, ...]
|
||||
*/
|
||||
unsigned int _stride_x = 1;
|
||||
PropertyVector<unsigned int> _pads_end;
|
||||
/**
|
||||
* @brief An Input convolution stride height
|
||||
* @brief A convolution strides array [X, Y, Z, ...]
|
||||
*/
|
||||
unsigned int _stride_y = 1;
|
||||
DEFINE_PROP(_stride);
|
||||
/**
|
||||
* @brief A convolution dilations array [X, Y, Z, ...]
|
||||
*/
|
||||
DEFINE_PROP(_dilation);
|
||||
/**
|
||||
* @brief A number of output feature maps (size) generating the 3'rd output dimension
|
||||
*/
|
||||
unsigned int _out_depth = 0;
|
||||
/**
|
||||
* @brief Input padding width
|
||||
*/
|
||||
unsigned int _padding_x = 0;
|
||||
/**
|
||||
* @brief Input padding height
|
||||
*/
|
||||
unsigned int _padding_y = 0;
|
||||
/**
|
||||
* @brief Dilation width
|
||||
*/
|
||||
unsigned int _dilation_x = 1;
|
||||
/**
|
||||
* @brief Dilation height
|
||||
*/
|
||||
unsigned int _dilation_y = 1;
|
||||
unsigned int _out_depth = 0u;
|
||||
/**
|
||||
* @brief Number of groups
|
||||
*/
|
||||
unsigned int _group = 1;
|
||||
unsigned int _group = 1u;
|
||||
|
||||
/**
|
||||
* @brief Creates a new ConvolutionLayer instance.
|
||||
*/
|
||||
using WeightableLayer::WeightableLayer;
|
||||
explicit ConvolutionLayer(const LayerParams &p) : WeightableLayer(p),
|
||||
_kernel(2, 0u), _padding(2, 0u), _stride(2, 1u), _dilation(2, 1u) {}
|
||||
/**
|
||||
* @brief assignment operator
|
||||
*/
|
||||
ConvolutionLayer & operator = (const ConvolutionLayer & that) {
|
||||
if (&that != this) {
|
||||
WeightableLayer::operator=(that);
|
||||
_kernel = that._kernel;
|
||||
_padding = that._padding;
|
||||
_pads_end = that._pads_end;
|
||||
_stride = that._stride;
|
||||
_dilation = that._dilation;
|
||||
_out_depth = that._out_depth;
|
||||
_group = that._group;
|
||||
}
|
||||
return *this;
|
||||
}
|
||||
/**
|
||||
* @brief move assignment operator
|
||||
*/
|
||||
ConvolutionLayer& operator = (ConvolutionLayer &&) = default;
|
||||
/**
|
||||
* @brief copy constructor
|
||||
*/
|
||||
ConvolutionLayer(const ConvolutionLayer & that) : WeightableLayer(that) {
|
||||
operator = (that);
|
||||
}
|
||||
/**
|
||||
* @brief move constructor
|
||||
*/
|
||||
ConvolutionLayer(ConvolutionLayer &&) = default;
|
||||
};
|
||||
|
||||
/**
|
||||
* @brief This class represents a standard deconvolution layer
|
||||
*/
|
||||
class DeconvolutionLayer : public WeightableLayer {
|
||||
public:
|
||||
/**
|
||||
* @brief Deconvolution kernel width
|
||||
*/
|
||||
unsigned int _kernel_x = 0;
|
||||
/**
|
||||
* @brief Deconvolution kernel height
|
||||
*/
|
||||
unsigned int _kernel_y = 0;
|
||||
/**
|
||||
* @brief Input Deconvolution stride width
|
||||
*/
|
||||
unsigned int _stride_x = 0;
|
||||
/**
|
||||
* @brief Input Deconvolution stride height
|
||||
*/
|
||||
unsigned int _stride_y = 0;
|
||||
/**
|
||||
* @brief number of output feature maps (size) generating the 3'rd output dimension
|
||||
*/
|
||||
unsigned int _out_depth = 0;
|
||||
/**
|
||||
* @brief Input padding width
|
||||
*/
|
||||
unsigned int _padding_x = 0;
|
||||
/**
|
||||
* @brief Input padding height
|
||||
*/
|
||||
unsigned int _padding_y = 0;
|
||||
/**
|
||||
* @brief Dilation width
|
||||
*/
|
||||
unsigned int _dilation_x = 0;
|
||||
/**
|
||||
* @brief Dilation height
|
||||
*/
|
||||
unsigned int _dilation_y = 0;
|
||||
/**
|
||||
* @brief Number of groups
|
||||
*/
|
||||
unsigned int _group = 0;
|
||||
|
||||
/**
|
||||
* @brief Creates a new DeconvolutionLayer instance.
|
||||
*/
|
||||
using WeightableLayer::WeightableLayer;
|
||||
class DeconvolutionLayer : public ConvolutionLayer {
|
||||
public:
|
||||
using ConvolutionLayer::ConvolutionLayer;
|
||||
using ConvolutionLayer::operator=;
|
||||
};
|
||||
|
||||
/**
|
||||
@@ -569,29 +556,21 @@ public:
|
||||
class PoolingLayer : public CNNLayer {
|
||||
public:
|
||||
/**
|
||||
* @brief Pooling kernel width
|
||||
* @brief Pooling kernel array [X, Y, Z, ...]
|
||||
*/
|
||||
unsigned int _kernel_x = 0;
|
||||
DEFINE_PROP(_kernel);
|
||||
/**
|
||||
* @brief Pooling kernel height
|
||||
* @brief Pooling paddings begin array [X, Y, Z, ...]
|
||||
*/
|
||||
unsigned int _kernel_y = 0;
|
||||
DEFINE_PROP(_padding);
|
||||
/**
|
||||
* @brief Input Pooling stride width
|
||||
* @brief Pooling paddings end array [X, Y, Z, ...]
|
||||
*/
|
||||
unsigned int _stride_x = 0;
|
||||
PropertyVector<unsigned int> _pads_end;
|
||||
/**
|
||||
* @brief Input Pooling stride height
|
||||
* @brief Pooling strides array [X, Y, Z, ...]
|
||||
*/
|
||||
unsigned int _stride_y = 0;
|
||||
/**
|
||||
* @brief Input padding width
|
||||
*/
|
||||
unsigned int _padding_x = 0;
|
||||
/**
|
||||
* @brief Input padding height
|
||||
*/
|
||||
unsigned int _padding_y = 0;
|
||||
DEFINE_PROP(_stride);
|
||||
|
||||
/**
|
||||
* @enum PoolType
|
||||
@@ -618,9 +597,44 @@ public:
|
||||
/**
|
||||
* @brief Creates a new PoolingLayer instance.
|
||||
*/
|
||||
using CNNLayer::CNNLayer;
|
||||
explicit PoolingLayer(const LayerParams &p) : CNNLayer(p),
|
||||
_kernel(2, 0u), _padding(2, 0u), _stride(2, 0u) {}
|
||||
|
||||
/**
|
||||
* @brief assignment operator
|
||||
*/
|
||||
PoolingLayer & operator = (const PoolingLayer & that) {
|
||||
if (&that != this) {
|
||||
CNNLayer::operator=(that);
|
||||
_kernel = that._kernel;
|
||||
_padding = that._padding;
|
||||
_pads_end = that._pads_end;
|
||||
_stride = that._stride;
|
||||
_type = that._type;
|
||||
_exclude_pad = that._exclude_pad;
|
||||
}
|
||||
return *this;
|
||||
}
|
||||
/**
|
||||
* @brief move assignment operator
|
||||
*/
|
||||
PoolingLayer& operator = (PoolingLayer &&) = default;
|
||||
|
||||
/**
|
||||
* @brief copy constructor
|
||||
*/
|
||||
PoolingLayer(const PoolingLayer & that) : CNNLayer(that) {
|
||||
operator=(that);
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief move constructor
|
||||
*/
|
||||
PoolingLayer(PoolingLayer &&) = default;
|
||||
};
|
||||
|
||||
#undef DEFINE_PROP
|
||||
|
||||
/**
|
||||
* @brief This class represents a fully connected layer
|
||||
*/
|
||||
@@ -836,7 +850,6 @@ public:
|
||||
*/
|
||||
std::vector<int> axis;
|
||||
/**
|
||||
* @deprecated result size is defined by second input
|
||||
* @brief A vector of dimensions to be preserved
|
||||
*/
|
||||
std::vector<int> dim;
|
||||
@@ -912,6 +925,66 @@ public:
|
||||
using WeightableLayer::WeightableLayer;
|
||||
};
|
||||
|
||||
/**
|
||||
* @brief This class represents RNN sequence layer
|
||||
*/
|
||||
class RNNLayer : public WeightableLayer {
|
||||
public:
|
||||
CellType cellType;
|
||||
|
||||
/**
|
||||
* @brief An axis by which iteration is performed. Axis=0 means first input blob dimension is sequence, axis=1 means first dimension is batch.
|
||||
*/
|
||||
unsigned int _axis = 1;
|
||||
|
||||
using WeightableLayer::WeightableLayer;
|
||||
|
||||
/**
|
||||
* @brief Creates a new RNNLayer instance.
|
||||
*/
|
||||
explicit RNNLayer(const LayerParams &p) : WeightableLayer(p) {}
|
||||
};
|
||||
|
||||
/**
|
||||
* @brief This class represents LSTMCell pseudo-layer to be used in TensorIterator
|
||||
*/
|
||||
class LSTMCell : public WeightableLayer {
|
||||
public:
|
||||
using WeightableLayer::WeightableLayer;
|
||||
};
|
||||
|
||||
class ICNNNetReader;
|
||||
/**
|
||||
* @brief This class represents TensorIterator layer
|
||||
*/
|
||||
class TensorIterator : public CNNLayer {
|
||||
public:
|
||||
using CNNNetReaderPtr = std::shared_ptr<ICNNNetReader>;
|
||||
CNNNetReaderPtr reader;
|
||||
|
||||
struct BackEdge {
|
||||
int fromLayer;
|
||||
int fromPort;
|
||||
int toLayer;
|
||||
int toPort;
|
||||
};
|
||||
|
||||
struct Port {
|
||||
int external_port_id;
|
||||
int internal_layer_id;
|
||||
int internal_port_id;
|
||||
int axis;
|
||||
int part_size;
|
||||
int stride;
|
||||
};
|
||||
|
||||
std::vector<Port> input_ports;
|
||||
std::vector<Port> output_ports;
|
||||
std::vector<BackEdge> backEdges;
|
||||
|
||||
using CNNLayer::CNNLayer;
|
||||
};
|
||||
|
||||
/**
|
||||
* @class PReLULayer
|
||||
* @brief This class represents a Layer which performs Scale and Shift
|
||||
|
||||
125
inference-engine/include/ie_layers_property.hpp
Normal file
125
inference-engine/include/ie_layers_property.hpp
Normal file
@@ -0,0 +1,125 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
/**
|
||||
* @brief a header file for describing property style structure used by CNNLayers
|
||||
* @file ie_layers_property.h
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
namespace InferenceEngine {
|
||||
|
||||
constexpr const int MAX_DIMS_NUMBER = 12;
|
||||
|
||||
enum eDIMS_AXIS : uint8_t {
|
||||
X_AXIS = 0,
|
||||
Y_AXIS,
|
||||
Z_AXIS
|
||||
};
|
||||
|
||||
template<class T, int N = MAX_DIMS_NUMBER>
|
||||
class PropertyVector {
|
||||
T _axises[N] = {};
|
||||
bool _allocated[N] = {};
|
||||
size_t _length = 0;
|
||||
|
||||
public:
|
||||
PropertyVector() = default;
|
||||
|
||||
PropertyVector(size_t len, T val) {
|
||||
if (len > N) {
|
||||
THROW_IE_EXCEPTION << "Property size exceeed limit of: " << N;
|
||||
}
|
||||
for (int i = 0; i < len; i++) {
|
||||
_axises[i] = val;
|
||||
_allocated[i] = true;
|
||||
}
|
||||
_length = len;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief allows access up-to capacity size
|
||||
* @param index
|
||||
* @return
|
||||
*/
|
||||
T &at(int index) {
|
||||
if (index >= N) {
|
||||
THROW_IE_EXCEPTION << "Property index is out of bounds(" << index << "/" << N;
|
||||
}
|
||||
return _axises[index];
|
||||
}
|
||||
|
||||
const T &operator[](size_t index) const {
|
||||
if (index >= N ||!_allocated[index]) {
|
||||
THROW_IE_EXCEPTION << "Property index ("<< index <<")is out of bounds";
|
||||
}
|
||||
return _axises[index];
|
||||
}
|
||||
|
||||
T &operator[](size_t index) {
|
||||
if (index >= N || !_allocated[index]) {
|
||||
THROW_IE_EXCEPTION << "Property index ("<< index <<")is out of bounds";
|
||||
}
|
||||
return _axises[index];
|
||||
}
|
||||
|
||||
PropertyVector &operator=(const PropertyVector &src) {
|
||||
if (this != &src) {
|
||||
_length = src.size();
|
||||
for (size_t i = 0; i < N; i++) {
|
||||
_allocated[i] = src._allocated[i];
|
||||
if (_allocated[i]) {
|
||||
_axises[i] = src[i];
|
||||
}
|
||||
}
|
||||
}
|
||||
return *this;
|
||||
}
|
||||
|
||||
bool operator==(const PropertyVector& src) const {
|
||||
if (this == &src) return true;
|
||||
if (_length != src.size()) return false;
|
||||
for (size_t i = 0; i < N; i++)
|
||||
if ((_allocated[i] != src._allocated[i]) ||
|
||||
(_allocated[i] && _axises[i] != src._axises[i])) return false;
|
||||
return true;
|
||||
}
|
||||
|
||||
size_t size() const {
|
||||
return _length;
|
||||
}
|
||||
|
||||
void insert(size_t axis, const T &val) {
|
||||
if (axis < N) {
|
||||
if (!_allocated[axis]) {
|
||||
_allocated[axis] = true;
|
||||
_length++;
|
||||
}
|
||||
_axises[axis] = val;
|
||||
} else {
|
||||
THROW_IE_EXCEPTION << "Layer Property insertion at(axis) should be in [0,"<< N<< ")";
|
||||
}
|
||||
}
|
||||
|
||||
void remove(size_t axis) {
|
||||
if (axis < N && _allocated[axis]) {
|
||||
_allocated[axis] = false;
|
||||
_length--;
|
||||
}
|
||||
}
|
||||
|
||||
void clear() {
|
||||
for (int i = 0; i != N; i++) {
|
||||
_allocated[i] = 0;
|
||||
}
|
||||
_length = 0u;
|
||||
}
|
||||
|
||||
bool exist(size_t axis) const {
|
||||
return (axis < N && _allocated[axis]);
|
||||
}
|
||||
};
|
||||
|
||||
} // namespace InferenceEngine
|
||||
354
inference-engine/include/ie_parallel.hpp
Normal file
354
inference-engine/include/ie_parallel.hpp
Normal file
@@ -0,0 +1,354 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
/**
|
||||
* @brief Contains declarations and definitions for sequential and multi-threading implementations.
|
||||
* Multi-threading support is implemented in two variants: using the Threading Building Blocks library and OpenMP* product.
|
||||
* To build a particular implementation, use the corresponding identifier: IE_THREAD_TBB, IE_THREAD_OMP or IE_THREAD_SEQ.
|
||||
* @file ie_parallel.hpp
|
||||
*/
|
||||
|
||||
#pragma once
|
||||
|
||||
#define IE_THREAD_TBB 0
|
||||
#define IE_THREAD_OMP 1
|
||||
#define IE_THREAD_SEQ 2
|
||||
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
#include "tbb/parallel_for.h"
|
||||
#include "tbb/task_arena.h"
|
||||
|
||||
#include "tbb/parallel_reduce.h"
|
||||
#include "tbb/blocked_range.h"
|
||||
#include "tbb/blocked_range2d.h"
|
||||
|
||||
inline int parallel_get_max_threads() { return tbb::this_task_arena::max_concurrency(); }
|
||||
inline int parallel_get_num_threads() { return parallel_get_max_threads(); }
|
||||
inline int parallel_get_thread_num() { return tbb::this_task_arena::current_thread_index(); }
|
||||
inline void parallel_set_num_threads(int n) { return; }
|
||||
|
||||
#elif IE_THREAD == IE_THREAD_OMP
|
||||
#include <omp.h>
|
||||
/* MSVC still supports omp 2.0 only */
|
||||
#if defined(_MSC_VER) && !defined(__INTEL_COMPILER)
|
||||
# define collapse(x)
|
||||
#endif // defined(_MSC_VER) && !defined(__INTEL_COMPILER)
|
||||
inline int parallel_get_max_threads() { return omp_get_max_threads(); }
|
||||
inline int parallel_get_num_threads() { return omp_get_num_threads(); }
|
||||
inline int parallel_get_thread_num() { return omp_get_thread_num(); }
|
||||
inline void parallel_set_num_threads(int n) { omp_set_num_threads(n); }
|
||||
|
||||
#elif IE_THREAD == IE_THREAD_SEQ
|
||||
inline int parallel_get_max_threads() { return 1; }
|
||||
inline int parallel_get_num_threads() { return 1; }
|
||||
inline int parallel_get_thread_num() { return 0; }
|
||||
inline void parallel_set_num_threads(int n) { return; }
|
||||
#endif
|
||||
|
||||
|
||||
namespace InferenceEngine {
|
||||
|
||||
template <typename F>
|
||||
void parallel_nt(int nthr, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
if (nthr == 0) nthr = parallel_get_max_threads();
|
||||
if (nthr == 1) {
|
||||
func(0, 1);
|
||||
return;
|
||||
}
|
||||
|
||||
tbb::parallel_for(0, nthr, [&](int ithr) {
|
||||
func(ithr, nthr);
|
||||
});
|
||||
#elif IE_THREAD == IE_THREAD_OMP
|
||||
if (nthr == 1) {
|
||||
func(0, 1);
|
||||
return;
|
||||
}
|
||||
|
||||
# pragma omp parallel num_threads(nthr)
|
||||
func(parallel_get_thread_num(), parallel_get_num_threads());
|
||||
#elif IE_THREAD == IE_THREAD_SEQ
|
||||
func(0, 1);
|
||||
#endif
|
||||
}
|
||||
|
||||
template <typename T0, typename R, typename F>
|
||||
R parallel_sum(const T0 D0, R &input, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
return tbb::parallel_reduce(
|
||||
tbb::blocked_range<T0>(0, D0), input,
|
||||
[&](const tbb::blocked_range<T0>& r, R init)->R {
|
||||
R sum = init;
|
||||
for (T0 dim1 = r.begin(); dim1 < r.end(); ++dim1)
|
||||
sum += func(dim1);
|
||||
return sum;
|
||||
},
|
||||
[](R x, R y)->R {
|
||||
return x + y;
|
||||
});
|
||||
#else
|
||||
R sum = input;
|
||||
#if IE_THREAD == IE_THREAD_OMP
|
||||
#pragma omp parallel for reduction(+ : sum) schedule(static)
|
||||
#endif
|
||||
for (T0 dim1 = 0; dim1 < D0; dim1++) {
|
||||
sum += func(dim1);
|
||||
}
|
||||
return sum;
|
||||
#endif
|
||||
}
|
||||
|
||||
template <typename T0, typename T1, typename R, typename F>
|
||||
R parallel_sum2d(const T0 D0, const T1 D1, R input, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
return tbb::parallel_reduce(
|
||||
tbb::blocked_range2d<T0, T1>(0, D0, 0, D1), input,
|
||||
[&](const tbb::blocked_range2d<T0, T1>& r, R init)->R {
|
||||
R sum = init;
|
||||
for (T0 dim2 = r.rows().begin(); dim2 < r.rows().end(); dim2++) {
|
||||
for (T1 dim1 = r.cols().begin(); dim1 < r.cols().end(); dim1++) {
|
||||
sum += func(dim2, dim1);
|
||||
}
|
||||
}
|
||||
return sum;
|
||||
},
|
||||
[](R x, R y)->R {
|
||||
return x + y;
|
||||
});
|
||||
#else
|
||||
R sum = input;
|
||||
#if IE_THREAD == IE_THREAD_OMP
|
||||
#pragma omp parallel for collapse(2) reduction(+ : sum) schedule(static)
|
||||
#endif
|
||||
for (T0 dim2 = 0; dim2 < D0; dim2++) {
|
||||
for (T1 dim1 = 0; dim1 < D1; dim1++) {
|
||||
sum += func(dim2, dim1);
|
||||
}
|
||||
}
|
||||
return sum;
|
||||
#endif
|
||||
}
|
||||
|
||||
template<typename T>
|
||||
inline T parallel_it_init(T start) { return start; }
|
||||
template<typename T, typename Q, typename R, typename... Args>
|
||||
inline T parallel_it_init(T start, Q &x, const R &X, Args &&... tuple) {
|
||||
start = parallel_it_init(start, static_cast<Args>(tuple)...);
|
||||
x = start % X;
|
||||
return start / X;
|
||||
}
|
||||
|
||||
inline bool parallel_it_step() { return true; }
|
||||
template<typename Q, typename R, typename... Args>
|
||||
inline bool parallel_it_step(Q &x, const R &X, Args &&... tuple) {
|
||||
if (parallel_it_step(static_cast<Args>(tuple)...)) {
|
||||
x = (x + 1) % X;
|
||||
return x == 0;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
template <typename T, typename Q>
|
||||
inline void splitter(T n, Q team, Q tid, T &n_start, T &n_end) {
|
||||
if (team <= 1 || n == 0) {
|
||||
n_start = 0;
|
||||
n_end = n;
|
||||
} else {
|
||||
T n1 = (n + (T)team - 1) / (T)team;
|
||||
T n2 = n1 - 1;
|
||||
T T1 = n - n2 * (T)team;
|
||||
n_end = (T)tid < T1 ? n1 : n2;
|
||||
n_start = (T)tid <= T1 ? tid * n1 : T1 * n1 + ((T)tid - T1) * n2;
|
||||
}
|
||||
|
||||
n_end += n_start;
|
||||
}
|
||||
|
||||
|
||||
template <typename T0, typename F>
|
||||
void for_1d(const int ithr, const int nthr, const T0 &D0, F func) {
|
||||
T0 d0{ 0 }, end{ 0 };
|
||||
splitter(D0, nthr, ithr, d0, end);
|
||||
for (; d0 < end; ++d0) func(d0);
|
||||
}
|
||||
|
||||
template <typename T0, typename F>
|
||||
void parallel_for(const T0 &D0, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
const int nthr = parallel_get_max_threads();
|
||||
tbb::parallel_for(0, nthr, [&](int ithr) {
|
||||
for_1d(ithr, nthr, D0, func);
|
||||
});
|
||||
#elif IE_THREAD == IE_THREAD_OMP
|
||||
# pragma omp parallel
|
||||
for_1d(parallel_get_thread_num(), parallel_get_num_threads(), D0, func);
|
||||
#elif IE_THREAD == IE_THREAD_SEQ
|
||||
for_1d(0, 1, D0, func);
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
template <typename T0, typename T1, typename F>
|
||||
void for_2d(const int ithr, const int nthr, const T0 &D0, const T1 &D1, F func) {
|
||||
const size_t work_amount = (size_t)D0 * D1;
|
||||
if (work_amount == 0) return;
|
||||
size_t start{ 0 }, end{ 0 };
|
||||
splitter(work_amount, nthr, ithr, start, end);
|
||||
|
||||
T0 d0{ 0 }; T1 d1{ 0 };
|
||||
parallel_it_init(start, d0, D0, d1, D1);
|
||||
for (size_t iwork = start; iwork < end; ++iwork) {
|
||||
func(d0, d1);
|
||||
parallel_it_step(d0, D0, d1, D1);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T0, typename T1, typename F>
|
||||
void parallel_for2d(const T0 &D0, const T1 &D1, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
const int nthr = parallel_get_max_threads();
|
||||
tbb::parallel_for(0, nthr, [&](int ithr) {
|
||||
for_2d(ithr, nthr, D0, D1, func);
|
||||
});
|
||||
#elif IE_THREAD == IE_THREAD_OMP
|
||||
# pragma omp parallel
|
||||
for_2d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, func);
|
||||
#elif IE_THREAD == IE_THREAD_SEQ
|
||||
for_2d(0, 1, D0, D1, func);
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
template <typename T0, typename T1, typename T2, typename F>
|
||||
void for_3d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
|
||||
const T2 &D2, F func) {
|
||||
const size_t work_amount = (size_t)D0 * D1 * D2;
|
||||
if (work_amount == 0) return;
|
||||
size_t start{ 0 }, end{ 0 };
|
||||
splitter(work_amount, nthr, ithr, start, end);
|
||||
|
||||
T0 d0{ 0 }; T1 d1{ 0 }; T2 d2{ 0 };
|
||||
parallel_it_init(start, d0, D0, d1, D1, d2, D2);
|
||||
for (size_t iwork = start; iwork < end; ++iwork) {
|
||||
func(d0, d1, d2);
|
||||
parallel_it_step(d0, D0, d1, D1, d2, D2);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T0, typename T1, typename T2, typename F>
|
||||
void parallel_for3d(const T0 &D0, const T1 &D1, const T2 &D2, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
const int nthr = parallel_get_max_threads();
|
||||
tbb::parallel_for(0, nthr, [&](int ithr) {
|
||||
for_3d(ithr, nthr, D0, D1, D2, func);
|
||||
});
|
||||
#elif IE_THREAD == IE_THREAD_OMP
|
||||
# pragma omp parallel
|
||||
for_3d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, D2, func);
|
||||
#elif IE_THREAD == IE_THREAD_SEQ
|
||||
for_3d(0, 1, D0, D1, D2, func);
|
||||
#endif
|
||||
}
|
||||
|
||||
template <typename T0, typename T1, typename T2, typename T3, typename F>
|
||||
void for_4d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
|
||||
const T2 &D2, const T3 &D3, F func) {
|
||||
const size_t work_amount = (size_t)D0 * D1 * D2 * D3;
|
||||
if (work_amount == 0) return;
|
||||
size_t start{ 0 }, end{ 0 };
|
||||
splitter(work_amount, nthr, ithr, start, end);
|
||||
|
||||
T0 d0{ 0 }; T1 d1{ 0 }; T2 d2{ 0 }; T3 d3{ 0 };
|
||||
parallel_it_init(start, d0, D0, d1, D1, d2, D2, d3, D3);
|
||||
for (size_t iwork = start; iwork < end; ++iwork) {
|
||||
func(d0, d1, d2, d3);
|
||||
parallel_it_step(d0, D0, d1, D1, d2, D2, d3, D3);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T0, typename T1, typename T2, typename T3, typename F>
|
||||
void parallel_for4d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
const int nthr = parallel_get_max_threads();
|
||||
tbb::parallel_for(0, nthr, [&](int ithr) {
|
||||
for_4d(ithr, nthr, D0, D1, D2, D3, func);
|
||||
});
|
||||
#elif IE_THREAD == IE_THREAD_OMP
|
||||
# pragma omp parallel
|
||||
for_4d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, D2, D3, func);
|
||||
#elif IE_THREAD == IE_THREAD_SEQ
|
||||
for_4d(0, 1, D0, D1, D2, D3, func);
|
||||
#endif
|
||||
}
|
||||
|
||||
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename F>
|
||||
void for_5d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
|
||||
const T2 &D2, const T3 &D3, const T4 &D4, F func) {
|
||||
const size_t work_amount = (size_t)D0 * D1 * D2 * D3 * D4;
|
||||
if (work_amount == 0) return;
|
||||
size_t start{ 0 }, end{ 0 };
|
||||
splitter(work_amount, nthr, ithr, start, end);
|
||||
|
||||
T0 d0{ 0 }; T1 d1{ 0 }; T2 d2{ 0 }; T3 d3{ 0 }; T4 d4{ 0 };
|
||||
parallel_it_init(start, d0, D0, d1, D1, d2, D2, d3, D3, d4, D4);
|
||||
for (size_t iwork = start; iwork < end; ++iwork) {
|
||||
func(d0, d1, d2, d3, d4);
|
||||
parallel_it_step(d0, D0, d1, D1, d2, D2, d3, D3, d4, D4);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename F>
|
||||
void parallel_for5d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3,
|
||||
const T4 &D4, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
const int nthr = parallel_get_max_threads();
|
||||
tbb::parallel_for(0, nthr, [&](int ithr) {
|
||||
for_5d(ithr, nthr, D0, D1, D2, D3, D4, func);
|
||||
});
|
||||
#elif IE_THREAD == IE_THREAD_OMP
|
||||
# pragma omp parallel
|
||||
for_5d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, D2, D3, D4, func);
|
||||
#elif IE_THREAD == IE_THREAD_SEQ
|
||||
for_5d(0, 1, D0, D1, D2, D3, D4, func);
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename T5, typename F>
|
||||
void for_6d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
|
||||
const T2 &D2, const T3 &D3, const T4 &D4, const T5 &D5, F func) {
|
||||
const size_t work_amount = (size_t)D0 * D1 * D2 * D3 * D4 * D5;
|
||||
if (work_amount == 0) return;
|
||||
size_t start{ 0 }, end{ 0 };
|
||||
splitter(work_amount, nthr, ithr, start, end);
|
||||
|
||||
T0 d0{ 0 }; T1 d1{ 0 }; T2 d2{ 0 }; T3 d3{ 0 }; T4 d4{ 0 }; T5 d5{ 0 };
|
||||
parallel_it_init(start, d0, D0, d1, D1, d2, D2, d3, D3, d4, D4,
|
||||
d5, D5);
|
||||
for (size_t iwork = start; iwork < end; ++iwork) {
|
||||
func(d0, d1, d2, d3, d4, d5);
|
||||
parallel_it_step(d0, D0, d1, D1, d2, D2, d3, D3, d4, D4, d5, D5);
|
||||
}
|
||||
}
|
||||
|
||||
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename T5, typename F>
|
||||
void parallel_for6d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3,
|
||||
const T4 &D4, const T5 &D5, F func) {
|
||||
#if IE_THREAD == IE_THREAD_TBB
|
||||
const int nthr = parallel_get_max_threads();
|
||||
tbb::parallel_for(0, nthr, [&](int ithr) {
|
||||
for_6d(ithr, nthr, D0, D1, D2, D3, D4, D5, func);
|
||||
});
|
||||
#elif IE_THREAD == IE_THREAD_OMP
|
||||
# pragma omp parallel
|
||||
for_6d(parallel_get_thread_num(), parallel_get_num_threads(), D0, D1, D2, D3, D4, D5, func);
|
||||
#elif IE_THREAD == IE_THREAD_SEQ
|
||||
for_6d(0, 1, D0, D1, D2, D3, D4, D5, func);
|
||||
#endif
|
||||
}
|
||||
|
||||
} // namespace InferenceEngine
|
||||
|
||||
@@ -21,6 +21,8 @@
|
||||
#include <ie_device.hpp>
|
||||
#include <ie_plugin_dispatcher.hpp>
|
||||
#include <ie_plugin_config.hpp>
|
||||
#include <ie_icnn_network.hpp>
|
||||
#include <ie_icnn_network_stats.hpp>
|
||||
#include <cpp/ie_cnn_net_reader.h>
|
||||
#include <cpp/ie_plugin_cpp.hpp>
|
||||
#include <cpp/ie_executable_network.hpp>
|
||||
@@ -177,5 +179,4 @@ void copyToFloat(float *dst, const InferenceEngine::Blob *src) {
|
||||
for (size_t i = 0; i < t_blob->size(); i++) dst[i] = srcPtr[i];
|
||||
}
|
||||
|
||||
|
||||
} // namespace InferenceEngine
|
||||
|
||||
@@ -1,67 +0,0 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
/**
|
||||
* @brief A header file for the main MKL-DNN Extension API
|
||||
* @file mkldnn_extension.hpp
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include <ie_iextension.h>
|
||||
|
||||
#include "mkldnn_generic_primitive.hpp"
|
||||
|
||||
namespace InferenceEngine {
|
||||
namespace MKLDNNPlugin {
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief The IMKLDNNExtension class provides the main extension interface
|
||||
*/
|
||||
class IMKLDNNExtension : public IExtension {
|
||||
public:
|
||||
/**
|
||||
* @brief Creates a generic layer and returns a pointer to an instance
|
||||
* @param primitive Pointer to newly created layer
|
||||
* @param layer Layer parameters (source for name, type, precision, attr, weights...)
|
||||
* @param resp Optional: pointer to an already allocated object to contain information in case of failure
|
||||
* @return Status code of the operation: OK (0) for success
|
||||
*/
|
||||
virtual InferenceEngine::StatusCode CreateGenericPrimitive(IMKLDNNGenericPrimitive*& primitive,
|
||||
const InferenceEngine::CNNLayerPtr& layer,
|
||||
InferenceEngine::ResponseDesc *resp) const noexcept = 0;
|
||||
/**
|
||||
* @brief This method isn't implemented for the old API
|
||||
*/
|
||||
StatusCode getPrimitiveTypes(char**& types, unsigned int& size, ResponseDesc* resp) noexcept override {
|
||||
return NOT_IMPLEMENTED;
|
||||
};
|
||||
/**
|
||||
* @brief This method isn't implemented for the old API
|
||||
*/
|
||||
StatusCode getFactoryFor(ILayerImplFactory *&factory, const CNNLayer *cnnLayer, ResponseDesc *resp) noexcept override {
|
||||
return NOT_IMPLEMENTED;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Gets shape propagation implementation for the given string-type of cnn Layer
|
||||
* @param impl the vector with implementations which is ordered by priority
|
||||
* @param resp response descriptor
|
||||
* @return status code
|
||||
*/
|
||||
StatusCode getShapeInferImpl(IShapeInferImpl::Ptr& impl, const char* type, ResponseDesc* resp) noexcept override {
|
||||
return NOT_IMPLEMENTED;
|
||||
};
|
||||
};
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief Creates the default instance of the extension
|
||||
* @return The MKL-DNN Extension interface
|
||||
*/
|
||||
INFERENCE_EXTENSION_API(StatusCode) CreateMKLDNNExtension(IMKLDNNExtension*& ext, ResponseDesc* resp) noexcept;
|
||||
|
||||
} // namespace MKLDNNPlugin
|
||||
} // namespace InferenceEngine
|
||||
@@ -1,138 +0,0 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
/**
|
||||
* @brief A header file that defines a wrapper class for handling extension instantiation and releasing resources
|
||||
* @file mkldnn_extension_ptr.hpp
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "details/ie_so_pointer.hpp"
|
||||
#include "mkldnn/mkldnn_extension.hpp"
|
||||
#include <string>
|
||||
#include <memory>
|
||||
|
||||
namespace InferenceEngine {
|
||||
namespace details {
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief The SOCreatorTrait class defines the name of the fabric
|
||||
* for creating MKLDNNPlugin::IMKLDNNExtension object in DLL
|
||||
*/
|
||||
template<>
|
||||
class SOCreatorTrait<MKLDNNPlugin::IMKLDNNExtension> {
|
||||
public:
|
||||
/**
|
||||
* @brief A name of the fabric for creating an MKLDNNPlugin::IMKLDNNExtension object in DLL
|
||||
*/
|
||||
static constexpr auto name = "CreateMKLDNNExtension";
|
||||
};
|
||||
|
||||
} // namespace details
|
||||
|
||||
namespace MKLDNNPlugin {
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief This class is a C++ helper to work with objects created using extensions.
|
||||
* Implements different interfaces.
|
||||
*/
|
||||
class MKLDNNExtension : public MKLDNNPlugin::IMKLDNNExtension {
|
||||
public:
|
||||
/**
|
||||
* @brief Loads extension from a shared library
|
||||
* @param name Logical name of the extension library (soname without .dll/.so/lib prefix)
|
||||
*/
|
||||
explicit MKLDNNExtension(const std::string &name)
|
||||
: actual(name) {}
|
||||
|
||||
/**
|
||||
* @brief Creates a generic layer and returns a pointer to an instance
|
||||
* @param primitive Pointer to a newly created layer
|
||||
* @param layer Layer parameters (source for name, type, precision, attr, weights...)
|
||||
* @param resp Optional: pointer to an already allocated object to contain information in case of failure
|
||||
* @return Status code of the operation: OK (0) for success
|
||||
*/
|
||||
InferenceEngine::StatusCode CreateGenericPrimitive(IMKLDNNGenericPrimitive *&primitive,
|
||||
const InferenceEngine::CNNLayerPtr &layer,
|
||||
InferenceEngine::ResponseDesc *resp) const noexcept override {
|
||||
return actual->CreateGenericPrimitive(primitive, layer, resp);
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief This method isn't implemented for the old API
|
||||
*/
|
||||
InferenceEngine::StatusCode getPrimitiveTypes(char**& types, unsigned int& size,
|
||||
InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
return actual->getPrimitiveTypes(types, size, resp);
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief This method isn't implemented for the old API
|
||||
*/
|
||||
InferenceEngine::StatusCode getFactoryFor(InferenceEngine::ILayerImplFactory *&factory,
|
||||
const InferenceEngine::CNNLayer *cnnLayer,
|
||||
InferenceEngine::ResponseDesc *resp) noexcept override {
|
||||
return actual->getFactoryFor(factory, cnnLayer, resp);
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief This method isn't implemented for the old API
|
||||
*/
|
||||
InferenceEngine::StatusCode getShapeInferImpl(InferenceEngine::IShapeInferImpl::Ptr& impl, const char* type,
|
||||
InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
return actual->getShapeInferImpl(impl, type, resp);
|
||||
};
|
||||
|
||||
/**
|
||||
* @brief Gets the extension version information
|
||||
* @param versionInfo A pointer to version info, set by plugin
|
||||
*/
|
||||
void GetVersion(const InferenceEngine::Version *&versionInfo) const noexcept override {
|
||||
actual->GetVersion(versionInfo);
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Sets a log callback that is used to track what is going on inside
|
||||
* @param listener Logging listener
|
||||
*/
|
||||
void SetLogCallback(InferenceEngine::IErrorListener &listener) noexcept override {
|
||||
actual->SetLogCallback(listener);
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Cleans the resources up
|
||||
*/
|
||||
void Unload() noexcept override {
|
||||
actual->Unload();
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Does nothing since destruction is done via regular mechanism
|
||||
*/
|
||||
void Release() noexcept override {}
|
||||
|
||||
protected:
|
||||
/**
|
||||
* @brief An SOPointer instance to the loaded templated object
|
||||
*/
|
||||
InferenceEngine::details::SOPointer<MKLDNNPlugin::IMKLDNNExtension> actual;
|
||||
};
|
||||
} // namespace MKLDNNPlugin
|
||||
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief Creates a special shared_pointer wrapper for the given type from a specific shared module
|
||||
* @param name Name of the shared library file
|
||||
* @return shared_pointer A wrapper for the given type from a specific shared module
|
||||
*/
|
||||
template<>
|
||||
inline std::shared_ptr<MKLDNNPlugin::IMKLDNNExtension> make_so_pointer(const std::string &name) {
|
||||
return std::make_shared<MKLDNNPlugin::MKLDNNExtension>(name);
|
||||
}
|
||||
|
||||
} // namespace InferenceEngine
|
||||
@@ -1,200 +0,0 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
/**
|
||||
* @brief A header file for the main MKL-DNN Extension API to work with weights, and primitives in memory
|
||||
* @file mkldnn_extension_types.hpp
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "ie_common.h"
|
||||
#include "ie_precision.hpp"
|
||||
|
||||
namespace InferenceEngine {
|
||||
namespace MKLDNNPlugin {
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief Defines formats from MKL-DNN which are supported in the MKL-DNN plugin of IE.
|
||||
*/
|
||||
enum MemoryFormat {
|
||||
/** Undefined memory format, used for empty memory descriptors. */
|
||||
format_undef = 0,
|
||||
/** Unspecified format. The primitive selects a format
|
||||
* automatically. */
|
||||
any,
|
||||
/** A tensor in a generic format described by the stride and blocking
|
||||
* values in each dimension. See #mkldnn_blocking_desc_t for more
|
||||
* information. */
|
||||
blocked,
|
||||
/** 1D data tensor. */
|
||||
x,
|
||||
/** 2D data tensor. */
|
||||
nc,
|
||||
/** 4D data tensor in the @c nchw format typically used in Caffe. */
|
||||
nchw,
|
||||
/** 4D data tensor in the @c nhwc format typically used in TensorFlow. */
|
||||
nhwc,
|
||||
/** 4D data tensor in the @c chwn format typically used in Neon. */
|
||||
chwn,
|
||||
/** 4D data tensor in the @c nchw format with channels data laid out in
|
||||
* memory in 8-element blocks. */
|
||||
nChw8c,
|
||||
/** 4D data tensor in the @c nchw format with channels data laid out in
|
||||
* memory in 16-element blocks. */
|
||||
nChw16c,
|
||||
/** 2D weights tensor in the format (input channels, output channels). */
|
||||
oi,
|
||||
/** 2D weights tensor in the format (input channels, output channels). */
|
||||
io,
|
||||
/** 4D weights tensor in the format (input channels, output channels,
|
||||
* width, height). */
|
||||
oihw,
|
||||
/** 4D weights tensor in the format (input channels, height, width,
|
||||
* output channels). */
|
||||
ihwo,
|
||||
/** 4D weights tensor in the format (height, width, input channels,
|
||||
* output channels). */
|
||||
hwio,
|
||||
/** 4D weights tensor in the @c oihw format with both input and output
|
||||
* channels data laid out in memory in 8-element blocks. */
|
||||
OIhw8i8o,
|
||||
/** 4D weights tensor in the @c oihw format with both input and output
|
||||
* channels data laid out in memory in 16-element blocks. */
|
||||
OIhw16i16o,
|
||||
/** 4D weights tensor in the @c oihw format with output channels data
|
||||
* laid out in memory in 16-element blocks and input channels data
|
||||
* laid out in memory in 8-element blocks blocked by pairs. */
|
||||
OIhw8i16o2i,
|
||||
/** 4D weights tensor in the @c oihw format with input channels data
|
||||
* laid out in memory in 16-element blocks and output channels data
|
||||
* laid out in memory in 8-element blocks blocked by pairs. */
|
||||
OIhw8o16i2o,
|
||||
/** 4D weights tensor in the @c oihw format with both input and output
|
||||
* channels data laid out in memory in 8-element blocks. */
|
||||
OIhw8o8i,
|
||||
/** 4D weights tensor in the @c oihw format with both input and output
|
||||
* channels data laid out in memory in 16-element blocks. */
|
||||
OIhw16o16i,
|
||||
/** 4D weights tensor in the format (output channels, input channels,
|
||||
* height, width) with output channels data laid out in memory in 8-element
|
||||
* blocks. */
|
||||
Oihw8o,
|
||||
/** 4D weights tensor in the format (output channels, input channels,
|
||||
* height, width) with output channels data laid out in memory in
|
||||
* 16-element blocks. */
|
||||
Oihw16o,
|
||||
/** 4D weights tensor in the format (output channels, width, height, input
|
||||
* channels) with output channels data laid out in memory in 8-element
|
||||
* blocks. */
|
||||
Ohwi8o,
|
||||
/** 4D weights tensor in the format (output channels, width, height, input
|
||||
* channels) with output channels data laid out in memory in 16-element
|
||||
* blocks. */
|
||||
Ohwi16o,
|
||||
/** 4D weights tensor in the @c oihw format with both input and output
|
||||
* channels data laid out in memory in 16-element and 4-element blocks. */
|
||||
OhIw16o4i,
|
||||
/** 5D weights tensor in the @c oihw format with extra outer dimension for
|
||||
* groups. */
|
||||
goihw,
|
||||
/** 5D weights tensor in the blocked version of @c goihw format with both
|
||||
* input and output channels data laid out in memory in 8-element blocks.
|
||||
*/
|
||||
gOIhw8i8o,
|
||||
/** 5D weights tensor in the blocked version of @c goihw format with both
|
||||
* input and output channels data laid out in memory in 16-element blocks.
|
||||
*/
|
||||
gOIhw16i16o,
|
||||
/** 5D weights tensor in the @c oihw format with output channels data
|
||||
* laid out in memory in 16-element blocks and input channels data
|
||||
* laid out in memory in 8-element blocks blocked by pairs. */
|
||||
gOIhw8i16o2i,
|
||||
/** 5D weights tensor in the @c oihw format with input channels data
|
||||
* laid out in memory in 16-element blocks and output channels data
|
||||
* laid out in memory in 8-element blocks blocked by pairs. */
|
||||
gOIhw8o16i2o,
|
||||
/** 5D weights tensor in the blocked version of @c goihw format with both
|
||||
* input and output channels data laid out in memory in 8-element blocks.
|
||||
*/
|
||||
gOIhw8o8i,
|
||||
/** 5D weights tensor in the blocked version of @c goihw format with both
|
||||
* input and output channels data laid out in memory in 16-element blocks.
|
||||
*/
|
||||
gOIhw16o16i,
|
||||
/** 5D weights tensor in the blocked version of @c goihw format with output
|
||||
* channels data laid out in memory in 8-element blocks. */
|
||||
gOihw8o,
|
||||
/** 5D weights tensor in the blocked version of @c goihw format with output
|
||||
* channels data laid out in memory in 16-element blocks. */
|
||||
gOihw16o,
|
||||
/** 5D weights tensor in the blocked version of @c goihw format with output
|
||||
* channels data laid out in memory in 8-element blocks. */
|
||||
gOhwi8o,
|
||||
/** 5D weights tensor in the blocked version of @c goihw format with output
|
||||
* channels data laid out in memory in 16-element blocks. */
|
||||
gOhwi16o,
|
||||
/** 5D weights tensor in the @c goihw format with both input and output
|
||||
* channels data laid out in memory in 16-element and 4-element blocks. */
|
||||
gOhIw16o4i,
|
||||
/** 4D weights tensor in the oihw format with input channels data laid out
|
||||
* in memory in 8-element blocks. */
|
||||
oIhw8i = nChw8c,
|
||||
/** 4D weights tensor in the oihw format with input channels data laid out
|
||||
* in memory in 16-element blocks. */
|
||||
oIhw16i = nChw16c,
|
||||
};
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief Stores necessary information about the primitive memory object.
|
||||
* Such as precision, dimensions, memory format etc.
|
||||
*/
|
||||
struct MKLDNNPrimitiveMemory {
|
||||
/**
|
||||
* @brief precision type
|
||||
*/
|
||||
Precision precision;
|
||||
/**
|
||||
* @brief dimensions of the given primitive
|
||||
*/
|
||||
SizeVector dims;
|
||||
/**
|
||||
* @brief memory type of the given primitive
|
||||
*/
|
||||
MemoryFormat format;
|
||||
/**
|
||||
* @brief primitive data stored
|
||||
*/
|
||||
void *data;
|
||||
|
||||
/**
|
||||
* @brief A constructor.
|
||||
*/
|
||||
MKLDNNPrimitiveMemory() : format(format_undef), data(nullptr) {}
|
||||
};
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief Stores necessary information about the primitive weights.
|
||||
*/
|
||||
struct MKLDNNWeightsMemory {
|
||||
/**
|
||||
* @brief size of weights
|
||||
*/
|
||||
size_t size;
|
||||
/**
|
||||
* @brief pointer to weights data
|
||||
*/
|
||||
void *data;
|
||||
|
||||
/**
|
||||
* @brief A constructor.
|
||||
*/
|
||||
MKLDNNWeightsMemory() : size(0), data(nullptr) {}
|
||||
};
|
||||
|
||||
} // namespace MKLDNNPlugin
|
||||
} // namespace InferenceEngine
|
||||
@@ -1,123 +0,0 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
/**
|
||||
* @brief a header file for MKL-DNN Generic Primitive API
|
||||
* @file mkldnn_generic_primitive.hpp
|
||||
*/
|
||||
#pragma once
|
||||
|
||||
#include "mkldnn_extension_types.hpp"
|
||||
#include "details/ie_irelease.hpp"
|
||||
#include <vector>
|
||||
|
||||
namespace InferenceEngine {
|
||||
namespace MKLDNNPlugin {
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief The MKLDNNGenericFormats stores weights, biases, inputs and outputs of the primitive
|
||||
*/
|
||||
class MKLDNNGenericFormats {
|
||||
public:
|
||||
/**
|
||||
* @brief A default constructor
|
||||
* @param ins - vector of inputs
|
||||
* @param outs - vector of outputs
|
||||
* @param weights - weights, format_undef by default
|
||||
* @param biases - biases, format_undef by default
|
||||
*/
|
||||
MKLDNNGenericFormats(const std::vector<MemoryFormat> &ins, const std::vector<MemoryFormat> &outs,
|
||||
const MemoryFormat weights = MemoryFormat::format_undef,
|
||||
const MemoryFormat biases = MemoryFormat::format_undef) : inputs(ins), outputs(outs) {
|
||||
this->weights = weights;
|
||||
this->biases = biases;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Get input formats
|
||||
* @return vector of input formats
|
||||
*/
|
||||
const std::vector<MemoryFormat>& GetInputs() const noexcept {
|
||||
return inputs;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Get output formats
|
||||
* @return vector of output formats
|
||||
*/
|
||||
const std::vector<MemoryFormat>& GetOutputs() const noexcept {
|
||||
return outputs;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Get weights format
|
||||
* @return weights format
|
||||
*/
|
||||
const MemoryFormat& GetWeights() const noexcept {
|
||||
return weights;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Get biases format
|
||||
* @return biases format
|
||||
*/
|
||||
const MemoryFormat& GetBiases() const noexcept {
|
||||
return biases;
|
||||
}
|
||||
|
||||
private:
|
||||
std::vector<MemoryFormat> inputs;
|
||||
std::vector<MemoryFormat> outputs;
|
||||
MemoryFormat weights;
|
||||
MemoryFormat biases;
|
||||
};
|
||||
|
||||
/**
|
||||
* @deprecated use new extensibility API
|
||||
* @brief The IMKLDNNGenericPrimitive is the main Generic Primitive interface
|
||||
*/
|
||||
class IMKLDNNGenericPrimitive : public InferenceEngine::details::IRelease {
|
||||
public:
|
||||
void Release() noexcept override {
|
||||
delete this;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Sets inputs nd outputs
|
||||
* @param inputs - vector of input primitives
|
||||
* @param outputs - vector of output primitives
|
||||
*/
|
||||
void SetMemory(const std::vector<MKLDNNPrimitiveMemory>& inputs,
|
||||
const std::vector<MKLDNNPrimitiveMemory>& outputs) noexcept {
|
||||
this->inputs = inputs;
|
||||
this->outputs = outputs;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Gets supported formats
|
||||
* @return vector of supported formats
|
||||
*/
|
||||
virtual std::vector<MKLDNNGenericFormats> GetSupportedFormats() noexcept = 0;
|
||||
|
||||
/**
|
||||
* @brief Entry point of actual execution of primitive.
|
||||
* Error reporting mechanism missed, static check should be done in constructor
|
||||
*/
|
||||
virtual void Execute() noexcept = 0;
|
||||
|
||||
protected:
|
||||
/**
|
||||
* @brief Vector of input primitives
|
||||
*/
|
||||
std::vector<MKLDNNPrimitiveMemory> inputs;
|
||||
/**
|
||||
* @brief Vector of output primitives
|
||||
*/
|
||||
std::vector<MKLDNNPrimitiveMemory> outputs;
|
||||
};
|
||||
|
||||
} // namespace MKLDNNPlugin
|
||||
} // namespace InferenceEngine
|
||||
@@ -5,14 +5,7 @@ cmake_minimum_required (VERSION 2.8)
|
||||
|
||||
project(Samples)
|
||||
|
||||
list (APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)
|
||||
|
||||
find_package(InferenceEngine 1.2)
|
||||
if (NOT InferenceEngine_FOUND)
|
||||
message(FATAL_ERROR "")
|
||||
endif()
|
||||
|
||||
if("${CMAKE_BUILD_TYPE}" STREQUAL "")
|
||||
if (CMAKE_BUILD_TYPE STREQUAL "")
|
||||
message(STATUS "CMAKE_BUILD_TYPE not defined, 'Release' will be used")
|
||||
set(CMAKE_BUILD_TYPE "Release")
|
||||
endif()
|
||||
@@ -27,35 +20,35 @@ if (NOT(BIN_FOLDER))
|
||||
set (BIN_FOLDER ${ARCH})
|
||||
endif()
|
||||
|
||||
if (NOT (IE_MAIN_SOURCE_DIR))
|
||||
set(NEED_EXTENSIONS TRUE)
|
||||
if (WIN32)
|
||||
set (IE_MAIN_SOURCE_DIR ${CMAKE_SOURCE_DIR}/../bin/)
|
||||
else()
|
||||
set (IE_MAIN_SOURCE_DIR ${CMAKE_CURRENT_BINARY_DIR})
|
||||
endif()
|
||||
if (NOT(IE_MAIN_SOURCE_DIR))
|
||||
# in case if samples are built out of IE repo
|
||||
set (IE_MAIN_SAMPLES_DIR ${CMAKE_CURRENT_BINARY_DIR})
|
||||
else()
|
||||
# in case if samples are built from IE repo
|
||||
set (IE_MAIN_SAMPLES_DIR ${IE_MAIN_SOURCE_DIR})
|
||||
endif()
|
||||
|
||||
if(NOT(UNIX))
|
||||
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_LIBRARY_PATH ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
|
||||
set (LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_LIBRARY_PATH ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
|
||||
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
|
||||
set (LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER})
|
||||
set (LIBRARY_OUTPUT_PATH ${LIBRARY_OUTPUT_DIRECTORY}) # compatibility issue: linux uses LIBRARY_OUTPUT_PATH, windows uses LIBRARY_OUTPUT_DIRECTORY
|
||||
else ()
|
||||
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
|
||||
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
|
||||
set (CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
|
||||
set (CMAKE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
|
||||
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
|
||||
set (LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SOURCE_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
|
||||
set (CMAKE_LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
|
||||
set (CMAKE_ARCHIVE_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
|
||||
set (CMAKE_COMPILE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
|
||||
set (CMAKE_PDB_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
|
||||
set (CMAKE_RUNTIME_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE})
|
||||
set (LIBRARY_OUTPUT_DIRECTORY ${IE_MAIN_SAMPLES_DIR}/${BIN_FOLDER}/${CMAKE_BUILD_TYPE}/lib)
|
||||
set (LIBRARY_OUTPUT_PATH ${LIBRARY_OUTPUT_DIRECTORY}/lib)
|
||||
endif()
|
||||
|
||||
set(CMAKE_CXX_FLAGS "-std=c++11 ${CMAKE_CXX_FLAGS}")
|
||||
find_package(InferenceEngine 1.4 REQUIRED)
|
||||
|
||||
if (WIN32)
|
||||
if(NOT "${CMAKE_SIZEOF_VOID_P}" EQUAL "8")
|
||||
message(FATAL_ERROR "Only 64-bit supported on Windows")
|
||||
@@ -65,7 +58,7 @@ if (WIN32)
|
||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_SCL_SECURE_NO_WARNINGS -DNOMINMAX")
|
||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /EHsc") #no asynchronous structured exception handling
|
||||
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /LARGEADDRESSAWARE")
|
||||
if (ENABLE_OMP)
|
||||
if (THREADING STREQUAL "OMP")
|
||||
find_package(OpenMP)
|
||||
if (OPENMP_FOUND)
|
||||
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
|
||||
@@ -81,8 +74,6 @@ else()
|
||||
endif()
|
||||
endif()
|
||||
|
||||
include(feature_defs OPTIONAL)
|
||||
|
||||
####################################
|
||||
## to use C++11
|
||||
set (CMAKE_CXX_STANDARD 11)
|
||||
@@ -105,28 +96,35 @@ if (UNIX)
|
||||
SET(LIB_DL dl)
|
||||
endif()
|
||||
|
||||
# Find OpenCV libray if exists
|
||||
# Find OpenCV library if exists
|
||||
find_package(OpenCV)
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
if(OpenCV_FOUND)
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
add_definitions(-DUSE_OPENCV)
|
||||
else()
|
||||
set (BUILD_VALIDATION_APP OFF)
|
||||
message(WARNING "No suitable OpenCV version detected, BUILD_VALIDATION_APP is set to OFF")
|
||||
endif()
|
||||
|
||||
add_subdirectory(common/format_reader)
|
||||
|
||||
if (NEED_EXTENSIONS)
|
||||
add_subdirectory(extension)
|
||||
endif()
|
||||
|
||||
####################################################
|
||||
# SAMPLES list
|
||||
####################################################
|
||||
add_subdirectory(classification_sample)
|
||||
add_subdirectory(classification_sample_async)
|
||||
add_subdirectory(hello_autoresize_classification)
|
||||
add_subdirectory(hello_classification)
|
||||
add_subdirectory(hello_request_classification)
|
||||
add_subdirectory(hello_shape_infer_ssd)
|
||||
add_subdirectory(object_detection_sample_ssd)
|
||||
add_subdirectory(style_transfer_sample)
|
||||
|
||||
if (OpenCV_FOUND)
|
||||
add_subdirectory(benchmark_app)
|
||||
add_subdirectory(calibration_tool)
|
||||
if (BUILD_VALIDATION_APP)
|
||||
add_subdirectory(validation_app)
|
||||
else()
|
||||
message(STATUS "Validation app build is switched off")
|
||||
endif()
|
||||
####################################################
|
||||
|
||||
@@ -1,80 +0,0 @@
|
||||
Inference Engine Samples {#SamplesOverview}
|
||||
================
|
||||
|
||||
The Inference Engine sample applications are simple console applications that demonstrate how you can use the Intel's Deep Learning Inference Engine in your applications.
|
||||
|
||||
The Deep Learning Inference Engine release package provides the following sample applications available in the samples
|
||||
directory in the Inference Engine installation directory:
|
||||
|
||||
- [CPU Extensions](@ref CPUExtensions) library with topology-specific layers (like DetectionOutput used in the SSD*, below)
|
||||
- [Hello Autoresize Classification Sample](@ref InferenceEngineHelloAutoresizeClassificationSample) - Input of any size and layout can be set to an infer request which will be pre-processed automatically during inference (the sample supports only images as inputs)
|
||||
- [Hello Infer Request Classification Sample](@ref InferenceEngineHelloRequestClassificationSample) - Inference of image classification networks via Infer Request API (the sample supports only images as inputs)
|
||||
- [Image Classification Sample](@ref InferenceEngineClassificationSampleApplication) - Inference of image classification networks like AlexNet and GoogLeNet (the sample supports only images as inputs)
|
||||
- [Image Classification Sample, pipelined](@ref InferenceEngineClassificationPipelinedSampleApplication)- Maximize performance via pipelined execution, the sample supports only images as inputs
|
||||
- [Neural Style Transfer Sample](@ref InferenceEngineNeuralStyleTransferSampleApplication) - Style Transfer sample (the sample supports only images as inputs)
|
||||
- [Object Detection for SSD Sample](@ref InferenceEngineObjectDetectionSSDSampleApplication) - Inference of object detection networks based on the SSD, this sample is simplified version that supports only images as inputs
|
||||
- [Validation App](@ref InferenceEngineValidationApp) - Infers pack of images resulting in total accuracy (only images as inputs)
|
||||
|
||||
## <a name="build_samples_linux"></a> Building the Sample Applications on Linux*
|
||||
The officially supported Linux build environment is the following:
|
||||
|
||||
* Ubuntu* 16.04 LTS 64-bit or CentOS* 7.4 64-bit
|
||||
* GCC* 5.4.0 (for Ubuntu* 16.04) or GCC* 4.8.5 (for CentOS* 7.4)
|
||||
* CMake* version 2.8 or higher.
|
||||
* OpenCV 3.3 or later (required for some samples)
|
||||
|
||||
<br>You can build the sample applications using the <i>CMake</i> file in the `samples` directory.
|
||||
|
||||
Create a new directory and change your current directory to the new one:
|
||||
```sh
|
||||
mkdir build
|
||||
cd build
|
||||
```
|
||||
Run <i>CMake</i> to generate Make files:
|
||||
```sh
|
||||
cmake -DCMAKE_BUILD_TYPE=Release <path_to_inference_engine_samples_directory>
|
||||
```
|
||||
|
||||
To build samples with debug information, use the following command:
|
||||
```sh
|
||||
cmake -DCMAKE_BUILD_TYPE=Debug <path_to_inference_engine_samples_directory>
|
||||
```
|
||||
|
||||
Run <i>Make</i> to build the application:
|
||||
```sh
|
||||
make
|
||||
```
|
||||
|
||||
For ease of reference, the Inference Engine installation folder is referred to as <code><INSTALL_DIR></code>.
|
||||
|
||||
After that you can find binaries for all samples applications in the <code>intel64/Release</code> subfolder.
|
||||
|
||||
## <a name="build_samples_windows"></a> Building the Sample Applications on Microsoft Windows* OS
|
||||
|
||||
The recommended Windows build environment is the following:
|
||||
* Microsoft Windows* 10
|
||||
* Microsoft* Visual Studio* 2015 including Microsoft Visual Studio 2015 Community or Microsoft Visual Studio 2017
|
||||
* CMake* version 2.8 or later
|
||||
* OpenCV* 3.3 or later
|
||||
|
||||
|
||||
Generate Microsoft Visual Studio solution file using <code>create_msvc_solution.bat</code> file in the <code>samples</code> directory and then build the solution <code>samples\build\Samples.sln</code> in the Microsoft Visual Studio 2015.
|
||||
|
||||
## Running the Sample Applications
|
||||
|
||||
Before running compiled binary files, make sure your application can find the Inference Engine libraries.
|
||||
Use the `setvars.sh` script, which will set all necessary environment variables.
|
||||
|
||||
For that, run (assuming that you are in a <code><INSTALL_DIR>/deployment_tools/inference_engine/bin/intel64/Release</code> folder):
|
||||
<pre>
|
||||
source ../../setvars.sh
|
||||
</pre>
|
||||
|
||||
What is left is running the required sample with appropriate commands, providing IR information (typically with "-m" command-line option).
|
||||
Please note that Inference Engine assumes that weights are in the same folder as _.xml_ file.
|
||||
|
||||
## See Also
|
||||
* [Introduction to Intel's Deep Learning Inference Engine](@ref Intro)
|
||||
|
||||
---
|
||||
\* Other names and brands may be claimed as the property of others.
|
||||
43
inference-engine/samples/benchmark_app/CMakeLists.txt
Normal file
43
inference-engine/samples/benchmark_app/CMakeLists.txt
Normal file
@@ -0,0 +1,43 @@
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
cmake_minimum_required(VERSION 2.8)
|
||||
|
||||
set (TARGET_NAME "benchmark_app")
|
||||
|
||||
if( BUILD_SAMPLE_NAME AND NOT ${BUILD_SAMPLE_NAME} STREQUAL ${TARGET_NAME} )
|
||||
message(STATUS "SAMPLE ${TARGET_NAME} SKIPPED")
|
||||
return()
|
||||
endif()
|
||||
|
||||
file (GLOB SRC
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/*.cpp
|
||||
)
|
||||
|
||||
# Create named folders for the sources within the .vcproj
|
||||
# Empty name lists them directly under the .vcproj
|
||||
source_group("src" FILES ${SRC})
|
||||
|
||||
link_directories(${LIB_FOLDER})
|
||||
|
||||
# Create library file from sources.
|
||||
add_executable(${TARGET_NAME} ${SRC})
|
||||
|
||||
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
|
||||
COMPILE_PDB_NAME ${TARGET_NAME})
|
||||
|
||||
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} IE::ie_cpu_extension format_reader gflags)
|
||||
|
||||
if(UNIX)
|
||||
target_link_libraries(${TARGET_NAME} ${LIB_DL} pthread)
|
||||
endif()
|
||||
87
inference-engine/samples/benchmark_app/README.md
Normal file
87
inference-engine/samples/benchmark_app/README.md
Normal file
@@ -0,0 +1,87 @@
|
||||
# Benchmark Application Demo
|
||||
|
||||
This topic demonstrates how to run the Benchmark Application demo, which performs inference using convolutional networks.
|
||||
|
||||
## How It Works
|
||||
|
||||
**NOTE:** To achieve benchmark results similar to the official published results, set CPU frequency to 2.9GHz and GPU frequency to 1GHz.
|
||||
|
||||
Upon the start-up, the application reads command-line parameters and loads a network and images to the Inference Engine plugin. The number of infer requests and execution approach depend on a mode defined with the `-api` command-line parameter.
|
||||
|
||||
|
||||
### Synchronous API
|
||||
For synchronous mode, the primary metric is latency. The application creates one infer request and executes the `Infer` method. A number of executions is defined by one of the two values:
|
||||
* Number of iterations defined with the `-niter` command-line argument
|
||||
* Predefined duration if `-niter` is skipped. Predefined duration value depends on device.
|
||||
|
||||
During the execution, the application collects two types of metrics:
|
||||
* Latency for each infer request executed with `Infer` method
|
||||
* Duration of all executions
|
||||
|
||||
Reported latency value is calculated as mean value of all collected latencies. Reported throughput value is a derivative from reported latency and additionally depends on batch size.
|
||||
|
||||
### Asynchronous API
|
||||
For asynchronous mode, the primary metric is throughput in frames per second (FPS). The application creates a certain number of infer requests and executes the `StartAsync` method. A number of infer is specified with the `-nireq` command-line parameter. A number of executions is defined by one of the two values:
|
||||
* Number of iterations defined with the `-niter` command-line argument
|
||||
* Predefined duration if `-niter` is skipped. Predefined duration value depends on device.
|
||||
|
||||
The infer requests are executed asynchronously. `Wait` method is used to wait for previous execution to complete. The application measures all infer requests executions and reports the throughput metric based on batch size and total execution duration.
|
||||
|
||||
## Running
|
||||
|
||||
Running the application with the `-h` option yields the following usage message:
|
||||
```sh
|
||||
./benchmark_app -h
|
||||
InferenceEngine:
|
||||
API version ............ <version>
|
||||
Build .................. <number>
|
||||
[ INFO ] Parsing input parameters
|
||||
|
||||
benchmark_app [OPTION]
|
||||
Options:
|
||||
|
||||
-h Print a usage message
|
||||
-i "<path>" Required. Path to a folder with images or to image files.
|
||||
-m "<path>" Required. Path to an .xml file with a trained model.
|
||||
-pp "<path>" Path to a plugin folder.
|
||||
-api "<sync/async>" Required. Enable using sync/async API.
|
||||
-d "<device>" Specify a target device to infer on: CPU, GPU, FPGA or MYRIAD. Use "-d HETERO:<comma separated devices list>" format to specify HETERO plugin. The application looks for a suitable plugin for the specified device.
|
||||
-niter "<integer>" Optional. Number of iterations. If not specified, the number of iterations is calculated depending on a device.
|
||||
-nireq "<integer>" Optional. Number of infer requests (default value is 2).
|
||||
-l "<absolute_path>" Required for CPU custom layers. Absolute path to a shared library with the kernels implementations.
|
||||
Or
|
||||
-c "<absolute_path>" Required for GPU custom kernels. Absolute path to an .xml file with the kernels description.
|
||||
-b "<integer>" Optional. Batch size value. If not specified, the batch size value is determined from IR.
|
||||
```
|
||||
|
||||
Running the application with the empty list of options yields the usage message given above and an error message.
|
||||
|
||||
To run the demo, you can use one-layer public models or one-layer pre-trained and optimized models delivered with the package that support images as input.
|
||||
|
||||
For example, to do inference on an image using a trained network with multiple outputs on CPU, run the following command:
|
||||
|
||||
```sh
|
||||
./benchmark_app -i <path_to_image>/inputImage.bmp -m <path_to_model>/multiple-output.xml -d CPU
|
||||
```
|
||||
|
||||
**NOTE**: Public models should be first converted to the Inference Engine format (\*.xml + \*.bin) using the [Model Optimizer tool](./docs/Model_Optimizer_Developer_Guide/Deep_Learning_Model_Optimizer_DevGuide.md).
|
||||
|
||||
## Demo Output
|
||||
|
||||
Application output depends on a used API. For synchronous API, the application outputs latency and throughput:
|
||||
```
|
||||
[ INFO ] Start inference synchronously (60000 ms duration)
|
||||
|
||||
[ INFO ] Latency: 37.91 ms
|
||||
[ INFO ] Throughput: 52.7566 FPS
|
||||
```
|
||||
|
||||
For asynchronous API, the application outputs only throughput:
|
||||
```
|
||||
[ INFO ] Start inference asynchronously (60000 ms duration, 2 inference requests in parallel)
|
||||
|
||||
[ INFO ] Throughput: 48.2031 FPS
|
||||
```
|
||||
|
||||
## See Also
|
||||
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)
|
||||
119
inference-engine/samples/benchmark_app/benchmark_app.h
Normal file
119
inference-engine/samples/benchmark_app/benchmark_app.h
Normal file
@@ -0,0 +1,119 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <string>
|
||||
#include <vector>
|
||||
#include <gflags/gflags.h>
|
||||
#include <iostream>
|
||||
|
||||
#ifdef _WIN32
|
||||
#include <os/windows/w_dirent.h>
|
||||
#else
|
||||
#include <sys/stat.h>
|
||||
#include <dirent.h>
|
||||
#endif
|
||||
|
||||
/// @brief message for help argument
|
||||
static const char help_message[] = "Print a usage message";
|
||||
|
||||
/// @brief message for images argument
|
||||
static const char image_message[] = "Required. Path to a folder with images or to image files.";
|
||||
|
||||
/// @brief message for images argument
|
||||
static const char multi_input_message[] = "Path to multi input file containing.";
|
||||
|
||||
/// @brief message for model argument
|
||||
static const char model_message[] = "Required. Path to an .xml file with a trained model.";
|
||||
|
||||
/// @brief message for plugin_path argument
|
||||
static const char plugin_path_message[] = "Path to a plugin folder.";
|
||||
|
||||
/// @brief message for plugin argument
|
||||
static const char api_message[] = "Required. Enable using sync/async API.";
|
||||
|
||||
/// @brief message for assigning cnn calculation to device
|
||||
static const char target_device_message[] = "Specify a target device to infer on: CPU, GPU, FPGA or MYRIAD. " \
|
||||
"Use \"-d HETERO:<comma separated devices list>\" format to specify HETERO plugin. " \
|
||||
"The application looks for a suitable plugin for the specified device.";
|
||||
|
||||
/// @brief message for iterations count
|
||||
static const char iterations_count_message[] = "Optional. Number of iterations. " \
|
||||
"If not specified, the number of iterations is calculated depending on a device.";
|
||||
|
||||
/// @brief message for iterations count
|
||||
static const char infer_requests_count_message[] = "Optional. Number of infer requests (default value is 2).";
|
||||
|
||||
/// @brief message for user library argument
|
||||
static const char custom_cpu_library_message[] = "Required for CPU custom layers. Absolute path to a shared library with the kernels implementations.";
|
||||
|
||||
/// @brief message for clDNN custom kernels desc
|
||||
static const char custom_cldnn_message[] = "Required for GPU custom kernels. Absolute path to an .xml file with the kernels description.";
|
||||
|
||||
static const char batch_size_message[] = "Batch size value. If not specified, the batch size value is determined from IR";
|
||||
|
||||
/// @brief Define flag for showing help message <br>
|
||||
DEFINE_bool(h, false, help_message);
|
||||
|
||||
/// @brief Define parameter for set image file <br>
|
||||
/// i or mif is a required parameter
|
||||
DEFINE_string(i, "", image_message);
|
||||
|
||||
/// @brief Define parameter for set model file <br>
|
||||
/// It is a required parameter
|
||||
DEFINE_string(m, "", model_message);
|
||||
|
||||
/// @brief Define parameter for set path to plugins <br>
|
||||
DEFINE_string(pp, "", plugin_path_message);
|
||||
|
||||
/// @brief Enable per-layer performance report
|
||||
DEFINE_string(api, "async", api_message);
|
||||
|
||||
/// @brief device the target device to infer on <br>
|
||||
DEFINE_string(d, "", target_device_message);
|
||||
|
||||
/// @brief Absolute path to CPU library with user layers <br>
|
||||
/// It is a required parameter
|
||||
DEFINE_string(l, "", custom_cpu_library_message);
|
||||
|
||||
/// @brief Define parameter for clDNN custom kernels path <br>
|
||||
/// Default is ./lib
|
||||
DEFINE_string(c, "", custom_cldnn_message);
|
||||
|
||||
/// @brief Iterations count (default 0)
|
||||
/// Sync mode: iterations count
|
||||
/// Async mode: StartAsync counts
|
||||
DEFINE_int32(niter, 0, iterations_count_message);
|
||||
|
||||
/// @brief Number of infer requests in parallel
|
||||
DEFINE_int32(nireq, 2, infer_requests_count_message);
|
||||
|
||||
/// @brief Define parameter for batch size <br>
|
||||
/// Default is 0 (that means don't specify)
|
||||
DEFINE_int32(b, 0, batch_size_message);
|
||||
|
||||
|
||||
/**
|
||||
* @brief This function show a help message
|
||||
*/
|
||||
static void showUsage() {
|
||||
std::cout << std::endl;
|
||||
std::cout << "universal_app [OPTION]" << std::endl;
|
||||
std::cout << "Options:" << std::endl;
|
||||
std::cout << std::endl;
|
||||
std::cout << " -h " << help_message << std::endl;
|
||||
std::cout << " -i \"<path>\" " << image_message << std::endl;
|
||||
std::cout << " -m \"<path>\" " << model_message << std::endl;
|
||||
std::cout << " -pp \"<path>\" " << plugin_path_message << std::endl;
|
||||
std::cout << " -api \"<sync/async>\" " << api_message << std::endl;
|
||||
std::cout << " -d \"<device>\" " << target_device_message << std::endl;
|
||||
std::cout << " -niter \"<integer>\" " << iterations_count_message << std::endl;
|
||||
std::cout << " -l \"<absolute_path>\" " << custom_cpu_library_message << std::endl;
|
||||
std::cout << " Or" << std::endl;
|
||||
std::cout << " -c \"<absolute_path>\" " << custom_cldnn_message << std::endl;
|
||||
std::cout << " -nireq \"<integer>\" " << infer_requests_count_message << std::endl;
|
||||
std::cout << " -b \"<integer>\" " << batch_size_message << std::endl;
|
||||
}
|
||||
417
inference-engine/samples/benchmark_app/main.cpp
Normal file
417
inference-engine/samples/benchmark_app/main.cpp
Normal file
@@ -0,0 +1,417 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <algorithm>
|
||||
#include <chrono>
|
||||
#include <memory>
|
||||
#include <map>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
#include <utility>
|
||||
|
||||
#include <inference_engine.hpp>
|
||||
#include <format_reader_ptr.h>
|
||||
|
||||
#include <samples/common.hpp>
|
||||
#include <samples/slog.hpp>
|
||||
#include <samples/args_helper.hpp>
|
||||
|
||||
#include "benchmark_app.h"
|
||||
|
||||
using namespace InferenceEngine;
|
||||
|
||||
long long getDurationInNanoseconds(const std::string& device);
|
||||
|
||||
double getMedianValue(const std::vector<float>& sortedTimes);
|
||||
|
||||
void fillBlobWithImage(
|
||||
Blob::Ptr& inputBlob,
|
||||
const std::vector<std::string>& filePaths,
|
||||
const size_t batchSize,
|
||||
const InferenceEngine::InputInfo& info);
|
||||
|
||||
static const std::vector<std::pair<std::string, long long>> deviceDurationsInSeconds{
|
||||
{ "CPU", 60LL },
|
||||
{ "GPU", 60LL },
|
||||
{ "VPU", 60LL },
|
||||
{ "MYRIAD", 60LL },
|
||||
{ "FPGA", 120LL },
|
||||
{ "UNKNOWN", 120LL }
|
||||
};
|
||||
|
||||
/**
|
||||
* @brief The entry point the benchmark application
|
||||
*/
|
||||
int main(int argc, char *argv[]) {
|
||||
try {
|
||||
slog::info << "InferenceEngine: " << InferenceEngine::GetInferenceEngineVersion() << slog::endl;
|
||||
|
||||
slog::info << "Parsing input parameters" << slog::endl;
|
||||
gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
|
||||
if (FLAGS_h) {
|
||||
showUsage();
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (FLAGS_m.empty()) {
|
||||
throw std::logic_error("Model required is not set. Please use -h.");
|
||||
}
|
||||
|
||||
if (FLAGS_api.empty()) {
|
||||
throw std::logic_error("API not selected. Please use -h.");
|
||||
}
|
||||
|
||||
if (FLAGS_api != "async" && FLAGS_api != "sync") {
|
||||
throw std::logic_error("Incorrect API. Please use -h.");
|
||||
}
|
||||
|
||||
if (FLAGS_i.empty()) {
|
||||
throw std::logic_error("Input is not set. Please use -h.");
|
||||
}
|
||||
|
||||
if (FLAGS_niter < 0) {
|
||||
throw std::logic_error("Number of iterations should be positive (invalid -niter option value)");
|
||||
}
|
||||
|
||||
if (FLAGS_nireq < 0) {
|
||||
throw std::logic_error("Number of inference requests should be positive (invalid -nireq option value)");
|
||||
}
|
||||
|
||||
if (FLAGS_b < 0) {
|
||||
throw std::logic_error("Batch size should be positive (invalid -b option value)");
|
||||
}
|
||||
|
||||
std::vector<std::string> inputs;
|
||||
parseInputFilesArguments(inputs);
|
||||
if (inputs.size() == 0ULL) {
|
||||
throw std::logic_error("no images found");
|
||||
}
|
||||
|
||||
// --------------------------- 1. Load Plugin for inference engine -------------------------------------
|
||||
|
||||
slog::info << "Loading plugin" << slog::endl;
|
||||
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp }).getPluginByDevice(FLAGS_d);
|
||||
|
||||
if (!FLAGS_l.empty()) {
|
||||
// CPU (MKLDNN) extensions is loaded as a shared library and passed as a pointer to base extension
|
||||
const std::shared_ptr<IExtension> extension_ptr = InferenceEngine::make_so_pointer<InferenceEngine::IExtension>(FLAGS_l);
|
||||
plugin.AddExtension(extension_ptr);
|
||||
slog::info << "CPU (MKLDNN) extensions is loaded " << FLAGS_l << slog::endl;
|
||||
} else if (!FLAGS_c.empty()) {
|
||||
// Load clDNN Extensions
|
||||
plugin.SetConfig({ {CONFIG_KEY(CONFIG_FILE), FLAGS_c} });
|
||||
slog::info << "GPU extensions is loaded " << FLAGS_c << slog::endl;
|
||||
}
|
||||
|
||||
InferenceEngine::ResponseDesc resp;
|
||||
|
||||
const Version *pluginVersion = plugin.GetVersion();
|
||||
slog::info << pluginVersion << slog::endl << slog::endl;
|
||||
|
||||
// --------------------------- 2. Read IR Generated by ModelOptimizer (.xml and .bin files) ------------
|
||||
|
||||
slog::info << "Loading network files" << slog::endl;
|
||||
|
||||
InferenceEngine::CNNNetReader netBuilder;
|
||||
netBuilder.ReadNetwork(FLAGS_m);
|
||||
const std::string binFileName = fileNameNoExt(FLAGS_m) + ".bin";
|
||||
netBuilder.ReadWeights(binFileName);
|
||||
|
||||
InferenceEngine::CNNNetwork cnnNetwork = netBuilder.getNetwork();
|
||||
const InferenceEngine::InputsDataMap inputInfo(cnnNetwork.getInputsInfo());
|
||||
if (inputInfo.empty()) {
|
||||
throw std::logic_error("no inputs info is provided");
|
||||
}
|
||||
|
||||
if (inputInfo.size() != 1) {
|
||||
throw std::logic_error("only one input layer network is supported");
|
||||
}
|
||||
|
||||
// --------------------------- 3. Resize network to match image sizes and given batch----------------------
|
||||
if (FLAGS_b != 0) {
|
||||
// We support models having only one input layers
|
||||
ICNNNetwork::InputShapes shapes = cnnNetwork.getInputShapes();
|
||||
const ICNNNetwork::InputShapes::iterator& it = shapes.begin();
|
||||
if (it->second.size() != 4) {
|
||||
throw std::logic_error("Unsupported model for batch size changing in automatic mode");
|
||||
}
|
||||
it->second[0] = FLAGS_b;
|
||||
slog::info << "Resizing network to batch = " << FLAGS_b << slog::endl;
|
||||
cnnNetwork.reshape(shapes);
|
||||
}
|
||||
|
||||
const size_t batchSize = cnnNetwork.getBatchSize();
|
||||
const Precision precision = inputInfo.begin()->second->getPrecision();
|
||||
slog::info << (FLAGS_b != 0 ? "Network batch size was changed to: " : "Network batch size: ") << batchSize <<
|
||||
", precision: " << precision << slog::endl;
|
||||
|
||||
// --------------------------- 4. Configure input & output ---------------------------------------------
|
||||
|
||||
const InferenceEngine::Precision inputPrecision = InferenceEngine::Precision::U8;
|
||||
for (auto& item : inputInfo) {
|
||||
/** Set the precision of input data provided by the user, should be called before load of the network to the plugin **/
|
||||
item.second->setInputPrecision(inputPrecision);
|
||||
}
|
||||
|
||||
const size_t imagesCount = inputs.size();
|
||||
if (batchSize > imagesCount) {
|
||||
slog::warn << "Network batch size " << batchSize << " is greater than images count " << imagesCount <<
|
||||
", some input files will be duplicated" << slog::endl;
|
||||
} else if (batchSize < imagesCount) {
|
||||
slog::warn << "Network batch size " << batchSize << " is less then images count " << imagesCount <<
|
||||
", some input files will be ignored" << slog::endl;
|
||||
}
|
||||
|
||||
// ------------------------------ Prepare output blobs -------------------------------------------------
|
||||
slog::info << "Preparing output blobs" << slog::endl;
|
||||
InferenceEngine::OutputsDataMap outputInfo(cnnNetwork.getOutputsInfo());
|
||||
InferenceEngine::BlobMap outputBlobs;
|
||||
for (auto& item : outputInfo) {
|
||||
const InferenceEngine::DataPtr outData = item.second;
|
||||
if (!outData) {
|
||||
throw std::logic_error("output data pointer is not valid");
|
||||
}
|
||||
InferenceEngine::SizeVector outputDims = outData->dims;
|
||||
const InferenceEngine::Precision outputPrecision = InferenceEngine::Precision::FP32;
|
||||
|
||||
/** Set the precision of output data provided by the user, should be called before load of the network to the plugin **/
|
||||
outData->precision = outputPrecision;
|
||||
InferenceEngine::TBlob<float>::Ptr output = InferenceEngine::make_shared_blob<float>(item.second->getTensorDesc());
|
||||
output->allocate();
|
||||
outputBlobs[item.first] = output;
|
||||
}
|
||||
|
||||
// --------------------------- 5. Loading model to the plugin ------------------------------------------
|
||||
|
||||
slog::info << "Loading model to the plugin" << slog::endl;
|
||||
const std::map<std::string, std::string> networkConfig;
|
||||
InferenceEngine::ExecutableNetwork exeNetwork = plugin.LoadNetwork(cnnNetwork, networkConfig);
|
||||
|
||||
// --------------------------- 6. Performance measurements stuff ------------------------------------------
|
||||
|
||||
typedef std::chrono::high_resolution_clock Time;
|
||||
typedef std::chrono::nanoseconds ns;
|
||||
|
||||
std::vector<float> times;
|
||||
long long durationInNanoseconds;
|
||||
if (FLAGS_niter != 0) {
|
||||
durationInNanoseconds = 0LL;
|
||||
times.reserve(FLAGS_niter);
|
||||
} else {
|
||||
durationInNanoseconds = getDurationInNanoseconds(FLAGS_d);
|
||||
}
|
||||
|
||||
if (FLAGS_api == "sync") {
|
||||
InferRequest inferRequest = exeNetwork.CreateInferRequest();
|
||||
slog::info << "Sync request created" << slog::endl;
|
||||
|
||||
for (const InputsDataMap::value_type& item : inputInfo) {
|
||||
Blob::Ptr inputBlob = inferRequest.GetBlob(item.first);
|
||||
fillBlobWithImage(inputBlob, inputs, batchSize, *item.second);
|
||||
}
|
||||
|
||||
if (FLAGS_niter != 0) {
|
||||
slog::info << "Start inference synchronously (" << FLAGS_niter << " sync inference executions)" << slog::endl << slog::endl;
|
||||
} else {
|
||||
slog::info << "Start inference synchronously (" << durationInNanoseconds * 0.000001 << " ms duration)" << slog::endl << slog::endl;
|
||||
}
|
||||
|
||||
const auto startTime = Time::now();
|
||||
auto currentTime = Time::now();
|
||||
|
||||
size_t iteration = 0ULL;
|
||||
while ((iteration < FLAGS_niter) || ((FLAGS_niter == 0LL) && ((currentTime - startTime).count() < durationInNanoseconds))) {
|
||||
const auto iterationStartTime = Time::now();
|
||||
inferRequest.Infer();
|
||||
currentTime = Time::now();
|
||||
|
||||
const auto iterationDurationNs = std::chrono::duration_cast<ns>(currentTime - iterationStartTime);
|
||||
times.push_back(static_cast<double>(iterationDurationNs.count()) * 0.000001);
|
||||
|
||||
iteration++;
|
||||
}
|
||||
|
||||
std::sort(times.begin(), times.end());
|
||||
const double latency = getMedianValue(times);
|
||||
slog::info << "Latency: " << latency << " ms" << slog::endl;
|
||||
|
||||
slog::info << "Throughput: " << batchSize * 1000.0 / latency << " FPS" << slog::endl;
|
||||
} else if (FLAGS_api == "async") {
|
||||
std::vector<InferRequest> inferRequests;
|
||||
inferRequests.reserve(FLAGS_nireq);
|
||||
|
||||
for (size_t i = 0; i < FLAGS_nireq; i++) {
|
||||
InferRequest inferRequest = exeNetwork.CreateInferRequest();
|
||||
inferRequests.push_back(inferRequest);
|
||||
|
||||
for (const InputsDataMap::value_type& item : inputInfo) {
|
||||
Blob::Ptr inputBlob = inferRequest.GetBlob(item.first);
|
||||
fillBlobWithImage(inputBlob, inputs, batchSize, *item.second);
|
||||
}
|
||||
}
|
||||
|
||||
if (FLAGS_niter != 0) {
|
||||
slog::info << "Start inference asynchronously (" << FLAGS_niter <<
|
||||
" async inference executions, " << FLAGS_nireq <<
|
||||
" inference requests in parallel)" << slog::endl << slog::endl;
|
||||
} else {
|
||||
slog::info << "Start inference asynchronously (" << durationInNanoseconds * 0.000001 <<
|
||||
" ms duration, " << FLAGS_nireq <<
|
||||
" inference requests in parallel)" << slog::endl << slog::endl;
|
||||
}
|
||||
|
||||
size_t currentInference = 0ULL;
|
||||
bool requiredInferenceRequestsWereExecuted = false;
|
||||
long long previousInference = 1LL - FLAGS_nireq;
|
||||
|
||||
// warming up - out of scope
|
||||
inferRequests[0].StartAsync();
|
||||
inferRequests[0].Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
|
||||
|
||||
const size_t stepsCount = FLAGS_niter + FLAGS_nireq - 1;
|
||||
|
||||
/** Start inference & calculate performance **/
|
||||
const auto startTime = Time::now();
|
||||
|
||||
size_t step = 0ULL;
|
||||
while ((!requiredInferenceRequestsWereExecuted) ||
|
||||
(step < stepsCount) ||
|
||||
((FLAGS_niter == 0LL) && ((Time::now() - startTime).count() < durationInNanoseconds))) {
|
||||
// start new inference
|
||||
inferRequests[currentInference].StartAsync();
|
||||
|
||||
// wait the latest inference execution if exists
|
||||
if (previousInference >= 0) {
|
||||
const StatusCode code = inferRequests[previousInference].Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
|
||||
if (code != StatusCode::OK) {
|
||||
throw std::logic_error("Wait");
|
||||
}
|
||||
}
|
||||
|
||||
currentInference++;
|
||||
if (currentInference >= FLAGS_nireq) {
|
||||
currentInference = 0;
|
||||
requiredInferenceRequestsWereExecuted = true;
|
||||
}
|
||||
|
||||
previousInference++;
|
||||
if (previousInference >= FLAGS_nireq) {
|
||||
previousInference = 0;
|
||||
}
|
||||
|
||||
step++;
|
||||
}
|
||||
|
||||
// wait the latest inference executions
|
||||
for (size_t notCompletedIndex = 0ULL; notCompletedIndex < (FLAGS_nireq - 1); ++notCompletedIndex) {
|
||||
if (previousInference >= 0) {
|
||||
const StatusCode code = inferRequests[previousInference].Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
|
||||
if (code != StatusCode::OK) {
|
||||
throw std::logic_error("Wait");
|
||||
}
|
||||
}
|
||||
|
||||
previousInference++;
|
||||
if (previousInference >= FLAGS_nireq) {
|
||||
previousInference = 0LL;
|
||||
}
|
||||
}
|
||||
|
||||
const double totalDuration = std::chrono::duration_cast<ns>(Time::now() - startTime).count() * 0.000001;
|
||||
const double fps = batchSize * 1000.0 * step / totalDuration;
|
||||
slog::info << "Throughput: " << fps << " FPS" << slog::endl;
|
||||
} else {
|
||||
throw std::logic_error("unknown api command line argument value");
|
||||
}
|
||||
} catch (const std::exception& ex) {
|
||||
slog::err << ex.what() << slog::endl;
|
||||
return 3;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
long long getDurationInNanoseconds(const std::string& device) {
|
||||
auto duration = 0LL;
|
||||
for (const auto& deviceDurationInSeconds : deviceDurationsInSeconds) {
|
||||
if (device.find(deviceDurationInSeconds.first) != std::string::npos) {
|
||||
duration = std::max(duration, deviceDurationInSeconds.second);
|
||||
}
|
||||
}
|
||||
|
||||
if (duration == 0LL) {
|
||||
const auto unknownDeviceIt = find_if(
|
||||
deviceDurationsInSeconds.begin(),
|
||||
deviceDurationsInSeconds.end(),
|
||||
[](std::pair<std::string, long long> deviceDuration) { return deviceDuration.first == "UNKNOWN"; });
|
||||
|
||||
if (unknownDeviceIt == deviceDurationsInSeconds.end()) {
|
||||
throw std::logic_error("UNKNOWN device was not found in device duration list");
|
||||
}
|
||||
duration = unknownDeviceIt->second;
|
||||
slog::warn << "Default duration " << duration << " seconds for unknown device '" << device << "' is used" << slog::endl;
|
||||
}
|
||||
|
||||
return duration * 1000000000LL;
|
||||
}
|
||||
|
||||
double getMedianValue(const std::vector<float>& sortedTimes) {
|
||||
return (sortedTimes.size() % 2 != 0) ?
|
||||
sortedTimes[sortedTimes.size() / 2ULL] :
|
||||
(sortedTimes[sortedTimes.size() / 2ULL] + sortedTimes[sortedTimes.size() / 2ULL - 1ULL]) / 2.0;
|
||||
}
|
||||
|
||||
void fillBlobWithImage(
|
||||
Blob::Ptr& inputBlob,
|
||||
const std::vector<std::string>& filePaths,
|
||||
const size_t batchSize,
|
||||
const InferenceEngine::InputInfo& info) {
|
||||
|
||||
uint8_t* inputBlobData = inputBlob->buffer().as<uint8_t*>();
|
||||
const SizeVector& inputBlobDims = inputBlob->dims();
|
||||
|
||||
slog::info << "Input dimensions (" << info.getTensorDesc().getLayout() << "): ";
|
||||
for (const auto& i : info.getTensorDesc().getDims()) {
|
||||
slog::info << i << " ";
|
||||
}
|
||||
slog::info << slog::endl;
|
||||
|
||||
/** Collect images data ptrs **/
|
||||
std::vector<std::shared_ptr<uint8_t>> vreader;
|
||||
vreader.reserve(batchSize);
|
||||
|
||||
for (size_t i = 0ULL, inputIndex = 0ULL; i < batchSize; i++, inputIndex++) {
|
||||
if (inputIndex >= filePaths.size()) {
|
||||
inputIndex = 0ULL;
|
||||
}
|
||||
|
||||
FormatReader::ReaderPtr reader(filePaths[inputIndex].c_str());
|
||||
if (reader.get() == nullptr) {
|
||||
slog::warn << "Image " << filePaths[inputIndex] << " cannot be read!" << slog::endl << slog::endl;
|
||||
continue;
|
||||
}
|
||||
|
||||
/** Getting image data **/
|
||||
std::shared_ptr<uint8_t> imageData(reader->getData(info.getDims()[0], info.getDims()[1]));
|
||||
if (imageData) {
|
||||
vreader.push_back(imageData);
|
||||
}
|
||||
}
|
||||
|
||||
/** Fill input tensor with images. First b channel, then g and r channels **/
|
||||
const size_t numChannels = inputBlobDims[2];
|
||||
const size_t imageSize = inputBlobDims[1] * inputBlobDims[0];
|
||||
/** Iterate over all input images **/
|
||||
for (size_t imageId = 0; imageId < vreader.size(); ++imageId) {
|
||||
/** Iterate over all pixel in image (b,g,r) **/
|
||||
for (size_t pid = 0; pid < imageSize; pid++) {
|
||||
/** Iterate over all channels **/
|
||||
for (size_t ch = 0; ch < numChannels; ++ch) {
|
||||
/** [images stride + channels stride + pixel id ] all in bytes **/
|
||||
inputBlobData[imageId * imageSize * numChannels + ch * imageSize + pid] = vreader.at(imageId).get()[pid*numChannels + ch];
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
61
inference-engine/samples/build_samples.sh
Normal file
61
inference-engine/samples/build_samples.sh
Normal file
@@ -0,0 +1,61 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
error() {
|
||||
local code="${3:-1}"
|
||||
if [[ -n "$2" ]];then
|
||||
echo "Error on or near line $1: $2; exiting with status ${code}"
|
||||
else
|
||||
echo "Error on or near line $1; exiting with status ${code}"
|
||||
fi
|
||||
exit "${code}"
|
||||
}
|
||||
trap 'error ${LINENO}' ERR
|
||||
|
||||
SAMPLES_PATH="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
|
||||
|
||||
if [[ -z "${InferenceEngine_DIR}" ]]; then
|
||||
printf "\nInferenceEngine_DIR environment variable is not set. Trying to find setupvars.sh to set it. \n"
|
||||
|
||||
setvars_path=$SAMPLES_PATH/../..
|
||||
if [ -e "$setvars_path/inference_engine/bin/setvars.sh" ]; then # for Intel Deep Learning Deployment Toolkit package
|
||||
setvars_path="$setvars_path/inference_engine/bin/setvars.sh"
|
||||
elif [ -e "$setvars_path/../bin/setupvars.sh" ]; then # for OpenVINO package
|
||||
setvars_path="$setvars_path/../bin/setupvars.sh"
|
||||
elif [ -e "$setvars_path/../setupvars.sh" ]; then
|
||||
setvars_path="$setvars_path/../setupvars.sh"
|
||||
else
|
||||
printf "Error: setupvars.sh is not found in hardcoded paths. \n\n"
|
||||
exit 1
|
||||
fi
|
||||
if ! source $setvars_path ; then
|
||||
printf "Unable to run ./setupvars.sh. Please check its presence. \n\n"
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
||||
if ! command -v cmake &>/dev/null; then
|
||||
printf "\n\nCMAKE is not installed. It is required to build Inference Engine samples. Please install it. \n\n"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
build_dir=$HOME/inference_engine_samples_build
|
||||
mkdir -p $build_dir
|
||||
cd $build_dir
|
||||
cmake -DCMAKE_BUILD_TYPE=Release $SAMPLES_PATH
|
||||
make -j8
|
||||
|
||||
printf "\nBuild completed, you can find binaries for all samples in the $HOME/inference_engine_samples_build/intel64/Release subfolder.\n\n"
|
||||
68
inference-engine/samples/calibration_tool/CMakeLists.txt
Normal file
68
inference-engine/samples/calibration_tool/CMakeLists.txt
Normal file
@@ -0,0 +1,68 @@
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
cmake_minimum_required(VERSION 2.8)
|
||||
|
||||
set (TARGET_NAME "calibration_tool")
|
||||
|
||||
file (GLOB MAIN_SRC
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/*.cpp
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/pugixml/*.cpp
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/ClassificationProcessor.cpp
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/classification_set_generator.cpp
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/image_decoder.cpp
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/ObjectDetectionProcessor.cpp
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/Processor.cpp
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app/VOCAnnotationParser.cpp
|
||||
)
|
||||
|
||||
file (GLOB MAIN_HEADERS
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/*.hpp
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/pugixml/*.hpp
|
||||
)
|
||||
|
||||
# Create named folders for the sources within the .vcproj
|
||||
# Empty name lists them directly under the .vcproj
|
||||
source_group("src" FILES ${MAIN_SRC})
|
||||
source_group("include" FILES ${MAIN_HEADERS})
|
||||
|
||||
# opencv include folders
|
||||
find_package(OpenCV QUIET COMPONENTS core imgproc highgui imgcodecs)
|
||||
if(NOT(OpenCV_FOUND))
|
||||
find_package(OpenCV QUIET COMPONENTS world)
|
||||
if(NOT(OpenCV_FOUND))
|
||||
message(WARNING "No suitable OpenCV version detected, " ${TARGET_NAME} " skipped")
|
||||
return()
|
||||
endif()
|
||||
endif()
|
||||
|
||||
# Properties->C/C++->General->Additional Include Directories
|
||||
include_directories (${CMAKE_CURRENT_SOURCE_DIR}/../classification_sample/core
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../common
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../common/os/windows
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../../include
|
||||
${OpenCV_INCLUDE_DIRS}
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/../validation_app)
|
||||
|
||||
link_directories(${LIB_FOLDER})
|
||||
|
||||
# Create library file from sources.
|
||||
add_executable(${TARGET_NAME} ${MAIN_SRC} ${MAIN_HEADERS})
|
||||
|
||||
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
|
||||
COMPILE_PDB_NAME ${TARGET_NAME})
|
||||
target_link_libraries(${TARGET_NAME} gflags IE::ie_cpu_extension ${InferenceEngine_LIBRARIES} ${OpenCV_LIBRARIES})
|
||||
if (UNIX)
|
||||
target_link_libraries(${TARGET_NAME} dl)
|
||||
endif()
|
||||
|
||||
103
inference-engine/samples/calibration_tool/README.md
Normal file
103
inference-engine/samples/calibration_tool/README.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# Calibration Tool
|
||||
|
||||
Inference Engine Calibration Tool calibrates a given FP32 model so that is can be run in low-precision 8-bit integer
|
||||
mode while keeping the input data of this model in the original precision.
|
||||
|
||||
## Calibration Tool Options
|
||||
|
||||
The core command-line options for the Calibration Tool are the same as for
|
||||
[Validation Application](./samples/validation_app/README.md). However, the Calibration Tool has the following specific options: `-t`, `-subset`, `-output`, and `-threshold`.
|
||||
|
||||
Running the Calibration Tool with the `-h` option yields the following usage message with all CLI options listed:
|
||||
```sh
|
||||
Usage: calibration_tool [OPTION]
|
||||
|
||||
Available options:
|
||||
|
||||
-h Print a help message
|
||||
-t <type> Type of an inferred network ("C" by default)
|
||||
-t "C" to calibrate Classification network and write the calibrated network to IR
|
||||
-t "OD" to calibrate Object Detection network and write the calibrated network to IR
|
||||
-t "RawC" to collect only statistics for Classification network and write statistics to IR. With this option, a model is not calibrated. For calibration and statisctics collection, use "-t C" instead.
|
||||
-t "RawOD" to collect only statistics for Object Detection network and write statistics to IR. With this option, a model is not calibrated. For calibration and statisctics collection, use "-t OD" instead
|
||||
-i <path> Required. Path to a directory with validation images. For Classification models, the directory must contain folders named as labels with images inside or a .txt file with a list of images. For Object Detection models, the dataset must be in VOC format.
|
||||
-m <path> Required. Path to an .xml file with a trained model, including model name and extension.
|
||||
-l <absolute_path> Required for CPU custom layers. Absolute path to a shared library with the kernel implementations.
|
||||
-c <absolute_path> Required for GPU custom kernels. Absolute path to an .xml file with the kernel descriptions.
|
||||
-d <device> Target device to infer on: CPU (default), GPU, FPGA, or MYRIAD. The application looks for a suitable plugin for the specified device.
|
||||
-b N Batch size value. If not specified, the batch size value is taken from IR
|
||||
-ppType <type> Preprocessing type. Options: "None", "Resize", "ResizeCrop"
|
||||
-ppSize N Preprocessing size (used with ppType="ResizeCrop")
|
||||
-ppWidth W Preprocessing width (overrides -ppSize, used with ppType="ResizeCrop")
|
||||
-ppHeight H Preprocessing height (overrides -ppSize, used with ppType="ResizeCrop")
|
||||
--dump Dump file names and inference results to a .csv file
|
||||
-subset Number of pictures from the whole validation set tocreate the calibration dataset. Default value is 0, which stands forthe whole provided dataset
|
||||
-output <output_IR> Output name for calibrated model. Default is <original_model_name>_i8.xml|bin
|
||||
-threshold Threshold for a maximum accuracy drop of quantized model. Must be an integer number (percents) without a percent sign. Default value is 1, which stands for accepted accuracy drop in 1%
|
||||
|
||||
Classification-specific options:
|
||||
-Czb true "Zero is a background" flag. Some networks are trained with a modified dataset where the class IDs are enumerated from 1, but 0 is an undefined "background" class (which is never detected)
|
||||
|
||||
Object detection-specific options:
|
||||
-ODkind <kind> Type of an Object Detection model. Options: SSD
|
||||
-ODa <path> Required for Object Detection models. Path to a directory containing an .xml file with annotations for images.
|
||||
-ODc <file> Required for Object Detection models. Path to a file with a list of classes
|
||||
-ODsubdir <name> Directory between the path to images (specified with -i) and image name (specified in the .xml file). For VOC2007 dataset, use JPEGImages.
|
||||
```
|
||||
|
||||
The tool options are divided into two categories:
|
||||
1. **Common options** named with a single letter or a word, such as <code>-b</code> or <code>--dump</code>.
|
||||
These options are the same in all calibration tool modes.
|
||||
2. **Network type-specific options** named as an acronym of the network type (<code>C</code> or <code>OD</code>)
|
||||
followed by a letter or a word.
|
||||
|
||||
|
||||
## Calibrate a Classification Model
|
||||
|
||||
To calibrate a classification convolutional neural network (CNN)
|
||||
on a subset of images (first 2000 images) from the given dataset (specified with the `-i` option), run the following command:
|
||||
|
||||
```bash
|
||||
./calibration_tool -t C -i <path_to_images_directory_or_txt_file> -m <path_to_classification_model>/<model_name>.xml -d <CPU|GPU> -subset 2000
|
||||
```
|
||||
|
||||
The dataset must have the correct format. Classification models support two formats: folders
|
||||
named as labels that contain all images of this class and ImageNet*-like format, with the
|
||||
`.txt` file containing list of images and IDs of classes.
|
||||
|
||||
For more information on the structure of the datasets, refer to the **Prepare a Dataset** section of the
|
||||
[Validation Application document](./samples/validation_app/README.md).
|
||||
|
||||
If you decide to use the subset of the given dataset, use the ImageNet-like format
|
||||
instead of "folder as classes" format. This brings a more accurate calibration as you are likely to get images
|
||||
representing different classes.
|
||||
|
||||
For example, to calibrate the pretrained TensorFlow\* `inception_v4_tf.xml` classification model,
|
||||
run the following command:
|
||||
|
||||
```bash
|
||||
./calibration_tool -t C -m inception_v4_tf.xml -i ILSVRC2012_val.txt -Czb false -ppType "ResizeCrop" -ppSize 342 -b 1 -d CPU -subset 2000
|
||||
```
|
||||
|
||||
## Calibrate Object Detection Model
|
||||
|
||||
This topic demonstrates how to run the Calibration Tool on the Object Detection CNN on a set of images. Please
|
||||
review the list of Object Detection models used for validation of the Calibration Tool
|
||||
in the [8-bit Inference Introduction](./docs/Inference_Engine_Developer_Guide/Int8Inference.md).
|
||||
Any network that can be inferred with the Inference Engine and has the same input and output
|
||||
format as the SSD CNN should be supported as well.
|
||||
|
||||
### Run SSD Network on the VOC dataset
|
||||
|
||||
Before you start calibrating the model, make sure your dataset is in the correct format. For more information,
|
||||
refer to the **Prepare a Dataset** section of the
|
||||
[Validation Application document](./samples/validation_app/README.md).
|
||||
|
||||
Once you have prepared the dataset, you can calibrate the model on it by running the following command:
|
||||
```bash
|
||||
./calibration_tool -d CPU -t OD -ODa "<path_to_image_annotations>/VOCdevkit/VOC2007/Annotations" -i "<path_to_image_directory>/VOCdevkit" -m "<path_to_model>/vgg_voc0712_ssd_300x300.xml" -ODc "<path_to_classes_list>/VOC_SSD_Classes.txt" -ODsubdir JPEGImages -subset 500
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)
|
||||
@@ -0,0 +1,847 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include "calibrator_processors.h"
|
||||
#include <string> // std::string
|
||||
#include <iostream> // std::cout
|
||||
#include <sstream> // std::stringstream
|
||||
#include <iomanip>
|
||||
#include <algorithm>
|
||||
#include <map>
|
||||
#include <memory>
|
||||
#include <utility>
|
||||
#include <list>
|
||||
#include "details/ie_cnn_network_tools.h"
|
||||
#include "details/caseless.hpp"
|
||||
|
||||
using namespace InferenceEngine;
|
||||
using namespace InferenceEngine::details;
|
||||
|
||||
using InferenceEngine::details::InferenceEngineException;
|
||||
|
||||
CNNLayerPtr Int8Calibrator::addScaleShiftBeforeLayer(std::string name, CNNLayer::Ptr beforeLayer, size_t port, std::vector<float> scale) {
|
||||
if (beforeLayer->insData.size() < port) {
|
||||
THROW_IE_EXCEPTION << "cannot find appropraite port for addScaleShiftBeforeLayer";
|
||||
}
|
||||
|
||||
DataPtr pData = beforeLayer->insData[port].lock();
|
||||
LayerParams params;
|
||||
params.name = name;
|
||||
params.precision = Precision::FP32;
|
||||
params.type = "ScaleShift";
|
||||
CNNLayerPtr lptr = std::make_shared<ScaleShiftLayer>(params);
|
||||
ScaleShiftLayer *pScaleShift = dynamic_cast<ScaleShiftLayer *>(lptr.get());
|
||||
|
||||
SizeVector wdims({ pData->dims[2] });
|
||||
|
||||
if (scale.size() == 1) {
|
||||
scale.resize(wdims[0]);
|
||||
for (int i = 1; i < wdims[0]; i++) {
|
||||
scale[i] = scale[0];
|
||||
}
|
||||
}
|
||||
|
||||
if (scale.size() != pData->dims[2]) {
|
||||
THROW_IE_EXCEPTION << "Failed to add scaleshift before " << beforeLayer->name << " due to scales and layer output dims incossitency";
|
||||
}
|
||||
|
||||
Blob::Ptr weights = nullptr;
|
||||
weights = make_shared_blob<float>(Precision::FP32, Layout::C, wdims);
|
||||
weights->allocate();
|
||||
float *buffer = weights->buffer().as<float *>();
|
||||
if (buffer == nullptr) {
|
||||
THROW_IE_EXCEPTION << "Could not allocate weights buffer";
|
||||
}
|
||||
for (size_t i = 0, idx = 0; i < pData->dims[2]; i++) {
|
||||
buffer[i] = scale[i];
|
||||
}
|
||||
pScaleShift->_weights = weights;
|
||||
|
||||
|
||||
SizeVector bdims({ pData->dims[2] });
|
||||
Blob::Ptr biases = nullptr;
|
||||
biases = make_shared_blob<float>(Precision::FP32, Layout::C, bdims);
|
||||
biases->allocate();
|
||||
buffer = biases->buffer().as<float *>();
|
||||
for (size_t i = 0, idx = 0; i < pData->dims[2]; i++) {
|
||||
buffer[i] = 0.f;
|
||||
}
|
||||
pScaleShift->_biases = biases;
|
||||
|
||||
Data *edge2 = new Data(*pData.get());
|
||||
DataPtr newEdge(edge2);
|
||||
lptr->insData.push_back(pData);
|
||||
lptr->outData.push_back(newEdge);
|
||||
newEdge->name = /*"EdgeAfter_" +*/ params.name;
|
||||
newEdge->creatorLayer = lptr;
|
||||
newEdge->inputTo.clear();
|
||||
newEdge->inputTo[beforeLayer->name] = beforeLayer;
|
||||
|
||||
pData->inputTo.erase(beforeLayer->name);
|
||||
pData->inputTo[params.name] = lptr;
|
||||
|
||||
for (size_t i = 0; i < beforeLayer->insData.size(); i++) {
|
||||
DataPtr d = beforeLayer->insData[i].lock();
|
||||
if (d == pData) {
|
||||
beforeLayer->insData[i] = newEdge;
|
||||
break;
|
||||
}
|
||||
}
|
||||
return lptr;
|
||||
}
|
||||
|
||||
|
||||
float Int8Calibrator::compare_NRMSD(InferenceEngine::Blob::Ptr res, InferenceEngine::Blob::Ptr ref) {
|
||||
float *res_ptr = res->buffer().as<float *>();
|
||||
size_t res_size = res->size();
|
||||
|
||||
float *ref_ptr = ref->buffer().as<float *>();
|
||||
size_t ref_size = ref->size();
|
||||
|
||||
float sum = 0;
|
||||
|
||||
float mmin = ref_ptr[0], mmax = ref_ptr[0];
|
||||
|
||||
for (size_t i = 0; i < ref_size; i++) {
|
||||
float sqr = (ref_ptr[i] - res_ptr[i]);
|
||||
sqr *= sqr;
|
||||
sum += sqr;
|
||||
|
||||
mmin = std::min(mmin, ref_ptr[i]);
|
||||
mmax = std::max(mmax, ref_ptr[i]);
|
||||
}
|
||||
sum /= ref_size;
|
||||
|
||||
sum = pow(sum, 0.5);
|
||||
|
||||
sum /= mmax - mmin;
|
||||
|
||||
return sum;
|
||||
}
|
||||
|
||||
|
||||
InferenceEngine::NetworkStatsMap Int8Calibrator::getStatistic(float threshold) {
|
||||
InferenceEngine::NetworkStatsMap netNodesStats;
|
||||
// go over all outputs and get aggregated statistics
|
||||
for (auto l : _statData.registeredLayers()) {
|
||||
NetworkNodeStatsPtr nodeStats;
|
||||
size_t channels = _statData.getNumberChannels(l);
|
||||
if (netNodesStats.find(l) == netNodesStats.end()) {
|
||||
nodeStats = NetworkNodeStatsPtr(new NetworkNodeStats(channels));
|
||||
|
||||
netNodesStats[l] = nodeStats;
|
||||
} else {
|
||||
nodeStats = netNodesStats[l];
|
||||
}
|
||||
for (size_t c = 0; c < channels; c++) {
|
||||
_statData.getDataMinMax(l, c, nodeStats->_minOutputs[c], nodeStats->_maxOutputs[c], threshold);
|
||||
}
|
||||
}
|
||||
return netNodesStats;
|
||||
}
|
||||
|
||||
|
||||
void Int8Calibrator::collectFP32Statistic() {
|
||||
_collectByLayer = false;
|
||||
_collectStatistic = true;
|
||||
|
||||
networkReaderC = InferenceEngine::CNNNetReader();
|
||||
networkReaderC.ReadNetwork(_modelFileNameI8C);
|
||||
if (!networkReaderC.isParseSuccess()) THROW_IE_EXCEPTION << "cannot load a failed Model";
|
||||
if (_cBatch == 0) {
|
||||
// Zero means "take batch value from the IR"
|
||||
_cBatch = networkReaderC.getNetwork().getBatchSize();
|
||||
} else {
|
||||
// Not zero means "use the specified value"
|
||||
networkReaderC.getNetwork().setBatchSize(_cBatch);
|
||||
}
|
||||
|
||||
/** Extract model name and load weights **/
|
||||
std::string binFileName = fileNameNoExt(_modelFileNameI8C) + ".bin";
|
||||
networkReaderC.ReadWeights(binFileName.c_str());
|
||||
|
||||
auto network = networkReaderC.getNetwork();
|
||||
|
||||
|
||||
std::vector<CNNLayerPtr> layersAfterInputs;
|
||||
|
||||
std::string hackPrefix = "scaleshifted_input:";
|
||||
|
||||
for (auto &&layer : network) {
|
||||
if (layer->insData.size() > 0) {
|
||||
std::string inName = layer->input()->getName();
|
||||
for (auto &&input : network.getInputsInfo()) {
|
||||
if (inName == input.first) {
|
||||
layersAfterInputs.push_back(layer);
|
||||
_inputsFromLayers[hackPrefix + layer->name] = inName;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for (auto &&layer : layersAfterInputs) {
|
||||
std::string firstInputName = hackPrefix + layer->name;
|
||||
auto scaleShiftLayer = addScaleShiftBeforeLayer(firstInputName, layer, 0, { 1.f });
|
||||
((ICNNNetwork&)network).addLayer(scaleShiftLayer);
|
||||
}
|
||||
|
||||
|
||||
// 1. add all layers as output one
|
||||
for (auto &&layer : network) {
|
||||
std::string layerType = network.getLayerByName(layer->name.c_str())->type;
|
||||
if (/*layerType != "Split" &&*/layerType != "Input") {
|
||||
network.addOutput(layer->name);
|
||||
}
|
||||
_statData.registerLayer(layer->name);
|
||||
}
|
||||
|
||||
ExecutableNetwork executable_network = _pluginI8C.LoadNetwork(network, { { CONFIG_KEY(EXCLUSIVE_ASYNC_REQUESTS), CONFIG_VALUE(YES) } });
|
||||
_inferRequestI8C = executable_network.CreateInferRequest();
|
||||
}
|
||||
|
||||
void Int8Calibrator::validateInt8Config(const InferenceEngine::NetworkStatsMap &stat,
|
||||
const std::map<std::string, bool> &layersToInt8) {
|
||||
_collectByLayer = false;
|
||||
_collectStatistic = false;
|
||||
networkReaderC = InferenceEngine::CNNNetReader();
|
||||
networkReaderC.ReadNetwork(_modelFileNameI8C);
|
||||
if (!networkReaderC.isParseSuccess()) THROW_IE_EXCEPTION << "cannot load a failed Model";
|
||||
if (_cBatch == 0) {
|
||||
// Zero means "take batch value from the IR"
|
||||
_cBatch = networkReaderC.getNetwork().getBatchSize();
|
||||
} else {
|
||||
// Not zero means "use the specified value"
|
||||
networkReaderC.getNetwork().setBatchSize(_cBatch);
|
||||
}
|
||||
|
||||
/** Extract model name and load weights **/
|
||||
std::string binFileName = fileNameNoExt(_modelFileNameI8C) + ".bin";
|
||||
networkReaderC.ReadWeights(binFileName.c_str());
|
||||
|
||||
// Initialize statistic
|
||||
ICNNNetworkStats *pstats = nullptr;
|
||||
StatusCode s = ((ICNNNetwork&)networkReaderC.getNetwork()).getStats(&pstats, nullptr);
|
||||
if (s == StatusCode::OK && pstats) {
|
||||
pstats->setNodesStats(stat);
|
||||
}
|
||||
|
||||
auto network = networkReaderC.getNetwork();
|
||||
for (auto l : layersToInt8) {
|
||||
network.getLayerByName(l.first.c_str())->
|
||||
params["quantization_level"] = (l.second == false) ? "FP32" : "I8";
|
||||
}
|
||||
|
||||
ExecutableNetwork executable_network = _pluginI8C.LoadNetwork(network, { { CONFIG_KEY(EXCLUSIVE_ASYNC_REQUESTS), CONFIG_VALUE(YES) } });
|
||||
_inferRequestI8C = executable_network.CreateInferRequest();
|
||||
}
|
||||
|
||||
CNNNetwork Int8Calibrator::createICNNNetworkForLayer(CNNLayer::Ptr layerToClone) {
|
||||
CNNLayer::Ptr layerRelU = layerToClone->outData[0]->inputTo.begin()->second;
|
||||
|
||||
InferenceEngine::CNNNetReader reader1;
|
||||
std::string inpuitName = layerToClone->insData[0].lock()->name;
|
||||
std::string model = "<net name=\"L\" version=\"2\" batch=\"1\"><layers> " \
|
||||
"<layer name=\"" +
|
||||
inpuitName +
|
||||
"\" type=\"Input\" precision=\"FP32\" id=\"0\"> "\
|
||||
"<output>"\
|
||||
"<port id=\"0\">"\
|
||||
"<dim>1</dim>"\
|
||||
"<dim>3</dim>"\
|
||||
"<dim>224</dim>"\
|
||||
"<dim>224</dim>"\
|
||||
"</port>"\
|
||||
"</output>"\
|
||||
"</layer>" \
|
||||
"<layer name=\"" +
|
||||
layerToClone->name +
|
||||
"\" type=\"Convolution\" precision=\"FP32\" id=\"1\">" \
|
||||
"<convolution_data stride-x=\"2\" stride-y=\"2\" pad-x=\"3\" pad-y=\"3\" kernel-x=\"7\" kernel-y=\"7\" output=\"64\" group=\"1\" />"\
|
||||
"<input>"\
|
||||
"<port id=\"1\">"\
|
||||
"<dim>1</dim>"\
|
||||
"<dim>3</dim>"\
|
||||
"<dim>224</dim>"\
|
||||
"<dim>224</dim>"\
|
||||
"</port>"\
|
||||
"</input>"\
|
||||
"<output>"\
|
||||
"<port id=\"2\">"\
|
||||
"<dim>1</dim>"\
|
||||
"<dim>64</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"</port>"\
|
||||
"</output>"\
|
||||
"</layer>"\
|
||||
"<layer name=\"" +
|
||||
layerRelU->name +
|
||||
"\" type=\"ReLU\" precision=\"FP32\" id=\"2\">"\
|
||||
"<input>"
|
||||
"<port id=\"3\">"\
|
||||
"<dim>1</dim>"\
|
||||
"<dim>64</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"</port>"\
|
||||
"</input>"\
|
||||
"<output>"\
|
||||
"<port id=\"4\">"\
|
||||
"<dim>1</dim>"\
|
||||
"<dim>64</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"</port>"\
|
||||
"</output>"\
|
||||
"</layer>"\
|
||||
"<layer name=\"" +
|
||||
layerToClone->name +
|
||||
"_\" type=\"ScaleShift\" precision=\"FP32\" id=\"3\">"\
|
||||
"<input>"
|
||||
"<port id=\"5\">"\
|
||||
"<dim>1</dim>"\
|
||||
"<dim>64</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"</port>"\
|
||||
"</input>"\
|
||||
"<output>"\
|
||||
"<port id=\"6\">"\
|
||||
"<dim>1</dim>"\
|
||||
"<dim>64</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"<dim>112</dim>"\
|
||||
"</port>"\
|
||||
"</output>"\
|
||||
"</layer>"\
|
||||
"</layers> <edges>"\
|
||||
"<edge from-layer=\"0\" from-port=\"0\" to-layer=\"1\" to-port=\"1\"/> "\
|
||||
"<edge from-layer=\"1\" from-port=\"2\" to-layer=\"2\" to-port=\"3\"/> "\
|
||||
"<edge from-layer=\"2\" from-port=\"4\" to-layer=\"3\" to-port=\"5\"/> "\
|
||||
"</edges></net>";
|
||||
|
||||
reader1.ReadNetwork(model.c_str(), model.length());
|
||||
ICNNNetwork &n = reader1.getNetwork();
|
||||
|
||||
InferenceEngine::InputsDataMap inputs;
|
||||
n.getInputsInfo(inputs);
|
||||
CNNLayerPtr inputLayer = inputs.begin()->second->getInputData()->creatorLayer.lock();
|
||||
|
||||
CNNLayerPtr convLayer;
|
||||
n.getLayerByName(layerToClone->name.c_str(), convLayer, nullptr);
|
||||
ConvolutionLayer *pConvS = dynamic_cast<ConvolutionLayer *>(layerToClone.get());
|
||||
ConvolutionLayer *pConvT = dynamic_cast<ConvolutionLayer *>(convLayer.get());
|
||||
pConvT->_kernel_x = pConvS->_kernel_x;
|
||||
pConvT->_kernel_y = pConvS->_kernel_y;
|
||||
pConvT->_stride_x = pConvS->_stride_x;
|
||||
pConvT->_stride_y = pConvS->_stride_y;
|
||||
pConvT->_out_depth = pConvS->_out_depth;
|
||||
pConvT->_padding_x = pConvS->_padding_x;
|
||||
pConvT->_padding_y = pConvS->_padding_y;
|
||||
pConvT->_dilation_x = pConvS->_dilation_x;
|
||||
pConvT->_dilation_y = pConvS->_dilation_y;
|
||||
pConvT->_group = pConvS->_group;
|
||||
pConvT->_weights = pConvS->_weights;
|
||||
pConvT->_biases = pConvS->_biases;
|
||||
pConvT->blobs = pConvS->blobs;
|
||||
|
||||
std::shared_ptr<Data> cur = layerToClone->insData[0].lock();
|
||||
if (cur == nullptr) {
|
||||
THROW_IE_EXCEPTION << "[Samples] shared ptr layerToClone->insData[0].lock() return nullptr";
|
||||
}
|
||||
DataPtr inputEdge = std::make_shared<Data>(*cur.get());
|
||||
|
||||
inputEdge->getInputTo().clear();
|
||||
inputEdge->name = inpuitName;
|
||||
inputEdge->creatorLayer = inputLayer;
|
||||
inputEdge->inputTo[layerToClone->name] = convLayer;
|
||||
inputEdge->getInputTo().clear();
|
||||
inputEdge->inputTo[layerToClone->name] = convLayer;
|
||||
|
||||
inputs.begin()->second->setInputData(inputEdge);
|
||||
|
||||
convLayer->insData.clear();
|
||||
convLayer->insData.push_back(inputEdge);
|
||||
|
||||
inputLayer->outData.clear();
|
||||
inputLayer->outData.push_back(inputEdge);
|
||||
|
||||
DataPtr convEdge = std::make_shared<Data>(*layerToClone->outData[0].get());
|
||||
convEdge->getInputTo().clear();
|
||||
convEdge->creatorLayer = convLayer;
|
||||
convEdge->name = convLayer->name;
|
||||
convLayer->outData.clear();
|
||||
convLayer->outData.push_back(convEdge);
|
||||
|
||||
CNNLayerPtr reluLayer;
|
||||
n.getLayerByName(layerRelU->name.c_str(), reluLayer, nullptr);
|
||||
DataPtr reluEdge = std::make_shared<Data>(*layerRelU->outData[0].get());
|
||||
reluEdge->getInputTo().clear();
|
||||
reluEdge->creatorLayer = reluLayer;
|
||||
reluEdge->name = reluLayer->name;
|
||||
reluLayer->insData.clear();
|
||||
reluLayer->insData.push_back(convEdge);
|
||||
reluLayer->outData.clear();
|
||||
reluLayer->outData.push_back(reluEdge);
|
||||
|
||||
convEdge->inputTo[reluLayer->name] = reluLayer;
|
||||
|
||||
CNNLayerPtr ssLayer;
|
||||
std::string ssLayerName = convLayer->name + "_";
|
||||
n.getLayerByName(ssLayerName.c_str(), ssLayer, nullptr);
|
||||
DataPtr ssEdge = std::make_shared<Data>(*layerRelU->outData[0].get());
|
||||
ssEdge->getInputTo().clear();
|
||||
ssEdge->creatorLayer = ssLayer;
|
||||
ssEdge->name = ssLayer->name;
|
||||
ssLayer->insData.clear();
|
||||
ssLayer->insData.push_back(reluEdge);
|
||||
ssLayer->outData.clear();
|
||||
ssLayer->outData.push_back(ssEdge);
|
||||
|
||||
reluEdge->inputTo[ssLayer->name] = ssLayer;
|
||||
|
||||
n.addOutput(ssLayer->name);
|
||||
|
||||
// filling weights and biases
|
||||
size_t channels = ssEdge->getTensorDesc().getDims()[1];
|
||||
Blob::Ptr weights = nullptr;
|
||||
SizeVector wdims;
|
||||
wdims.push_back(channels);
|
||||
weights = make_shared_blob<float, const SizeVector>(Precision::FP32, Layout::C, wdims);
|
||||
weights->allocate();
|
||||
float *dataw = weights->buffer().as<float *>();
|
||||
for (size_t i = 0; i < channels; i++) {
|
||||
dataw[i] = 1.0f;
|
||||
}
|
||||
ssLayer->blobs["weights"] = weights;
|
||||
|
||||
Blob::Ptr biases = nullptr;
|
||||
SizeVector bdims;
|
||||
bdims.push_back(channels);
|
||||
biases = make_shared_blob<float, const SizeVector>(Precision::FP32, Layout::C, bdims);
|
||||
biases->allocate();
|
||||
float *datab = biases->buffer().as<float *>();
|
||||
for (size_t i = 0; i < channels; i++) {
|
||||
datab[i] = 0.0f;
|
||||
}
|
||||
ssLayer->blobs["biases"] = biases;
|
||||
|
||||
auto wss = dynamic_cast<WeightableLayer*>(ssLayer.get());
|
||||
wss->_weights = weights;
|
||||
wss->_biases = biases;
|
||||
|
||||
return reader1.getNetwork();
|
||||
}
|
||||
|
||||
void Int8Calibrator::collectByLayerStatistic(const InferenceEngine::NetworkStatsMap &stat) {
|
||||
_collectByLayer = true;
|
||||
_collectStatistic = false;
|
||||
networkReaderC = InferenceEngine::CNNNetReader();
|
||||
networkReaderC.ReadNetwork(_modelFileNameI8C);
|
||||
if (!networkReaderC.isParseSuccess()) THROW_IE_EXCEPTION << "cannot load a failed Model";
|
||||
if (_cBatch != 0) {
|
||||
networkReaderC.getNetwork().setBatchSize(_cBatch);
|
||||
}
|
||||
|
||||
/** Extract model name and load weights **/
|
||||
std::string binFileName = fileNameNoExt(_modelFileNameI8C) + ".bin";
|
||||
networkReaderC.ReadWeights(binFileName.c_str());
|
||||
|
||||
auto network = networkReaderC.getNetwork();
|
||||
// 1. add all layers as output one
|
||||
for (auto &&layer : network) {
|
||||
std::string layerType = network.getLayerByName(layer->name.c_str())->type;
|
||||
if (/*layerType != "Split" &&*/layerType != "Input") {
|
||||
network.addOutput(layer->name);
|
||||
}
|
||||
|
||||
if (layerType == "Convolution") {
|
||||
_layersAccuracyDrop[layer->name] = 0.f;
|
||||
}
|
||||
}
|
||||
|
||||
ExecutableNetwork executable_network = _pluginI8C.LoadNetwork(network, { { CONFIG_KEY(EXCLUSIVE_ASYNC_REQUESTS), CONFIG_VALUE(YES) } });
|
||||
_inferRequestI8C = executable_network.CreateInferRequest();
|
||||
|
||||
// 2. go over all layers which affect accuracy and create network basing on it
|
||||
for (auto l : _layersAccuracyDrop) {
|
||||
CNNLayerPtr layerToClone = network.getLayerByName(l.first.c_str());
|
||||
CNNLayerPtr layerRelU = nullptr;
|
||||
// verification if there is Conv-RELU patern
|
||||
// currently it is only supported
|
||||
|
||||
// if only one output from conv and if it is an output to relu
|
||||
bool quattization = false;
|
||||
if (layerToClone->outData.size() == 1 && layerToClone->outData[0]->inputTo.size() == 1) {
|
||||
layerRelU = layerToClone->outData[0]->inputTo.begin()->second;
|
||||
if (layerRelU->type == "ReLU") {
|
||||
quattization = true;
|
||||
}
|
||||
}
|
||||
|
||||
if (quattization) {
|
||||
CNNNetwork n = createICNNNetworkForLayer(layerToClone);
|
||||
if (_cBatch != 0) {
|
||||
n.setBatchSize(_cBatch);
|
||||
}
|
||||
|
||||
// Initialize statistic
|
||||
ICNNNetworkStats *pstats = nullptr;
|
||||
ICNNNetwork &in = n;
|
||||
StatusCode s = in.getStats(&pstats, nullptr);
|
||||
if (s == StatusCode::OK && pstats) {
|
||||
pstats->setNodesStats(stat);
|
||||
}
|
||||
|
||||
InferenceEngine::InputsDataMap inputs = n.getInputsInfo();
|
||||
DataPtr q = inputs.begin()->second->getInputData();
|
||||
|
||||
ExecutableNetwork enetwork = _pluginI8C.LoadNetwork(n, { { CONFIG_KEY(EXCLUSIVE_ASYNC_REQUESTS), CONFIG_VALUE(YES) } });
|
||||
_singleLayerNetworks.push_back(enetwork);
|
||||
InferenceEngine::InferRequest request = enetwork.CreateInferRequest();
|
||||
std::string inpuitName = layerToClone->insData[0].lock()->name;
|
||||
request.SetBlob(inpuitName, _inferRequestI8C.GetBlob(inpuitName));
|
||||
_singleLayerRequests[layerToClone->name] = { request, layerRelU->name, layerToClone->name };
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
void Int8Calibrator::collectCalibrationStatistic() {
|
||||
if (_collectByLayer) {
|
||||
std::map<std::string, SingleLayerData>::iterator it = _singleLayerRequests.begin();
|
||||
while (it != _singleLayerRequests.end()) {
|
||||
it->second._request.Infer();
|
||||
Blob::Ptr expected = _inferRequestI8C.GetBlob(it->second._outputName);
|
||||
std::string i8Out = it->second._outputI8Name + "_";
|
||||
Blob::Ptr result = it->second._request.GetBlob(i8Out.c_str());
|
||||
float diff = compare_NRMSD(result, expected);
|
||||
it->second._int8Accuracy.push_back(diff);
|
||||
it++;
|
||||
}
|
||||
}
|
||||
if (_collectStatistic) {
|
||||
for (auto l : _statData.registeredLayers()) {
|
||||
auto outBlob = _inferRequestI8C.GetBlob(l);
|
||||
|
||||
std::string outName = l;
|
||||
if (_inputsFromLayers.find(l) != _inputsFromLayers.end()) {
|
||||
outName = _inputsFromLayers[l];
|
||||
}
|
||||
|
||||
size_t N, C, statCount;
|
||||
if (outBlob->dims().size() == 4 && outBlob->layout() == Layout::NCHW) {
|
||||
N = outBlob->dims()[3];
|
||||
C = outBlob->dims()[2];
|
||||
statCount = C;
|
||||
} else if (outBlob->dims().size() == 2 && outBlob->layout() == Layout::NC) {
|
||||
N = outBlob->dims()[1];
|
||||
C = outBlob->dims()[0];
|
||||
statCount = 1;
|
||||
} else {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Counting min/max outputs per channel
|
||||
for (size_t n = 0; n < N; n++) {
|
||||
if (outBlob->dims().size() == 4) {
|
||||
size_t _HW = outBlob->dims()[0] * outBlob->dims()[1];
|
||||
for (size_t c = 0; c < C; c++) {
|
||||
if (outBlob->getTensorDesc().getPrecision() == Precision::FP32) {
|
||||
float *ptr = &outBlob->buffer().as<float *>()[(n * C + c) * _HW];
|
||||
_statData.addTensorStatistics(outName, c, ptr, _HW);
|
||||
} else if (outBlob->getTensorDesc().getPrecision() == Precision::U8) {
|
||||
uint8_t *ptr = &outBlob->buffer().as<uint8_t *>()[(n * C + c) * _HW];
|
||||
_statData.addTensorStatistics(outName, c, ptr, _HW);
|
||||
} else {
|
||||
throw std::logic_error(std::string("Unsupported precision: ") + outBlob->getTensorDesc().getPrecision().name());
|
||||
}
|
||||
}
|
||||
} else if (outBlob->dims().size() == 2) {
|
||||
if (outBlob->getTensorDesc().getPrecision() == Precision::FP32) {
|
||||
float *ptr = &outBlob->buffer().as<float *>()[n * C];
|
||||
_statData.addTensorStatistics(outName, 0, ptr, C);
|
||||
} else if (outBlob->getTensorDesc().getPrecision() == Precision::U8) {
|
||||
uint8_t *ptr = &outBlob->buffer().as<uint8_t *>()[n * C];
|
||||
_statData.addTensorStatistics(outName, 0, ptr, C);
|
||||
} else {
|
||||
throw std::logic_error(std::string("Unsupported precision: ") + outBlob->getTensorDesc().getPrecision().name());
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void Int8Calibrator::calculateLayersAccuracyDrop() {
|
||||
_layersAccuracyDrop.clear();
|
||||
|
||||
std::map<std::string, SingleLayerData>::iterator it = _singleLayerRequests.begin();
|
||||
while (it != _singleLayerRequests.end()) {
|
||||
// calculate average metric per layer over all images and sort in desc order
|
||||
float mo = 0.f;
|
||||
for (auto d : it->second._int8Accuracy) {
|
||||
mo += d;
|
||||
}
|
||||
mo = mo / it->second._int8Accuracy.size();
|
||||
_layersAccuracyDrop[it->first] = mo;
|
||||
it++;
|
||||
}
|
||||
|
||||
// correction of accuracy drop to have sorted values for cases when accuracy drop is equal
|
||||
// correction is added according to topological order
|
||||
// this will prioritize returning of layers to FP32 starting from layers closer to the end of network
|
||||
std::vector<CNNLayerPtr> ordered = InferenceEngine::details::CNNNetSortTopologically(networkReaderC.getNetwork());
|
||||
float c = 0.00001f;
|
||||
for (auto l : ordered) {
|
||||
auto it = _layersAccuracyDrop.find(l->name);
|
||||
if (it != _layersAccuracyDrop.end()) {
|
||||
it->second += c;
|
||||
}
|
||||
c += 0.00001f;
|
||||
}
|
||||
_singleLayerRequests.clear();
|
||||
}
|
||||
|
||||
std::map<std::string, float> Int8Calibrator::layersAccuracyDrop() {
|
||||
return _layersAccuracyDrop;
|
||||
}
|
||||
|
||||
|
||||
|
||||
//--------------------------------------------------------------------------------------------------
|
||||
|
||||
ClassificationCalibrator::ClassificationCalibrator(int nPictures, const std::string &flags_m,
|
||||
const std::string &flags_d, const std::string &flags_i,
|
||||
int flags_b, InferenceEngine::InferencePlugin plugin,
|
||||
CsvDumper &dumper, const std::string &flags_l,
|
||||
PreprocessingOptions preprocessingOptions, bool zeroBackground) :
|
||||
ClassificationProcessor(flags_m, flags_d, flags_i, flags_b,
|
||||
plugin, dumper, flags_l,
|
||||
preprocessingOptions, zeroBackground) {
|
||||
_modelFileNameI8C = modelFileName;
|
||||
_pluginI8C = plugin;
|
||||
_nPictures = nPictures;
|
||||
_cBatch = flags_b;
|
||||
}
|
||||
|
||||
shared_ptr<Processor::InferenceMetrics> ClassificationCalibrator::Process() {
|
||||
inferRequest = _inferRequestI8C;
|
||||
int top1Result = 0, total = 0;
|
||||
|
||||
ClassificationSetGenerator generator;
|
||||
|
||||
auto validationMap = generator.getValidationMap(imagesPath);
|
||||
ImageDecoder decoder;
|
||||
|
||||
// ----------------------------Do inference-------------------------------------------------------------
|
||||
std::vector<int> expected(batch);
|
||||
std::vector<std::string> files(batch);
|
||||
int captured = 0;
|
||||
|
||||
if (!_nPictures) {
|
||||
_nPictures = validationMap.size();
|
||||
}
|
||||
|
||||
|
||||
ConsoleProgress progress(_nPictures);
|
||||
|
||||
CalibrationMetrics im;
|
||||
|
||||
std::string firstInputName = this->inputInfo.begin()->first;
|
||||
std::string firstOutputName = this->outInfo.begin()->first;
|
||||
auto firstInputBlob = inferRequest.GetBlob(firstInputName);
|
||||
auto firstOutputBlob = inferRequest.GetBlob(firstOutputName);
|
||||
|
||||
size_t ipics = 0;
|
||||
auto iter = validationMap.begin();
|
||||
while (iter != validationMap.end() && ipics < _nPictures) {
|
||||
int b = 0;
|
||||
int filesWatched = 0;
|
||||
for (; b < batch && iter != validationMap.end() && ipics + b < _nPictures ; b++, iter++, filesWatched++) {
|
||||
expected[b] = iter->first;
|
||||
try {
|
||||
decoder.insertIntoBlob(iter->second, b, *firstInputBlob, preprocessingOptions);
|
||||
files[b] = iter->second;
|
||||
} catch (const InferenceEngineException &iex) {
|
||||
slog::warn << "Can't read file " << iter->second << slog::endl;
|
||||
// Could be some non-image file in directory
|
||||
b--;
|
||||
continue;
|
||||
}
|
||||
}
|
||||
ipics += batch;
|
||||
|
||||
Infer(progress, filesWatched, im);
|
||||
collectCalibrationStatistic();
|
||||
|
||||
std::vector<unsigned> results;
|
||||
auto firstOutputData = firstOutputBlob->buffer().as<PrecisionTrait<Precision::FP32>::value_type *>();
|
||||
InferenceEngine::TopResults(1, *firstOutputBlob, results);
|
||||
|
||||
for (int i = 0; i < b; i++) {
|
||||
int expc = expected[i];
|
||||
if (zeroBackground) expc++;
|
||||
bool top1Scored = (results[i] == expc);
|
||||
if (top1Scored) top1Result++;
|
||||
total++;
|
||||
}
|
||||
}
|
||||
progress.finish();
|
||||
|
||||
calculateLayersAccuracyDrop();
|
||||
|
||||
im.AccuracyResult = static_cast<float>(top1Result) / static_cast<float>(total);
|
||||
|
||||
return std::shared_ptr<Processor::InferenceMetrics>(new CalibrationMetrics(im));
|
||||
}
|
||||
|
||||
//--------------------------------------------------------------------------------------------------
|
||||
SSDObjectDetectionCalibrator::SSDObjectDetectionCalibrator(int nPictures, const std::string &flags_m,
|
||||
const std::string &flags_d, const std::string &flags_i,
|
||||
const std::string &subdir, int flags_b,
|
||||
double threshold,
|
||||
InferencePlugin plugin, CsvDumper &dumper,
|
||||
const std::string &flags_a, const std::string &classes_list_file) :
|
||||
SSDObjectDetectionProcessor(flags_m, flags_d, flags_i, subdir, flags_b,
|
||||
threshold,
|
||||
plugin, dumper,
|
||||
flags_a, classes_list_file) {
|
||||
_modelFileNameI8C = modelFileName;
|
||||
_pluginI8C = plugin;
|
||||
_nPictures = nPictures;
|
||||
}
|
||||
|
||||
shared_ptr<Processor::InferenceMetrics> SSDObjectDetectionCalibrator::Process() {
|
||||
inferRequest = _inferRequestI8C;
|
||||
|
||||
// Parsing PASCAL VOC2012 format
|
||||
VOCAnnotationParser vocAnnParser;
|
||||
VOCAnnotationCollector annCollector(annotationsPath);
|
||||
|
||||
if (annCollector.annotations().size() == 0) {
|
||||
ObjectDetectionInferenceMetrics emptyIM(this->threshold);
|
||||
|
||||
return std::shared_ptr<InferenceMetrics>(new ObjectDetectionInferenceMetrics(emptyIM));
|
||||
}
|
||||
|
||||
// Getting desired results from annotations
|
||||
std::map<std::string, ImageDescription> desiredForFiles;
|
||||
|
||||
for (auto &ann : annCollector.annotations()) {
|
||||
std::list<DetectedObject> dobList;
|
||||
for (auto &obj : ann.objects) {
|
||||
DetectedObject dob(classes[obj.name], obj.bndbox.xmin, obj.bndbox.ymin, obj.bndbox.xmax, obj.bndbox.ymax, 1.0, obj.difficult != 0);
|
||||
dobList.push_back(dob);
|
||||
}
|
||||
ImageDescription id(dobList);
|
||||
desiredForFiles.insert(std::pair<std::string, ImageDescription>(ann.folder + "/" + (!subdir.empty() ? subdir + "/" : "") + ann.filename, id));
|
||||
}
|
||||
|
||||
|
||||
ImageDecoder decoder;
|
||||
|
||||
const int maxProposalCount = outputDims[1];
|
||||
const int objectSize = outputDims[0];
|
||||
|
||||
for (auto &item : outInfo) {
|
||||
DataPtr outputData = item.second;
|
||||
if (!outputData) {
|
||||
throw std::logic_error("output data pointer is not valid");
|
||||
}
|
||||
}
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// ----------------------------Do inference-------------------------------------------------------------
|
||||
|
||||
std::vector<VOCAnnotation> expected(batch);
|
||||
|
||||
if (!_nPictures) {
|
||||
_nPictures = annCollector.annotations().size();
|
||||
}
|
||||
|
||||
ConsoleProgress progress(_nPictures);
|
||||
|
||||
ObjectDetectionInferenceMetrics im(threshold);
|
||||
|
||||
vector<VOCAnnotation>::const_iterator iter = annCollector.annotations().begin();
|
||||
|
||||
std::map<std::string, ImageDescription> scaledDesiredForFiles;
|
||||
|
||||
std::string firstInputName = this->inputInfo.begin()->first;
|
||||
auto firstInputBlob = inferRequest.GetBlob(firstInputName);
|
||||
size_t ipics = 0;
|
||||
|
||||
while (iter != annCollector.annotations().end() && ipics < _nPictures) {
|
||||
std::vector<std::string> files;
|
||||
int b = 0;
|
||||
|
||||
int filesWatched = 0;
|
||||
for (; b < batch && iter != annCollector.annotations().end(); b++, iter++, filesWatched++) {
|
||||
expected[b] = *iter;
|
||||
string filename = iter->folder + "/" + (!subdir.empty() ? subdir + "/" : "") + iter->filename;
|
||||
try {
|
||||
Size orig_size = decoder.insertIntoBlob(std::string(imagesPath) + "/" + filename, b, *firstInputBlob, preprocessingOptions);
|
||||
float scale_x, scale_y;
|
||||
|
||||
scale_x = 1.0 / iter->size.width; // orig_size.width;
|
||||
scale_y = 1.0 / iter->size.height; // orig_size.height;
|
||||
|
||||
if (scaleProposalToInputSize) {
|
||||
scale_x *= firstInputBlob->dims()[0];
|
||||
scale_y *= firstInputBlob->dims()[1];
|
||||
}
|
||||
|
||||
// Scaling the desired result (taken from the annotation) to the network size
|
||||
scaledDesiredForFiles.insert(std::pair<std::string, ImageDescription>(filename, desiredForFiles.at(filename).scale(scale_x, scale_y)));
|
||||
|
||||
files.push_back(filename);
|
||||
} catch (const InferenceEngineException &iex) {
|
||||
slog::warn << "Can't read file " << this->imagesPath + "/" + filename << slog::endl;
|
||||
// Could be some non-image file in directory
|
||||
b--;
|
||||
continue;
|
||||
}
|
||||
ipics++;
|
||||
}
|
||||
|
||||
if (files.size() == batch) {
|
||||
InferenceEngine::StatusCode sts;
|
||||
InferenceEngine::ResponseDesc dsc;
|
||||
|
||||
// Infer model
|
||||
Infer(progress, filesWatched, im);
|
||||
collectCalibrationStatistic();
|
||||
|
||||
// Processing the inference result
|
||||
std::map<std::string, std::list<DetectedObject>> detectedObjects = processResult(files);
|
||||
|
||||
// Calculating similarity
|
||||
//
|
||||
for (int b = 0; b < files.size(); b++) {
|
||||
ImageDescription result(detectedObjects[files[b]]);
|
||||
im.apc.consumeImage(result, scaledDesiredForFiles.at(files[b]));
|
||||
}
|
||||
}
|
||||
}
|
||||
progress.finish();
|
||||
|
||||
calculateLayersAccuracyDrop();
|
||||
|
||||
CalibrationMetrics imCalibration;
|
||||
const ObjectDetectionInferenceMetrics &odim = dynamic_cast<const ObjectDetectionInferenceMetrics&>(im);
|
||||
if (im.nRuns > 0) {
|
||||
std::map<int, double> appc = odim.apc.calculateAveragePrecisionPerClass();
|
||||
|
||||
double mAP = 0;
|
||||
for (auto i : appc) {
|
||||
mAP += i.second;
|
||||
}
|
||||
imCalibration.AccuracyResult = mAP / appc.size();
|
||||
}
|
||||
return std::shared_ptr<Processor::InferenceMetrics>(new CalibrationMetrics(imCalibration));
|
||||
}
|
||||
|
||||
|
||||
@@ -0,0 +1,178 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <vector>
|
||||
#include <string>
|
||||
#include "inference_engine.hpp"
|
||||
#include "ClassificationProcessor.hpp"
|
||||
#include "SSDObjectDetectionProcessor.hpp"
|
||||
#include "data_stats.h"
|
||||
#include <map>
|
||||
#include <memory>
|
||||
|
||||
/**
|
||||
* Calibrator class representing unified stages for calibration of any kind of networks
|
||||
*/
|
||||
class Int8Calibrator {
|
||||
public:
|
||||
/**
|
||||
* Intermediate structure storing of data for measurements of by-layer statistic of accuracy drop
|
||||
*/
|
||||
struct SingleLayerData {
|
||||
InferenceEngine::InferRequest _request;
|
||||
std::string _outputName;
|
||||
std::string _outputI8Name;
|
||||
std::vector<float> _int8Accuracy;
|
||||
};
|
||||
|
||||
/**
|
||||
* Initializes state to collect accuracy of FP32 network and collect statistic
|
||||
* of activations. The statistic of activations is stored in _statData and has all max/min for all
|
||||
* layers and for all pictures
|
||||
* The inference of all pictures and real collect of the statistic happen during call of
|
||||
* Processor::Process()
|
||||
*/
|
||||
void collectFP32Statistic();
|
||||
|
||||
/**
|
||||
* Initializes a state to collect intermediate numeric accuracy drop happening during quantization of
|
||||
* certain layer to int8. The numeric accuracy drop is measured using NRMSD metric.
|
||||
*
|
||||
* For this purpose it creates dedicated network for certain layer, initializes this
|
||||
* network by statistic that cause execute dedicated network in int8 mode.
|
||||
*
|
||||
* In addition to original network we create full original network executed in FP32 mode, and
|
||||
* register all layers as output ones.
|
||||
* Information from these layers is used as
|
||||
* a) input to dedicated layer networks
|
||||
* b) comparison for NRMSD algorithm between I8 and FP32 calc
|
||||
*
|
||||
* The inference of all pictures and real collect of the drop happen during call of
|
||||
* Processor::Process()
|
||||
* @param stat
|
||||
*/
|
||||
void collectByLayerStatistic(const InferenceEngine::NetworkStatsMap &stat);
|
||||
|
||||
/**
|
||||
* Initialize state to collect accuracy drop in int8 mode to be compared later vs FP32 accuracy
|
||||
* metric.
|
||||
*
|
||||
* The inference of all pictures and real collect of the accuracy happen during call of
|
||||
* Processor::Process()
|
||||
*
|
||||
* @param stat - The statistic for normalization
|
||||
* @param layersToInt8 - list of layers planned to be executed in int8. if layer is absent in this
|
||||
* map, it is assumed that it will be executed in int8
|
||||
*/
|
||||
void validateInt8Config(const InferenceEngine::NetworkStatsMap &stat,
|
||||
const std::map<std::string, bool>& layersToInt8);
|
||||
|
||||
/**
|
||||
* Statistic collected in the collectFP32Statistic is processed with threshold passed as a parameter
|
||||
* for this method. All values for each layers and for all pictures are sorted and number of min/max
|
||||
* values which exceed threshold is thrown off
|
||||
* @param threshold - parameter for thrown off outliers in activation statistic
|
||||
* @return InferenceEngine::NetworkStatsMap - mapping of layer name to NetworkNodeStatsPtr
|
||||
*/
|
||||
InferenceEngine::NetworkStatsMap getStatistic(float threshold);
|
||||
|
||||
/**
|
||||
* returns by-layer accuracy drop container
|
||||
*/
|
||||
std::map<std::string, float> layersAccuracyDrop();
|
||||
|
||||
protected:
|
||||
/**
|
||||
* This function should be called from final callibrator after and each Infer for each picture
|
||||
* It calculates by layer accuracy drop and as well it also collect activation values statistic
|
||||
*/
|
||||
void collectCalibrationStatistic();
|
||||
|
||||
/**
|
||||
* This function should be called from calibration class after Infer of all picture
|
||||
* It calculates average NRMSD based accuracy drop for each layer and fills _layersAccuracyDrop
|
||||
*/
|
||||
void calculateLayersAccuracyDrop();
|
||||
|
||||
bool _collectByLayer = false;
|
||||
bool _collectStatistic = true;
|
||||
InferencePlugin _pluginI8C;
|
||||
std::string _modelFileNameI8C;
|
||||
InferenceEngine::CNNNetReader networkReaderC;
|
||||
InferenceEngine::InferRequest _inferRequestI8C;
|
||||
int _cBatch = 0;
|
||||
|
||||
int _nPictures;
|
||||
|
||||
private:
|
||||
/**
|
||||
* helper function for getting statistic for input layers. For getting statistic for them, we are
|
||||
* adding scalshift just after the input with scale == 1 and shift == 0
|
||||
*/
|
||||
CNNLayerPtr addScaleShiftBeforeLayer(std::string name, InferenceEngine::CNNLayer::Ptr beforeLayer,
|
||||
size_t port, std::vector<float> scale);
|
||||
|
||||
/**
|
||||
* Returns Normalized root-mean-square deviation metric for two blobs passed to the function
|
||||
*/
|
||||
float compare_NRMSD(InferenceEngine::Blob::Ptr res, InferenceEngine::Blob::Ptr ref);
|
||||
|
||||
/**
|
||||
* Creates dedicated i8 network around selected layer. Currently this network beside layer itself
|
||||
* has to have ReLU and ScaleShift layers.
|
||||
* Since Inference Engine API mostly directed to the loading of network from IR, we need to create
|
||||
* such IR first, read through stream and modify network to correspond required parameters
|
||||
*/
|
||||
InferenceEngine::CNNNetwork createICNNNetworkForLayer(InferenceEngine::CNNLayer::Ptr layerToClone);
|
||||
|
||||
std::map<std::string, float> _layersAccuracyDrop;
|
||||
std::vector<InferenceEngine::ExecutableNetwork> _singleLayerNetworks;
|
||||
std::map<std::string, SingleLayerData> _singleLayerRequests;
|
||||
std::map<std::string, std::string> _inputsFromLayers;
|
||||
AggregatedDataStats _statData;
|
||||
};
|
||||
|
||||
/**
|
||||
* This class represents the only one generalized metric which will be used for comparison of
|
||||
* accuracy drop
|
||||
*/
|
||||
struct CalibrationMetrics : public ClassificationProcessor::InferenceMetrics {
|
||||
public:
|
||||
float AccuracyResult = 0;
|
||||
};
|
||||
|
||||
/**
|
||||
* Сalibration class for classification networks.
|
||||
* Responsible for proper post processing of results and calculate of Top1 metric which is used as
|
||||
* universal metric for accuracy and particiapted in verification of accuracy drop
|
||||
*/
|
||||
class ClassificationCalibrator : public ClassificationProcessor, public Int8Calibrator {
|
||||
public:
|
||||
ClassificationCalibrator(int nPictures, const std::string &flags_m, const std::string &flags_d,
|
||||
const std::string &flags_i, int flags_b,
|
||||
InferenceEngine::InferencePlugin plugin, CsvDumper &dumper, const std::string &flags_l,
|
||||
PreprocessingOptions preprocessingOptions, bool zeroBackground);
|
||||
|
||||
shared_ptr<InferenceMetrics> Process()override;
|
||||
};
|
||||
|
||||
|
||||
/**
|
||||
* Calibration class for SSD object detection networks.
|
||||
* Responsible for proper post processing of results and calculate of mAP metric which is used as
|
||||
* universal metric for accuracy and participated in verification of accuracy drop
|
||||
*/
|
||||
class SSDObjectDetectionCalibrator : public SSDObjectDetectionProcessor, public Int8Calibrator {
|
||||
public:
|
||||
SSDObjectDetectionCalibrator(int nPictures, const std::string &flags_m, const std::string &flags_d,
|
||||
const std::string &flags_i, const std::string &subdir, int flags_b,
|
||||
double threshold,
|
||||
InferencePlugin plugin, CsvDumper &dumper,
|
||||
const std::string &flags_a, const std::string &classes_list_file);
|
||||
|
||||
shared_ptr<InferenceMetrics> Process()override;
|
||||
};
|
||||
105
inference-engine/samples/calibration_tool/data_stats.cpp
Normal file
105
inference-engine/samples/calibration_tool/data_stats.cpp
Normal file
@@ -0,0 +1,105 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <stdlib.h>
|
||||
#include <cfloat>
|
||||
#include <cmath>
|
||||
#include <stdint.h>
|
||||
#include <iostream>
|
||||
#include <limits>
|
||||
#include <vector>
|
||||
#include <algorithm>
|
||||
#include <string>
|
||||
|
||||
#include "data_stats.h"
|
||||
|
||||
|
||||
TensorStatistic::TensorStatistic(float* data, size_t count, size_t nbuckets) {
|
||||
_min = std::numeric_limits<float>::max();
|
||||
_max = std::numeric_limits<float>::min();
|
||||
for (size_t i = 0; i < count; i++) {
|
||||
float val = static_cast<float>(data[i]);
|
||||
if (_min > val) {
|
||||
_min = val;
|
||||
}
|
||||
|
||||
if (_max < val) {
|
||||
_max = val;
|
||||
}
|
||||
}
|
||||
|
||||
if (_min == _max) {
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
float TensorStatistic::getMaxValue() const {
|
||||
return _max;
|
||||
}
|
||||
|
||||
|
||||
float TensorStatistic::getMinValue() const {
|
||||
return _min;
|
||||
}
|
||||
|
||||
std::vector<std::string> AggregatedDataStats::registeredLayers() {
|
||||
std::vector<std::string> layers;
|
||||
for (auto l : _data) {
|
||||
layers.push_back(l.first);
|
||||
}
|
||||
return layers;
|
||||
}
|
||||
|
||||
void AggregatedDataStats::registerLayer(std::string layer) {
|
||||
_data[layer];
|
||||
}
|
||||
|
||||
void AggregatedDataStats::addTensorStatistics(const std::string& name, size_t channel, float* data, size_t count) {
|
||||
auto&& byChannel = _data[name];
|
||||
byChannel[channel].push_back(TensorStatistic(data, count));
|
||||
}
|
||||
|
||||
void AggregatedDataStats::addTensorStatistics(const std::string &name, size_t channel, uint8_t *data, size_t count) {
|
||||
std::vector<float> intermediate;
|
||||
for (size_t i = 0; i < count; i++) {
|
||||
intermediate.push_back(data[i]);
|
||||
}
|
||||
addTensorStatistics(name, channel, intermediate.data(), count);
|
||||
}
|
||||
|
||||
size_t AggregatedDataStats::getNumberChannels(const std::string& name) const {
|
||||
auto it = _data.find(name);
|
||||
if (it != _data.end()) {
|
||||
return it->second.size();
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
void AggregatedDataStats::getDataMinMax(const std::string& name, size_t channel, float& min, float& max, float threshold) {
|
||||
// take data by name
|
||||
auto it = _data.find(name);
|
||||
if (it != _data.end()) {
|
||||
auto stats = it->second[channel];
|
||||
// having absolute min/max values, we can create new statistic
|
||||
std::vector<float> maxValues;
|
||||
std::vector<float> minValues;
|
||||
for (size_t i = 0; i < stats.size(); i++) {
|
||||
const TensorStatistic& tsS = stats[i];
|
||||
maxValues.push_back(tsS.getMaxValue());
|
||||
minValues.push_back(tsS.getMinValue());
|
||||
}
|
||||
// define number of elements to throw out
|
||||
size_t elementToTake = maxValues.size() * threshold / 100;
|
||||
int elementsToThrow = maxValues.size() - elementToTake;
|
||||
std::sort(maxValues.begin(), maxValues.end());
|
||||
std::sort(minValues.begin(), minValues.end());
|
||||
|
||||
min = minValues[elementsToThrow];
|
||||
max = maxValues[elementToTake - 1];
|
||||
} else {
|
||||
min = max = 0.f;
|
||||
}
|
||||
}
|
||||
|
||||
32
inference-engine/samples/calibration_tool/data_stats.h
Normal file
32
inference-engine/samples/calibration_tool/data_stats.h
Normal file
@@ -0,0 +1,32 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <vector>
|
||||
#include <map>
|
||||
#include <string>
|
||||
|
||||
struct TensorStatistic {
|
||||
TensorStatistic(float* data, size_t count, size_t nbuckets = 1000);
|
||||
float getMaxValue() const;
|
||||
float getMinValue()const;
|
||||
protected:
|
||||
float _min;
|
||||
float _max;
|
||||
};
|
||||
|
||||
class AggregatedDataStats {
|
||||
public:
|
||||
void addTensorStatistics(const std::string& name, size_t channel, float* data, size_t count);
|
||||
void addTensorStatistics(const std::string &name, size_t channel, uint8_t *data, size_t count);
|
||||
void getDataMinMax(const std::string& name, size_t channel, float& min, float& max, float threshold);
|
||||
size_t getNumberChannels(const std::string& name) const;
|
||||
std::vector <std::string> registeredLayers();
|
||||
void registerLayer(std::string layer);
|
||||
protected:
|
||||
std::map<std::string, std::map<size_t, std::vector<TensorStatistic> > > _data;
|
||||
};
|
||||
|
||||
521
inference-engine/samples/calibration_tool/main.cpp
Normal file
521
inference-engine/samples/calibration_tool/main.cpp
Normal file
@@ -0,0 +1,521 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
/**
|
||||
* @brief The entry point for Inference Engine validation application
|
||||
* @file validation_app/main.cpp
|
||||
*/
|
||||
#include <gflags/gflags.h>
|
||||
#include <algorithm>
|
||||
#include <functional>
|
||||
#include <iostream>
|
||||
#include <map>
|
||||
#include <fstream>
|
||||
#include <random>
|
||||
#include <string>
|
||||
#include <tuple>
|
||||
#include <vector>
|
||||
#include <limits>
|
||||
#include <iomanip>
|
||||
#include <memory>
|
||||
|
||||
#include <ext_list.hpp>
|
||||
|
||||
#include <samples/common.hpp>
|
||||
#include <samples/slog.hpp>
|
||||
|
||||
#include "user_exception.hpp"
|
||||
#include "calibrator_processors.h"
|
||||
#include "SSDObjectDetectionProcessor.hpp"
|
||||
#include "YOLOObjectDetectionProcessor.hpp"
|
||||
#include "network_serializer.h"
|
||||
#include "ie_icnn_network_stats.hpp"
|
||||
#include "details/caseless.hpp"
|
||||
|
||||
using namespace std;
|
||||
using namespace InferenceEngine;
|
||||
using namespace InferenceEngine::details;
|
||||
|
||||
using InferenceEngine::details::InferenceEngineException;
|
||||
|
||||
#define DEFAULT_PATH_P "./lib"
|
||||
|
||||
/// @brief Message for help argument
|
||||
static const char help_message[] = "Print a help message";
|
||||
/// @brief Message for images argument
|
||||
static const char image_message[] = "Required. Path to a directory with validation images. For Classification models, the directory must contain"
|
||||
" folders named as labels with images inside or a .txt file with"
|
||||
" a list of images. For Object Detection models, the dataset must be in"
|
||||
" VOC format.";
|
||||
/// @brief Message for plugin_path argument
|
||||
static const char plugin_path_message[] = "Path to a plugin folder";
|
||||
/// @brief message for model argument
|
||||
static const char model_message[] = "Required. Path to an .xml file with a trained model, including model name and "
|
||||
"extension.";
|
||||
/// @brief Message for plugin argument
|
||||
static const char plugin_message[] = "Plugin name. For example, CPU. If this parameter is passed, "
|
||||
"the sample looks for a specified plugin only.";
|
||||
/// @brief Message for assigning cnn calculation to device
|
||||
static const char target_device_message[] = "Target device to infer on: CPU (default), GPU, FPGA, or MYRIAD."
|
||||
" The application looks for a suitable plugin for the specified device.";
|
||||
/// @brief Message for label argument
|
||||
static const char label_message[] = "Path to a file with labels for a model";
|
||||
/// @brief M`essage for batch argumenttype
|
||||
static const char batch_message[] = "Batch size value. If not specified, the batch size value is taken from IR";
|
||||
/// @brief Message for dump argument
|
||||
static const char dump_message[] = "Dump file names and inference results to a .csv file";
|
||||
/// @brief Message for network type
|
||||
static const char type_message[] = "Type of an inferred network (\"C\" by default)";
|
||||
/// @brief Message for pp-type
|
||||
static const char preprocessing_type[] = "Preprocessing type. Options: \"None\", \"Resize\", \"ResizeCrop\"";
|
||||
/// @brief Message for pp-crop-size
|
||||
static const char preprocessing_size[] = "Preprocessing size (used with ppType=\"ResizeCrop\")";
|
||||
static const char preprocessing_width[] = "Preprocessing width (overrides -ppSize, used with ppType=\"ResizeCrop\")";
|
||||
static const char preprocessing_height[] = "Preprocessing height (overrides -ppSize, used with ppType=\"ResizeCrop\")";
|
||||
|
||||
static const char obj_detection_annotations_message[] = "Required for Object Detection models. Path to a directory"
|
||||
" containing an .xml file with annotations for images.";
|
||||
|
||||
static const char obj_detection_classes_message[] = "Required for Object Detection models. Path to a file with"
|
||||
" a list of classes";
|
||||
|
||||
static const char obj_detection_subdir_message[] = "Directory between the path to images (specified with -i) and image name (specified in the"
|
||||
" .xml file). For VOC2007 dataset, use JPEGImages.";
|
||||
|
||||
static const char obj_detection_kind_message[] = "Type of an Object Detection model. Options: SSD";
|
||||
|
||||
/// @brief Message for GPU custom kernels desc
|
||||
static const char custom_cldnn_message[] = "Required for GPU custom kernels. "
|
||||
"Absolute path to an .xml file with the kernel descriptions.";
|
||||
|
||||
/// @brief Message for user library argument
|
||||
static const char custom_cpu_library_message[] = "Required for CPU custom layers. "
|
||||
"Absolute path to a shared library with the kernel implementations.";
|
||||
|
||||
static const char zero_background_message[] = "\"Zero is a background\" flag. Some networks are trained with a modified"
|
||||
" dataset where the class IDs "
|
||||
" are enumerated from 1, but 0 is an undefined \"background\" class"
|
||||
" (which is never detected)";
|
||||
|
||||
/// @brief Network type options and their descriptions
|
||||
static const char* types_descriptions[][2] = {
|
||||
{ "C", "calibrate Classification network and write the calibrated network to IR" },
|
||||
// { "SS", "semantic segmentation" }, // Not supported yet
|
||||
{ "OD", "calibrate Object Detection network and write the calibrated network to IR" },
|
||||
{ "RawC", "collect only statistics for Classification network and write statistics to IR. With this option, a model is not calibrated. For calibration "
|
||||
"and statisctics collection, use \"-t C\" instead." },
|
||||
{ "RawOD", "collect only statistics for Object Detection network and write statistics to IR. With this option, a model is not calibrated. For calibration "
|
||||
"and statisctics collection, use \"-t OD\" instead" },
|
||||
{ nullptr, nullptr }
|
||||
};
|
||||
|
||||
static const char accuracy_threshold_message[] = "Threshold for a maximum accuracy drop of quantized model."
|
||||
" Must be an integer number (percents)"
|
||||
" without a percent sign. Default value is 1, which stands for accepted"
|
||||
" accuracy drop in 1%";
|
||||
static const char number_of_pictures_message[] = "Number of pictures from the whole validation set to"
|
||||
"create the calibration dataset. Default value is 0, which stands for"
|
||||
"the whole provided dataset";
|
||||
static const char output_model_name[] = "Output name for calibrated model. Default is <original_model_name>_i8.xml|bin";
|
||||
|
||||
/// @brief Define flag for showing help message <br>
|
||||
DEFINE_bool(h, false, help_message);
|
||||
/// @brief Define parameter for a path to images <br>
|
||||
/// It is a required parameter
|
||||
DEFINE_string(i, "", image_message);
|
||||
/// @brief Define parameter for a path to model file <br>
|
||||
/// It is a required parameter
|
||||
DEFINE_string(m, "", model_message);
|
||||
/// @brief Define parameter for a plugin name <br>
|
||||
/// It is a required parameter
|
||||
DEFINE_string(p, "", plugin_message);
|
||||
/// @brief Define parameter for a path to a file with labels <br>
|
||||
/// Default is empty
|
||||
DEFINE_string(OCl, "", label_message);
|
||||
/// @brief Define parameter for a path to plugins <br>
|
||||
/// Default is ./lib
|
||||
DEFINE_string(pp, DEFAULT_PATH_P, plugin_path_message);
|
||||
/// @brief Define paraneter for a target device to infer on <br>
|
||||
DEFINE_string(d, "CPU", target_device_message);
|
||||
/// @brief Define parameter for batch size <br>
|
||||
/// Default is 0 (which means that batch size is not specified)
|
||||
DEFINE_int32(b, 0, batch_message);
|
||||
/// @brief Define flag to dump results to a file <br>
|
||||
DEFINE_bool(dump, false, dump_message);
|
||||
/// @brief Define parameter for a network type
|
||||
DEFINE_string(t, "C", type_message);
|
||||
|
||||
/// @brief Define parameter for preprocessing type
|
||||
DEFINE_string(ppType, "", preprocessing_type);
|
||||
|
||||
/// @brief Define parameter for preprocessing size
|
||||
DEFINE_int32(ppSize, 0, preprocessing_size);
|
||||
DEFINE_int32(ppWidth, 0, preprocessing_width);
|
||||
DEFINE_int32(ppHeight, 0, preprocessing_height);
|
||||
|
||||
DEFINE_bool(Czb, false, zero_background_message);
|
||||
|
||||
DEFINE_string(ODa, "", obj_detection_annotations_message);
|
||||
|
||||
DEFINE_string(ODc, "", obj_detection_classes_message);
|
||||
|
||||
DEFINE_string(ODsubdir, "", obj_detection_subdir_message);
|
||||
|
||||
/// @brief Define parameter for a type of Object Detection network
|
||||
DEFINE_string(ODkind, "SSD", obj_detection_kind_message);
|
||||
|
||||
/// @brief Define parameter for GPU kernels path <br>
|
||||
/// Default is ./lib
|
||||
DEFINE_string(c, "", custom_cldnn_message);
|
||||
|
||||
/// @brief Define parameter for a path to CPU library with user layers <br>
|
||||
/// It is an optional parameter
|
||||
DEFINE_string(l, "", custom_cpu_library_message);
|
||||
|
||||
/// @brief Define parameter for accuracy drop threshold
|
||||
DEFINE_double(threshold, 1.0f, accuracy_threshold_message);
|
||||
|
||||
DEFINE_int32(subset, 0, number_of_pictures_message);
|
||||
|
||||
DEFINE_string(output, "", output_model_name);
|
||||
|
||||
/**
|
||||
* @brief This function shows a help message
|
||||
*/
|
||||
static void showUsage() {
|
||||
std::cout << std::endl;
|
||||
std::cout << "Usage: calibration_tool [OPTION]" << std::endl << std::endl;
|
||||
std::cout << "Available options:" << std::endl;
|
||||
std::cout << std::endl;
|
||||
std::cout << " -h " << help_message << std::endl;
|
||||
std::cout << " -t <type> " << type_message << std::endl;
|
||||
for (int i = 0; types_descriptions[i][0] != nullptr; i++) {
|
||||
std::cout << " -t \"" << types_descriptions[i][0] << "\" to " << types_descriptions[i][1] << std::endl;
|
||||
}
|
||||
std::cout << " -i <path> " << image_message << std::endl;
|
||||
std::cout << " -m <path> " << model_message << std::endl;
|
||||
std::cout << " -l <absolute_path> " << custom_cpu_library_message << std::endl;
|
||||
std::cout << " -c <absolute_path> " << custom_cldnn_message << std::endl;
|
||||
std::cout << " -d <device> " << target_device_message << std::endl;
|
||||
std::cout << " -b N " << batch_message << std::endl;
|
||||
std::cout << " -ppType <type> " << preprocessing_type << std::endl;
|
||||
std::cout << " -ppSize N " << preprocessing_size << std::endl;
|
||||
std::cout << " -ppWidth W " << preprocessing_width << std::endl;
|
||||
std::cout << " -ppHeight H " << preprocessing_height << std::endl;
|
||||
std::cout << " --dump " << dump_message << std::endl;
|
||||
std::cout << " -subset " << number_of_pictures_message << std::endl;
|
||||
std::cout << " -output <output_IR> " << output_model_name << std::endl;
|
||||
std::cout << " -threshold " << accuracy_threshold_message << std::endl;
|
||||
|
||||
std::cout << std::endl;
|
||||
std::cout << " Classification-specific options:" << std::endl;
|
||||
std::cout << " -Czb true " << zero_background_message << std::endl;
|
||||
|
||||
std::cout << std::endl;
|
||||
std::cout << " Object detection-specific options:" << std::endl;
|
||||
std::cout << " -ODkind <kind> " << obj_detection_kind_message << std::endl;
|
||||
std::cout << " -ODa <path> " << obj_detection_annotations_message << std::endl;
|
||||
std::cout << " -ODc <file> " << obj_detection_classes_message << std::endl;
|
||||
std::cout << " -ODsubdir <name> " << obj_detection_subdir_message << std::endl << std::endl;
|
||||
}
|
||||
|
||||
enum NetworkType {
|
||||
Undefined = -1,
|
||||
Classification,
|
||||
ObjDetection,
|
||||
RawC,
|
||||
RawOD
|
||||
};
|
||||
|
||||
std::string strtolower(const std::string& s) {
|
||||
std::string res = s;
|
||||
std::transform(res.begin(), res.end(), res.begin(), ::tolower);
|
||||
return res;
|
||||
}
|
||||
|
||||
void SaveCalibratedIR(const std::string &originalName,
|
||||
const std::string &outModelName,
|
||||
const std::map<std::string, bool>& layersToInt8,
|
||||
const InferenceEngine::NetworkStatsMap& statMap) {
|
||||
slog::info << "Layers profile for Int8 quantization\n";
|
||||
CNNNetReader networkReader;
|
||||
networkReader.ReadNetwork(originalName);
|
||||
if (!networkReader.isParseSuccess())THROW_IE_EXCEPTION << "cannot load a failed Model";
|
||||
|
||||
/** Extract model name and load weights **/
|
||||
std::string binFileName = fileNameNoExt(originalName)+ ".bin";
|
||||
networkReader.ReadWeights(binFileName.c_str());
|
||||
|
||||
auto network = networkReader.getNetwork();
|
||||
for (auto &&layer : network) {
|
||||
if (CaselessEq<std::string>()(layer->type, "convolution")) {
|
||||
auto it = layersToInt8.find(layer->name);
|
||||
if (it != layersToInt8.end() && it->second == false) {
|
||||
layer->params["quantization_level"] = "FP32";
|
||||
std::cout << layer->name << ": " << "FP32" << std::endl;
|
||||
} else {
|
||||
layer->params["quantization_level"] = "I8";
|
||||
std::cout << layer->name << ": " << "I8" << std::endl;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
ICNNNetworkStats* pstats = nullptr;
|
||||
StatusCode s = ((ICNNNetwork&)networkReader.getNetwork()).getStats(&pstats, nullptr);
|
||||
if (s == StatusCode::OK && pstats) {
|
||||
pstats->setNodesStats(statMap);
|
||||
}
|
||||
|
||||
slog::info << "Write calibrated network to " << outModelName << ".(xml|bin) IR file\n";
|
||||
CNNNetworkSerializer serializer;
|
||||
serializer.Serialize(outModelName + ".xml", outModelName + ".bin", networkReader.getNetwork());
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief The main function of inference engine sample application
|
||||
* @param argc - The number of arguments
|
||||
* @param argv - Arguments
|
||||
* @return 0 if all good
|
||||
*/
|
||||
int main(int argc, char *argv[]) {
|
||||
try {
|
||||
slog::info << "InferenceEngine: " << GetInferenceEngineVersion() << slog::endl;
|
||||
|
||||
// ---------------------------Parsing and validating input arguments--------------------------------------
|
||||
slog::info << "Parsing input parameters" << slog::endl;
|
||||
|
||||
bool noOptions = argc == 1;
|
||||
|
||||
gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
|
||||
if (FLAGS_h || noOptions) {
|
||||
showUsage();
|
||||
return 1;
|
||||
}
|
||||
|
||||
UserExceptions ee;
|
||||
|
||||
NetworkType netType = Undefined;
|
||||
// Checking the network type
|
||||
if (std::string(FLAGS_t) == "C") {
|
||||
netType = Classification;
|
||||
} else if (std::string(FLAGS_t) == "OD") {
|
||||
netType = ObjDetection;
|
||||
} else if (std::string(FLAGS_t) == "RawC") {
|
||||
netType = RawC;
|
||||
} else if (std::string(FLAGS_t) == "RawOD") {
|
||||
netType = RawOD;
|
||||
} else {
|
||||
ee << UserException(5, "Unknown network type specified (invalid -t option)");
|
||||
}
|
||||
|
||||
// Checking required options
|
||||
if (FLAGS_m.empty()) ee << UserException(3, "Model file is not specified (missing -m option)");
|
||||
if (FLAGS_i.empty()) ee << UserException(4, "Images list is not specified (missing -i option)");
|
||||
if (FLAGS_d.empty()) ee << UserException(5, "Target device is not specified (missing -d option)");
|
||||
if (FLAGS_b < 0) ee << UserException(6, "Batch must be positive (invalid -b option value)");
|
||||
|
||||
if (netType == ObjDetection) {
|
||||
// Checking required OD-specific options
|
||||
if (FLAGS_ODa.empty()) ee << UserException(11, "Annotations folder is not specified for object detection (missing -a option)");
|
||||
if (FLAGS_ODc.empty()) ee << UserException(12, "Classes file is not specified (missing -c option)");
|
||||
if (FLAGS_b > 0) ee << UserException(13, "Batch option other than 0 is not supported for Object Detection networks");
|
||||
}
|
||||
|
||||
if (!ee.empty()) throw ee;
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// ---------------------Loading plugin for Inference Engine------------------------------------------------
|
||||
slog::info << "Loading plugin" << slog::endl;
|
||||
/** Loading the library with extensions if provided**/
|
||||
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp, "../../../lib/intel64", "" }).getPluginByDevice(FLAGS_d);
|
||||
|
||||
/** Loading default extensions **/
|
||||
if (FLAGS_d.find("CPU") != std::string::npos) {
|
||||
/**
|
||||
* cpu_extensions library is compiled from "extension" folder containing
|
||||
* custom CPU plugin layer implementations. These layers are not supported
|
||||
* by CPU, but they can be useful for inferring custom topologies.
|
||||
**/
|
||||
plugin.AddExtension(std::make_shared<Extensions::Cpu::CpuExtensions>());
|
||||
}
|
||||
|
||||
if (!FLAGS_l.empty()) {
|
||||
// CPU extensions are loaded as a shared library and passed as a pointer to base extension
|
||||
IExtensionPtr extension_ptr = make_so_pointer<IExtension>(FLAGS_l);
|
||||
plugin.AddExtension(extension_ptr);
|
||||
slog::info << "CPU Extension loaded: " << FLAGS_l << slog::endl;
|
||||
}
|
||||
if (!FLAGS_c.empty()) {
|
||||
// GPU extensions are loaded from an .xml description and OpenCL kernel files
|
||||
plugin.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, FLAGS_c}});
|
||||
slog::info << "GPU Extension loaded: " << FLAGS_c << slog::endl;
|
||||
}
|
||||
|
||||
printPluginVersion(plugin, std::cout);
|
||||
|
||||
CsvDumper dumper(FLAGS_dump);
|
||||
|
||||
std::shared_ptr<Processor> processor;
|
||||
|
||||
PreprocessingOptions preprocessingOptions;
|
||||
if (strtolower(FLAGS_ppType.c_str()) == "none") {
|
||||
preprocessingOptions = PreprocessingOptions(false, ResizeCropPolicy::DoNothing);
|
||||
} else if (strtolower(FLAGS_ppType) == "resizecrop") {
|
||||
size_t ppWidth = FLAGS_ppSize;
|
||||
size_t ppHeight = FLAGS_ppSize;
|
||||
|
||||
if (FLAGS_ppWidth > 0) ppWidth = FLAGS_ppSize;
|
||||
if (FLAGS_ppHeight > 0) ppHeight = FLAGS_ppSize;
|
||||
|
||||
if (FLAGS_ppSize > 0 || (FLAGS_ppWidth > 0 && FLAGS_ppHeight > 0)) {
|
||||
preprocessingOptions = PreprocessingOptions(false, ResizeCropPolicy::ResizeThenCrop, ppWidth, ppHeight);
|
||||
} else {
|
||||
THROW_USER_EXCEPTION(2) << "Size must be specified for preprocessing type " << FLAGS_ppType;
|
||||
}
|
||||
} else if (strtolower(FLAGS_ppType) == "resize" || FLAGS_ppType.empty()) {
|
||||
preprocessingOptions = PreprocessingOptions(false, ResizeCropPolicy::Resize);
|
||||
} else {
|
||||
THROW_USER_EXCEPTION(2) << "Unknown preprocessing type: " << FLAGS_ppType;
|
||||
}
|
||||
|
||||
if (netType == Classification || netType == RawC) {
|
||||
processor = std::shared_ptr<Processor>(
|
||||
new ClassificationCalibrator(FLAGS_subset, FLAGS_m, FLAGS_d, FLAGS_i, FLAGS_b,
|
||||
plugin, dumper, FLAGS_l, preprocessingOptions, FLAGS_Czb));
|
||||
} else if (netType == ObjDetection || netType == RawOD) {
|
||||
if (FLAGS_ODkind == "SSD") {
|
||||
processor = std::shared_ptr<Processor>(
|
||||
new SSDObjectDetectionCalibrator(FLAGS_subset, FLAGS_m, FLAGS_d, FLAGS_i, FLAGS_ODsubdir, FLAGS_b,
|
||||
0.5, plugin, dumper, FLAGS_ODa, FLAGS_ODc));
|
||||
/* } else if (FLAGS_ODkind == "YOLO") {
|
||||
processor = std::shared_ptr<Processor>(
|
||||
new YOLOObjectDetectionProcessor(FLAGS_m, FLAGS_d, FLAGS_i, FLAGS_ODsubdir, FLAGS_b,
|
||||
0.5, plugin, dumper, FLAGS_ODa, FLAGS_ODc));
|
||||
*/
|
||||
}
|
||||
} else {
|
||||
THROW_USER_EXCEPTION(2) << "Unknown network type specified" << FLAGS_ppType;
|
||||
}
|
||||
if (!processor.get()) {
|
||||
THROW_USER_EXCEPTION(2) << "Processor pointer is invalid" << FLAGS_ppType;
|
||||
}
|
||||
|
||||
Int8Calibrator* calibrator = dynamic_cast<Int8Calibrator*>(processor.get());
|
||||
|
||||
if (netType != RawC && netType != RawOD) {
|
||||
slog::info << "Collecting accuracy metric in FP32 mode to get a baseline, collecting activation statistics" << slog::endl;
|
||||
} else {
|
||||
slog::info << "Collecting activation statistics" << slog::endl;
|
||||
}
|
||||
calibrator->collectFP32Statistic();
|
||||
shared_ptr<Processor::InferenceMetrics> pIMFP32 = processor->Process();
|
||||
const CalibrationMetrics* mFP32 = dynamic_cast<const CalibrationMetrics*>(pIMFP32.get());
|
||||
std:: cout << " FP32 Accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% " << std::endl;
|
||||
|
||||
InferenceEngine::NetworkStatsMap statMap;
|
||||
std::map<std::string, bool> layersToInt8;
|
||||
bool bAccuracy = false;
|
||||
|
||||
if (netType != RawC && netType != RawOD) {
|
||||
slog::info << "Verification of network accuracy if all possible layers converted to INT8" << slog::endl;
|
||||
float bestThreshold = 100.f;
|
||||
float maximalAccuracy = 0.f;
|
||||
for (float threshold = 100.0f; threshold > 95.0f; threshold -= 0.5) {
|
||||
std::cout << "Validate int8 accuracy, threshold for activation statistics = " << threshold << std::endl;
|
||||
InferenceEngine::NetworkStatsMap tmpStatMap = calibrator->getStatistic(threshold);
|
||||
calibrator->validateInt8Config(tmpStatMap, {});
|
||||
shared_ptr<Processor::InferenceMetrics> pIM_I8 = processor->Process();
|
||||
const CalibrationMetrics *mI8 = dynamic_cast<const CalibrationMetrics *>(pIM_I8.get());
|
||||
if (maximalAccuracy < mI8->AccuracyResult) {
|
||||
maximalAccuracy = mI8->AccuracyResult;
|
||||
bestThreshold = threshold;
|
||||
}
|
||||
std::cout << " Accuracy is " << OUTPUT_FLOATING(100.0 * mI8->AccuracyResult) << "%" << std::endl;
|
||||
}
|
||||
|
||||
statMap = calibrator->getStatistic(bestThreshold);
|
||||
|
||||
if ((mFP32->AccuracyResult - maximalAccuracy) > (FLAGS_threshold / 100)) {
|
||||
slog::info << "Accuracy of all layers conversion does not correspond to the required threshold\n";
|
||||
cout << "FP32 Accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% vs " <<
|
||||
"all Int8 layers Accuracy: " << OUTPUT_FLOATING(100.0 * maximalAccuracy) << "%, " <<
|
||||
"threshold for activation statistics: " << bestThreshold << "%" << std::endl;
|
||||
slog::info << "Collecting intermediate per-layer accuracy drop" << slog::endl;
|
||||
// getting statistic on accuracy drop by layers
|
||||
calibrator->collectByLayerStatistic(statMap);
|
||||
processor->Process();
|
||||
// starting to reduce number of layers being converted to Int8
|
||||
std::map<std::string, float> layersAccuracyDrop = calibrator->layersAccuracyDrop();
|
||||
|
||||
std::map<float, std::string> orderedLayersAccuracyDrop;
|
||||
for (auto d : layersAccuracyDrop) {
|
||||
orderedLayersAccuracyDrop[d.second] = d.first;
|
||||
layersToInt8[d.first] = true;
|
||||
}
|
||||
std::map<float, std::string>::const_reverse_iterator it = orderedLayersAccuracyDrop.crbegin();
|
||||
|
||||
shared_ptr<Processor::InferenceMetrics> pIM_I8;
|
||||
const CalibrationMetrics *mI8;
|
||||
while (it != orderedLayersAccuracyDrop.crend() && bAccuracy == false) {
|
||||
slog::info << "Returning of '" << it->second << "' to FP32 precision, start validation\n";
|
||||
layersToInt8[it->second] = false;
|
||||
calibrator->validateInt8Config(statMap, layersToInt8);
|
||||
pIM_I8 = processor->Process();
|
||||
mI8 = dynamic_cast<const CalibrationMetrics *>(pIM_I8.get());
|
||||
maximalAccuracy = mI8->AccuracyResult;
|
||||
if ((mFP32->AccuracyResult - maximalAccuracy) > (FLAGS_threshold / 100)) {
|
||||
cout << "FP32 Accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% vs " <<
|
||||
"current Int8 configuration Accuracy: " << OUTPUT_FLOATING(100.0 * maximalAccuracy) << "%" << std::endl;
|
||||
} else {
|
||||
bAccuracy = true;
|
||||
}
|
||||
it++;
|
||||
}
|
||||
} else {
|
||||
bAccuracy = true;
|
||||
}
|
||||
|
||||
if (bAccuracy) {
|
||||
slog::info << "Achieved required accuracy drop satisfying threshold\n";
|
||||
cout << "FP32 accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% vs " <<
|
||||
"current Int8 configuration accuracy: " << OUTPUT_FLOATING(100.0 * maximalAccuracy) << "% " <<
|
||||
"with threshold for activation statistic: " << bestThreshold << "%" << std::endl;
|
||||
std::string outModelName = FLAGS_output.empty() ? fileNameNoExt(FLAGS_m) + "_i8" : fileNameNoExt(FLAGS_output);
|
||||
SaveCalibratedIR(FLAGS_m, outModelName, layersToInt8, statMap);
|
||||
} else {
|
||||
slog::info << "Required threshold of accuracy drop cannot be achieved with any int8 quantization\n";
|
||||
}
|
||||
} else {
|
||||
std::cout << "Collected activation statistics, writing maximum values to IR" << std::endl;
|
||||
statMap = calibrator->getStatistic(100.0f);
|
||||
std::string outModelName = FLAGS_output.empty() ? fileNameNoExt(FLAGS_m) + "_i8" : fileNameNoExt(FLAGS_output);
|
||||
SaveCalibratedIR(FLAGS_m, outModelName, layersToInt8, statMap);
|
||||
}
|
||||
|
||||
if (dumper.dumpEnabled()) {
|
||||
slog::info << "Dump file generated: " << dumper.getFilename() << slog::endl;
|
||||
}
|
||||
} catch (const InferenceEngineException& ex) {
|
||||
slog::err << "Inference problem: \n" << ex.what() << slog::endl;
|
||||
return 1;
|
||||
} catch (const UserException& ex) {
|
||||
slog::err << "Input problem: \n" << ex.what() << slog::endl;
|
||||
showUsage();
|
||||
return ex.exitCode();
|
||||
} catch (const UserExceptions& ex) {
|
||||
if (ex.list().size() == 1) {
|
||||
slog::err << "Input problem: " << ex.what() << slog::endl;
|
||||
showUsage();
|
||||
return ex.list().begin()->exitCode();
|
||||
} else {
|
||||
const char* s = ex.what();
|
||||
slog::err << "Input problems: \n" << ex.what() << slog::endl;
|
||||
showUsage();
|
||||
return ex.list().begin()->exitCode();
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
381
inference-engine/samples/calibration_tool/network_serializer.cpp
Normal file
381
inference-engine/samples/calibration_tool/network_serializer.cpp
Normal file
@@ -0,0 +1,381 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <fstream>
|
||||
#include <map>
|
||||
#include <vector>
|
||||
#include <string>
|
||||
#include <ie_precision.hpp>
|
||||
#include "details/ie_cnn_network_tools.h"
|
||||
#include "details/caseless.hpp"
|
||||
#include "ie_layers_property.hpp"
|
||||
#include "network_serializer.h"
|
||||
#include "../common/samples/common.hpp"
|
||||
|
||||
using namespace InferenceEngine;
|
||||
using namespace details;
|
||||
|
||||
template<typename T>
|
||||
std::string arrayToIRProperty(const T& property) {
|
||||
std::string sProperty;
|
||||
for (size_t i = 0; i < property.size(); i++) {
|
||||
sProperty = sProperty + std::to_string(property[i]) +
|
||||
std::string((i != property.size() - 1) ? "," : "");
|
||||
}
|
||||
return sProperty;
|
||||
}
|
||||
|
||||
template<typename T>
|
||||
std::string arrayRevertToIRProperty(const T& property) {
|
||||
std::string sProperty;
|
||||
for (size_t i = 0; i < property.size(); i++) {
|
||||
sProperty = sProperty + std::to_string(property[property.size() - i - 1]) +
|
||||
std::string((i != property.size() - 1) ? "," : "");
|
||||
}
|
||||
return sProperty;
|
||||
}
|
||||
|
||||
|
||||
void CNNNetworkSerializer::Serialize(const std::string &xmlPath, const std::string &binPath,
|
||||
ICNNNetwork &network) {
|
||||
std::ofstream ofsBin(binPath, std::ofstream::out | std::ofstream::binary);
|
||||
|
||||
pugi::xml_document doc;
|
||||
|
||||
pugi::xml_node net = doc.append_child("net");
|
||||
|
||||
char name[1024];
|
||||
network.getName(name, 1024);
|
||||
|
||||
net.append_attribute("name").set_value(name);
|
||||
net.append_attribute("version").set_value("3");
|
||||
net.append_attribute("batch").set_value("1");
|
||||
|
||||
pugi::xml_node layers = net.append_child("layers");
|
||||
|
||||
size_t dataOffset = 0;
|
||||
|
||||
std::string dataName = "data";
|
||||
|
||||
std::vector<CNNLayerPtr> ordered;
|
||||
|
||||
ordered = CNNNetSortTopologically(network);
|
||||
|
||||
std::map<CNNLayer::Ptr, int> matching;
|
||||
for (size_t i = 0; i < ordered.size(); i++) {
|
||||
matching[ordered[i]] = i;
|
||||
}
|
||||
|
||||
for (size_t i = 0; i < ordered.size(); i++) {
|
||||
CNNLayerPtr node = ordered[i];
|
||||
|
||||
pugi::xml_node layer = layers.append_child("layer");
|
||||
Precision precision = node->precision;
|
||||
layer.append_attribute("name").set_value(node->name.c_str());
|
||||
layer.append_attribute("type").set_value(node->type.c_str());
|
||||
layer.append_attribute("precision").set_value(precision.name());
|
||||
layer.append_attribute("id").set_value(i);
|
||||
|
||||
updateStdLayerParams(node);
|
||||
|
||||
auto ¶ms = node->params;
|
||||
|
||||
if (params.size()) {
|
||||
pugi::xml_node data = layer.append_child(dataName.c_str());
|
||||
|
||||
for (auto it : params) {
|
||||
data.append_attribute(it.first.c_str()).set_value(it.second.c_str());
|
||||
}
|
||||
}
|
||||
|
||||
if (node->insData.size()) {
|
||||
pugi::xml_node input = layer.append_child("input");
|
||||
|
||||
for (size_t iport = 0; iport < node->insData.size(); iport++) {
|
||||
DataPtr d = node->insData[iport].lock();
|
||||
pugi::xml_node port = input.append_child("port");
|
||||
|
||||
port.append_attribute("id").set_value(iport);
|
||||
|
||||
for (auto dim : d->getDims()) {
|
||||
port.append_child("dim").text().set(dim);
|
||||
}
|
||||
}
|
||||
}
|
||||
if (node->outData.size()) {
|
||||
pugi::xml_node input = layer.append_child("output");
|
||||
for (size_t oport = 0; oport < node->outData.size(); oport++) {
|
||||
pugi::xml_node port = input.append_child("port");
|
||||
|
||||
port.append_attribute("id").set_value(node->insData.size() + oport);
|
||||
|
||||
for (auto dim : node->outData[oport]->getDims()) {
|
||||
port.append_child("dim").text().set(dim);
|
||||
}
|
||||
}
|
||||
}
|
||||
if (node->blobs.size()) {
|
||||
auto blobsNode = layer.append_child("blobs");
|
||||
for (auto dataIt : node->blobs) {
|
||||
const char *dataPtr = dataIt.second->buffer().as<char*>();
|
||||
|
||||
size_t dataSize = dataIt.second->byteSize();
|
||||
pugi::xml_node data = blobsNode.append_child(dataIt.first.c_str());
|
||||
data.append_attribute("offset").set_value(dataOffset);
|
||||
data.append_attribute("size").set_value(dataSize);
|
||||
|
||||
dataOffset += dataSize;
|
||||
ofsBin.write(dataPtr, dataSize);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
pugi::xml_node edges = net.append_child("edges");
|
||||
|
||||
for (size_t i = 0; i < ordered.size(); i++) {
|
||||
CNNLayer::Ptr node = ordered[i];
|
||||
|
||||
if (node->outData.size()) {
|
||||
auto itFrom = matching.find(node);
|
||||
if (itFrom == matching.end()) {
|
||||
THROW_IE_EXCEPTION << "Internal error, cannot find " << node->name << " in matching container during serialization of IR";
|
||||
}
|
||||
for (size_t oport = 0; oport < node->outData.size(); oport++) {
|
||||
DataPtr outData = node->outData[oport];
|
||||
for (auto inputTo : outData->inputTo) {
|
||||
auto itTo = matching.find(inputTo.second);
|
||||
if (itTo == matching.end()) {
|
||||
THROW_IE_EXCEPTION << "Broken edge form layer " << node->name << " to layer " << inputTo.first<< "during serialization of IR";
|
||||
}
|
||||
|
||||
size_t foundPort = -1;
|
||||
for (size_t iport = 0; iport < inputTo.second->insData.size(); iport++) {
|
||||
if (inputTo.second->insData[iport].lock() == outData) {
|
||||
foundPort = iport;
|
||||
}
|
||||
}
|
||||
if (foundPort == -1) {
|
||||
THROW_IE_EXCEPTION << "Broken edge from layer to parent, cannot find parent " << outData->name << " for layer " << inputTo.second->name
|
||||
<< "\ninitial layer for edge output " << node->name;
|
||||
}
|
||||
pugi::xml_node edge = edges.append_child("edge");
|
||||
|
||||
edge.append_attribute("from-layer").set_value(itFrom->second);
|
||||
edge.append_attribute("from-port").set_value(oport + node->insData.size());
|
||||
|
||||
edge.append_attribute("to-layer").set_value(itTo->second);
|
||||
edge.append_attribute("to-port").set_value(foundPort);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
InputsDataMap inputInfo;
|
||||
network.getInputsInfo(inputInfo);
|
||||
|
||||
// assuming that we have preprocess only for one input
|
||||
for (auto ii : inputInfo) {
|
||||
auto pp = ii.second->getPreProcess();
|
||||
size_t nInChannels = pp.getNumberOfChannels();
|
||||
if (nInChannels) {
|
||||
pugi::xml_node preproc = net.append_child("pre-process");
|
||||
|
||||
preproc.append_attribute("reference-layer-name").set_value(ii.first.c_str());
|
||||
preproc.append_attribute("mean-precision").set_value(Precision(Precision::FP32).name());
|
||||
|
||||
for (size_t ch = 0; ch < nInChannels; ch++) {
|
||||
PreProcessChannel::Ptr &preProcessChannel = pp[ch];
|
||||
auto channel = preproc.append_child("channel");
|
||||
channel.append_attribute("id").set_value(ch);
|
||||
|
||||
auto mean = channel.append_child("mean");
|
||||
|
||||
if (!preProcessChannel->meanData) {
|
||||
mean.append_attribute("value").set_value(preProcessChannel->meanValue);
|
||||
} else {
|
||||
THROW_IE_EXCEPTION << "Mean data is not supported yet for serialization of the model";
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
// adding statistic to the file if statistic exists
|
||||
ICNNNetworkStats* netNodesStats = nullptr;
|
||||
auto stats = net.append_child("statistics");
|
||||
network.getStats(&netNodesStats, nullptr);
|
||||
NetworkStatsMap statsmap = netNodesStats->getNodesStats();
|
||||
|
||||
auto joinCommas = [&](std::vector<float>& v) -> std::string {
|
||||
std::string res;
|
||||
|
||||
for (size_t i = 0; i < v.size(); ++i) {
|
||||
res += std::to_string(v[i]);
|
||||
if (i < v.size() - 1) {
|
||||
res += ", ";
|
||||
}
|
||||
}
|
||||
|
||||
return res;
|
||||
};
|
||||
|
||||
for (auto itStats : statsmap) {
|
||||
auto layer = stats.append_child("layer");
|
||||
|
||||
layer.append_child("name").text().set(itStats.first.c_str());
|
||||
|
||||
layer.append_child("min").text().set(joinCommas(itStats.second->_minOutputs).c_str());
|
||||
layer.append_child("max").text().set(joinCommas(itStats.second->_maxOutputs).c_str());
|
||||
}
|
||||
|
||||
doc.save_file(xmlPath.c_str());
|
||||
}
|
||||
|
||||
|
||||
void CNNNetworkSerializer::updateStdLayerParams(CNNLayer::Ptr layer) {
|
||||
auto layerPtr = layer.get();
|
||||
auto type = layer->type;
|
||||
auto ¶ms = layer->params;
|
||||
|
||||
if (CaselessEq<std::string>()(layer->type, "power")) {
|
||||
PowerLayer *lr = dynamic_cast<PowerLayer *>(layerPtr);
|
||||
|
||||
params["scale"] = std::to_string(lr->scale);
|
||||
params["shift"] = std::to_string(lr->offset);
|
||||
params["power"] = std::to_string(lr->power);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "convolution") ||
|
||||
CaselessEq<std::string>()(layer->type, "deconvolution")) {
|
||||
ConvolutionLayer *lr = dynamic_cast<ConvolutionLayer *>(layerPtr);
|
||||
|
||||
params["kernel"] = arrayRevertToIRProperty(lr->_kernel);
|
||||
params["pads_begin"] = arrayRevertToIRProperty(lr->_padding);
|
||||
params["pads_end"] = arrayRevertToIRProperty(lr->_pads_end);
|
||||
params["strides"] = arrayRevertToIRProperty(lr->_stride);
|
||||
params["dilations"] = arrayRevertToIRProperty(lr->_dilation);
|
||||
params["output"] = std::to_string(lr->_out_depth);
|
||||
params["group"] = std::to_string(lr->_group);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "relu")) {
|
||||
ReLULayer *lr = dynamic_cast<ReLULayer *>(layerPtr);
|
||||
if (lr->negative_slope != 0.0f) {
|
||||
params["negative_slope"] = std::to_string(lr->negative_slope);
|
||||
}
|
||||
} else if (CaselessEq<std::string>()(layer->type, "norm") ||
|
||||
CaselessEq<std::string>()(layer->type, "lrn")) {
|
||||
NormLayer *lr = dynamic_cast<NormLayer *>(layerPtr);
|
||||
|
||||
params["alpha"] = std::to_string(lr->_alpha);
|
||||
params["beta"] = std::to_string(lr->_beta);
|
||||
params["local-size"] = std::to_string(lr->_size);
|
||||
params["region"] = lr->_isAcrossMaps ? "across" : "same";
|
||||
} else if (CaselessEq<std::string>()(layer->type, "pooling")) {
|
||||
PoolingLayer *lr = dynamic_cast<PoolingLayer *>(layerPtr);
|
||||
|
||||
params["kernel"] = arrayRevertToIRProperty(lr->_kernel);
|
||||
params["pads_begin"] = arrayRevertToIRProperty(lr->_padding);
|
||||
params["pads_end"] = arrayRevertToIRProperty(lr->_pads_end);
|
||||
params["strides"] = arrayRevertToIRProperty(lr->_stride);
|
||||
|
||||
switch (lr->_type) {
|
||||
case PoolingLayer::MAX:
|
||||
params["pool-method"] = "max";
|
||||
break;
|
||||
case PoolingLayer::AVG:
|
||||
params["pool-method"] = "avg";
|
||||
break;
|
||||
|
||||
default:
|
||||
THROW_IE_EXCEPTION << "Found unsupported pooling method: " << lr->_type;
|
||||
}
|
||||
} else if (CaselessEq<std::string>()(layer->type, "split")) {
|
||||
SplitLayer *lr = dynamic_cast<SplitLayer *>(layerPtr);
|
||||
params["axis"] = std::to_string(lr->_axis);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "concat")) {
|
||||
ConcatLayer *lr = dynamic_cast<ConcatLayer *>(layerPtr);
|
||||
params["axis"] = std::to_string(lr->_axis);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "FullyConnected") ||
|
||||
CaselessEq<std::string>()(layer->type, "InnerProduct")) {
|
||||
FullyConnectedLayer *lr = dynamic_cast<FullyConnectedLayer *>(layerPtr);
|
||||
params["out-size"] = std::to_string(lr->_out_num);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "softmax")) {
|
||||
SoftMaxLayer *lr = dynamic_cast<SoftMaxLayer *>(layerPtr);
|
||||
params["axis"] = std::to_string(lr->axis);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "reshape")) {
|
||||
// need to add here support of flatten layer if it is created from API
|
||||
ReshapeLayer *lr = dynamic_cast<ReshapeLayer *>(layerPtr);
|
||||
params["axis"] = std::to_string(lr->axis);
|
||||
params["num_axes"] = std::to_string(lr->num_axes);
|
||||
params["dim"] = arrayToIRProperty(lr->shape);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "Eltwise")) {
|
||||
EltwiseLayer *lr = dynamic_cast<EltwiseLayer *>(layerPtr);
|
||||
|
||||
std::string op;
|
||||
|
||||
switch (lr->_operation) {
|
||||
case EltwiseLayer::Sum:
|
||||
op = "sum";
|
||||
break;
|
||||
case EltwiseLayer::Prod:
|
||||
op = "prod";
|
||||
break;
|
||||
case EltwiseLayer::Max:
|
||||
op = "max";
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
params["operation"] = op;
|
||||
} else if (CaselessEq<std::string>()(layer->type, "scaleshift")) {
|
||||
ScaleShiftLayer *lr = dynamic_cast<ScaleShiftLayer *>(layerPtr);
|
||||
params["broadcast"] = std::to_string(lr->_broadcast);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "crop")) {
|
||||
CropLayer *lr = dynamic_cast<CropLayer *>(layerPtr);
|
||||
params["axis"] = arrayToIRProperty(lr->axis);
|
||||
params["offset"] = arrayToIRProperty(lr->offset);
|
||||
params["dim"] = arrayToIRProperty(lr->dim);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "tile")) {
|
||||
TileLayer *lr = dynamic_cast<TileLayer *>(layerPtr);
|
||||
params["axis"] = std::to_string(lr->axis);
|
||||
params["tiles"] = std::to_string(lr->tiles);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "prelu")) {
|
||||
PReLULayer *lr = dynamic_cast<PReLULayer *>(layerPtr);
|
||||
params["channel_shared"] = std::to_string(lr->_channel_shared);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "clamp")) {
|
||||
ClampLayer *lr = dynamic_cast<ClampLayer *>(layerPtr);
|
||||
params["min"] = std::to_string(lr->min_value);
|
||||
params["max"] = std::to_string(lr->max_value);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "BatchNormalization")) {
|
||||
BatchNormalizationLayer *lr = dynamic_cast<BatchNormalizationLayer *>(layerPtr);
|
||||
params["epsilon"] = std::to_string(lr->epsilon);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "grn")) {
|
||||
GRNLayer *lr = dynamic_cast<GRNLayer *>(layerPtr);
|
||||
params["bias"] = std::to_string(lr->bias);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "mvn")) {
|
||||
MVNLayer *lr = dynamic_cast<MVNLayer *>(layerPtr);
|
||||
params["across_channels"] = std::to_string(lr->across_channels);
|
||||
params["normalize_variance"] = std::to_string(lr->normalize);
|
||||
} else if (CaselessEq<std::string>()(layer->type, "rnn") ||
|
||||
CaselessEq<std::string>()(layer->type, "TensorIterator") ||
|
||||
CaselessEq<std::string>()(layer->type, "LSTMCell")) {
|
||||
THROW_IE_EXCEPTION << "Not covered layers for writing to IR";
|
||||
}
|
||||
|
||||
if (layer->params.find("quantization_level") != layer->params.end()) {
|
||||
params["quantization_level"] = layer->params["quantization_level"];
|
||||
}
|
||||
|
||||
|
||||
// update of weightable layers
|
||||
WeightableLayer *pwlayer = dynamic_cast<WeightableLayer *>(layerPtr);
|
||||
if (pwlayer) {
|
||||
if (pwlayer->_weights) {
|
||||
pwlayer->blobs["weights"] = pwlayer->_weights;
|
||||
}
|
||||
if (pwlayer->_biases) {
|
||||
pwlayer->blobs["biases"] = pwlayer->_biases;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#pragma once
|
||||
|
||||
#include "inference_engine.hpp"
|
||||
#include <pugixml/pugixml.hpp>
|
||||
#include <string>
|
||||
|
||||
/** Class for serialization of model been presented as ICNNNetwork to the disk
|
||||
*/
|
||||
class CNNNetworkSerializer {
|
||||
public:
|
||||
void Serialize(const std::string &xmlPath, const std::string &binPath,
|
||||
InferenceEngine::ICNNNetwork& network);
|
||||
|
||||
protected:
|
||||
void updateStdLayerParams(InferenceEngine::CNNLayer::Ptr layer);
|
||||
};
|
||||
@@ -36,7 +36,7 @@ add_executable(${TARGET_NAME} ${SRC})
|
||||
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
|
||||
COMPILE_PDB_NAME ${TARGET_NAME})
|
||||
|
||||
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} cpu_extension format_reader gflags)
|
||||
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} IE::ie_cpu_extension format_reader gflags)
|
||||
|
||||
if(UNIX)
|
||||
target_link_libraries(${TARGET_NAME} ${LIB_DL} pthread)
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Image Classification Sample {#InferenceEngineClassificationSampleApplication}
|
||||
# Image Classification Sample
|
||||
|
||||
This topic demonstrates how to build and run the Image Classification sample application, which does
|
||||
inference using image classification networks like AlexNet and GoogLeNet.
|
||||
@@ -37,6 +37,8 @@ Options:
|
||||
Number of iterations (default 1)
|
||||
-pc
|
||||
Enables per-layer performance report
|
||||
-p_msg
|
||||
Enables messages from a plugin
|
||||
|
||||
```
|
||||
|
||||
@@ -63,4 +65,4 @@ Engine plugin. When inference is done, the application creates an
|
||||
output image and outputs data to the standard output stream.
|
||||
|
||||
## See Also
|
||||
* [Using Inference Engine Samples](@ref SamplesOverview)
|
||||
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)
|
||||
|
||||
@@ -1,18 +1,7 @@
|
||||
/*
|
||||
// Copyright (c) 2018 Intel Corporation
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
*/
|
||||
|
||||
#pragma once
|
||||
|
||||
@@ -61,6 +50,8 @@ static const char custom_cldnn_message[] = "Required for clDNN (GPU)-targeted cu
|
||||
static const char custom_cpu_library_message[] = "Required for MKLDNN (CPU)-targeted custom layers." \
|
||||
"Absolute path to a shared library with the kernels impl.";
|
||||
|
||||
/// @brief message for plugin messages
|
||||
static const char plugin_message[] = "Enables messages from a plugin";
|
||||
|
||||
/// @brief Define flag for showing help message <br>
|
||||
DEFINE_bool(h, false, help_message);
|
||||
@@ -96,6 +87,9 @@ DEFINE_string(l, "", custom_cpu_library_message);
|
||||
/// @brief Iterations count (default 1)
|
||||
DEFINE_int32(ni, 1, iterations_count_message);
|
||||
|
||||
/// @brief Enable plugin messages
|
||||
DEFINE_bool(p_msg, false, plugin_message);
|
||||
|
||||
/**
|
||||
* @brief This function show a help message
|
||||
*/
|
||||
@@ -115,4 +109,5 @@ static void showUsage() {
|
||||
std::cout << " -nt \"<integer>\" " << ntop_message << std::endl;
|
||||
std::cout << " -ni \"<integer>\" " << iterations_count_message << std::endl;
|
||||
std::cout << " -pc " << performance_counter_message << std::endl;
|
||||
std::cout << " -p_msg " << plugin_message << std::endl;
|
||||
}
|
||||
|
||||
@@ -1,18 +1,7 @@
|
||||
/*
|
||||
// Copyright (c) 2018 Intel Corporation
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
*/
|
||||
|
||||
#include <fstream>
|
||||
#include <vector>
|
||||
@@ -32,6 +21,8 @@
|
||||
|
||||
using namespace InferenceEngine;
|
||||
|
||||
ConsoleErrorListener error_listener;
|
||||
|
||||
bool ParseAndCheckCommandLine(int argc, char *argv[]) {
|
||||
// ---------------------------Parsing and validation of input args--------------------------------------
|
||||
gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
|
||||
@@ -72,13 +63,16 @@ int main(int argc, char *argv[]) {
|
||||
|
||||
/** This vector stores paths to the processed images **/
|
||||
std::vector<std::string> imageNames;
|
||||
parseImagesArguments(imageNames);
|
||||
parseInputFilesArguments(imageNames);
|
||||
if (imageNames.empty()) throw std::logic_error("No suitable images were found");
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 1. Load Plugin for inference engine -------------------------------------
|
||||
slog::info << "Loading plugin" << slog::endl;
|
||||
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp, "../../../lib/intel64" , "" }).getPluginByDevice(FLAGS_d);
|
||||
if (FLAGS_p_msg) {
|
||||
static_cast<InferenceEngine::InferenceEnginePluginPtr>(plugin)->SetLogCallback(error_listener);
|
||||
}
|
||||
|
||||
/** Loading default extensions **/
|
||||
if (FLAGS_d.find("CPU") != std::string::npos) {
|
||||
|
||||
@@ -37,7 +37,7 @@ set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_F
|
||||
COMPILE_PDB_NAME ${TARGET_NAME})
|
||||
|
||||
|
||||
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} cpu_extension format_reader gflags)
|
||||
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} IE::ie_cpu_extension format_reader gflags)
|
||||
|
||||
if(UNIX)
|
||||
target_link_libraries(${TARGET_NAME} ${LIB_DL} pthread)
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Image Classification Sample Async {#InferenceEngineClassificationPipelinedSampleApplication}
|
||||
# Image Classification Sample Async
|
||||
|
||||
This sample demonstrates how to build and execute inference in pipelined mode on example of classifications networks.
|
||||
|
||||
@@ -52,6 +52,8 @@ Options:
|
||||
Enables per-layer performance report
|
||||
-nireq "<integer>"
|
||||
Number of infer request for pipelined mode (default 1)
|
||||
-p_msg
|
||||
Enables messages from a plugin
|
||||
|
||||
```
|
||||
|
||||
@@ -78,4 +80,4 @@ Then in the loop it starts inference for the current infer request and switch fo
|
||||
When inference is done, the application outputs data to the standard output stream.
|
||||
|
||||
## See Also
|
||||
* [Using Inference Engine Samples](@ref SamplesOverview)
|
||||
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)
|
||||
|
||||
@@ -1,18 +1,7 @@
|
||||
/*
|
||||
// Copyright (c) 2018 Intel Corporation
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
*/
|
||||
|
||||
#pragma once
|
||||
|
||||
@@ -65,6 +54,9 @@ static const char custom_cldnn_message[] = "Required for clDNN (GPU)-targeted cu
|
||||
static const char custom_cpu_library_message[] = "Required for MKLDNN (CPU)-targeted custom layers." \
|
||||
"Absolute path to a shared library with the kernels impl.";
|
||||
|
||||
/// @brief message for plugin messages
|
||||
static const char plugin_message[] = "Enables messages from a plugin";
|
||||
|
||||
|
||||
/// @brief Define flag for showing help message <br>
|
||||
DEFINE_bool(h, false, help_message);
|
||||
@@ -103,6 +95,9 @@ DEFINE_int32(ni, 1, iterations_count_message);
|
||||
/// @brief Number of infer requests
|
||||
DEFINE_int32(nireq, 1, ninfer_request_message);
|
||||
|
||||
/// @brief Enable plugin messages
|
||||
DEFINE_bool(p_msg, false, plugin_message);
|
||||
|
||||
/**
|
||||
* @brief This function show a help message
|
||||
*/
|
||||
@@ -123,4 +118,5 @@ static void showUsage() {
|
||||
std::cout << " -ni \"<integer>\" " << iterations_count_message << std::endl;
|
||||
std::cout << " -pc " << performance_counter_message << std::endl;
|
||||
std::cout << " -nireq \"<integer>\" " << ninfer_request_message << std::endl;
|
||||
std::cout << " -p_msg " << plugin_message << std::endl;
|
||||
}
|
||||
|
||||
@@ -1,18 +1,7 @@
|
||||
/*
|
||||
// Copyright (c) 2018 Intel Corporation
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
*/
|
||||
|
||||
/**
|
||||
* @brief The entry point the Inference Engine sample application
|
||||
@@ -30,7 +19,7 @@
|
||||
|
||||
#include <inference_engine.hpp>
|
||||
|
||||
#include <format_reader/format_reader_ptr.h>
|
||||
#include <format_reader_ptr.h>
|
||||
|
||||
#include <samples/common.hpp>
|
||||
#include <samples/slog.hpp>
|
||||
@@ -43,6 +32,8 @@
|
||||
|
||||
using namespace InferenceEngine;
|
||||
|
||||
ConsoleErrorListener error_listener;
|
||||
|
||||
bool ParseAndCheckCommandLine(int argc, char *argv[]) {
|
||||
// ---------------------------Parsing and validation of input args--------------------------------------
|
||||
slog::info << "Parsing input parameters" << slog::endl;
|
||||
@@ -88,13 +79,16 @@ int main(int argc, char *argv[]) {
|
||||
|
||||
/** This vector stores paths to the processed images **/
|
||||
std::vector<std::string> imageNames;
|
||||
parseImagesArguments(imageNames);
|
||||
parseInputFilesArguments(imageNames);
|
||||
if (imageNames.empty()) throw std::logic_error("No suitable images were found");
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 1. Load Plugin for inference engine -------------------------------------
|
||||
slog::info << "Loading plugin" << slog::endl;
|
||||
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp, "../../../lib/intel64" , "" }).getPluginByDevice(FLAGS_d);
|
||||
if (FLAGS_p_msg) {
|
||||
static_cast<InferenceEngine::InferenceEnginePluginPtr>(plugin)->SetLogCallback(error_listener);
|
||||
}
|
||||
|
||||
/** Loading default extensions **/
|
||||
if (FLAGS_d.find("CPU") != std::string::npos) {
|
||||
@@ -194,7 +188,6 @@ int main(int argc, char *argv[]) {
|
||||
if (FLAGS_pc) {
|
||||
config[PluginConfigParams::KEY_PERF_COUNT] = PluginConfigParams::YES;
|
||||
}
|
||||
|
||||
ExecutableNetwork executable_network = plugin.LoadNetwork(network, {});
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
|
||||
@@ -1,57 +0,0 @@
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
cmake_minimum_required (VERSION 2.8)
|
||||
|
||||
include(CPUID)
|
||||
include(OptimizationFlags)
|
||||
|
||||
set(OpenCV_STATIC OFF)
|
||||
|
||||
set (BUILD_VALIDATION_APP OFF)
|
||||
|
||||
find_package(OpenCV 3.3 QUIET COMPONENTS core imgproc highgui imgcodecs)
|
||||
if(NOT(OpenCV_FOUND))
|
||||
find_package(OpenCV 3.3 QUIET COMPONENTS world)
|
||||
endif()
|
||||
|
||||
if (OpenCV_FOUND)
|
||||
set (BUILD_VALIDATION_APP ON)
|
||||
else()
|
||||
message(WARNING "No suitable OpenCV version detected, BUILD_VALIDATION_APP is set to OFF")
|
||||
endif()
|
||||
|
||||
macro(enable_omp)
|
||||
if(UNIX) # Linux
|
||||
add_definitions(-fopenmp)
|
||||
find_library(intel_omp_lib iomp5
|
||||
PATHS ${InferenceEngine_INCLUDE_DIRS}/../external/mkltiny_lnx/lib
|
||||
)
|
||||
elseif(WIN32) # Windows
|
||||
if(${CMAKE_CXX_COMPILER_ID} STREQUAL MSVC)
|
||||
set(OPENMP_FLAGS "/Qopenmp /openmp")
|
||||
set(CMAKE_SHARED_LINKER_FLAGS " ${CMAKE_SHARED_LINKER_FLAGS} /nodefaultlib:vcomp")
|
||||
elseif(${CMAKE_CXX_COMPILER_ID} STREQUAL Intel)
|
||||
set(OPENMP_FLAGS "/Qopenmp /openmp")
|
||||
else()
|
||||
message("Unknown compiler ID. OpenMP support is disabled.")
|
||||
endif()
|
||||
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OPENMP_FLAGS}")
|
||||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OPENMP_FLAGS}")
|
||||
find_library(intel_omp_lib
|
||||
libiomp5md
|
||||
PATHS "${InferenceEngine_INCLUDE_DIRS}/../lib/intel64/${CMAKE_BUILD_TYPE}"
|
||||
)
|
||||
endif()
|
||||
endmacro(enable_omp)
|
||||
@@ -1,6 +1,17 @@
|
||||
# Copyright (C) 2018 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
cmake_minimum_required(VERSION 2.8)
|
||||
|
||||
set (TARGET_NAME "format_reader")
|
||||
@@ -15,8 +26,12 @@ file (GLOB LIBRARY_HEADERS
|
||||
|
||||
# Find OpenCV libray if exists
|
||||
find_package(OpenCV)
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
|
||||
if(OpenCV_FOUND)
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
else()
|
||||
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " is built without OPENCV support")
|
||||
endif()
|
||||
|
||||
if(UNIX)
|
||||
list(REMOVE_ITEM MAIN_SRC ${CMAKE_CURRENT_SOURCE_DIR}/dllmain.cpp)
|
||||
else()
|
||||
@@ -37,4 +52,4 @@ add_library(${TARGET_NAME} SHARED ${MAIN_SRC} ${LIBRARY_HEADERS})
|
||||
target_link_libraries(${TARGET_NAME} ${OpenCV_LIBRARIES})
|
||||
|
||||
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
|
||||
COMPILE_PDB_NAME ${TARGET_NAME})
|
||||
COMPILE_PDB_NAME ${TARGET_NAME})
|
||||
|
||||
@@ -25,22 +25,22 @@ private:
|
||||
static Register<BitMap> reg;
|
||||
|
||||
typedef struct {
|
||||
unsigned short type; /* Magic identifier */
|
||||
unsigned int size; /* File size in bytes */
|
||||
unsigned int reserved;
|
||||
unsigned int offset; /* Offset to image data, bytes */
|
||||
unsigned short type = 0u; /* Magic identifier */
|
||||
unsigned int size = 0u; /* File size in bytes */
|
||||
unsigned int reserved = 0u;
|
||||
unsigned int offset = 0u; /* Offset to image data, bytes */
|
||||
} BmpHeader;
|
||||
|
||||
typedef struct {
|
||||
unsigned int size; /* Header size in bytes */
|
||||
int width, height; /* Width and height of image */
|
||||
unsigned short planes; /* Number of colour planes */
|
||||
unsigned short bits; /* Bits per pixel */
|
||||
unsigned int compression; /* Compression type */
|
||||
unsigned int imagesize; /* Image size in bytes */
|
||||
int xresolution, yresolution; /* Pixels per meter */
|
||||
unsigned int ncolours; /* Number of colours */
|
||||
unsigned int importantcolours; /* Important colours */
|
||||
unsigned int size = 0u; /* Header size in bytes */
|
||||
int width = 0, height = 0; /* Width and height of image */
|
||||
unsigned short planes = 0u; /* Number of colour planes */
|
||||
unsigned short bits = 0u; /* Bits per pixel */
|
||||
unsigned int compression = 0u; /* Compression type */
|
||||
unsigned int imagesize = 0u; /* Image size in bytes */
|
||||
int xresolution = 0, yresolution = 0; /* Pixels per meter */
|
||||
unsigned int ncolours = 0u; /* Number of colours */
|
||||
unsigned int importantcolours = 0u; /* Important colours */
|
||||
} BmpInfoHeader;
|
||||
|
||||
public:
|
||||
|
||||
@@ -23,19 +23,21 @@
|
||||
#endif
|
||||
|
||||
/**
|
||||
* @brief This function check input args and find images in given folder
|
||||
* @brief This function checks input args and existence of specified files in a given folder
|
||||
* @param arg path to a file to be checked for existence
|
||||
* @return files updated vector of verified input files
|
||||
*/
|
||||
void readImagesArguments(std::vector<std::string> &images, const std::string& arg) {
|
||||
void readInputFilesArguments(std::vector<std::string> &files, const std::string& arg) {
|
||||
struct stat sb;
|
||||
if (stat(arg.c_str(), &sb) != 0) {
|
||||
std::cout << "[ WARNING ] File " << arg << " cannot be opened!" << std::endl;
|
||||
slog::warn << "File " << arg << " cannot be opened!" << slog::endl;
|
||||
return;
|
||||
}
|
||||
if (S_ISDIR(sb.st_mode)) {
|
||||
DIR *dp;
|
||||
dp = opendir(arg.c_str());
|
||||
if (dp == nullptr) {
|
||||
std::cout << "[ WARNING ] Directory " << arg << " cannot be opened!" << std::endl;
|
||||
slog::warn << "Directory " << arg << " cannot be opened!" << slog::endl;
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -43,19 +45,29 @@ void readImagesArguments(std::vector<std::string> &images, const std::string& ar
|
||||
while (nullptr != (ep = readdir(dp))) {
|
||||
std::string fileName = ep->d_name;
|
||||
if (fileName == "." || fileName == "..") continue;
|
||||
std::cout << "[ INFO ] Add file " << ep->d_name << " from directory " << arg << "." << std::endl;
|
||||
images.push_back(arg + "/" + ep->d_name);
|
||||
files.push_back(arg + "/" + ep->d_name);
|
||||
}
|
||||
closedir(dp);
|
||||
} else {
|
||||
files.push_back(arg);
|
||||
}
|
||||
|
||||
if (files.size() < 20) {
|
||||
slog::info << "Files were added: " << files.size() << slog::endl;
|
||||
for (std::string filePath : files) {
|
||||
slog::info << " " << filePath << slog::endl;
|
||||
}
|
||||
} else {
|
||||
images.push_back(arg);
|
||||
slog::info << "Files were added: " << files.size() << ". Too many to display each of them." << slog::endl;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief This function find -i/--images key in input args
|
||||
* It's necessary to process multiple values for single key
|
||||
* @return files updated vector of verified input files
|
||||
*/
|
||||
void parseImagesArguments(std::vector<std::string> &images) {
|
||||
void parseInputFilesArguments(std::vector<std::string> &files) {
|
||||
std::vector<std::string> args = gflags::GetArgvs();
|
||||
bool readArguments = false;
|
||||
for (size_t i = 0; i < args.size(); i++) {
|
||||
@@ -69,6 +81,6 @@ void parseImagesArguments(std::vector<std::string> &images) {
|
||||
if (args.at(i).c_str()[0] == '-') {
|
||||
break;
|
||||
}
|
||||
readImagesArguments(images, args.at(i));
|
||||
readInputFilesArguments(files, args.at(i));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -46,6 +46,20 @@
|
||||
#endif
|
||||
#endif
|
||||
|
||||
/**
|
||||
* @brief This class represents a console error listener.
|
||||
*
|
||||
*/
|
||||
class ConsoleErrorListener : public InferenceEngine::IErrorListener {
|
||||
/**
|
||||
* @brief The plugin calls this method with a null terminated error message (in case of error)
|
||||
* @param msg Error message
|
||||
*/
|
||||
void onError(const char *msg) noexcept override {
|
||||
std::clog << "Plugin message: " << msg << std::endl;
|
||||
}
|
||||
};
|
||||
|
||||
/**
|
||||
* @brief Trims from both ends (in place)
|
||||
* @param s - string to trim
|
||||
@@ -183,7 +197,7 @@ static UNUSED std::ostream &operator<<(std::ostream &os, const PluginVersion &ve
|
||||
}
|
||||
|
||||
inline void printPluginVersion(InferenceEngine::InferenceEnginePluginPtr ptr, std::ostream& stream) {
|
||||
const PluginVersion *pluginVersion;
|
||||
const PluginVersion *pluginVersion = nullptr;
|
||||
ptr->GetVersion((const InferenceEngine::Version*&)pluginVersion);
|
||||
stream << pluginVersion << std::endl;
|
||||
}
|
||||
@@ -462,9 +476,10 @@ static UNUSED bool writeOutputBmp(std::string name, unsigned char *data, size_t
|
||||
* @param width - width of the rectangle
|
||||
* @param rectangles - vector points for the rectangle, should be 4x compared to num classes
|
||||
* @param classes - vector of classes
|
||||
* @param thickness - thickness of a line (in pixels) to be used for bounding boxes
|
||||
*/
|
||||
static UNUSED void addRectangles(unsigned char *data, size_t height, size_t width, std::vector<int> rectangles, std::vector<int> classes) {
|
||||
std::vector<Color> colors = {
|
||||
static UNUSED void addRectangles(unsigned char *data, size_t height, size_t width, std::vector<int> rectangles, std::vector<int> classes, int thickness = 1) {
|
||||
std::vector<Color> colors = { // colors to be used for bounding boxes
|
||||
{ 128, 64, 128 },
|
||||
{ 232, 35, 244 },
|
||||
{ 70, 70, 70 },
|
||||
@@ -497,38 +512,47 @@ static UNUSED void addRectangles(unsigned char *data, size_t height, size_t widt
|
||||
int w = rectangles.at(i * 4 + 2);
|
||||
int h = rectangles.at(i * 4 + 3);
|
||||
|
||||
int cls = classes.at(i) % colors.size(); // color of a bounding box line
|
||||
|
||||
if (x < 0) x = 0;
|
||||
if (y < 0) y = 0;
|
||||
if (w < 0) w = 0;
|
||||
if (h < 0) h = 0;
|
||||
|
||||
if (x >= width) { x = width - 1; w = 0; }
|
||||
if (y >= height) { y = height - 1; h = 0; }
|
||||
if (x >= width) { x = width - 1; w = 0; thickness = 1; }
|
||||
if (y >= height) { y = height - 1; h = 0; thickness = 1; }
|
||||
|
||||
if (x + w >= width) { w = width - x - 1; }
|
||||
if (y + h >= height) { h = height - y - 1; }
|
||||
|
||||
size_t shift_first = y*width * 3;
|
||||
size_t shift_second = (y + h)*width * 3;
|
||||
int cls = classes.at(i) % colors.size();
|
||||
for (int i = x; i < x + w; i++) {
|
||||
data[shift_first + i * 3] = colors.at(cls).red();
|
||||
data[shift_first + i * 3 + 1] = colors.at(cls).green();
|
||||
data[shift_first + i * 3 + 2] = colors.at(cls).blue();
|
||||
data[shift_second + i * 3] = colors.at(cls).red();
|
||||
data[shift_second + i * 3 + 1] = colors.at(cls).green();
|
||||
data[shift_second + i * 3 + 2] = colors.at(cls).blue();
|
||||
thickness = std::min(std::min(thickness, w / 2 + 1), h / 2 + 1);
|
||||
|
||||
size_t shift_first;
|
||||
size_t shift_second;
|
||||
for (int t = 0; t < thickness; t++) {
|
||||
shift_first = (y + t) * width * 3;
|
||||
shift_second = (y + h - t) * width * 3;
|
||||
for (int i = x; i < x + w + 1; i++) {
|
||||
data[shift_first + i * 3] = colors.at(cls).red();
|
||||
data[shift_first + i * 3 + 1] = colors.at(cls).green();
|
||||
data[shift_first + i * 3 + 2] = colors.at(cls).blue();
|
||||
data[shift_second + i * 3] = colors.at(cls).red();
|
||||
data[shift_second + i * 3 + 1] = colors.at(cls).green();
|
||||
data[shift_second + i * 3 + 2] = colors.at(cls).blue();
|
||||
}
|
||||
}
|
||||
|
||||
shift_first = x * 3;
|
||||
shift_second = (x + w) * 3;
|
||||
for (int i = y; i < y + h; i++) {
|
||||
data[shift_first + i*width * 3] = colors.at(cls).red();
|
||||
data[shift_first + i*width * 3 + 1] = colors.at(cls).green();
|
||||
data[shift_first + i*width * 3 + 2] = colors.at(cls).blue();
|
||||
data[shift_second + i*width * 3] = colors.at(cls).red();
|
||||
data[shift_second + i*width * 3 + 1] = colors.at(cls).green();
|
||||
data[shift_second + i*width * 3 + 2] = colors.at(cls).blue();
|
||||
for (int t = 0; t < thickness; t++) {
|
||||
shift_first = (x + t) * 3;
|
||||
shift_second = (x + w - t) * 3;
|
||||
for (int i = y; i < y + h + 1; i++) {
|
||||
data[shift_first + i * width * 3] = colors.at(cls).red();
|
||||
data[shift_first + i * width * 3 + 1] = colors.at(cls).green();
|
||||
data[shift_first + i * width * 3 + 2] = colors.at(cls).blue();
|
||||
data[shift_second + i * width * 3] = colors.at(cls).red();
|
||||
data[shift_second + i * width * 3 + 1] = colors.at(cls).green();
|
||||
data[shift_second + i * width * 3 + 2] = colors.at(cls).blue();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1091,4 +1115,4 @@ static InferenceEngine::Blob::Ptr wrapMat2Blob(const cv::Mat &mat) {
|
||||
|
||||
return InferenceEngine::make_shared_blob<uint8_t>(tDesc, mat.data);
|
||||
}
|
||||
#endif
|
||||
#endif
|
||||
31
inference-engine/samples/create_msvc2015_solution.bat
Normal file
31
inference-engine/samples/create_msvc2015_solution.bat
Normal file
@@ -0,0 +1,31 @@
|
||||
@echo off
|
||||
|
||||
:: Copyright (c) 2018 Intel Corporation
|
||||
::
|
||||
:: Licensed under the Apache License, Version 2.0 (the "License");
|
||||
:: you may not use this file except in compliance with the License.
|
||||
:: You may obtain a copy of the License at
|
||||
::
|
||||
:: http://www.apache.org/licenses/LICENSE-2.0
|
||||
::
|
||||
:: Unless required by applicable law or agreed to in writing, software
|
||||
:: distributed under the License is distributed on an "AS IS" BASIS,
|
||||
:: WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
:: See the License for the specific language governing permissions and
|
||||
:: limitations under the License.
|
||||
|
||||
|
||||
@setlocal
|
||||
set "ROOT_DIR=%~dp0"
|
||||
|
||||
set "SOLUTION_DIR64=%USERPROFILE%\Documents\Intel\OpenVINO\inference_engine_samples_2015"
|
||||
if exist "%SOLUTION_DIR64%" rd /s /q "%SOLUTION_DIR64%"
|
||||
if "%InferenceEngine_DIR%"=="" set "InferenceEngine_DIR=%ROOT_DIR%\..\share"
|
||||
if exist "%ROOT_DIR%\..\..\bin\setupvars.bat" call "%ROOT_DIR%\..\..\bin\setupvars.bat"
|
||||
if exist "%ROOT_DIR%\..\..\..\bin\setupvars.bat" call "%ROOT_DIR%\..\..\..\bin\setupvars.bat"
|
||||
|
||||
echo Creating Visual Studio 2015 (x64) files in %SOLUTION_DIR64%... && ^
|
||||
cd "%ROOT_DIR%" && cmake -E make_directory "%SOLUTION_DIR64%" && cd "%SOLUTION_DIR64%" && cmake -G "Visual Studio 14 2015 Win64" "%ROOT_DIR%"
|
||||
|
||||
echo Done.
|
||||
pause
|
||||
31
inference-engine/samples/create_msvc2017_solution.bat
Normal file
31
inference-engine/samples/create_msvc2017_solution.bat
Normal file
@@ -0,0 +1,31 @@
|
||||
@echo off
|
||||
|
||||
:: Copyright (c) 2018 Intel Corporation
|
||||
::
|
||||
:: Licensed under the Apache License, Version 2.0 (the "License");
|
||||
:: you may not use this file except in compliance with the License.
|
||||
:: You may obtain a copy of the License at
|
||||
::
|
||||
:: http://www.apache.org/licenses/LICENSE-2.0
|
||||
::
|
||||
:: Unless required by applicable law or agreed to in writing, software
|
||||
:: distributed under the License is distributed on an "AS IS" BASIS,
|
||||
:: WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
:: See the License for the specific language governing permissions and
|
||||
:: limitations under the License.
|
||||
|
||||
|
||||
@setlocal
|
||||
set "ROOT_DIR=%~dp0"
|
||||
|
||||
set "SOLUTION_DIR64=%USERPROFILE%\Documents\Intel\OpenVINO\inference_engine_samples_2017"
|
||||
if exist "%SOLUTION_DIR64%" rd /s /q "%SOLUTION_DIR64%"
|
||||
if "%InferenceEngine_DIR%"=="" set "InferenceEngine_DIR=%ROOT_DIR%\..\share"
|
||||
if exist "%ROOT_DIR%\..\..\bin\setupvars.bat" call "%ROOT_DIR%\..\..\bin\setupvars.bat"
|
||||
if exist "%ROOT_DIR%\..\..\..\bin\setupvars.bat" call "%ROOT_DIR%\..\..\..\bin\setupvars.bat"
|
||||
|
||||
echo Creating Visual Studio 2017 (x64) files in %SOLUTION_DIR64%... && ^
|
||||
cd "%ROOT_DIR%" && cmake -E make_directory "%SOLUTION_DIR64%" && cd "%SOLUTION_DIR64%" && cmake -G "Visual Studio 15 2017 Win64" "%ROOT_DIR%"
|
||||
|
||||
echo Done.
|
||||
pause
|
||||
@@ -26,15 +26,15 @@ file (GLOB SRC
|
||||
|
||||
# Find OpenCV libray if exists
|
||||
find_package(OpenCV)
|
||||
if(OpenCV_FOUND)
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
else()
|
||||
if(NOT(OpenCV_FOUND))
|
||||
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " skiped")
|
||||
return()
|
||||
endif()
|
||||
|
||||
source_group("src" FILES ${SRC})
|
||||
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
|
||||
link_directories(${LIB_FOLDER})
|
||||
|
||||
# Create library file from sources.
|
||||
@@ -46,7 +46,6 @@ set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_F
|
||||
|
||||
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} ${OpenCV_LIBRARIES})
|
||||
|
||||
|
||||
if(UNIX)
|
||||
target_link_libraries(${TARGET_NAME} ${LIB_DL})
|
||||
endif()
|
||||
|
||||
@@ -1,15 +1,15 @@
|
||||
# Hello Autoresize Classification Sample {#InferenceEngineHelloAutoresizeClassificationSample}
|
||||
# Hello Autoresize Classification Sample
|
||||
|
||||
This topic describes how to run the Hello Autoresize Classification sample application.
|
||||
The sample is simplified version of [Image Classification Sample](@ref InferenceEngineClassificationSampleApplication).
|
||||
The sample is simplified version of [Image Classification Sample](./samples/classification_sample/README.md).
|
||||
It's intended to demonstrate using of new input autoresize API of Inference Engine in applications. Refer to
|
||||
[Integrate with customer application New Request API](@ref IntegrateIEInAppNewAPI) for details.
|
||||
[Integrate with customer application New Request API](./docs/Inference_Engine_Developer_Guide/Integrate_with_customer_application_new_API.md) for details.
|
||||
|
||||
There is also new API introduced to crop a ROI object and set it as input without additional memory re-allocation.
|
||||
To properly demonstrate this new API it's required to run several networks in pipeline which is out of scope of this sample.
|
||||
Please refer to [Object Detection for SSD Demo app](@ref InferenceEngineObjectDetectionSSDDemoAsyncApplication) or
|
||||
[Security Barrier Camera Demo](@ref InferenceEngineSecurityBarrierCameraDemoApplication) or
|
||||
[Crossroad Camera Demo](@ref InferenceEngineCrossroadCameraDemoApplication) with an example of using of new crop ROI API.
|
||||
Please refer to [Object Detection for SSD Demo app](./samples/object_detection_demo_ssd_async/README.md) or
|
||||
[Security Barrier Camera Demo](./samples/security_barrier_camera_demo/README.md) or
|
||||
[Crossroad Camera Demo](./samples/crossroad_camera_demo/README.md) with an example of using of new crop ROI API.
|
||||
|
||||
## Running
|
||||
|
||||
@@ -23,4 +23,4 @@ You can do inference on an image using a trained AlexNet network on Intel® P
|
||||
The application outputs top-10 inference results.
|
||||
|
||||
## See Also
|
||||
* [Using Inference Engine Samples](@ref SamplesOverview)
|
||||
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)
|
||||
|
||||
@@ -1,18 +1,7 @@
|
||||
/*
|
||||
// Copyright (c) 2018 Intel Corporation
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
*/
|
||||
|
||||
#include <iomanip>
|
||||
#include <vector>
|
||||
|
||||
@@ -26,9 +26,7 @@ file (GLOB SRC
|
||||
|
||||
# Find OpenCV libray if exists
|
||||
find_package(OpenCV)
|
||||
if(OpenCV_FOUND)
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
else()
|
||||
if(NOT(OpenCV_FOUND))
|
||||
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " skiped")
|
||||
return()
|
||||
endif()
|
||||
@@ -37,6 +35,8 @@ endif()
|
||||
# Empty name lists them directly under the .vcproj
|
||||
source_group("src" FILES ${SRC})
|
||||
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
|
||||
link_directories(${LIB_FOLDER})
|
||||
|
||||
# Create library file from sources.
|
||||
|
||||
@@ -1,18 +1,7 @@
|
||||
/*
|
||||
// Copyright (c) 2018 Intel Corporation
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
*/
|
||||
|
||||
#include <iomanip>
|
||||
#include <vector>
|
||||
|
||||
@@ -26,9 +26,7 @@ file (GLOB SRC
|
||||
|
||||
# Find OpenCV libray if exists
|
||||
find_package(OpenCV)
|
||||
if(OpenCV_FOUND)
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
else()
|
||||
if(NOT(OpenCV_FOUND))
|
||||
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " skiped")
|
||||
return()
|
||||
endif()
|
||||
@@ -37,6 +35,8 @@ endif()
|
||||
# Empty name lists them directly under the .vcproj
|
||||
source_group("src" FILES ${SRC})
|
||||
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
|
||||
link_directories(${LIB_FOLDER})
|
||||
|
||||
# Create library file from sources.
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
# Hello Infer Request Classification Sample {#InferenceEngineHelloRequestClassificationSample}
|
||||
# Hello Infer Request Classification Sample
|
||||
|
||||
This topic describes how to run the Hello Infer Classification sample application.
|
||||
The sample is simplified version of [Image Classification Sample](@ref InferenceEngineClassificationSampleApplication).
|
||||
The sample is simplified version of [Image Classification Sample](./samples/classification_sample/README.md).
|
||||
It's intended to demonstrate using of new Infer Request API of Inference Engine in applications. Refer to
|
||||
[Integrate with customer application New Request API](@ref IntegrateIEInAppNewAPI) for details.
|
||||
[Integrate with customer application New Request API](./docs/Inference_Engine_Developer_Guide/Integrate_with_customer_application_new_API.md) for details.
|
||||
|
||||
## Running
|
||||
|
||||
@@ -18,4 +18,4 @@ The application outputs top-10 inference results.
|
||||
|
||||
|
||||
## See Also
|
||||
* [Using Inference Engine Samples](@ref SamplesOverview)
|
||||
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)
|
||||
|
||||
@@ -1,18 +1,7 @@
|
||||
/*
|
||||
// Copyright (c) 2018 Intel Corporation
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
*/
|
||||
|
||||
#include <iomanip>
|
||||
#include <vector>
|
||||
|
||||
@@ -0,0 +1,56 @@
|
||||
# Copyright (c) 2018 Intel Corporation
|
||||
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
cmake_minimum_required(VERSION 2.8)
|
||||
|
||||
set(TARGET_NAME "hello_shape_infer_ssd")
|
||||
|
||||
if (BUILD_SAMPLE_NAME AND NOT ${BUILD_SAMPLE_NAME} STREQUAL ${TARGET_NAME})
|
||||
message(STATUS "SAMPLE ${TARGET_NAME} SKIPPED")
|
||||
return()
|
||||
endif ()
|
||||
|
||||
file(GLOB SRC
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/*.cpp
|
||||
)
|
||||
|
||||
file(GLOB HEADERS
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/*.hpp
|
||||
)
|
||||
|
||||
# Find OpenCV libray if exists
|
||||
find_package(OpenCV)
|
||||
if(NOT(OpenCV_FOUND))
|
||||
message(STATUS "OPENCV is disabled or not found, " ${TARGET_NAME} " skiped")
|
||||
return()
|
||||
endif()
|
||||
|
||||
# Create named folders for the sources within the .vcproj
|
||||
# Empty name lists them directly under the .vcproj
|
||||
source_group("src" FILES ${SRC})
|
||||
source_group("headers" FILES ${HEADERS})
|
||||
|
||||
include_directories(${OpenCV_INCLUDE_DIRS})
|
||||
|
||||
link_directories(${LIB_FOLDER})
|
||||
|
||||
# Create library file from sources.
|
||||
add_executable(${TARGET_NAME} ${SRC} ${HEADERS})
|
||||
|
||||
set_target_properties(${TARGET_NAME} PROPERTIES COMPILE_PDB_NAME ${TARGET_NAME})
|
||||
|
||||
target_link_libraries(${TARGET_NAME} ${InferenceEngine_LIBRARIES} IE::ie_cpu_extension ${OpenCV_LIBRARIES})
|
||||
|
||||
if (UNIX)
|
||||
target_link_libraries(${TARGET_NAME} ${LIB_DL})
|
||||
endif ()
|
||||
20
inference-engine/samples/hello_shape_infer_ssd/README.md
Normal file
20
inference-engine/samples/hello_shape_infer_ssd/README.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# Hello Shape Infer Sample
|
||||
|
||||
This topic demonstrates how to run the Hello Shape Infer SSD application, which does inference using object detection
|
||||
networks like SSD-VGG. The sample shows how to use [Shape Inference feature](./docs/Inference_Engine_Developer_Guide/ShapeInference.md).
|
||||
|
||||
## Running
|
||||
|
||||
You can use the following command to do inference on Intel® Processors on an image using a trained SSD network:
|
||||
```sh
|
||||
./hello_shape_infer_ssd <path_to_model>/ssd_300.xml <path_to_image>/500x500.bmp CPU 3
|
||||
```
|
||||
|
||||
### Outputs
|
||||
|
||||
The application renders an image with detected objects enclosed in rectangles. It outputs the list of classes
|
||||
of the detected objects along with the respective confidence values and the coordinates of the
|
||||
rectangles to the standard output stream.
|
||||
|
||||
## See Also
|
||||
* [Using Inference Engine Samples](./docs/Inference_Engine_Developer_Guide/Samples_Overview.md)
|
||||
173
inference-engine/samples/hello_shape_infer_ssd/main.cpp
Normal file
173
inference-engine/samples/hello_shape_infer_ssd/main.cpp
Normal file
@@ -0,0 +1,173 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <vector>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
|
||||
#include <opencv2/opencv.hpp>
|
||||
#include <inference_engine.hpp>
|
||||
#include <samples/common.hpp>
|
||||
#include <ext_list.hpp>
|
||||
|
||||
#include "shape_infer_extension.hpp"
|
||||
|
||||
using namespace InferenceEngine;
|
||||
|
||||
int main(int argc, char* argv[]) {
|
||||
try {
|
||||
// ------------------------------ Parsing and validation of input args ---------------------------------
|
||||
if (argc != 5) {
|
||||
std::cout << "Usage : ./hello_shape_infer_ssd <path_to_model> <path_to_image> <device> <batch>"
|
||||
<< std::endl;
|
||||
return EXIT_FAILURE;
|
||||
}
|
||||
const std::string input_model{argv[1]};
|
||||
const std::string input_image_path{argv[2]};
|
||||
const std::string device_name{argv[3]};
|
||||
const size_t batch_size{std::stoul(argv[4])};
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 1. Load Plugin for inference engine -------------------------------------
|
||||
InferencePlugin plugin = PluginDispatcher({"../../../lib/intel64", ""}).getPluginByDevice(device_name);
|
||||
IExtensionPtr cpuExtension, inPlaceExtension;
|
||||
if (device_name == "CPU") {
|
||||
cpuExtension = std::make_shared<Extensions::Cpu::CpuExtensions>();
|
||||
inPlaceExtension = std::make_shared<InPlaceExtension>();
|
||||
plugin.AddExtension(cpuExtension);
|
||||
// register sample's custom kernel (CustomReLU)
|
||||
plugin.AddExtension(inPlaceExtension);
|
||||
}
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 2. Read IR Generated by ModelOptimizer (.xml and .bin files) ------------
|
||||
CNNNetReader network_reader;
|
||||
network_reader.ReadNetwork(input_model);
|
||||
network_reader.ReadWeights(input_model.substr(0, input_model.size() - 4) + ".bin");
|
||||
CNNNetwork network = network_reader.getNetwork();
|
||||
|
||||
OutputsDataMap outputs_info(network.getOutputsInfo());
|
||||
InputsDataMap inputs_info(network.getInputsInfo());
|
||||
if (inputs_info.size() != 1 && outputs_info.size() != 1)
|
||||
throw std::logic_error("Sample supports clean SSD network with one input and one output");
|
||||
|
||||
// --------------------------- Resize network to match image sizes and given batch----------------------
|
||||
if (device_name == "CPU") {
|
||||
// register shape inference functions (SpatialTransformer) from CPU Extension
|
||||
network.AddExtension(cpuExtension);
|
||||
// register sample's custom shape inference (CustomReLU)
|
||||
network.AddExtension(inPlaceExtension);
|
||||
}
|
||||
auto input_shapes = network.getInputShapes();
|
||||
std::string input_name;
|
||||
SizeVector input_shape;
|
||||
std::tie(input_name, input_shape) = *input_shapes.begin();
|
||||
cv::Mat image = cv::imread(input_image_path);
|
||||
input_shape[0] = batch_size;
|
||||
input_shape[2] = image.rows;
|
||||
input_shape[3] = image.cols;
|
||||
input_shapes[input_name] = input_shape;
|
||||
std::cout << "Resizing network to the image size = [" << image.rows << "x" << image.cols << "] "
|
||||
<< "with batch = " << batch_size << std::endl;
|
||||
network.reshape(input_shapes);
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 3. Configure input & output ---------------------------------------------
|
||||
// --------------------------- Prepare input blobs -----------------------------------------------------
|
||||
InputInfo::Ptr input_info;
|
||||
std::tie(input_name, input_info) = *inputs_info.begin();
|
||||
input_info->setLayout(Layout::NCHW);
|
||||
input_info->setPrecision(Precision::U8);
|
||||
// --------------------------- Prepare output blobs ----------------------------------------------------
|
||||
DataPtr output_info;
|
||||
std::string output_name;
|
||||
std::tie(output_name, output_info) = *outputs_info.begin();
|
||||
if (output_info->creatorLayer.lock()->type != "DetectionOutput")
|
||||
throw std::logic_error("Can't find a DetectionOutput layer in the topology");
|
||||
const SizeVector output_shape = output_info->getTensorDesc().getDims();
|
||||
const int max_proposal_count = output_shape[2];
|
||||
const int object_size = output_shape[3];
|
||||
if (object_size != 7) {
|
||||
throw std::logic_error("Output item should have 7 as a last dimension");
|
||||
}
|
||||
if (output_shape.size() != 4) {
|
||||
throw std::logic_error("Incorrect output dimensions for SSD model");
|
||||
}
|
||||
if (output_info == nullptr) {
|
||||
THROW_IE_EXCEPTION << "[SAMPLES] shared_ptr ouput_info == nullptr";
|
||||
}
|
||||
|
||||
output_info->setPrecision(Precision::FP32);
|
||||
|
||||
auto dumpVec = [](const SizeVector& vec) -> std::string {
|
||||
if (vec.empty()) return "[]";
|
||||
std::stringstream oss;
|
||||
oss << "[" << vec[0];
|
||||
for (size_t i = 1; i < vec.size(); i++) oss << "," << vec[i];
|
||||
oss << "]";
|
||||
return oss.str();
|
||||
};
|
||||
std::cout << "Resulting input shape = " << dumpVec(input_shape) << std::endl;
|
||||
std::cout << "Resulting output shape = " << dumpVec(output_shape) << std::endl;
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 4. Loading model to the plugin ------------------------------------------
|
||||
ExecutableNetwork executable_network = plugin.LoadNetwork(network, {});
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 5. Create infer request -------------------------------------------------
|
||||
InferRequest infer_request = executable_network.CreateInferRequest();
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 6. Prepare input --------------------------------------------------------
|
||||
Blob::Ptr input = infer_request.GetBlob(input_name);
|
||||
for (int b = 0; b < batch_size; b++) {
|
||||
matU8ToBlob<uint8_t>(image, input, b);
|
||||
}
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 7. Do inference --------------------------------------------------------
|
||||
infer_request.Infer();
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
|
||||
// --------------------------- 8. Process output ------------------------------------------------------
|
||||
Blob::Ptr output = infer_request.GetBlob(output_name);
|
||||
const float* detection = output->buffer().as<PrecisionTrait<Precision::FP32>::value_type*>();
|
||||
|
||||
/* Each detection has image_id that denotes processed image */
|
||||
for (int cur_proposal = 0; cur_proposal < max_proposal_count; cur_proposal++) {
|
||||
float image_id = detection[cur_proposal * object_size + 0];
|
||||
float label = detection[cur_proposal * object_size + 1];
|
||||
float confidence = detection[cur_proposal * object_size + 2];
|
||||
/* CPU and GPU plugins have difference in DetectionOutput layer, so we need both checks */
|
||||
if (image_id < 0 || confidence == 0) {
|
||||
continue;
|
||||
}
|
||||
|
||||
float xmin = detection[cur_proposal * object_size + 3] * image.cols;
|
||||
float ymin = detection[cur_proposal * object_size + 4] * image.rows;
|
||||
float xmax = detection[cur_proposal * object_size + 5] * image.cols;
|
||||
float ymax = detection[cur_proposal * object_size + 6] * image.rows;
|
||||
|
||||
if (confidence > 0.5) {
|
||||
/** Drawing only objects with >50% probability **/
|
||||
std::ostringstream conf;
|
||||
conf << ":" << std::fixed << std::setprecision(3) << confidence;
|
||||
cv::rectangle(image, cv::Point2f(xmin, ymin), cv::Point2f(xmax, ymax), cv::Scalar(0, 0, 255));
|
||||
std::cout << "[" << cur_proposal << "," << label << "] element, prob = " << confidence <<
|
||||
", bbox = (" << xmin << "," << ymin << ")-(" << xmax << "," << ymax << ")" << ", batch id = "
|
||||
<< image_id << std::endl;
|
||||
}
|
||||
}
|
||||
cv::imwrite("hello_shape_infer_ssd_output.jpg", image);
|
||||
std::cout << "The resulting image was saved in the file: hello_shape_infer_ssd_output.jpg" << std::endl;
|
||||
|
||||
// -----------------------------------------------------------------------------------------------------
|
||||
} catch (const std::exception& ex) {
|
||||
std::cerr << ex.what() << std::endl;
|
||||
return EXIT_FAILURE;
|
||||
}
|
||||
return EXIT_SUCCESS;
|
||||
}
|
||||
@@ -0,0 +1,146 @@
|
||||
// Copyright (C) 2018 Intel Corporation
|
||||
//
|
||||
// SPDX-License-Identifier: Apache-2.0
|
||||
//
|
||||
|
||||
#include <map>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <algorithm>
|
||||
#include <vector>
|
||||
|
||||
#include <inference_engine.hpp>
|
||||
|
||||
#define CUSTOM_RELU_TYPE std::string("CustomReLU")
|
||||
|
||||
class CustomReLUImpl : public InferenceEngine::ILayerExecImpl {
|
||||
public:
|
||||
explicit CustomReLUImpl(const InferenceEngine::CNNLayer& layer) : _layer(layer) {}
|
||||
|
||||
InferenceEngine::StatusCode getSupportedConfigurations(std::vector<InferenceEngine::LayerConfig>& conf,
|
||||
InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
InferenceEngine::DataConfig inDataConfig;
|
||||
InferenceEngine::DataConfig outDataConfig;
|
||||
auto firstInput = *_layer.insData.begin();
|
||||
auto firstOutput = *_layer.outData.begin();
|
||||
inDataConfig.desc = firstInput.lock()->getTensorDesc();
|
||||
outDataConfig.desc = firstOutput->getTensorDesc();
|
||||
InferenceEngine::LayerConfig layerConfig;
|
||||
layerConfig.inConfs = {inDataConfig};
|
||||
layerConfig.outConfs = {outDataConfig};
|
||||
conf.push_back(layerConfig);
|
||||
return InferenceEngine::StatusCode::OK;
|
||||
}
|
||||
|
||||
InferenceEngine::StatusCode
|
||||
init(InferenceEngine::LayerConfig& config, InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
return InferenceEngine::StatusCode::OK;
|
||||
}
|
||||
|
||||
InferenceEngine::StatusCode
|
||||
execute(std::vector<InferenceEngine::Blob::Ptr>& inputs, std::vector<InferenceEngine::Blob::Ptr>& outputs,
|
||||
InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
static bool wasCalled = false;
|
||||
if (!wasCalled) {
|
||||
std::cout << "Running " + CUSTOM_RELU_TYPE + " kernel for the first time (next messages won't be printed)"
|
||||
<< std::endl;
|
||||
wasCalled = true;
|
||||
}
|
||||
for (size_t i = 0; i < inputs.size(); i++) {
|
||||
auto inputBlob = inputs[i];
|
||||
auto outputBlob = outputs[i];
|
||||
auto inputData = inputBlob->buffer().as<InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type*>();
|
||||
auto outputData = outputBlob->buffer().as<InferenceEngine::PrecisionTrait<InferenceEngine::Precision::FP32>::value_type*>();
|
||||
for (size_t j = 0; j < inputBlob->size(); j++) {
|
||||
outputData[j] = inputData[j] < 0 ? 0 : inputData[j];
|
||||
}
|
||||
}
|
||||
return InferenceEngine::StatusCode::OK;
|
||||
}
|
||||
|
||||
private:
|
||||
const InferenceEngine::CNNLayer _layer;
|
||||
};
|
||||
|
||||
class CustomReLUFactory : public InferenceEngine::ILayerImplFactory {
|
||||
public:
|
||||
explicit CustomReLUFactory(const InferenceEngine::CNNLayer* layer) : _layer(*layer) {}
|
||||
|
||||
InferenceEngine::StatusCode
|
||||
getImplementations(std::vector<InferenceEngine::ILayerImpl::Ptr>& impls,
|
||||
InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
impls.push_back(std::make_shared<CustomReLUImpl>(_layer));
|
||||
return InferenceEngine::StatusCode::OK;
|
||||
}
|
||||
|
||||
private:
|
||||
InferenceEngine::CNNLayer _layer;
|
||||
};
|
||||
|
||||
class CustomReLUResizeImpl : public InferenceEngine::IShapeInferImpl {
|
||||
public:
|
||||
InferenceEngine::StatusCode inferShapes(const std::vector<InferenceEngine::SizeVector>& inShapes,
|
||||
const std::map<std::string, std::string>& params,
|
||||
const std::map<std::string, InferenceEngine::Blob::Ptr>& blobs,
|
||||
std::vector<InferenceEngine::SizeVector>& outShapes,
|
||||
InferenceEngine::ResponseDesc* desc) noexcept override {
|
||||
static bool wasCalled = false;
|
||||
if (!wasCalled) {
|
||||
std::cout << "Running " + CUSTOM_RELU_TYPE +
|
||||
" shape inference for the first time (next messages won't be printed)" << std::endl;
|
||||
wasCalled = true;
|
||||
}
|
||||
outShapes = inShapes;
|
||||
return InferenceEngine::StatusCode::OK;
|
||||
}
|
||||
};
|
||||
|
||||
class InPlaceExtension : public InferenceEngine::IExtension {
|
||||
public:
|
||||
InPlaceExtension() {
|
||||
_shapeInferImpl = std::make_shared<CustomReLUResizeImpl>();
|
||||
}
|
||||
|
||||
InferenceEngine::StatusCode
|
||||
getPrimitiveTypes(char**& types, unsigned int& size, InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
size = 1;
|
||||
types = new char* [size];
|
||||
std::string type = CUSTOM_RELU_TYPE;
|
||||
types[0] = new char[type.size() + 1];
|
||||
std::copy(type.begin(), type.end(), types[0]);
|
||||
types[0][type.size()] = 0;
|
||||
return InferenceEngine::OK;
|
||||
};
|
||||
|
||||
InferenceEngine::StatusCode
|
||||
getShapeInferTypes(char**& types, unsigned int& size, InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
return getPrimitiveTypes(types, size, resp);
|
||||
};
|
||||
|
||||
InferenceEngine::StatusCode getShapeInferImpl(InferenceEngine::IShapeInferImpl::Ptr& impl, const char* type,
|
||||
InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
if (CUSTOM_RELU_TYPE.compare(type) != 0) return InferenceEngine::StatusCode::NOT_IMPLEMENTED;
|
||||
impl = _shapeInferImpl;
|
||||
return InferenceEngine::StatusCode::OK;
|
||||
}
|
||||
|
||||
void GetVersion(const InferenceEngine::Version*& versionInfo) const noexcept override {};
|
||||
|
||||
void SetLogCallback(InferenceEngine::IErrorListener& listener) noexcept override {};
|
||||
|
||||
void Unload() noexcept override {};
|
||||
|
||||
void Release() noexcept override {}
|
||||
|
||||
InferenceEngine::StatusCode
|
||||
getFactoryFor(InferenceEngine::ILayerImplFactory*& factory, const InferenceEngine::CNNLayer* cnnLayer,
|
||||
InferenceEngine::ResponseDesc* resp) noexcept override {
|
||||
if (cnnLayer->type != CUSTOM_RELU_TYPE)
|
||||
return InferenceEngine::StatusCode::NOT_IMPLEMENTED;
|
||||
factory = new CustomReLUFactory(cnnLayer);
|
||||
return InferenceEngine::StatusCode::OK;
|
||||
};
|
||||
|
||||
private:
|
||||
InferenceEngine::IShapeInferImpl::Ptr _shapeInferImpl;
|
||||
};
|
||||
@@ -43,7 +43,7 @@ add_dependencies(${TARGET_NAME} gflags)
|
||||
set_target_properties(${TARGET_NAME} PROPERTIES "CMAKE_CXX_FLAGS" "${CMAKE_CXX_FLAGS} -fPIE"
|
||||
COMPILE_PDB_NAME ${TARGET_NAME})
|
||||
|
||||
target_link_libraries(${TARGET_NAME} format_reader cpu_extension ${InferenceEngine_LIBRARIES} gflags)
|
||||
target_link_libraries(${TARGET_NAME} format_reader IE::ie_cpu_extension ${InferenceEngine_LIBRARIES} gflags)
|
||||
|
||||
if(UNIX)
|
||||
target_link_libraries( ${TARGET_NAME} ${LIB_DL} pthread)
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user