Compare commits

...

2 Commits

Author SHA1 Message Date
Alexey Suhov
0ef92871b6 Publishing 2019 R1.1 content and Myriad plugin sources (#162)
* Publishing 2019 R1.1 content and Myriad plugin sources
2019-05-27 21:18:32 +03:00
Alexey Suhov
e206d06f18 Publishing 2019 R1.0.1 content 2019-04-30 18:55:07 +03:00
575 changed files with 55161 additions and 2286 deletions

View File

@@ -15,7 +15,12 @@ Deep Learning Deployment Toolkit is licensed under [Apache License Version 2.0](
## Documentation
* [OpenVINO™ Release Notes](https://software.intel.com/en-us/articles/OpenVINO-RelNotes)
* Inference Engine [build instructions](inference-engine/README.md)
* [Inference Engine build instructions](inference-engine/README.md)
* [Get Started with Deep Learning Deployment Toolkit on Linux*](get-started-linux.md)
* [Introduction to Deep Learning Deployment Toolkit](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Introduction.html)
* [Inference Engine Developer Guide](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html)
* [Model Optimizer Developer Guide](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html)
## How to Contribute
We welcome community contributions to the Deep Learning Deployment Toolkit repository. If you have an idea how to improve the product, please share it with us doing the following steps:

203
get-started-linux.md Normal file
View File

@@ -0,0 +1,203 @@
# Get Started with OpenVINO™ Deep Learning Deployment Toolkit (DLDT) on Linux*
This guide provides you with the information that will help you to start using the DLDT on Linux*. With this guide you will learn how to:
1. [Configure the Model Optimizer](#configure-the-model-optimizer)
2. [Prepare a model for sample inference:](#prepare-a-model-for-sample-inference)
1. [Download a pre-trained model](#download-a-trained-model)
2. [Convert the model to an Intermediate Representation (IR) with the Model Optimizer](#convert-the-model-to-an-intermediate-representation-with-the-model-optimizer)
3. [Run the Image Classification Sample Application with the model](#run-the-image-classification-sample-application)
## Prerequisites
1. This guide assumes that you have already cloned the `dldt` repo and successfully built the Inference Engine and Samples using the [build instructions](inference-engine/README.md).
2. The original structure of the repository directories is kept unchanged.
> **NOTE**: Below, the directory to which the `dldt` repository is cloned is referred to as `<DLDT_DIR>`.
## Configure the Model Optimizer
The Model Optimizer is a Python\*-based command line tool for importing trained models from popular deep learning frameworks such as Caffe\*, TensorFlow\*, Apache MXNet\*, ONNX\* and Kaldi\*.
You cannot perform inference on your trained model without running the model through the Model Optimizer. When you run a pre-trained model through the Model Optimizer, your output is an Intermediate Representation (IR) of the network. The Intermediate Representation is a pair of files that describe the whole model:
- `.xml`: Describes the network topology
- `.bin`: Contains the weights and biases binary data
For more information about the Model Optimizer, refer to the [Model Optimizer Developer Guide](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html). 
### Model Optimizer Configuration Steps
You can choose to either configure all supported frameworks at once **OR** configure one framework at a time. Choose the option that best suits your needs. If you see error messages, make sure you installed all dependencies.
> **NOTE**: Since the TensorFlow framework is not officially supported on CentOS*, the Model Optimizer for TensorFlow can't be configured and ran on those systems.
> **IMPORTANT**: The Internet access is required to execute the following steps successfully. If you have access to the Internet through the proxy server only, please make sure that it is configured in your OS environment.
**Option 1: Configure all supported frameworks at the same time**
1. Go to the Model Optimizer prerequisites directory:
```sh
cd <DLDT_DIR>/model_optimizer/install_prerequisites
```
2. Run the script to configure the Model Optimizer for Caffe,
TensorFlow, MXNet, Kaldi\*, and ONNX:
```sh
sudo ./install_prerequisites.sh
```
**Option 2: Configure each framework separately**
Configure individual frameworks separately **ONLY** if you did not select **Option 1** above.
1. Go to the Model Optimizer prerequisites directory:
```sh
cd <DLDT_DIR>/model_optimizer/install_prerequisites
```
2. Run the script for your model framework. You can run more than one script:
- For **Caffe**:
```sh
sudo ./install_prerequisites_caffe.sh
```
- For **TensorFlow**:
```sh
sudo ./install_prerequisites_tf.sh
```
- For **MXNet**:
```sh
sudo ./install_prerequisites_mxnet.sh
```
- For **ONNX**:
```sh
sudo ./install_prerequisites_onnx.sh
```
- For **Kaldi**:
```sh
sudo ./install_prerequisites_kaldi.sh
```
The Model Optimizer is configured for one or more frameworks. Continue to the next session to download and prepare a model for running a sample inference.
## Prepare a Model for Sample Inference
This paragraph contains the steps to get the pre-trained model for sample inference and to prepare the model's optimized Intermediate Representation that Inference Engine uses.
### Download a Trained Model
To run the Image Classification Sample you'll need a pre-trained model to run the inference on. This guide will use the public SqueezeNet 1.1 Caffe* model. You can find and download this model manually or use the OpenVINO™ [Model Downloader](https://github.com/opencv/open_model_zoo/tree/master/model_downloader).
With the Model Downloader, you can download other popular public deep learning topologies and the [OpenVINO™ pre-trained models](https://github.com/opencv/open_model_zoo/tree/master/intel_models) prepared for running inference for a wide list of inference scenarios: object detection, object recognition, object re-identification, human pose estimation, action recognition and others.
To download the SqueezeNet 1.1 Caffe* model to a models folder with the Model Downloader:
1. Install the [prerequisites](https://github.com/opencv/open_model_zoo/tree/master/model_downloader#prerequisites).
2. Run the `downloader.py` with specifying the topology name and a `<models_dir>` path. For example to download the model to the `~/public_models` directory:
```sh
./downloader.py --name squeezenet1.1 --output_dir ~/public_models
```
When the model files are successfully downloaded the output similar to the following is printed:
```sh
###############|| Downloading topologies ||###############
========= Downloading /home/username/public_models/classification/squeezenet/1.1/caffe/squeezenet1.1.prototxt
========= Downloading /home/username/public_models/classification/squeezenet/1.1/caffe/squeezenet1.1.caffemodel
... 100%, 4834 KB, 3157 KB/s, 1 seconds passed
###############|| Post processing ||###############
========= Changing input dimensions in squeezenet1.1.prototxt =========
```
### Convert the model to an Intermediate Representation with the Model Optimizer
> **NOTE**: This section assumes that you have configured the Model Optimizer using the instructions from the [Configure the Model Optimizer](#configure-the-model-optimizer) section.
1. Create a `<ir_dir>` directory that will contains the Intermediate Representation (IR) of the model.
2. Inference Engine can perform inference on a [list of supported devices](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_supported_plugins_Supported_Devices.html) using specific device plugins. Different plugins support models of [different precision formats](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_supported_plugins_Supported_Devices.html#supported_model_formats), such as FP32, FP16, INT8. To prepare an IR to run inference on a particular hardware, run the Model Optimizer with the appropriate `--data_type` options:
**For CPU (FP32):**
```sh
python3 <DLDT_DIR>/model_optimizer/mo.py --input_model <models_dir>/classification/squeezenet/1.1/caffe/squeezenet1.1.caffemodel --data_type FP32 --output_dir <ir_dir>
```
**For GPU and MYRIAD (FP16):**
```sh
python3 <DLDT_DIR>/model_optimizer/mo.py --input_model <models_dir>/classification/squeezenet/1.1/caffe/squeezenet1.1.caffemodel --data_type FP16 --output_dir <ir_dir>
```
After the Model Optimizer script is completed, the produced IR files (`squeezenet1.1.xml`, `squeezenet1.1.bin`) are in the specified `<ir_dir>` directory.
3. Copy the `squeezenet1.1.labels` file from the `<DLDT_DIR>/inference-engine/samples/sample_data/` to the model IR directory. This file contains the classes that ImageNet uses so that the inference results show text instead of classification numbers:
```sh
cp <DLDT_DIR>/inference-engine/samples/sample_data/squeezenet1.1.labels <ir_dir>
```
Now you are ready to run the Image Classification Sample Application.
## Run the Image Classification Sample Application
The Inference Engine sample applications are automatically compiled when you built the Inference Engine using the [build instructions](inference-engine/README.md). The binary files are located in the `<DLDT_DIR>/inference-engine/bin/intel64/Release` directory.
Follow the steps below to run the Image Classification sample application on the prepared IR and with an input image:
1. Go to the samples build directory:
```sh
cd <DLDT_DIR>/inference-engine/bin/intel64/Release
```
2. Run the sample executable with specifying the `car.png` file from the `<DLDT_DIR>/inference-engine/samples/sample_data/` directory as an input image, the IR of your model and a plugin for a hardware device to perform inference on:
**For CPU:**
```sh
./classification_sample -i <DLDT_DIR>/inference-engine/samples/sample_data/car.png -m <ir_dir>/squeezenet1.1.xml -d CPU
```
**For GPU:**
```sh
./classification_sample -i <DLDT_DIR>/inference-engine/samples/sample_data/car.png -m <ir_dir>/squeezenet1.1.xml -d GPU
```
**For MYRIAD:**
>**NOTE**: Running inference on VPU devices (Intel® Movidius™ Neural Compute Stick or Intel® Neural Compute Stick 2) with the MYRIAD plugin requires performing [additional hardware configuration steps](inference-engine/README.md#optional-additional-installation-steps-for-the-intel-movidius-neural-compute-stick-and-neural-compute-stick-2).
```sh
./classification_sample -i <DLDT_DIR>/inference-engine/samples/sample_data/car.png -m <ir_dir>/squeezenet1.1.xml -d MYRIAD
```
When the Sample Application completes, you will have the label and confidence for the top-10 categories printed on the screen. Below is a sample output with inference results on CPU:
```sh
Top 10 results:
Image /home/user/dldt/inference-engine/samples/sample_data/car.png
classid probability label
------- ----------- -----
817 0.8363345 sports car, sport car
511 0.0946488 convertible
479 0.0419131 car wheel
751 0.0091071 racer, race car, racing car
436 0.0068161 beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon
656 0.0037564 minivan
586 0.0025741 half track
717 0.0016069 pickup, pickup truck
864 0.0012027 tow truck, tow car, wrecker
581 0.0005882 grille, radiator grille
total inference time: 2.6642941
Average running time of one iteration: 2.6642941 ms
Throughput: 375.3339402 FPS
[ INFO ] Execution successful
```
## Additional Resources
* [OpenVINO™ Release Notes](https://software.intel.com/en-us/articles/OpenVINO-RelNotes)
* [Inference Engine build instructions](inference-engine/README.md)
* [Introduction to Intel® Deep Learning Deployment Toolkit](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Introduction.html)
* [Inference Engine Developer Guide](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html)
* [Model Optimizer Developer Guide](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html)
* [Inference Engine Samples Overview](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Samples_Overview.html).

View File

@@ -2,7 +2,7 @@
# SPDX-License-Identifier: Apache-2.0
#
cmake_minimum_required(VERSION 3.8 FATAL_ERROR)
cmake_minimum_required(VERSION 3.5 FATAL_ERROR)
project(InferenceEngine)

View File

@@ -1,5 +1,34 @@
## Repository components
# Build Inference Engine
## Contents
- [Introduction](#introduction)
- [Build on Linux* Systems](#build-on-linux-systems)
- [Software Requirements](#software-requirements)
- [Build Steps](#build-steps)
- [Additional Build Options](#additional-build-options)
- [Build for Raspbian* Stretch OS](#build-for-raspbian-stretch-os)
- [Hardware Requirements](#hardware-requirements)
- [Native Compilation](#native-compilation)
- [Cross Compilation Using Docker*](#cross-compilation-using-docker)
- [Additional Build Options](#additional-build-options-1)
- [Build on Windows* Systems](#build-on-windows-systems)
- [Software Requirements](#software-requirements-1)
- [Build Steps](#build-steps-1)
- [Additional Build Options](#additional-build-options-2)
- [Building Inference Engine with Ninja* Build System](#building-inference-engine-with-ninja-build-system)
- [Build on macOS* Systems](#build-on-macos-systems)
- [Software Requirements](#software-requirements-2)
- [Build Steps](#build-steps-2)
- [Additional Build Options](#additional-build-options-3)
- [Use Custom OpenCV Builds for Inference Engine](#use-custom-opencv-builds-for-inference-engine)
- [(Optional) Additional Installation Steps for the Intel® Movidius™ Neural Compute Stick and Neural Compute Stick 2](#optional-additional-installation-steps-for-the-intel-movidius-neural-compute-stick-and-neural-compute-stick-2)
- [For Linux, Raspbian Stretch* OS](#for-linux-raspbian-stretch-os)
- [For Windows](#for-windows-1)
- [Next Steps](#next-steps)
- [Additional Resources](#additional-resources)
## Introduction
The Inference Engine can infer models in different formats with various input and output formats.
The open source version of Inference Engine includes the following plugins:
@@ -9,21 +38,22 @@ The open source version of Inference Engine includes the following plugins:
| CPU plugin | Intel® Xeon® with Intel® AVX2 and AVX512, Intel® Core™ Processors with Intel® AVX2, Intel® Atom® Processors with Intel® SSE |
| GPU plugin | Intel® Processor Graphics, including Intel® HD Graphics and Intel® Iris® Graphics |
| GNA plugin | Intel® Speech Enabling Developer Kit, Amazon Alexa* Premium Far-Field Developer Kit, Intel® Pentium® Silver processor J5005, Intel® Celeron® processor J4005, Intel® Core™ i3-8121U processor |
| MYRIAD plugin | Intel® Movidius™ Neural Compute Stick powered by the Intel® Movidius™ Myriad™ 2, Intel® Neural Compute Stick 2 powered by the Intel® Movidius™ Myriad™ X |
| Heterogeneous plugin | Heterogeneous plugin enables computing for inference on one network on several Intel® devices. |
Inference Engine plugins for Intel® FPGA and Intel® Movidius™ Neural Compute Stick are distributed only in a binary form as a part of [Intel® Distribution of OpenVINO™](https://software.intel.com/en-us/openvino-toolkit).
Inference Engine plugin for Intel® FPGA is distributed only in a binary form as a part of [Intel® Distribution of OpenVINO™](https://software.intel.com/en-us/openvino-toolkit).
## Build on Linux\* Systems
## Build on Linux* Systems
The software was validated on:
- Ubuntu\* 16.04 (64-bit) with default GCC\* 5.4.0
- CentOS\* 7.4 (64-bit) with default GCC\* 4.8.5
- [Intel® Graphics Compute Runtime for OpenCL™ Driver package 18.28.11080](https://github.com/intel/compute-runtime/releases/tag/18.28.11080).
### Software Requirements
- [CMake\*](https://cmake.org/download/) 3.9 or higher
- [CMake\*](https://cmake.org/download/) 3.5 or higher
- GCC\* 4.8 or higher to build the Inference Engine
- Python 2.7 or higher for Inference Engine Python API wrapper
- (Optional) [Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.04.12237](https://github.com/intel/compute-runtime/releases/tag/19.04.12237).
### Build Steps
1. Clone submodules:
@@ -33,34 +63,42 @@ The software was validated on:
git submodule update --recursive
```
2. Install build dependencies using the `install_dependencies.sh` script in the project root folder.
3. Create a build folder:
3. By default, the build enables the Inference Engine GPU plugin to infer models on your Intel® Processor Graphics. This requires you to [Install Intel® Graphics Compute Runtime for OpenCL™ Driver package 19.04.12237](https://github.com/intel/compute-runtime/releases/tag/19.04.12237) before running the build. If you don't want to use the GPU plugin, use the `-DENABLE_CLDNN=ON` CMake build option and skip the installation of the Intel® Graphics Compute Runtime for OpenCL™ Driver.
4. Create a build folder:
```sh
mkdir build
mkdir build && cd build
```
4. Inference Engine uses a CMake-based build system. In the created `build` directory, run `cmake` to fetch project dependencies and create Unix makefiles, then run `make` to build the project:
5. Inference Engine uses a CMake-based build system. In the created `build` directory, run `cmake` to fetch project dependencies and create Unix makefiles, then run `make` to build the project:
```sh
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j16
make --jobs=$(nproc --all)
```
You can use the following additional build options:
- Internal JIT GEMM implementation is used by default.
- To switch to OpenBLAS\* implementation, use `GEMM=OPENBLAS` option and `BLAS_INCLUDE_DIRS` and `BLAS_LIBRARIES` cmake options to specify path to OpenBLAS headers and library, for example use the following options on CentOS\*: `-DGEMM=OPENBLAS -DBLAS_INCLUDE_DIRS=/usr/include/openblas -DBLAS_LIBRARIES=/usr/lib64/libopenblas.so.0`
- To switch to the optimized MKL-ML\* GEMM implementation, use `-DGEMM=MKL` and `-DMKLROOT=<path_to_MKL>` cmake options to specify a path to unpacked MKL-ML with the `include` and `lib` folders. MKL-ML\* package can be downloaded [here](https://github.com/intel/mkl-dnn/releases/download/v0.17/mklml_lnx_2019.0.1.20180928.tgz)
### Additional Build Options
You can use the following additional build options:
- Internal JIT GEMM implementation is used by default.
- To switch to OpenBLAS\* implementation, use the `GEMM=OPENBLAS` option and `BLAS_INCLUDE_DIRS` and `BLAS_LIBRARIES` CMake options to specify path to the OpenBLAS headers and library. For example use the following options on CentOS\*: `-DGEMM=OPENBLAS -DBLAS_INCLUDE_DIRS=/usr/include/openblas -DBLAS_LIBRARIES=/usr/lib64/libopenblas.so.0`.
- To switch to the optimized MKL-ML\* GEMM implementation, use `-DGEMM=MKL` and `-DMKLROOT=<path_to_MKL>` CMake options to specify a path to unpacked MKL-ML with the `include` and `lib` folders. MKL-ML\* package can be downloaded from the [MKL-DNN repository](https://github.com/intel/mkl-dnn/releases/download/v0.17/mklml_lnx_2019.0.1.20180928.tgz).
- Threading Building Blocks (TBB) is used by default. To build the Inference Engine with OpenMP* threading, set the `-DTHREADING=OMP` option.
- Required versions of TBB and OpenCV packages are downloaded automatically by the CMake-based script. If you already have installed TBB or OpenCV packages configured in your environment, you may need to clean the `TBBROOT` and `OpenCV_DIR` environment variables before running the `cmake` command, otherwise they won't be downloaded and the build may fail if incompatible versions were installed.
- Required versions of TBB and OpenCV packages are downloaded automatically by the CMake-based script. If you want to use the automatically downloaded packages but you already have installed TBB or OpenCV packages configured in your environment, you may need to clean the `TBBROOT` and `OpenCV_DIR` environment variables before running the `cmake` command, otherwise they won't be downloaded and the build may fail if incompatible versions were installed.
- If the CMake-based build script can not find and download the OpenCV package that is supported on your platform, or if you want to use a custom build of the OpenCV library, refer to the [Use Custom OpenCV Builds](#use-custom-opencv-builds-for-inference-engine) section for details.
- To build the Python API wrapper, use the `-DENABLE_PYTHON=ON` option. To specify an exact Python version, use the following options:
```sh
-DPYTHON_EXECUTABLE=`which python3.7` \
-DPYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.7m.so \
-DPYTHON_INCLUDE_DIR=/usr/include/python3.7
```
- To switch on/off the CPU and GPU plugins, use `cmake` options `-DENABLE_MKL_DNN=ON/OFF` and `-DENABLE_CLDNN=ON/OFF`.
```sh
-DPYTHON_EXECUTABLE=`which python3.7` \
-DPYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.7m.so \
-DPYTHON_INCLUDE_DIR=/usr/include/python3.7
```
- To switch off/on the CPU and GPU plugins, use the `cmake` options `-DENABLE_MKL_DNN=ON/OFF` and `-DENABLE_CLDNN=ON/OFF` respectively.
5. Adding to your project
For CMake projects, set an environment variable `InferenceEngine_DIR`:
@@ -79,16 +117,179 @@ You can use the following additional build options:
target_link_libraries(${PROJECT_NAME} ${InferenceEngine_LIBRARIES} dl)
```
## Build on Windows\* Systems:
## Build for Raspbian Stretch* OS
> **NOTE**: Only the MYRIAD plugin is supported.
### Hardware Requirements
* Raspberry Pi\* 2 or 3 with Raspbian\* Stretch OS (32-bit). Check that it's CPU supports ARMv7 instruction set (`uname -m` command returns `armv7l`).
> **NOTE**: Despite the Raspberry Pi\* CPU is ARMv8, 32-bit OS detects ARMv7 CPU instruction set. The default `gcc` compiler applies ARMv6 architecture flag for compatibility with lower versions of boards. For more information, run the `gcc -Q --help=target` command and refer to the description of the `-march=` option.
You can compile the Inference Engine for Raspberry Pi\* in one of the two ways:
* [Native Compilation](#native-compilation), which is the simplest way, but time-consuming
* [Cross Compilation Using Docker*](#cross-compilation-using-docker), which is the recommended way
### Native Compilation
Native compilation of the Inference Engine is the most straightforward solution. However, it might take at least one hour to complete on Raspberry Pi\* 3.
1. Install dependencies:
```bash
sudo apt-get update
sudo apt-get install -y git cmake libusb-1.0-0-dev
```
2. Go to the `inference-engine` directory of the cloned `dldt` repository:
```bash
cd dldt/inference-engine
```
3. Initialize submodules:
```bash
git submodule init
git submodule update --recursive
```
4. Create a build folder:
```bash
mkdir build && cd build
```
5. Build the Inference Engine:
```bash
cmake -DCMAKE_BUILD_TYPE=Release \
-DENABLE_SSE42=OFF \
-DTHREADING=SEQ \
-DENABLE_GNA=OFF .. && make -j2
```
### Cross Compilation Using Docker*
This compilation was tested on the following configuration:
* Host: Ubuntu\* 16.04 (64-bit, Intel® Core™ i7-6700K CPU @ 4.00GHz × 8)
* Target: Raspbian\* Stretch (32-bit, ARMv7, Raspberry Pi\* 3)
1. Install Docker\*:
```bash
sudo apt-get install -y docker.io
```
2. Add a current user to `docker` group:
```bash
sudo usermod -a -G docker $USER
```
Log out and log in for this to take effect.
3. Create a directory named `ie_cross_armhf` and add a text file named `Dockerfile`
with the following content:
```docker
FROM debian:stretch
USER root
RUN dpkg --add-architecture armhf && \
apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
crossbuild-essential-armhf \
git \
wget \
libusb-1.0-0-dev:armhf \
libgtk-3-dev:armhf \
libavcodec-dev:armhf \
libavformat-dev:armhf \
libswscale-dev:armhf \
libgstreamer1.0-dev:armhf \
libgstreamer-plugins-base1.0-dev:armhf \
libpython3-dev:armhf \
python3-pip
RUN wget https://www.cmake.org/files/v3.14/cmake-3.14.3.tar.gz && \
tar xf cmake-3.14.3.tar.gz && \
(cd cmake-3.14.3 && ./bootstrap --parallel=$(nproc --all) && make --jobs=$(nproc --all) && make install) && \
rm -rf cmake-3.14.3 cmake-3.14.3.tar.gz
```
It uses the Debian\* Stretch (Debian 9) OS for compilation because it is a base of the Raspbian\* Stretch.
4. Build a Docker\* image:
```bash
docker image build -t ie_cross_armhf ie_cross_armhf
```
5. Run Docker\* container with mounted source code folder from host:
```bash
docker run -it -v /absolute/path/to/dldt:/dldt ie_cross_armhf /bin/bash
```
6. While in the container:
1. Go to the `inference-engine` directory of the cloned `dldt` repository:
```bash
cd dldt/inference-engine
```
2. Create a build folder:
```bash
mkdir build && cd build
```
3. Build the Inference Engine:
```bash
cmake -DCMAKE_BUILD_TYPE=Release \
-DCMAKE_TOOLCHAIN_FILE="../cmake/arm.toolchain.cmake" \
-DTHREADS_PTHREAD_ARG="-pthread" \
-DENABLE_SSE42=OFF \
-DTHREADING=SEQ \
-DENABLE_GNA=OFF .. && make --jobs=$(nproc --all)
```
7. Press "Ctrl"+"D" to exit from Docker\*. You can find the resulting binaries in the `dldt/inference-engine/bin/armv7l/` directory and the OpenCV* installation in the `dldt/inference-engine/temp`.
>**NOTE**: Native applications that link to cross-compiled Inference Engine library require an extra compilation flag `-march=armv7-a`.
### Additional Build Options
You can use the following additional build options:
- Required versions of OpenCV packages are downloaded automatically by the CMake-based script. If you want to use the automatically downloaded packages but you already have installed OpenCV packages configured in your environment, you may need to clean the `OpenCV_DIR` environment variable before running the `cmake` command, otherwise they won't be downloaded and the build may fail if incompatible versions were installed.
- If the CMake-based build script can not find and download the OpenCV package that is supported on your platform, or if you want to use a custom build of the OpenCV library, refer to the [Use Custom OpenCV Builds](#use-custom-opencv-builds-for-inference-engine) section for details.
- To build Python API wrapper, install `libpython3-dev:armhf` and `python3-pip` packages using `apt-get`, then install `numpy` and `cython` python modules using `pip3` command and add the following cmake options:
```sh
-DENABLE_PYTHON=ON \
-DPYTHON_EXECUTABLE=/usr/bin/python3.5 \
-DPYTHON_LIBRARY=/usr/lib/arm-linux-gnueabihf/libpython3.5m.so \
-DPYTHON_INCLUDE_DIR=/usr/include/python3.5
```
## Build on Windows* Systems
The software was validated on:
- Microsoft\* Windows\* 10 (64-bit) with Visual Studio 2017 and Intel® C++ Compiler 2018 Update 3
- [Intel® Graphics Driver for Windows* [24.20] driver package](https://downloadcenter.intel.com/download/27803/Graphics-Intel-Graphics-Driver-for-Windows-10?v=t).
### Software Requirements
- [CMake\*](https://cmake.org/download/) 3.9 or higher
- [CMake\*](https://cmake.org/download/) 3.5 or higher
- [OpenBLAS\*](https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download) and [mingw64\* runtime dependencies](https://sourceforge.net/projects/openblas/files/v0.2.14/mingw64_dll.zip/download).
- [Intel® C++ Compiler](https://software.intel.com/en-us/intel-parallel-studio-xe) 18.0 to build the Inference Engine on Windows.
- (Optional) [Intel® Graphics Driver for Windows* [25.20] driver package](https://downloadcenter.intel.com/download/28646/Intel-Graphics-Windows-10-DCH-Drivers?product=80939).
- Python 3.4 or higher for Inference Engine Python API wrapper
### Build Steps
@@ -101,11 +302,12 @@ The software was validated on:
3. Install OpenBLAS:
1. Download [OpenBLAS\*](https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download)
2. Unzip the downloaded package to a directory on your machine. In this document, this directory is referred to as `<OPENBLAS_DIR>`.
4. Create build directory:
4. By default, the build enables the Inference Engine GPU plugin to infer models on your Intel® Processor Graphics. This requires you to [download and install the Intel® Graphics Driver for Windows* [25.20] driver package](https://downloadcenter.intel.com/download/28646/Intel-Graphics-Windows-10-DCH-Drivers?product=80939) before running the build. If you don't want to use the GPU plugin, use the `-DENABLE_CLDNN=ON` CMake build option and skip the installation of the Intel® Graphics Driver.
5. Create build directory:
```sh
mkdir build
```
5. In the `build` directory, run `cmake` to fetch project dependencies and generate a Visual Studio solution:
6. In the `build` directory, run `cmake` to fetch project dependencies and generate a Visual Studio solution:
```sh
cd build
cmake -G "Visual Studio 15 2017 Win64" -T "Intel C++ Compiler 18.0" ^
@@ -113,26 +315,32 @@ cmake -G "Visual Studio 15 2017 Win64" -T "Intel C++ Compiler 18.0" ^
-DICCLIB="C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\compiler\lib" ..
```
7. Build generated solution in Visual Studio 2017 or run `cmake --build . --config Release` to build from the command line.
8. Before running the samples, add paths to TBB and OpenCV binaries used for the build to the `%PATH%` environment variable. By default, TBB binaries are downloaded by the CMake-based script to the `<dldt_repo>/inference-engine/temp/tbb/lib` folder, OpenCV binaries - to the `<dldt_repo>/inference-engine/temp/opencv_4.1.0/bin` folder.
### Additional Build Options
- Internal JIT GEMM implementation is used by default.
- To switch to OpenBLAS GEMM implementation, use -DGEMM=OPENBLAS cmake option and specify path to OpenBLAS using `-DBLAS_INCLUDE_DIRS=<OPENBLAS_DIR>\include` and `-DBLAS_LIBRARIES=<OPENBLAS_DIR>\lib\libopenblas.dll.a` options. Prebuilt OpenBLAS\* package can be downloaded [here](https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download), mingw64* runtime dependencies [here](https://sourceforge.net/projects/openblas/files/v0.2.14/mingw64_dll.zip/download)
- To switch to the optimized MKL-ML\* GEMM implementation, use `-DGEMM=MKL` and `-DMKLROOT=<path_to_MKL>` cmake options to specify a path to unpacked MKL-ML with the `include` and `lib` folders. MKL-ML\* package can be downloaded [here](https://github.com/intel/mkl-dnn/releases/download/v0.17/mklml_win_2019.0.1.20180928.zip)
- To switch to OpenBLAS GEMM implementation, use the `-DGEMM=OPENBLAS` CMake option and specify path to OpenBLAS using the `-DBLAS_INCLUDE_DIRS=<OPENBLAS_DIR>\include` and `-DBLAS_LIBRARIES=<OPENBLAS_DIR>\lib\libopenblas.dll.a` options. Prebuilt OpenBLAS\* package can be downloaded [here](https://sourceforge.net/projects/openblas/files/v0.2.14/OpenBLAS-v0.2.14-Win64-int64.zip/download). mingw64* runtime dependencies can be downloaded [here](https://sourceforge.net/projects/openblas/files/v0.2.14/mingw64_dll.zip/download).
- To switch to the optimized MKL-ML\* GEMM implementation, use the `-DGEMM=MKL` and `-DMKLROOT=<path_to_MKL>` CMake options to specify a path to unpacked MKL-ML with the `include` and `lib` folders. MKL-ML\* package can be downloaded from the [MKL-DNN repository](https://github.com/intel/mkl-dnn/releases/download/v0.17/mklml_win_2019.0.1.20180928.zip).
- Threading Building Blocks (TBB) is used by default. To build the Inference Engine with OpenMP* threading, set the `-DTHREADING=OMP` option.
- Required versions of TBB and OpenCV packages are downloaded automatically by the CMake-based script. If you already have installed TBB or OpenCV packages configured in your environment, you may need to clean the `TBBROOT` and `OpenCV_DIR` environment variables before running the `cmake` command, otherwise they won't be downloaded and the build may fail if incompatible versions were installed.
- Required versions of TBB and OpenCV packages are downloaded automatically by the CMake-based script. If you want to use the automatically downloaded packages but you already have installed TBB or OpenCV packages configured in your environment, you may need to clean the `TBBROOT` and `OpenCV_DIR` environment variables before running the `cmake` command, otherwise they won't be downloaded and the build may fail if incompatible versions were installed.
- If the CMake-based build script can not find and download the OpenCV package that is supported on your platform, or if you want to use a custom build of the OpenCV library, refer to the [Use Custom OpenCV Builds](#use-custom-opencv-builds-for-inference-engine) section for details.
- To switch off/on the CPU and GPU plugins, use the `cmake` options `-DENABLE_MKL_DNN=ON/OFF` and `-DENABLE_CLDNN=ON/OFF` respectively.
- To build the Python API wrapper, use the `-DENABLE_PYTHON=ON` option. To specify an exact Python version, use the following options:
```sh
-DPYTHON_EXECUTABLE="C:\Program Files\Python37\python.exe" ^
-DPYTHON_LIBRARY="C:\Program Files\Python37\libs\python37.lib" ^
-DPYTHON_INCLUDE_DIR="C:\Program Files\Python37\include"
```
```sh
-DPYTHON_EXECUTABLE="C:\Program Files\Python37\python.exe" ^
-DPYTHON_LIBRARY="C:\Program Files\Python37\libs\python37.lib" ^
-DPYTHON_INCLUDE_DIR="C:\Program Files\Python37\include"
```
6. Build generated solution in Visual Studio 2017 or run `cmake --build . --config Release` to build from the command line.
7. Before running the samples, add paths to TBB and OpenCV binaries used for the build to the %PATH% environment variable. By default, TBB binaries are downloaded by the CMake-based script to the `<dldt_repo>/inference-engine/temp/tbb/lib` folder, OpenCV binaries - to the `<dldt_repo>/inference-engine/temp/opencv_4.1.0/bin` folder.
### Building Inference Engine with Ninja
### Building Inference Engine with Ninja* Build System
```sh
call "C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\bin\ipsxe-comp-vars.bat" intel64 vs2017
@@ -144,13 +352,15 @@ cmake -G Ninja -Wno-dev -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --config Release
```
## Build on macOS\* Systems
## Build on macOS* Systems
> **NOTE**: The current version of the OpenVINO™ toolkit for macOS* supports inference on Intel CPUs only.
The software was validated on:
- macOS\* 10.14, 64-bit
### Software Requirements
- [CMake\*](https://cmake.org/download/) 3.9 or higher
- [CMake\*](https://cmake.org/download/) 3.5 or higher
- Clang\* compiler from Xcode\* 10.1
- Python\* 3.4 or higher for the Inference Engine Python API wrapper
@@ -169,14 +379,20 @@ The software was validated on:
4. Inference Engine uses a CMake-based build system. In the created `build` directory, run `cmake` to fetch project dependencies and create Unix makefiles, then run `make` to build the project:
```sh
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j16
make --jobs=$(nproc --all)
```
### Additional Build Options
You can use the following additional build options:
- Internal JIT GEMM implementation is used by default.
- To switch to the optimized MKL-ML\* GEMM implementation, use `-DGEMM=MKL` and `-DMKLROOT=<path_to_MKL>` cmake options to specify a path to unpacked MKL-ML with the `include` and `lib` folders. MKL-ML\* package can be downloaded [here](https://github.com/intel/mkl-dnn/releases/download/v0.17.1/mklml_mac_2019.0.1.20180928.tgz)
- Threading Building Blocks (TBB) is used by default. To build the Inference Engine with OpenMP* threading, set the `-DTHREADING=OMP` option.
- Required versions of TBB and OpenCV packages are downloaded automatically by the CMake-based script. If you want to use the automatically downloaded packages but you already have installed TBB or OpenCV packages configured in your environment, you may need to clean the `TBBROOT` and `OpenCV_DIR` environment variables before running the `cmake` command, otherwise they won't be downloaded and the build may fail if incompatible versions were installed.
- If the CMake-based build script can not find and download the OpenCV package that is supported on your platform, or if you want to use a custom build of the OpenCV library, refer to the [Use Custom OpenCV Builds](#use-custom-opencv-builds-for-inference-engine) section for details.
- To build the Python API wrapper, use the `-DENABLE_PYTHON=ON` option. To specify an exact Python version, use the following options:
```sh
-DPYTHON_EXECUTABLE=/Library/Frameworks/Python.framework/Versions/3.7/bin/python3.7 \
@@ -184,6 +400,82 @@ You can use the following additional build options:
-DPYTHON_INCLUDE_DIR=/Library/Frameworks/Python.framework/Versions/3.7/include/python3.7m
```
## Use Custom OpenCV Builds for Inference Engine
> **NOTE**: The recommended and tested version of OpenCV is 4.1. The minimum supported version is 3.4.0.
Required versions of OpenCV packages are downloaded automatically during the building Inference Engine library. If the build script can not find and download the OpenCV package that is supported on your platform, you can use one of the following options:
* Download the most suitable version from the list of available pre-build packages from [https://download.01.org/opencv/2019/openvinotoolkit](https://download.01.org/opencv/2019/openvinotoolkit) from the `<release_version>/inference_engine` directory.
* Use a system provided OpenCV package (e.g with running the `apt install libopencv-dev` command). The following modules must be enabled: `imgcodecs`, `videoio`, `highgui`.
* Get the OpenCV package using a package manager: pip, conda, conan etc. The package must have the development components included (header files and CMake scripts).
* Build OpenCV from source using the [build instructions](https://docs.opencv.org/master/df/d65/tutorial_table_of_content_introduction.html) on the OpenCV site.
After you got the built OpenCV library, perform the following preparation steps before running the Inference Engine build:
1. Set the `OpenCV_DIR` environment variable to the directory where the `OpenCVConfig.cmake` file of you custom OpenCV build is located.
2. Disable the package automatic downloading with using the `-DENABLE_OPENCV=OFF` option for CMake-based build script for Inference Engine.
## (Optional) Additional Installation Steps for the Intel® Movidius™ Neural Compute Stick and Neural Compute Stick 2
> **NOTE**: These steps are only required if you want to perform inference on Intel® Movidius™ Neural Compute Stick or the Intel® Neural Compute Stick 2 using the Inference Engine MYRIAD Plugin. See also [Intel® Neural Compute Stick 2 Get Started](https://software.intel.com/en-us/neural-compute-stick/get-started)
### For Linux, Raspbian\* Stretch OS
1. Add the current Linux user to the `users` group:
```sh
sudo usermod -a -G users "$(whoami)"
```
Log out and log in for it to take effect.
2. To perform inference on Intel® Movidius™ Neural Compute Stick and Intel® Neural Compute Stick 2, install the USB rules as follows:
```sh
cat <<EOF > 97-myriad-usbboot.rules
SUBSYSTEM=="usb", ATTRS{idProduct}=="2150", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
SUBSYSTEM=="usb", ATTRS{idProduct}=="2485", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
SUBSYSTEM=="usb", ATTRS{idProduct}=="f63b", ATTRS{idVendor}=="03e7", GROUP="users", MODE="0666", ENV{ID_MM_DEVICE_IGNORE}="1"
EOF
```
```sh
sudo cp 97-myriad-usbboot.rules /etc/udev/rules.d/
```
```sh
sudo udevadm control --reload-rules
```
```sh
sudo udevadm trigger
```
```sh
sudo ldconfig
```
```sh
rm 97-myriad-usbboot.rules
```
### For Windows
For Intel® Movidius™ Neural Compute Stick and Intel® Neural Compute Stick 2, install the Movidius™ VSC driver:
1. Go to the `<DLDT_ROOT_DIR>/inference-engine/thirdparty/movidius/MovidiusDriver` directory, where the `DLDT_ROOT_DIR` is the directory to which the DLDT repository was cloned.
2. Right click on the `Movidius_VSC_Device.inf` file and choose **Install** from the pop up menu.
You have installed the driver for your Intel® Movidius™ Neural Compute Stick or Intel® Neural Compute Stick 2.
## Next Steps
Congratulations, you have built the Inference Engine. To get started with the OpenVINO™ DLDT, proceed to the Get Started guides:
* [Get Started with Deep Learning Deployment Toolkit on Linux*](../get-started-linux.md)
## Additional Resources
* [OpenVINO™ Release Notes](https://software.intel.com/en-us/articles/OpenVINO-RelNotes)
* [Introduction to Intel® Deep Learning Deployment Toolkit](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Introduction.html)
* [Inference Engine Samples Overview](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Samples_Overview.html)
* [Inference Engine Developer Guide](https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Deep_Learning_Inference_Engine_DevGuide.html)
* [Model Optimizer Developer Guide](https://docs.openvinotoolkit.org/latest/_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html)
---
\* Other names and brands may be claimed as the property of others.
\* Other names and brands may be claimed as the property of others.

View File

@@ -30,6 +30,7 @@ endif()
if (APPLE)
set(ENABLE_GNA OFF)
set(ENABLE_CLDNN OFF)
SET(ENABLE_MYRIAD OFF)
endif()
@@ -60,6 +61,14 @@ if (NOT ENABLE_MKL_DNN)
set(ENABLE_MKL OFF)
endif()
if (NOT ENABLE_VPU)
set(ENABLE_MYRIAD OFF)
endif()
if (NOT ENABLE_MYRIAD)
set(ENABLE_VPU OFF)
endif()
#next section set defines to be accesible in c++/c code for certain feature
if (ENABLE_PROFILING_RAW)
add_definitions(-DENABLE_PROFILING_RAW=1)
@@ -69,6 +78,22 @@ if (ENABLE_CLDNN)
add_definitions(-DENABLE_CLDNN=1)
endif()
if (ENABLE_MYRIAD)
add_definitions(-DENABLE_MYRIAD=1)
endif()
if (ENABLE_MYX_PCIE AND ENABLE_MYRIAD)
add_definitions(-DENABLE_MYX_PCIE=1)
endif()
if (ENABLE_MYRIAD_NO_BOOT AND ENABLE_MYRIAD )
add_definitions(-DENABLE_MYRIAD_NO_BOOT=1)
endif()
if (ENABLE_MYX_PCIE AND ENABLE_MYRIAD_NO_BOOT)
message(FATAL_ERROR "ENABLE_MYX_PCIE and ENABLE_MYRIAD_NO_BOOT can't be enabled at the same time")
endif()
if (ENABLE_MKL_DNN)
add_definitions(-DENABLE_MKL_DNN=1)
endif()

View File

@@ -37,6 +37,24 @@ else()
set(MODELS_BRANCH "master")
endif()
if (ENABLE_MYRIAD)
RESOLVE_DEPENDENCY(VPU_FIRMWARE_MA2450
ARCHIVE_UNIFIED firmware_ma2450_491.zip
TARGET_PATH "${TEMP}/vpu/firmware/ma2450"
ENVIRONMENT "VPU_FIRMWARE_MA2450"
FOLDER)
debug_message(STATUS "ma2450=" ${VPU_FIRMWARE_MA2450})
endif ()
if (ENABLE_MYRIAD)
RESOLVE_DEPENDENCY(VPU_FIRMWARE_MA2480
ARCHIVE_UNIFIED firmware_ma2480_mdk_R7_9.zip
TARGET_PATH "${TEMP}/vpu/firmware/ma2480"
ENVIRONMENT "VPU_FIRMWARE_MA2480"
FOLDER)
debug_message(STATUS "ma2480=" ${VPU_FIRMWARE_MA2480})
endif ()
## enable cblas_gemm from OpenBLAS package
if (GEMM STREQUAL "OPENBLAS")
if(NOT BLAS_LIBRARIES OR NOT BLAS_INCLUDE_DIRS)
@@ -100,7 +118,7 @@ elseif(LINUX)
ENVIRONMENT "TBBROOT")
else(APPLE)
RESOLVE_DEPENDENCY(TBB
ARCHIVE_MAC "tbb2019_20190130_mac.tgz"
ARCHIVE_MAC "tbb2019_20190414_mac.tgz"
TARGET_PATH "${TEMP}/tbb"
ENVIRONMENT "TBBROOT"
VERSION_REGEX ".*_([a-z]*_([a-z0-9]+\\.)*[0-9]+).*")

View File

@@ -44,9 +44,15 @@ set (IE_DEBUG_POSTFIX_WIN "d")
set (IE_RELEASE_POSTFIX_WIN "")
set (IE_DEBUG_POSTFIX_LIN "")
set (IE_RELEASE_POSTFIX_LIN "")
set (IE_DEBUG_POSTFIX_MAC "d")
set (IE_RELEASE_POSTFIX_MAC "")
if (WIN32)
set (IE_DEBUG_POSTFIX ${IE_DEBUG_POSTFIX_WIN})
set (IE_RELEASE_POSTFIX ${IE_RELEASE_POSTFIX_WIN})
elseif(APPLE)
set (IE_DEBUG_POSTFIX ${IE_DEBUG_POSTFIX_MAC})
set (IE_RELEASE_POSTFIX ${IE_RELEASE_POSTFIX_MAC})
else()
set (IE_DEBUG_POSTFIX ${IE_DEBUG_POSTFIX_LIN})
set (IE_RELEASE_POSTFIX ${IE_RELEASE_POSTFIX_LIN})
@@ -56,6 +62,14 @@ list (APPEND IE_OPTIONS IE_DEBUG_POSTFIX)
set(IE_RELEASE_POSTFIX "${IE_RELEASE_POSTFIX}" CACHE STRING "Release postfix" FORCE)
list (APPEND IE_OPTIONS IE_RELEASE_POSTFIX)
ie_option (ENABLE_VPU "vpu targeted plugins for inference engine" ON)
ie_option (ENABLE_MYRIAD "myriad targeted plugin for inference engine" ON)
ie_option (ENABLE_MYX_PCIE "myriad plugin with support PCIE device" OFF)
ie_option (ENABLE_MYRIAD_NO_BOOT "myriad plugin will skip device boot" OFF)
ie_option (ENABLE_TESTS "unit and functional tests" OFF)
ie_option (ENABLE_GAPI_TESTS "unit tests for GAPI kernels" OFF)
@@ -94,6 +108,8 @@ ie_option (ENABLE_DEBUG_SYMBOLS "generates symbols for debugging" OFF)
ie_option (ENABLE_PYTHON "enables ie python bridge build" OFF)
ie_option (TREAT_WARNING_AS_ERROR "Treat build warnings as errors" ON)
ie_option(ENABLE_CPPLINT "Enable cpplint checks during the build" OFF)
ie_option(ENABLE_CPPLINT_REPORT "Build cpplint report instead of failing the build" OFF)

View File

@@ -108,6 +108,14 @@ else()
FOUND_VAR INFERENCEENGINE_FOUND
REQUIRED_VARS IE_RELEASE_LIBRARY IE_DEBUG_LIBRARY IE_INCLUDE_DIR
FAIL_MESSAGE "Inference Engine cannot be found at ${_IE_ROOT_LIBRARY}. Please consult InferenceEgnineConfig.cmake module's help page.")
elseif (APPLE)
find_library(IE_RELEASE_LIBRARY inference_engine@IE_RELEASE_POSTFIX_MAC@ "${IE_LIB_DIR}")
find_library(IE_DEBUG_LIBRARY inference_engine@IE_DEBUG_POSTFIX_MAC@ "${IE_LIB_DIR}")
find_package_handle_standard_args( InferenceEngine
FOUND_VAR INFERENCEENGINE_FOUND
REQUIRED_VARS IE_RELEASE_LIBRARY IE_DEBUG_LIBRARY IE_INCLUDE_DIR
FAIL_MESSAGE "Inference Engine cannot be found at ${_IE_ROOT_LIBRARY}. Please consult InferenceEgnineConfig.cmake module's help page.")
else()
find_library(IE_LIBRARY inference_engine@IE_RELEASE_POSTFIX_LIN@ "${IE_LIB_DIR}")
find_package_handle_standard_args( InferenceEngine
@@ -132,6 +140,12 @@ else()
MAP_IMPORTED_CONFIG_RELEASE Release
MAP_IMPORTED_CONFIG_RELWITHDEBINFO Release
INTERFACE_INCLUDE_DIRECTORIES "${IE_INCLUDE_DIR}")
elseif (APPLE)
set_target_properties(IE::inference_engine PROPERTIES
IMPORTED_LOCATION_RELEASE "${IE_RELEASE_LIBRARY}"
IMPORTED_LOCATION_DEBUG "${IE_DEBUG_LIBRARY}"
INTERFACE_INCLUDE_DIRECTORIES "${IE_INCLUDE_DIR}")
target_link_libraries(IE::inference_engine INTERFACE ${CMAKE_DL_LIBS})
else()
set_target_properties(IE::inference_engine PROPERTIES
IMPORTED_LOCATION "${IE_LIBRARY}"

View File

@@ -8,7 +8,7 @@ This topic demonstrates how to run the Benchmark Application demo, which perform
Upon the start-up, the application reads command-line parameters and loads a network and images to the Inference Engine plugin. The number of infer requests and execution approach depend on a mode defined with the `-api` command-line parameter.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
### Synchronous API
For synchronous mode, the primary metric is latency. The application creates one infer request and executes the `Infer` method. A number of executions is defined by one of the two values:

View File

@@ -3,13 +3,13 @@
This topic demonstrates how to run the Image Classification sample application, which performs
inference using image classification networks such as AlexNet and GoogLeNet.
### How It Works
## How It Works
Upon the start-up, the sample application reads command line parameters and loads a network and an image to the Inference
Engine plugin. When inference is done, the application creates an
output image and outputs data to the standard output stream.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running
@@ -62,18 +62,16 @@ For example, to perform inference of an AlexNet model (previously converted to t
python3 classification_sample.py -i <path_to_image>/cat.bmp -m <path_to_model>/alexnet_fp32.xml
```
### Sample Output
## Sample Output
By default the application outputs top-10 inference results.
Add the `-nt` option to the previous command to modify the number of top output results.
For example, to get the top-5 results on GPU, run the following command:
```
python3 classification_sample.py<path_to_image>/cat.bmp -m <path_to_model>/alexnet_fp32.xml -nt 5 -d GPU
python3 classification_sample.py -i <path_to_image>/cat.bmp -m <path_to_model>/alexnet_fp32.xml -nt 5 -d GPU
```
## See Also
* [Using Inference Engine Samples](./docs/IE_DG/Samples_Overview.md)
* [Model Optimizer tool](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
* [Model Downloader](https://github.com/opencv/open_model_zoo/tree/2018/model_downloader)

View File

@@ -16,7 +16,7 @@ Another required aspect of good throughput is a number of iterations. Only with
The batch mode is an independent attribute on the pipelined mode. Pipelined mode works efficiently with any batch size.
### How It Works
## How It Works
Upon the start-up, the sample application reads command line parameters and loads a network and an image to the Inference
Engine plugin.
@@ -26,13 +26,13 @@ Then in a loop it starts inference for the current infer request and switches to
When inference is done, the application outputs data to the standard output stream.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running
Running the application with the <code>-h</code> option yields the following usage message:
```
python3 classification_sample_async.py -h
python3 classification_sample_async.py -h
```
The command yields the following usage message:
```
@@ -80,7 +80,7 @@ You can do inference on an image using a trained AlexNet network on FPGA with fa
python3 classification_sample_async.py -i <path_to_image>/cat.bmp -m <path_to_model>/alexnet_fp32.xml -nt 5 -d HETERO:FPGA,CPU -nireq 2 -ni 200
```
### Sample Output
## Sample Output
By default, the application outputs top-10 inference results for each infer request.
It also provides throughput value measured in frames per seconds.

View File

@@ -7,7 +7,7 @@ inference of style transfer models.
## How It Works
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running

View File

@@ -74,7 +74,7 @@ public:
ConcatLayer& setAxis(size_t axis);
private:
size_t axis;
size_t axis = 1;
};
} // namespace Builder

View File

@@ -98,7 +98,7 @@ public:
EltwiseLayer& setScales(const std::vector<float>& scales);
private:
EltwiseType type;
EltwiseType type = SUM;
};
} // namespace Builder

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -161,8 +161,8 @@ public:
PoolingLayer& setExcludePad(bool exclude);
private:
PoolingType type;
RoundingType roundingType;
PoolingType type = MAX;
RoundingType roundingType = CEIL;
};
} // namespace Builder

View File

@@ -44,6 +44,9 @@ public:
const Version *GetVersion() {
const Version *versionInfo = nullptr;
actual->GetVersion(versionInfo);
if (versionInfo == nullptr) {
THROW_IE_EXCEPTION << "Unknown device is used";
}
return versionInfo;
}

View File

@@ -23,8 +23,9 @@ namespace details {
template<class NT, class LT>
class INetworkIterator: public std::iterator<std::input_iterator_tag, std::shared_ptr<LT>> {
public:
explicit INetworkIterator(NT * network, bool toEnd = false): network(network), currentIdx(0) {
if (!network || toEnd)
explicit INetworkIterator(NT * network, bool toEnd): network(network), currentIdx(0) {}
explicit INetworkIterator(NT * network): network(network), currentIdx(0) {
if (!network)
return;
const auto& inputs = network->getInputs();

View File

@@ -30,7 +30,7 @@ class SharedObjectLoader {
private:
HMODULE shared_object;
public:
public:
/**
* @brief Loads a library with the name specified. The library is loaded according to the
* WinAPI LoadLibrary rules
@@ -38,6 +38,20 @@ private:
*/
explicit SharedObjectLoader(LPCTSTR pluginName) {
char cwd[1024];
// Exclude current directory from DLL search path process wise.
// If application specific path was configured before then
// current directory is alread excluded.
// GetDLLDirectory does not distinguish if aplication specific
// path was set to "" or NULL so reset it to "" to keep
// aplication safe.
if (GetDllDirectory(0, NULL) <= 1) {
SetDllDirectory(
#if defined UNICODE
L"");
#else
"");
#endif
}
shared_object = LoadLibrary(pluginName);
if (!shared_object) {
THROW_IE_EXCEPTION << "Cannot load library '"

View File

@@ -82,7 +82,7 @@ public:
* @brief Constructor. Creates an empty Blob object with the specified precision.
* @param tensorDesc Defines the layout and dims of the blob
*/
explicit Blob(TensorDesc tensorDesc): tensorDesc(tensorDesc) {}
explicit Blob(const TensorDesc &tensorDesc): tensorDesc(tensorDesc) {}
/**
* @deprecated Please use TensorDesc for Blob initialization
@@ -126,17 +126,21 @@ public:
* @return Total number of elements (a product of all the dimensions)
*/
size_t Resize(const SizeVector &dims, Layout layout = Layout::ANY) noexcept {
bool bret = deallocate();
try {
bool bret = deallocate();
if (layout != Layout::ANY) {
tensorDesc = TensorDesc(tensorDesc.getPrecision(), SizeVector(dims.rbegin(), dims.rend()), layout);
} else {
tensorDesc.setDims(SizeVector(dims.rbegin(), dims.rend()));
if (layout != Layout::ANY) {
tensorDesc = TensorDesc(tensorDesc.getPrecision(), SizeVector(dims.rbegin(), dims.rend()), layout);
} else {
tensorDesc.setDims(SizeVector(dims.rbegin(), dims.rend()));
}
if (!bret) {
allocate();
}
return product(tensorDesc.getDims());
} catch (...) {
return 0;
}
if (!bret) {
allocate();
}
return product(tensorDesc.getDims());
}
/**
@@ -147,16 +151,20 @@ public:
* @return The total number of elements (a product of all the dims)
*/
size_t Reshape(const SizeVector &dims, Layout layout = Layout::ANY) noexcept {
if (product(tensorDesc.getDims()) != product(dims)) {
try {
if (product(tensorDesc.getDims()) != product(dims)) {
return 0;
}
if (layout != Layout::ANY) {
tensorDesc = TensorDesc(tensorDesc.getPrecision(), SizeVector(dims.rbegin(), dims.rend()), layout);
} else {
tensorDesc.setDims(SizeVector(dims.rbegin(), dims.rend()));
}
return product(tensorDesc.getDims());
} catch (...) {
return 0;
}
if (layout != Layout::ANY) {
tensorDesc = TensorDesc(tensorDesc.getPrecision(), SizeVector(dims.rbegin(), dims.rend()), layout);
} else {
tensorDesc.setDims(SizeVector(dims.rbegin(), dims.rend()));
}
return product(tensorDesc.getDims());
}
/**

View File

@@ -27,10 +27,8 @@ enum class TargetDevice : uint8_t {
eGPU = 3,
eFPGA = 4,
eMYRIAD = 5,
eHDDL = 6,
eGNA = 7,
eHETERO = 8,
eKMB = 9,
};
/**
@@ -52,10 +50,8 @@ class TargetDeviceInfo {
DECL_DEVICE(GPU),
DECL_DEVICE(FPGA),
DECL_DEVICE(MYRIAD),
DECL_DEVICE(HDDL),
DECL_DEVICE(GNA),
DECL_DEVICE(HETERO),
DECL_DEVICE(KMB)
};
#undef DECLARE
return g_allDeviceInfos;
@@ -68,11 +64,9 @@ class TargetDeviceInfo {
{ "GPU", InferenceEngine::TargetDevice::eGPU },
{ "FPGA", InferenceEngine::TargetDevice::eFPGA },
{ "MYRIAD", InferenceEngine::TargetDevice::eMYRIAD },
{ "HDDL", InferenceEngine::TargetDevice::eHDDL },
{ "GNA", InferenceEngine::TargetDevice::eGNA },
{ "BALANCED", InferenceEngine::TargetDevice::eBalanced },
{ "HETERO", InferenceEngine::TargetDevice::eHETERO },
{ "KMB", InferenceEngine::TargetDevice::eKMB }
};
auto val = deviceFromNameMap.find(deviceName);
return val != deviceFromNameMap.end() ? val->second : InferenceEngine::TargetDevice::eDefault;

View File

@@ -701,7 +701,7 @@ public:
/**
* @brief A pad value which is used to fill pad area
*/
float _pad_value = -1.0f;
float _pad_value = 0.0f;
/**
* @brief A convolution kernel array [X, Y, Z, ...]

View File

@@ -11,6 +11,8 @@
#pragma once
#include <cstddef>
#define IE_THREAD_TBB 0
#define IE_THREAD_OMP 1
#define IE_THREAD_SEQ 2
@@ -70,7 +72,7 @@ inline void parallel_set_num_threads(int n) { return; }
namespace InferenceEngine {
template <typename F>
void parallel_nt(int nthr, F func) {
void parallel_nt(int nthr, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
if (nthr == 0) nthr = parallel_get_max_threads();
if (nthr == 1) {
@@ -95,7 +97,7 @@ void parallel_nt(int nthr, F func) {
}
template <typename F>
void parallel_nt_static(int nthr, F func) {
void parallel_nt_static(int nthr, const F &func) {
#if IE_THREAD == IE_THREAD_SEQ
const bool serial = true;
#else
@@ -124,7 +126,7 @@ void parallel_nt_static(int nthr, F func) {
}
template <typename T0, typename R, typename F>
R parallel_sum(const T0 D0, R &input, F func) {
R parallel_sum(const T0 &D0, const R &input, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
return tbb::parallel_reduce(
tbb::blocked_range<T0>(0, D0), input,
@@ -157,7 +159,7 @@ R parallel_sum(const T0 D0, R &input, F func) {
}
template <typename T0, typename T1, typename R, typename F>
R parallel_sum2d(const T0 D0, const T1 D1, R input, F func) {
R parallel_sum2d(const T0 &D0, const T1 &D1, const R &input, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
return tbb::parallel_reduce(
tbb::blocked_range2d<T0, T1>(0, D0, 0, D1), input,
@@ -196,7 +198,7 @@ R parallel_sum2d(const T0 D0, const T1 D1, R input, F func) {
#endif
}
template <typename T0, typename T1, typename T2, typename R, typename F>
R parallel_sum3d(const T0 D0, const T1 D1, const T2 D2, R input, F func) {
R parallel_sum3d(const T0 &D0, const T1 &D1, const T2 &D2, const R &input, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
return tbb::parallel_reduce(
tbb::blocked_range3d<T0, T1, T2>(0, D0, 0, D1, 0, D2), input,
@@ -261,7 +263,7 @@ inline bool parallel_it_step(Q &x, const R &X, Args &&... tuple) {
}
template <typename T, typename Q>
inline void splitter(T n, Q team, Q tid, T &n_start, T &n_end) {
inline void splitter(const T &n, const Q &team, const Q &tid, T &n_start, T &n_end) {
if (team <= 1 || n == 0) {
n_start = 0;
n_end = n;
@@ -278,14 +280,14 @@ inline void splitter(T n, Q team, Q tid, T &n_start, T &n_end) {
template <typename T0, typename F>
void for_1d(const int ithr, const int nthr, const T0 &D0, F func) {
void for_1d(const int &ithr, const int &nthr, const T0 &D0, const F &func) {
T0 d0{ 0 }, end{ 0 };
splitter(D0, nthr, ithr, d0, end);
for (; d0 < end; ++d0) func(d0);
}
template <typename T0, typename F>
void parallel_for(const T0 &D0, F func) {
void parallel_for(const T0 &D0, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
@@ -301,7 +303,7 @@ void parallel_for(const T0 &D0, F func) {
template <typename T0, typename T1, typename F>
void for_2d(const int ithr, const int nthr, const T0 &D0, const T1 &D1, F func) {
void for_2d(const int &ithr, const int &nthr, const T0 &D0, const T1 &D1, const F &func) {
const size_t work_amount = (size_t)D0 * D1;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
@@ -316,7 +318,7 @@ void for_2d(const int ithr, const int nthr, const T0 &D0, const T1 &D1, F func)
}
template <typename T0, typename T1, typename F>
void parallel_for2d(const T0 &D0, const T1 &D1, F func) {
void parallel_for2d(const T0 &D0, const T1 &D1, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
@@ -332,8 +334,8 @@ void parallel_for2d(const T0 &D0, const T1 &D1, F func) {
template <typename T0, typename T1, typename T2, typename F>
void for_3d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
const T2 &D2, F func) {
void for_3d(const int &ithr, const int &nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const F &func) {
const size_t work_amount = (size_t)D0 * D1 * D2;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
@@ -348,7 +350,7 @@ void for_3d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
}
template <typename T0, typename T1, typename T2, typename F>
void parallel_for3d(const T0 &D0, const T1 &D1, const T2 &D2, F func) {
void parallel_for3d(const T0 &D0, const T1 &D1, const T2 &D2, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
@@ -363,8 +365,8 @@ void parallel_for3d(const T0 &D0, const T1 &D1, const T2 &D2, F func) {
}
template <typename T0, typename T1, typename T2, typename T3, typename F>
void for_4d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const T3 &D3, F func) {
void for_4d(const int &ithr, const int &nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const T3 &D3, const F &func) {
const size_t work_amount = (size_t)D0 * D1 * D2 * D3;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
@@ -379,7 +381,7 @@ void for_4d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
}
template <typename T0, typename T1, typename T2, typename T3, typename F>
void parallel_for4d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3, F func) {
void parallel_for4d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
@@ -394,8 +396,8 @@ void parallel_for4d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3, F fu
}
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename F>
void for_5d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const T3 &D3, const T4 &D4, F func) {
void for_5d(const int &ithr, const int &nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const T3 &D3, const T4 &D4, const F &func) {
const size_t work_amount = (size_t)D0 * D1 * D2 * D3 * D4;
if (work_amount == 0) return;
size_t start{ 0 }, end{ 0 };
@@ -411,7 +413,7 @@ void for_5d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename F>
void parallel_for5d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3,
const T4 &D4, F func) {
const T4 &D4, const F &func) {
#if IE_THREAD == IE_THREAD_TBB
const int nthr = parallel_get_max_threads();
tbb::parallel_for(0, nthr, [&](int ithr) {
@@ -427,7 +429,7 @@ void parallel_for5d(const T0 &D0, const T1 &D1, const T2 &D2, const T3 &D3,
template <typename T0, typename T1, typename T2, typename T3, typename T4, typename T5, typename F>
void for_6d(const int ithr, const int nthr, const T0 &D0, const T1 &D1,
void for_6d(const int &ithr, const int &nthr, const T0 &D0, const T1 &D1,
const T2 &D2, const T3 &D3, const T4 &D4, const T5 &D5, F func) {
const size_t work_amount = (size_t)D0 * D1 * D2 * D3 * D4 * D5;
if (work_amount == 0) return;

View File

@@ -83,27 +83,32 @@ public:
/** @brief checks whether given storage class T can be used to store objects of current precision */
template <class T>
bool hasStorageType(const char * typeName = nullptr) const noexcept {
if (precisionInfo.value != BIN) {
if (sizeof(T) != size()) {
return false;
try {
if (precisionInfo.value != BIN) {
if (sizeof(T) != size()) {
return false;
}
}
}
#define CASE(x, y) case x: return std::is_same<T, y>()
#define CASE2(x, y1, y2) case x: return std::is_same<T, y1>() || std::is_same<T, y2>()
switch (precisionInfo.value) {
CASE(FP32, float);
CASE2(FP16, int16_t, uint16_t);
CASE(I16, int16_t);
CASE(I32, int32_t);
CASE(U16, uint16_t);
CASE(U8, uint8_t);
CASE(I8, int8_t);
CASE2(Q78, int16_t, uint16_t);
CASE2(BIN, int8_t, uint8_t);
default : return areSameStrings(name(), typeName == nullptr ? typeid(T).name() : typeName);
switch (precisionInfo.value) {
CASE(FP32, float);
CASE2(FP16, int16_t, uint16_t);
CASE(I16, int16_t);
CASE(I32, int32_t);
CASE(U16, uint16_t);
CASE(U8, uint8_t);
CASE(I8, int8_t);
CASE2(Q78, int16_t, uint16_t);
CASE2(BIN, int8_t, uint8_t);
default :
return areSameStrings(name(), typeName == nullptr ? typeid(T).name() : typeName);
#undef CASE
#undef CASE2
}
} catch (...) {
return false;
}
}
@@ -172,7 +177,7 @@ public:
/**
* @brief Returns size in bytes of single element of that precision
* @deprecated : size of precision will be report in bits in future releases
* @deprecated : size of precision will be reported in bits in future releases
*/
size_t size() const {
if (precisionInfo.bitsSize == 0) {
@@ -182,7 +187,7 @@ public:
}
/** @brief Checks if it is a floating point */
bool is_float() const {
bool is_float() const noexcept {
return precisionInfo.isFloat;
}
@@ -306,7 +311,7 @@ inline Precision::PrecisionInfo Precision::makePrecisionInfo(const char *name) {
Precision::PrecisionInfo info;
info.name = name;
int nBits = precision == BIN ? 1 : 8;
size_t nBits = precision == BIN ? 1 : 8;
info.bitsSize = nBits * type_size_or_zero<typename PrecisionTrait<precision>::value_type>();
info.isFloat = is_floating<precision>();
info.value = precision;

View File

@@ -20,12 +20,6 @@
#define DECLARE_VPU_CONFIG_KEY(name) DECLARE_CONFIG_KEY(VPU_##name)
#define DECLARE_VPU_CONFIG_VALUE(name) DECLARE_CONFIG_VALUE(VPU_##name)
#define VPU_HDDL_CONFIG_KEY(name) InferenceEngine::VPUConfigParams::_CONFIG_KEY(VPU_HDDL_##name)
#define VPU_HDDL_CONFIG_VALUE(name) InferenceEngine::VPUConfigParams::VPU_HDDL_##name
#define DECLARE_VPU_HDDL_CONFIG_KEY(name) DECLARE_CONFIG_KEY(VPU_HDDL_##name)
#define DECLARE_VPU_HDDL_CONFIG_VALUE(name) DECLARE_CONFIG_VALUE(VPU_HDDL_##name)
namespace InferenceEngine {
namespace VPUConfigParams {
@@ -68,92 +62,6 @@ DECLARE_VPU_CONFIG_KEY(PRINT_RECEIVE_TENSOR_TIME);
*/
DECLARE_VPU_CONFIG_KEY(FORCE_RESET);
/**
* @brief [Only for HDDLPlugin]
* Type: Arbitrary non-empty string. If empty (""), equals no set, default: "";
* This option allows to specify the number of MYX devices used for inference a specific Executable network.
* Note: Only one network would be allocated to one device.
* The number of devices for the tag is specified in the hddl_service.config file.
* Example:
* "service_settings":
* {
* "graph_tag_map":
* {
* "tagA":3
* }
* }
* It means that an executable network marked with tagA will be executed on 3 devices
*/
DECLARE_VPU_HDDL_CONFIG_KEY(GRAPH_TAG);
/**
* @brief [Only for HDDLPlugin]
* Type: Arbitrary non-empty string. If empty (""), equals no set, default: "";
* This config makes the executable networks to be allocated on one certain device (instead of multiple devices).
* And all inference through this executable network, will be done on this device.
* Note: Only one network would be allocated to one device.
* The number of devices which will be used for stream-affinity must be specified in hddl_service.config file.
* Example:
* "service_settings":
* {
* "stream_device_number":5
* }
* It means that 5 device will be used for stream-affinity
*/
DECLARE_VPU_HDDL_CONFIG_KEY(STREAM_ID);
/**
* @brief [Only for HDDLPlugin]
* Type: Arbitrary non-empty string. If empty (""), equals no set, default: "";
* This config allows user to control device flexibly. This config gives a "tag" for a certain device while
* allocating a network to it. Afterward, user can allocating/deallocating networks to this device with this "tag".
* Devices used for such use case is controlled by a so-called "Bypass Scheduler" in HDDL backend, and the number
* of such device need to be specified in hddl_service.config file.
* Example:
* "service_settings":
* {
* "bypass_device_number": 5
* }
* It means that 5 device will be used for Bypass scheduler.
*/
DECLARE_VPU_HDDL_CONFIG_KEY(DEVICE_TAG);
/**
* @brief [Only for HDDLPlugin]
* Type: "YES/NO", default is "NO".
* This config is a sub-config of DEVICE_TAG, and only available when "DEVICE_TAG" is set. After a user load a
* network, the user got a handle for the network.
* If "YES", the network allocated is bind to the device (with the specified "DEVICE_TAG"), which means all afterwards
* inference through this network handle will be executed on this device only.
* If "NO", the network allocated is not bind to the device (with the specified "DEVICE_TAG"). If the same network
* is allocated on multiple other devices (also set BIND_DEVICE to "False"), then inference through any handle of these
* networks may be executed on any of these devices those have the network loaded.
*/
DECLARE_VPU_HDDL_CONFIG_KEY(BIND_DEVICE);
/**
* @brief [Only for HDDLPlugin]
* Type: A signed int wrapped in a string, default is "0".
* This config is a sub-config of DEVICE_TAG, and only available when "DEVICE_TAG" is set and "BIND_DEVICE" is "False".
* When there are multiple devices running a certain network (a same network running on multiple devices in Bypass Scheduler),
* the device with a larger number has a higher priority, and more inference tasks will be fed to it with priority.
*/
DECLARE_VPU_HDDL_CONFIG_KEY(RUNTIME_PRIORITY);
/**
* @brief [Only for HDDLPlugin]
* Type: "YES/NO", default is "NO". **Note: ONLY available when "DEVICE_TAG" is set.
* This config should be used only when the network has been loaded already with the same network content, the same
* "DEVICE_TAG" as used this time and "BIND_DEVICE" of the loaded network had been set to "NO".
* This config is only used to update the "RUNTIME_PRIORITY" of previous loaded network, and the application should keep using
* the network handle that previous allocated to do inference.
* - If "Yes": the "RUNTIME_PRIORITY" must be specified with a integer, and it will be set as the new runtime priority for that network on that device.
* - If "No": load this network to deivce.
* **Note: If "BIND_DEVICE" of the previously loaded network was "Yes", the behavior of "update runtime priority" is undefined.
*/
DECLARE_VPU_HDDL_CONFIG_KEY(UPDATE_RUNTIME_PRIORITY);
/**
* @brief This option allows to pass extra configuration for executable network.
* By default, it is empty string, which means - no configuration.

View File

@@ -22,7 +22,6 @@ function yes_or_no {
# install dependencies
if [[ -f /etc/lsb-release ]]; then
# Ubuntu
system_ver=`cat /etc/lsb-release | grep -i "DISTRIB_RELEASE" | cut -d "=" -f2`
sudo -E apt update
sudo -E apt-get install -y \
build-essential \
@@ -52,12 +51,12 @@ if [[ -f /etc/lsb-release ]]; then
gstreamer1.0-plugins-base \
libusb-1.0-0-dev \
libopenblas-dev
if [ $system_ver = "18.04" ]; then
sudo -E apt-get install -y libpng-dev
if apt-cache search --names-only '^libpng12'| grep -q libpng12; then
sudo -E apt-get install -y libpng12-dev
else
sudo -E apt-get install -y libpng12-dev
sudo -E apt-get install -y libpng-dev
fi
else
elif [[ -f /etc/redhat-release ]]; then
# CentOS 7.x
sudo -E yum install -y centos-release-scl epel-release
sudo -E yum install -y \
@@ -125,5 +124,6 @@ else
echo "FFmpeg installation skipped. You may build FFmpeg from sources as described here: https://trac.ffmpeg.org/wiki/CompilationGuide/Centos"
echo
fi
else
echo "Unknown OS, please install build dependencies manually"
fi

View File

@@ -59,6 +59,11 @@ if (WIN32)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_SCL_SECURE_NO_WARNINGS -DNOMINMAX")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /EHsc") #no asynchronous structured exception handling
set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /LARGEADDRESSAWARE")
if (TREAT_WARNING_AS_ERROR)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /WX") #treating warnings as errors
endif ()
if (${CMAKE_CXX_COMPILER_ID} STREQUAL MSVC)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /wd4251 /wd4275 /wd4267") #disable some warnings
endif()

View File

@@ -14,7 +14,7 @@ Upon start-up, the application reads command-line parameters and loads a network
plugin, which is chosen depending on a specified device. The number of infer requests and execution approach depend
on the mode defined with the `-api` command-line parameter.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
If you run the application in the synchronous mode, it creates one infer request and executes the `Infer` method.
If you run the application in the asynchronous mode, it creates as many infer requests as specified in the `-nireq`

View File

@@ -11,6 +11,7 @@
#include <utility>
#include <inference_engine.hpp>
#include <ext_list.hpp>
#include <format_reader_ptr.h>
#include <vpu/vpu_plugin_config.hpp>
@@ -60,18 +61,6 @@ bool ParseAndCheckCommandLine(int argc, char *argv[]) {
throw std::logic_error("Input is not set. Please use -h.");
}
if (FLAGS_niter < 0) {
throw std::logic_error("Number of iterations should be positive (invalid -niter option value)");
}
if (FLAGS_nireq < 0) {
throw std::logic_error("Number of inference requests should be positive (invalid -nireq option value)");
}
if (FLAGS_b < 0) {
throw std::logic_error("Batch size should be positive (invalid -b option value)");
}
if (!FLAGS_report_type.empty() &&
FLAGS_report_type != noCntReport && FLAGS_report_type != medianCntReport && FLAGS_report_type != detailedCntReport) {
std::string err = "only " + std::string(noCntReport) + "/" + std::string(medianCntReport) + "/" + std::string(detailedCntReport) +
@@ -113,12 +102,19 @@ int main(int argc, char *argv[]) {
InferencePlugin plugin = PluginDispatcher({ FLAGS_pp }).getPluginByDevice(FLAGS_d);
if (!FLAGS_l.empty()) {
// CPU (MKLDNN) extensions is loaded as a shared library and passed as a pointer to base extension
const std::shared_ptr<IExtension> extension_ptr = InferenceEngine::make_so_pointer<InferenceEngine::IExtension>(FLAGS_l);
plugin.AddExtension(extension_ptr);
slog::info << "CPU (MKLDNN) extensions is loaded " << FLAGS_l << slog::endl;
} else if (!FLAGS_c.empty()) {
if (FLAGS_d.find("CPU") != std::string::npos) {
// Loading default CPU etensions
plugin.AddExtension(std::make_shared<Extensions::Cpu::CpuExtensions>());
if (!FLAGS_l.empty()) {
// CPU (MKLDNN) extensions is loaded as a shared library and passed as a pointer to base extension
const auto extension_ptr = InferenceEngine::make_so_pointer<InferenceEngine::IExtension>(FLAGS_l);
plugin.AddExtension(extension_ptr);
slog::info << "CPU (MKLDNN) extensions is loaded " << FLAGS_l << slog::endl;
}
}
if ((FLAGS_d.find("GPU") != std::string::npos) && !FLAGS_c.empty()) {
// Load clDNN Extensions
plugin.SetConfig({ {CONFIG_KEY(CONFIG_FILE), FLAGS_c} });
slog::info << "GPU extensions is loaded " << FLAGS_c << slog::endl;

View File

@@ -34,7 +34,7 @@ public:
std::string report_folder;
};
explicit StatisticsReport(Config config) : _config(std::move(config)) {
explicit StatisticsReport(const Config &config) : _config(config) {
if (_config.niter > 0) {
_performanceCounters.reserve(_config.niter);
}

View File

@@ -1,9 +1,13 @@
# Calibration Tool
# C++ Calibration Tool [DEPRECATED]
Inference Engine Calibration Tool calibrates a given FP32 model so that is can be run in low-precision 8-bit integer
> **NOTE**: OpenVINO 2019 R1 release introduced a [Python\* version of the Calibration Tool](./inference-engine/tools/calibration_tool/README.md). This is now a recommended version since it supports a larger set of topologies and datasets. The [C++ version of the Calibration Tool](./inference-engine/samples/calibration_tool/README.md) is still in the package but deprecated and will not be updated for new releases.
The C++ Calibration Tool calibrates a given FP32 model so that is can be run in low-precision 8-bit integer
mode while keeping the input data of this model in the original precision.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: INT8 models are currently supported only by the CPU plugin. For the full list of supported configurations, see the [Supported Devices](./docs/IE_DG/supported_plugins/Supported_Devices.md) topic.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Calibration Tool Options

View File

@@ -33,6 +33,9 @@ CNNLayerPtr Int8Calibrator::addScaleShiftBeforeLayer(std::string name, CNNLayer:
params.type = "ScaleShift";
CNNLayerPtr lptr = std::make_shared<ScaleShiftLayer>(params);
ScaleShiftLayer *pScaleShift = dynamic_cast<ScaleShiftLayer *>(lptr.get());
if (pScaleShift == nullptr) {
THROW_IE_EXCEPTION << "Layer " << lptr->name << " is not instance of ScaleShiftLayer class";
}
SizeVector wdims({ pData->dims[2] });
@@ -94,10 +97,14 @@ CNNLayerPtr Int8Calibrator::addScaleShiftBeforeLayer(std::string name, CNNLayer:
float Int8Calibrator::compare_NRMSD(InferenceEngine::Blob::Ptr res, InferenceEngine::Blob::Ptr ref) {
float *res_ptr = res->buffer().as<float *>();
auto *res_ptr = res->buffer().as<float *>();
auto *ref_ptr = ref->buffer().as<float *>();
float *ref_ptr = ref->buffer().as<float *>();
size_t ref_size = ref->size();
if (ref_size == 0) {
throw std::logic_error("ref_size can't be equal to zero");
}
float sum = 0;
@@ -111,9 +118,7 @@ float Int8Calibrator::compare_NRMSD(InferenceEngine::Blob::Ptr res, InferenceEng
mmin = std::min(mmin, ref_ptr[i]);
mmax = std::max(mmax, ref_ptr[i]);
}
if (std::fabs(ref_size) < std::numeric_limits<double>::epsilon()) {
throw std::logic_error("ref_size can't be equal to zero");
}
sum /= ref_size;
sum = pow(sum, 0.5f);
@@ -278,6 +283,9 @@ CNNNetwork Int8Calibrator::createICNNNetworkForLayer(CNNLayer::Ptr layerToClone,
size_t outputWidth = outputData->getTensorDesc().getDims()[3];
ConvolutionLayer *pConvS = dynamic_cast<ConvolutionLayer *>(layerToClone.get());
if (pConvS == nullptr) {
THROW_IE_EXCEPTION << "Layer " << layerToClone->name << " is not instance of ConvolutionLayer class";
}
std::string model = "<net name=\"L\" version=\"2\" batch=\"1\"><layers> "\
"<layer name=\"" +
@@ -361,6 +369,10 @@ CNNNetwork Int8Calibrator::createICNNNetworkForLayer(CNNLayer::Ptr layerToClone,
CNNLayerPtr convLayer;
n.getLayerByName(layerToClone->name.c_str(), convLayer, nullptr);
ConvolutionLayer *pConvT = dynamic_cast<ConvolutionLayer *>(convLayer.get());
if (pConvT == nullptr) {
THROW_IE_EXCEPTION << "Layer " << convLayer->name << " is not instance of ConvolutionLayer class";
}
pConvT->_weights = pConvS->_weights;
pConvT->_biases = pConvS->_biases;
pConvT->blobs = pConvS->blobs;

View File

@@ -107,7 +107,7 @@ protected:
InferenceEngine::InferRequest _inferRequestI8C;
int _cBatch = 0;
size_t _nPictures;
size_t _nPictures = 0;
private:
/**

View File

@@ -425,7 +425,10 @@ int main(int argc, char *argv[]) {
THROW_USER_EXCEPTION(2) << "Processor pointer is invalid" << FLAGS_ppType;
}
Int8Calibrator* calibrator = dynamic_cast<Int8Calibrator*>(processor.get());
auto calibrator = dynamic_cast<Int8Calibrator*>(processor.get());
if (calibrator == nullptr) {
THROW_USER_EXCEPTION(2) << "processor object is not instance of Int8Calibrator class";
}
if (netType != RawC && netType != RawOD) {
slog::info << "Collecting accuracy metric in FP32 mode to get a baseline, collecting activation statistics" << slog::endl;
@@ -434,7 +437,10 @@ int main(int argc, char *argv[]) {
}
calibrator->collectFP32Statistic();
shared_ptr<Processor::InferenceMetrics> pIMFP32 = processor->Process(FLAGS_stream_output);
const CalibrationMetrics* mFP32 = dynamic_cast<const CalibrationMetrics*>(pIMFP32.get());
const auto mFP32 = dynamic_cast<const CalibrationMetrics*>(pIMFP32.get());
if (mFP32 == nullptr) {
THROW_USER_EXCEPTION(2) << "FP32 inference metrics object is not instance of CalibrationMetrics class";
}
std:: cout << " FP32 Accuracy: " << OUTPUT_FLOATING(100.0 * mFP32->AccuracyResult) << "% " << std::endl;
InferenceEngine::NetworkStatsMap statMap;
@@ -450,7 +456,10 @@ int main(int argc, char *argv[]) {
InferenceEngine::NetworkStatsMap tmpStatMap = calibrator->getStatistic(threshold);
calibrator->validateInt8Config(tmpStatMap, {}, FLAGS_convert_fc);
shared_ptr<Processor::InferenceMetrics> pIM_I8 = processor->Process(FLAGS_stream_output);
const CalibrationMetrics *mI8 = dynamic_cast<const CalibrationMetrics *>(pIM_I8.get());
auto *mI8 = dynamic_cast<const CalibrationMetrics *>(pIM_I8.get());
if (mI8 == nullptr) {
THROW_USER_EXCEPTION(2) << "INT8 inference metrics object is not instance of CalibrationMetrics class";
}
if (maximalAccuracy < mI8->AccuracyResult) {
maximalAccuracy = mI8->AccuracyResult;
bestThreshold = threshold;
@@ -477,7 +486,7 @@ int main(int argc, char *argv[]) {
orderedLayersAccuracyDrop[d.second] = d.first;
layersToInt8[d.first] = true;
}
std::map<float, std::string>::const_reverse_iterator it = orderedLayersAccuracyDrop.crbegin();
auto it = orderedLayersAccuracyDrop.crbegin();
shared_ptr<Processor::InferenceMetrics> pIM_I8;
const CalibrationMetrics *mI8;
@@ -537,6 +546,12 @@ int main(int argc, char *argv[]) {
showUsage();
return ex.list().begin()->exitCode();
}
} catch (const std::exception& ex) {
slog::err << ex.what() << slog::endl;
return 1;
} catch (...) {
slog::err << "Unknown/internal exception happened." << slog::endl;
return 1;
}
return 0;
}

View File

@@ -11,7 +11,7 @@ Upon the start-up, the sample application reads command line parameters and load
Engine plugin. When inference is done, the application creates an
output image and outputs data to the standard output stream.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running
Running the application with the `-h` option yields the following usage message:

View File

@@ -29,7 +29,7 @@ Then in a loop it starts inference for the current infer request and switches to
When inference is done, the application outputs data to the standard output stream.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running

View File

@@ -11,7 +11,7 @@ Please refer to [Object Detection for SSD Demo](./inference-engine/samples/objec
[Security Barrier Camera Demo](./inference-engine/samples/security_barrier_camera_demo/README.md), or
[Crossroad Camera Demo](./inference-engine/samples/crossroad_camera_demo/README.md) with an example of using of new crop ROI API.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running

View File

@@ -5,7 +5,7 @@ The sample is simplified version of [Image Classification Sample](./inference-en
It demonstrates how to use the new Infer Request API of Inference Engine in applications. Refer to
[Integrate the Inference Engine New Request API with Your Application](./docs/IE_DG/Integrate_with_customer_application_new_API.md) for details.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running

View File

@@ -3,7 +3,7 @@
This topic demonstrates how to run the Hello Shape Infer SSD application, which does inference using object detection
networks like SSD-VGG. The sample shows how to use [Shape Inference feature](./docs/IE_DG/ShapeInference.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running

View File

@@ -13,7 +13,7 @@ When inference is done, the application outputs inference results to the standar
> **NOTE**: This sample is implemented to support models with FP32 weights only.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running

View File

@@ -9,7 +9,7 @@ Upon the start-up the sample application reads command line parameters and loads
Engine plugin. When inference is done, the application creates an
output image and outputs data to the standard output stream.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running

View File

@@ -100,7 +100,6 @@ static std::map<std::string, std::string> parseConfig(const std::string &configN
static std::size_t getNumberRequests(const std::string &plugin) {
static const std::unordered_map<std::string, std::size_t> supported_plugins = {
{ "MYRIAD", 4 },
{ "HDDL", 100 },
{ "FPGA", 3 },
};

Binary file not shown.

After

Width:  |  Height:  |  Size: 303 KiB

File diff suppressed because it is too large Load Diff

View File

@@ -810,7 +810,7 @@ int main(int argc, char *argv[]) {
inputFrame,
inputBlob->byteSize());
auto index = frameIndex - 2 * FLAGS_cw;
int index = static_cast<int>(frameIndex) - 2 * FLAGS_cw;
inferRequest.inferRequest.StartAsync();
inferRequest.frameIndex = index < 0 ? -2 : index;
inferRequest.numFramesThisBatch = numFramesThisBatch;

View File

@@ -139,7 +139,7 @@ DEFINE_int32(bs, 1, batch_size_message);
/// @brief Number of threads to use for inference on the CPU (also affects Hetero cases)
DEFINE_int32(nthreads, 1, infer_num_threads_message);
/// @brief Batch size (default 0)
/// @brief Context window size (default 0)
DEFINE_int32(cw, 0, context_window_message);
/**

View File

@@ -5,7 +5,7 @@ inference of style transfer models.
> **NOTE**: The OpenVINO™ toolkit does not include a pre-trained model to run the Neural Style Transfer sample. A public model from the [Zhaw's Neural Style Transfer repository](https://github.com/zhaw/neural_style) can be used. Read the [Converting a Style Transfer Model from MXNet*](./docs/MO_DG/prepare_model/convert_model/mxnet_specific/Convert_Style_Transfer_From_MXNet.md) topic from the [Model Optimizer Developer Guide](./docs/MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md) to learn about how to get the trained model and how to convert it to the Inference Engine format (\*.xml + \*.bin).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Running

View File

@@ -0,0 +1,115 @@
body {
background-color: #ffffff;
color: black;
margin-right: 1in;
margin-left: 1in;
}
h1, h2, h3, h4, h5, h6 {
color: #3366ff;
font-family: sans-serif;
}
@media print {
/* Darker version for printing */
h1, h2, h3, h4, h5, h6 {
color: #000080;
font-family: helvetica, sans-serif;
}
}
h1 {
text-align: center;
font-size: 18pt;
}
h2 {
margin-left: -0.5in;
}
h3 {
margin-left: -0.25in;
}
h4 {
margin-left: -0.125in;
}
hr {
margin-left: -1in;
}
/* Definition lists: definition term bold */
dt {
font-weight: bold;
}
address {
text-align: right;
}
/* Use the <code> tag for bits of code and <var> for variables and objects. */
code,pre,samp,var {
color: #006000;
}
/* Use the <file> tag for file and directory paths and names. */
file {
color: #905050;
font-family: monospace;
}
/* Use the <kbd> tag for stuff the user should type. */
kbd {
color: #600000;
}
div.note p {
float: right;
width: 3in;
margin-right: 0%;
padding: 1px;
border: 2px solid #6060a0;
background-color: #fffff0;
}
UL.nobullets {
list-style-type: none;
list-style-image: none;
margin-left: -1em;
}
/*
body:after {
content: "Google Confidential";
}
*/
/* pretty printing styles. See prettify.js */
.str { color: #080; }
.kwd { color: #008; }
.com { color: #800; }
.typ { color: #606; }
.lit { color: #066; }
.pun { color: #660; }
.pln { color: #000; }
.tag { color: #008; }
.atn { color: #606; }
.atv { color: #080; }
pre.prettyprint { padding: 2px; border: 1px solid #888; }
.embsrc { background: #eee; }
@media print {
.str { color: #060; }
.kwd { color: #006; font-weight: bold; }
.com { color: #600; font-style: italic; }
.typ { color: #404; font-weight: bold; }
.lit { color: #044; }
.pun { color: #440; }
.pln { color: #000; }
.tag { color: #006; font-weight: bold; }
.atn { color: #404; }
.atv { color: #060; }
}
/* Table Column Headers */
.hdr {
color: #006;
font-weight: bold;
background-color: #dddddd; }
.hdr2 {
color: #006;
background-color: #eeeeee; }

View File

@@ -0,0 +1,648 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title>How To Use Gflags (formerly Google Commandline Flags)</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link href="designstyle.css" type="text/css" rel="stylesheet">
<style type="text/css">
<!--
ol.bluelist li {
color: #3366ff;
font-family: sans-serif;
}
ol.bluelist li p {
color: #000;
font-family: "Times Roman", times, serif;
}
ul.blacklist li {
color: #000;
font-family: "Times Roman", times, serif;
}
//-->
</style>
</head>
<body>
<h1>How To Use gflags (formerly Google Commandline Flags)</h1>
<small>(as of
<script type=text/javascript>
var lm = new Date(document.lastModified);
document.write(lm.toDateString());
</script>)
</small>
<br>
<blockquote><dl>
<dt> Table of contents </dt>
<dd> <a href="#intro">Introduction</a> </dd>
<dd> <a href="#download">Download and Installation</a> </dd>
<dd> <a href="#cmake">Declare dependency on gflags with CMake</a></dd>
<dd> <a href="#bazel">Declare dependency on gflags with Bazel</a></dd>
<dd> <a href="#define">DEFINE: Defining Flags In Program</A> </dd>
<dd> <a href="#using">Accessing the Flag</A> </dd>
<dd> <a href="#declare">DECLARE: Using the Flag in a Different File</a> </dd>
<dd> <a href="#validate">RegisterFlagValidator: Sanity-checking Flag Values</a> </dd>
<dd> <a href="#together">Putting It Together: How to Set Up Flags</a> </dd>
<dd> <a href="#commandline">Setting Flags on the Command Line</a> </dd>
<dd> <a href="#varz">Setting Flags at Runtime</a> </dd>
<dd> <a href="#default">Changing the Default Flag Value</a> </dd>
<dd> <a href="#special">Special Flags</a> </dd>
<dd> <a href="#api">The API</a> </dd>
<dd> <a href="#misc">Miscellaneous Notes</a> </dd>
<dd> <a href="#issues">Issues and Feature Requests</a> </dd>
<dd> <br/> </dd>
</dl></blockquote>
<h2> <A NAME=intro>Introduction, and Comparison to Other Commandline
Flags Libraries</A> </h2>
<p><b>Commandline flags</b> are flags that users specify on the
command line when they run an executable. In the command</p>
<pre>
fgrep -l -f /var/tmp/foo johannes brahms
</pre>
<p><code>-l</code> and <code>-f /var/tmp/foo</code> are the two
commandline flags. (<code>johannes</code> and <code>brahms</code>,
which don't start with a dash, are <b>commandline arguments</b>.)</p>
<p>Typically, an application lists what flags the user is allowed to
pass in, and what arguments they take -- in this example,
<code>-l</code> takes no argument, and <code>-f</code> takes a
string (in particular, a filename) as an argument. Users can use a
library to help parse the commandline and store the flags in some data
structure.</p>
<p>Gflags, the commandline flags library used within Google,
differs from other libraries,
such as <code>getopt()</code>, in that flag definitions can be
scattered around the source code, and not just listed in one place
such as <code>main()</code>. In practice, this means that a single
source-code file will define and use flags that are meaningful to that
file. Any application that links in that file will get the flags, and
the gflags library will automatically handle that
flag appropriately.</p>
<p>There's significant gain in flexibility, and ease of code reuse,
due to this technique. However, there is a danger that two files will
define the same flag, and then give an error when they're linked
together.</p>
<p>The rest of this document describes how to use the commandlineflag
library. It's a C++ library, so examples are in C++. However, there
is a Python port with the same functionality, and this discussion
translates directly to Python.</p>
<h2> <A NAME=download>Download and Installation</A> </h2>
<p>The gflags library can be downloaded from <A href="https://github.com/gflags/gflags">GitHub</A>.
You can clone the project using the command:</p>
<pre>
git clone https://github.com/gflags/gflags.git
</pre>
<p>Build and installation instructions are provided in the
<A href="https://github.com/gflags/gflags/blob/master/INSTALL.md">INSTALL</A> file.
The installation of the gflags package includes configuration files for popular build systems
such as <A href="https://www.freedesktop.org/wiki/Software/pkg-config/">pkg-config</A>,
<A href="#cmake">CMake</A>, and <A href="#bazel">Bazel</A>.</p>
<h2> <A name=cmake>Declare dependency on gflags with CMake</A></h2>
<p>Using gflags within a project which uses <A href="http://www.cmake.org">CMake</A> for its build system is easy.
You can either require an external installation of the gflags package and find it using CMake's find_package
command, or include the gflags project as subtree or submodule within your project's source tree and add the directory
using CMake's add_subdirectory command.
<p>To use an external gflags installation, add the following CMake code to your <code>CMakeLists.txt</code> file.</p>
<p>Find gflags installation. The <code>gflags_DIR</code> variable must be set to the &lt;prefix&gt;/lib/cmake/gflags directory
containing the gflags-config.cmake file if &lt;prefix&gt; is a non-standard location. Otherwise, CMake should find
the gflags installation automatically.</p>
<pre>
find_package(gflags REQUIRED)
</pre>
<p>To request a particular imported gflags library target to link against, use the <code>COMPONENTS</code> option of
the find_package command. For example, to force the use of the single-threaded static library, use the command</p>
<pre>
find_package(gflags COMPONENTS nothreads_static)
</pre>
<p>Note that this will raise a fatal error when the installed gflags package does not contain the requested library.
It is therefore recommended to only specify the particular component to look for if a specific library must be used.
Otherwise, the gflags-config.cmake module will choose a suitable and available library for you. By default, the
multi-threaded gflags library with shared linkage is chosen if available.</p>
<p>When the source tree of the gflags project is included as subtree or submodule in the "gflags" directory of your project,
replace the above find_package command by <code>add_subdirectory(gflags)</code>. See the top of the <code>gflags/CMakeLists.txt</code>
file for a listing of available CMake variables that can be set before this command to configure the build of the
gflags library. The default build settings are the build of a single-threaded static library which does not require
any installation of the gflags subproject products.</p>
<p>Finally, add your executable build target which uses gflags to parse the command arguments with dependency on the
imported gflags library target:</p>
<pre>
add_executable(foo main.cc)
target_link_libraries(foo gflags)
</pre>
<h2> <A name=bazel>Declare dependency on gflags with Bazel</A></h2>
<p>To use gflags within a project which uses <A href="https://bazel.build/">Bazel</A> as build tool,
add the following lines to your <code>WORKSPACE</code> file
(see also Bazel documentation of <A href="https://www.bazel.io/versions/master/docs/be/workspace.html#git_repository">git_repository</A>):
<pre>
git_repository(
name = "com_github_gflags_gflags",
commit = "&lt;INSERT COMMIT SHA HERE&gt;",
remote = "https://github.com/gflags/gflags.git",
)
bind(
name = "gflags",
actual = "@com_github_gflags_gflags//:gflags",
)
bind(
name = "gflags_nothreads",
actual = "@com_github_gflags_gflags//:gflags_nothreads",
)
</pre>
<p>You can then add <code>//external:gflags</code> to the <code>deps</code> section of a <code>cc_binary</code>
or <code>cc_library</code> rule, and <code>#include "gflags/gflags.h"</code> to include it in your source code.
This use the shared gflags library with multi-threading enabled. In order to use the single-threaded shared
gflags library, use the external dependency <code>//external:gflags_nothreads</code> instead.</p>
<p>For example, see the following <code>BUILD</code> rule of the gflags/example project:</p>
<pre>
cc_binary(
name = "foo",
srcs = ["main.cc"],
deps = ["//external:gflags"],
)
</pre>
<h2> <A name=define>DEFINE: Defining Flags In Program</A> </h2>
<p> Defining a flag is easy: just use the appropriate macro for the
type you want the flag to be, as defined at the bottom of
<code>gflags/gflags.h</code>. Here's an example file,
<code>foo.cc</code>:</p>
<pre>
#include &lt;gflags/gflags.h&gt;
DEFINE_bool(big_menu, true, "Include 'advanced' options in the menu listing");
DEFINE_string(languages, "english,french,german",
"comma-separated list of languages to offer in the 'lang' menu");
</pre>
<p><code>DEFINE_bool</code> defines a boolean flag. Here are the
types supported:</p>
<ul>
<li> <code>DEFINE_bool</code>: boolean
<li> <code>DEFINE_int32</code>: 32-bit integer
<li> <code>DEFINE_int64</code>: 64-bit integer
<li> <code>DEFINE_uint64</code>: unsigned 64-bit integer
<li> <code>DEFINE_double</code>: double
<li> <code>DEFINE_string</code>: C++ string
</ul>
<p>Note that there are no 'complex' types like lists: the "languages"
flag in our example is a list of strings, but is defined of type
"string", not "list_of_string" or similar. This is by design. We'd
rather use only simple types for the flags, and allow for complex,
arbitrary parsing routines to parse them, than to try to put the logic
inside the flags library proper.</p>
<p>All DEFINE macros take the same three arguments: the name of the
flag, its default value, and a 'help' string that describes its use.
The 'help' string is displayed when the user runs the application with
the <A HREF="#special"><code>--help</code> flag</A>.</p>
<p>You can define a flag in any source-code file in your executable.
Only define a flag once! If you want to access a flag in more than
one source file, DEFINE it in one file, and <A
HREF="#declare">DECLARE</A> it in the others. Even better, DEFINE it
in <code>foo.cc</code> and DECLARE it in <code>foo.h</code>; then
everyone who <code>#includes foo.h</code> can use the flag.</p>
<p>
Defining flags in libraries rather than in main() is powerful, but
does have some costs. One is that a library might not have a good
default value for its flags, for example if the flag holds a
filename that might not exist in some environments. To mitigate such problems,
you can use <a href="#validate">flag validators</a> to ensure prompt
notification (in the form of a crash) of an invalid flag value.
</p>
<p>Note that while most functions in this library are defined in the
<code>google</code> namespace, <code>DEFINE_foo</code> (and
<code>DECLARE_foo</code>, <A HREF="#declare">below</A>), should always
be in the global namespace.</p>
<h2> <A name=using>Accessing the Flag</A> </h2>
<p>All defined flags are available to the program as just a normal
variable, with the prefix <code>FLAGS_</code> prepended. In the above
example, the macros define two variables, <code>FLAGS_big_menu</code>
(a bool), and <code>FLAGS_languages</code> (a C++ string).</p>
<p>You can read and write to the flag just like any other
variable:</p>
<pre>
if (FLAGS_consider_made_up_languages)
FLAGS_languages += ",klingon"; // implied by --consider_made_up_languages
if (FLAGS_languages.find("finnish") != string::npos)
HandleFinnish();
</pre>
<p>You can also get and set flag values via special functions in
<code>gflags.h</code>. That's a rarer use case, though.</p>
<h2> <A name=declare>DECLARE: Using the Flag in a Different File</A> </h2>
<p>Accessing a flag in the manner of the previous section only works
if the flag was <code>DEFINE</code>-ed at the top of the file. If it
wasn't, you'll get an 'unknown variable' error.</p>
<p>The <code>DECLARE_type</code> macro is available when you want to
use a flag that's defined in another file. For instance, if I were
writing <code>bar.cc</code> but wanted to access the big_menu, flag, I
would put this near the top of <code>bar.cc</code>:</p>
<pre>
DECLARE_bool(big_menu);
</pre>
<p>This is functionally equivalent to saying <code>extern
FLAGS_big_menu</code>.</p>
<p>Note that such an extern declaration introduces a dependency
between your file and the file that defines the <code>big_menu</code>
flag: <code>foo.cc</code>, in this case. Such implicit dependencies
can be difficult to manage in large projects. For that reason we
recommend the following guideline:</p>
<blockquote>
If you DEFINE a flag in <code>foo.cc</code>, either don't DECLARE it
at all, only DECLARE it in tightly related tests, or only DECLARE
it in <code>foo.h</code>.
</blockquote>
<p>You should go the do-not-DECLARE route when the flag is only needed
by <code>foo.cc</code>, and not in any other file. If you want to
modify the value of the flag in the related test file to see if it is
functioning as expected, DECLARE it in the <code>foo_test.cc</code>
file.
<p>If the flag does span multiple files, DECLARE it in the associated
<code>.h</code> file, and make others <code>#include</code> that
<code>.h</code> file if they want to access the flag. The
<code>#include</code> will make explicit the dependency between the
two files. This causes the flag to be a global variable.</p>
<h2> <A name=validate>RegisterFlagValidator: Sanity-checking Flag Values</A> </h2>
<p>After DEFINE-ing a flag, you may optionally register a validator
function with the flag. If you do this, after the flag is parsed from
the commandline, and whenever its value is changed via a call to
<code>SetCommandLineOption()</code>, the validator function is called
with the new value as an argument. The validator function should
return 'true' if the flag value is valid, and false otherwise.
If the function returns false for the new setting of the
flag, the flag will retain its current value. If it returns false for the
default value, ParseCommandLineFlags will die.
<p>Here is an example use of this functionality:</p>
<pre>
static bool ValidatePort(const char* flagname, int32 value) {
if (value > 0 && value < 32768) // value is ok
return true;
printf("Invalid value for --%s: %d\n", flagname, (int)value);
return false;
}
DEFINE_int32(port, 0, "What port to listen on");
DEFINE_validator(port, &ValidatePort);
</pre>
<p>By doing the registration at global initialization time (right
after the DEFINE_int32), we ensure that the registration happens before
the commandline is parsed at the beginning of <code>main()</code>.</p>
<p>The above used <code>DEFINE_validator</code> macro calls the
<code>RegisterFlagValidator()</code> function which returns true if the
registration is successful. It returns false if the registration fails
because a) the first argument does not refer to a commandline flag, or
b) a different validator has already been registered for this flag.
The return value is available as global static boolean variable named
<code>&lt;flag&gt;_validator_registered</code>.</p>
<h2> <A name=together>Putting It Together: How to Set Up Flags</A> </h2>
<p>The final piece is the one that tells the executable to process the
commandline flags, and set the <code>FLAGS_*</code> variables to the
appropriate, non-default value based on what is seen on the
commandline. This is equivalent to the <code>getopt()</code> call in
the getopt library, but has much less overhead to use. In fact, it's
just a single function call:</p>
<pre>
gflags::ParseCommandLineFlags(&argc, &argv, true);
</pre>
<p>Usually, this code is at the beginning of <code>main()</code>.
<code>argc</code> and <code>argv</code> are exactly as passed in to
<code>main()</code>. This routine might modify them, which is why
pointers to them are passed in.</p>
<p>The last argument is called "remove_flags". If true, then
<code>ParseCommandLineFlags</code> removes the flags and their
arguments from <code>argv</code>, and modifies <code>argc</code>
appropriately. In this case, after the function call,
<code>argv</code> will hold only commandline arguments, and not
commandline flags.</p>
<p>If, on the other hand, <code>remove_flags</code> is false, then
<code>ParseCommandLineFlags</code> will leave argc unchanged, but will
rearrange the arguments in argv so that the flags are all at the
beginning. For example, if the input is <code>"/bin/foo" "arg1" "-q"
"arg2"</code> (which is legal but weird), the function will rearrange
<code>argv</code> so it reads <code>"/bin/foo", "-q", "arg1",
"arg2"</code>. In this case, <code>ParseCommandLineFlags</code>
returns the index into argv that holds the first commandline argument:
that is, the index past the last flag. (In this example, it would
return 2, since <code>argv[2]</code> points to <code>arg1</code>.)</p>
<p>In either case, the <code>FLAGS_*</code> variables are modified
based on what was <A HREF="#commandline">passed in on the
commandline</A>.</p>
<h2> <A name=commandline>Setting Flags on the Command Line</A> </h2>
<p>The reason you make something a flag instead of a compile-time
constant, is so users can specify a non-default value on the
commandline. Here's how they might do it for an application that
links in <code>foo.cc</code>:</p>
<pre>
app_containing_foo --nobig_menu -languages="chinese,japanese,korean" ...
</pre>
<p>This sets <code>FLAGS_big_menu = false;</code> and
<code>FLAGS_languages = "chinese,japanese,korean"</code>, when
<code>ParseCommandLineFlags</code> is run.</p>
<p>Note the atypical syntax for setting a boolean flag to false:
putting "no" in front of its name. There's a fair bit of flexibility
to how flags may be specified. Here's an example of all the ways to
specify the "languages" flag:</p>
<ul>
<li> <code>app_containing_foo --languages="chinese,japanese,korean"</code>
<li> <code>app_containing_foo -languages="chinese,japanese,korean"</code>
<li> <code>app_containing_foo --languages "chinese,japanese,korean"</code>
<li> <code>app_containing_foo -languages "chinese,japanese,korean"</code>
</ul>
<p>For boolean flags, the possibilities are slightly different:</p>
<ul>
<li> <code>app_containing_foo --big_menu</code>
<li> <code>app_containing_foo --nobig_menu</code>
<li> <code>app_containing_foo --big_menu=true</code>
<li> <code>app_containing_foo --big_menu=false</code>
</ul>
<p>(as well as the single-dash variant on all of these).</p>
<p>Despite this flexibility, we recommend using only a single form:
<code>--variable=value</code> for non-boolean flags, and
<code>--variable/--novariable</code> for boolean flags. This
consistency will make your code more readable, and is also the format
required for certain special-use cases like <A
HREF="#flagfiles">flagfiles</A>.</p>
<p>It is a fatal error to specify a flag on the commandline that has
not been DEFINED somewhere in the executable. If you need that
functionality for some reason -- say you want to use the same set of
flags for several executables, but not all of them DEFINE every flag
in your list -- you can specify <A
HREF="#special"><code>--undefok</code></A> to suppress the error.</p>
<p>As in getopt(), <code>--</code> by itself will terminate flags
processing. So in <code>foo -f1 1 -- -f2 2</code>, <code>f1</code> is
considered a flag, but <code>-f2</code> is not.</p>
<p>If a flag is specified more than once, only the last specification
is used; the others are ignored.</p>
<p>Note that flags do not have single-letter synonyms, like they do in
the getopt library, nor do we allow "combining" flags behind a
single dash, as in <code>ls -la</code>.</p>
<h2> <A name=default>Changing the Default Flag Value</A> </h2>
<p>Sometimes a flag is defined in a library, and you want to change
its default value in one application but not others. It's simple to
do this: just assign a new value to the flag in <code>main()</code>,
before calling <code>ParseCommandLineFlags()</code>:</p>
<pre>
DECLARE_bool(lib_verbose); // mylib has a lib_verbose flag, default is false
int main(int argc, char** argv) {
FLAGS_lib_verbose = true; // in my app, I want a verbose lib by default
ParseCommandLineFlags(...);
}
</pre>
<p>For this application, users can still set the flag value on the
commandline, but if they do not, the flag's value will default to
true.</p>
<h2> <A name="special">Special Flags</a> </h2>
<p>There are a few flags defined by the commandlineflags module
itself, and are available to all applications that use
commandlineflags. These fall into
three categories. First are the 'reporting' flags that, when found, cause
the application to print some information about itself and exit.</p>
<table><tr valign=top>
<td><code>--help</code></td>
<td>shows all flags from all files, sorted by file and then by name;
shows the flagname, its default value, and its help string</td>
</tr><tr valign=top>
<td><code>--helpfull</code></td>
<td>same as -help, but unambiguously asks for all flags
(in case -help changes in the future)</td>
</tr><tr valign=top>
<td><code>--helpshort</code></td>
<td>shows only flags for the file with the same name as the executable
(usually the one containing <code>main()</code>)</td>
</tr><tr valign=top>
<td><code>--helpxml</code></td>
<td>like --help, but output is in xml for easier parsing</td>
</tr><tr valign=top>
<td><code>--helpon=FILE &nbsp;</code></td>
<td>shows only flags defined in FILE.*</td>
</tr><tr valign=top>
<td><code>--helpmatch=S</code></td>
<td>shows only flags defined in *S*.*</td>
</tr><tr valign=top>
<td><code>--helppackage</code></td>
<td>shows flags defined in files in same directory as <code>main()</code></td>
</tr><tr valign=top>
<td><code>--version</code></td>
<td>prints version info for the executable</td>
</tr></table>
<p>Second are the flags that affect how other flags are parsed.</p>
<table><tr valign=top>
<td><code>--undefok=flagname,flagname,...</code></td>
<td>for those names listed as the argument to <code>--undefok</code>,
suppress the normal error-exit that occurs when
<code>--name</code> is seen on the commandline, but
<code>name</code> has not been DEFINED anywhere in the
application
</table>
<p>Third are the 'recursive' flags, that cause other flag values to be
set: <code>--fromenv</code>, <code>--tryfromenv</code>,
<code>--flagfile</code>. These are described below in more
detail.</p>
<h3> <code>--fromenv</code> </h3>
<p><code>--fromenv=foo,bar</code> says to read the values for the
<code>foo</code> and <code>bar</code> flags from the environment.
In concert with this flag, you must actually set the values in the
environment, via a line like one of the two below:</p>
<pre>
export FLAGS_foo=xxx; export FLAGS_bar=yyy # sh
setenv FLAGS_foo xxx; setenv FLAGS_bar yyy # tcsh
</pre>
<p>This is equivalent to specifying <code>--foo=xxx</code>,
<code>--bar=yyy</code> on the commandline.</p>
<p>Note it is a fatal error to say <code>--fromenv=foo</code> if
<code>foo</code> is not DEFINED somewhere in the application. (Though
you can suppress this error via <code>--undefok=foo</code>, just like
for any other flag.)</p>
<p>It is also a fatal error to say <code>--fromenv=foo</code> if
<code>FLAGS_foo</code> is not actually defined in the environment.</p>
<h3> <code>--tryfromenv</code> </h3>
<p><code>--tryfromenv</code> is exactly like <code>--fromenv</code>,
except it is <b>not</b> a fatal error to say
<code>--tryfromenv=foo</code> if <code>FLAGS_foo</code> is not
actually defined in the environment. Instead, in such cases,
<code>FLAGS_foo</code> just keeps its default value as specified in
the application.</p>
<p>Note it is still an error to say <code>--tryfromenv=foo</code> if
<code>foo</code> is not DEFINED somewhere in the application.</p>
<h3> <code>--flagfile</code> </h3>
<p><code>--flagfile=f</code> tells the commandlineflags module to read
the file <code>f</code>, and to run all the flag-assignments found in
that file as if these flags had been specified on the commandline.</p>
<p>In its simplest form, <code>f</code> should just be a list of flag
assignments, one per line. Unlike on the commandline, the equals sign
separating a flagname from its argument is <i>required</i> for
flagfiles. An example flagfile, <code>/tmp/myflags</code>:</p>
<pre>
--nobig_menus
--languages=english,french
</pre>
<p>With this flagfile, the following two lines are equivalent:<p>
<pre>
./myapp --foo --nobig_menus --languages=english,french --bar
./myapp --foo --flagfile=/tmp/myflags --bar
</pre>
<p>Note that many errors are silently suppressed in flagfiles. In
particular, unrecognized flagnames are silently ignored, as are flags
that are missing a required value (e.g., a flagfile that just says
<code>--languages</code>).</p>
<p>The general format of a flagfile is a bit more complicated than the
simple, common case above. It is: a sequence of filenames, one per
line, followed by a sequence of flags, one per line, repeated as many
times as desired. Filenames in a flagfile can use wildcards
(<code>*</code> and <code>?</code>), and the sequence of flags located
after a sequence of filenames is processed only if the current
executable's name matches one of the filenames. It is possible to
start the flagfile with a sequence of flags instead of a sequence of
filenames; if such a sequence of flags is present, these flags are
applied to the current executable no matter what it is.</p>
<p>Lines that start with a <code>#</code> are ignored as comments.
Leading whitespace is also ignored in flagfiles, as are blank
lines.</p>
<p>It is possible for a flagfile to use the <code>--flagfile</code>
flag to include another flagfile.</p>
<p>Flags are always processed in the expected order. That is,
processing begins by examining the flags specified directly on the
command line. If a flagfile is specified, its contents are processed,
and then processing continues with remaining flags from the command
line.</p>
<h2> <A name="api">The API</a> </h2>
<p>In addition to accessing <code>FLAGS_foo</code> directly, it is
possible to access the flags programmatically, through an API. It is
also possible to access information about a flag, such as its default
value and help-string. A <code>FlagSaver</code> makes it easy to
modify flags and then automatically undo the modifications later.
Finally, there are somewhat unrelated, but useful, routines to easily
access parts of <code>argv</code> outside main, including the program
name (<code>argv[0]</code>).</p>
<p>For more information about these routines, and other useful helper
methods such as <code>gflags::SetUsageMessage()</code> and
<code>gflags::SetVersionString</code>, see <code>gflags.h</code>.</p>
<h2> <A name="misc">Miscellaneous Notes</code> </h2>
<p>If your application has code like this:</p>
<pre>
#define STRIP_FLAG_HELP 1 // this must go before the #include!
#include &lt;gflags/gflags.h&gt;
</pre>
<p>we will remove the help messages from the compiled source. This can
reduce the size of the resulting binary somewhat, and may also be
useful for security reasons.</p>
<h2> <A name="issues">Issues and Feature Requests</code> </h2>
<p>Please report any issues or ideas for additional features on <A href="https://github.com/gflags/gflags/issues">GitHub</A>.
We would also like to encourage <A href="https://github.com/gflags/gflags/pulls">pull requests</A> for bug fixes and implementations of new features.</p>
<hr>
<address>
Craig Silverstein, Andreas Schuh<br>
<script type=text/javascript>
var lm = new Date(document.lastModified);
document.write(lm.toDateString());
</script>
</address>
</body>
</html>

View File

@@ -15,7 +15,7 @@ Possible use cases of the tool:
* Use Validation Application as another sample: although the code is much more complex than in classification and object
detection samples, the source code is open and can be re-used.
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Specify Input Shapes** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
> **NOTE**: By default, Inference Engine samples and demos expect input with BGR channels order. If you trained your model to work with RGB order, you need to manually rearrange the default channels order in the sample or demo application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to **When to Reverse Input Channels** section of [Converting a Model Using General Conversion Parameters](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md).
## Validation Application Options
@@ -59,7 +59,7 @@ The tool options are divided into two categories:
## General Workflow
> **NOTE**: By default, Inference Engine samples expect input images to have BGR channels order. If you trained you model to work with images in RGB order, you need to manually rearrange the default channels order in the sample application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to [When to Specify Input Shapes](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md#when_to_reverse_input_channels).
> **NOTE**: By default, Inference Engine samples expect input images to have BGR channels order. If you trained you model to work with images in RGB order, you need to manually rearrange the default channels order in the sample application or reconvert your model using the Model Optimizer tool with `--reverse_input_channels` argument specified. For more information about the argument, refer to [When to Reverse Input Channels](./docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md#when_to_reverse_input_channels).
When executed, the Validation Application perform the following steps:
@@ -157,7 +157,7 @@ The correct way to use such dataset is to specify the path as `-i <path>/dataset
### Dataset Format for Object Detection (VOC-like)
Object Detection SSD models can be inferred on the original dataset that was used as a testing dataset during the model training.
To prepare the VOC dataset, follow the steps below :
To prepare the VOC dataset, follow the steps below:
1. Download the pre-trained SSD-300 model from the SSD GitHub* repository at
[https://github.com/weiliu89/caffe/tree/ssd](https://github.com/weiliu89/caffe/tree/ssd).
@@ -167,7 +167,7 @@ To prepare the VOC dataset, follow the steps below :
$wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
tar -xvf VOCtest_06-Nov-2007.tar
```
3. Convert the model with the [Model Optimizer](docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md).
3. Convert the model with the [Model Optimizer](./docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md).
4. Create a proper `.txt` class file from the original `labelmap_voc.prototxt`. The new file must be in
the following format:

View File

@@ -23,7 +23,7 @@ int getLoadModeForChannels(int channels, int base) {
case 3:
return base | IMREAD_COLOR;
}
return base | IMREAD_UNCHANGED;
return IMREAD_UNCHANGED;
}
template <class T>

View File

@@ -360,6 +360,12 @@ int main(int argc, char *argv[]) {
showUsage();
return ex.list().begin()->exitCode();
}
} catch (const std::exception& ex) {
slog::err << ex.what() << slog::endl;
return 1;
} catch (...) {
slog::err << "Unknown/internal exception happened." << slog::endl;
return 1;
}
return 0;
}

View File

@@ -18,6 +18,10 @@ if(ENABLE_CLDNN)
add_subdirectory(cldnn_engine)
endif()
if(ENABLE_VPU)
add_subdirectory(vpu)
endif()
if (ENABLE_GNA)
add_subdirectory(gna_plugin)
endif()

View File

@@ -222,12 +222,16 @@ void CLDNNGraph::Config::LoadFromMap(const std::map<std::string, std::string>& c
} else if (key.compare(CLDNNConfigParams::KEY_CLDNN_GRAPH_DUMPS_DIR) == 0) {
if (!val.empty()) {
graph_dumps_dir = val;
mkdir(graph_dumps_dir.c_str(), 0755);
if (mkdir(graph_dumps_dir.c_str(), 0755) != 0) {
THROW_IE_EXCEPTION << "Couldn't create clDNN graph dump directory!";
}
}
} else if (key.compare(CLDNNConfigParams::KEY_CLDNN_SOURCES_DUMPS_DIR) == 0) {
if (!val.empty()) {
sources_dumps_dir = val;
mkdir(sources_dumps_dir.c_str(), 0755);
if (mkdir(sources_dumps_dir.c_str(), 0755) != 0) {
THROW_IE_EXCEPTION << "Couldn't create clDNN source dump directory!";
}
}
} else if (key.compare(PluginConfigParams::KEY_EXCLUSIVE_ASYNC_REQUESTS) == 0) {
if (val.compare(PluginConfigParams::YES) == 0) {
@@ -310,7 +314,7 @@ CLDNNGraph::CLDNNGraph(InferenceEngine::ICNNNetwork& network, const Config& conf
}
bool res = !NetPass::CombineRNNSeq(network) ? NetPass::UnrollTI(network) : true;
res &= NetPass::UnrollRNN_if(network, [] (RNNCellBase rnn) -> bool {
res &= NetPass::UnrollRNN_if(network, [] (const RNNCellBase& rnn) -> bool {
if (rnn.clip != 0.0f)
return true;
if (rnn.type == "GRUCell" ||
@@ -386,6 +390,15 @@ CLDNNGraph::CLDNNGraph(InferenceEngine::ICNNNetwork& network, const Config& conf
m_env.debugOptions.ClearTimedEvents();
}
template<typename LayerTypePtr>
LayerTypePtr as(const CNNLayerPtr& in_ptr) {
auto result_ptr = dynamic_cast<LayerTypePtr> (in_ptr.get());
if (nullptr == result_ptr) {
THROW_IE_EXCEPTION << "CNNLayerPtr is not suitable for casting to requested layer type";
}
return result_ptr;
}
inline std::string layer_type_name_ID(InferenceEngine::CNNLayer* layer) {
return layer->type + ":" + layer->name;
}
@@ -683,7 +696,7 @@ cldnn::concatenation::concatenation_axis CLDNNGraph::ConcatAxisFromIEAxis(unsign
void CLDNNGraph::CreatePrimitiveFromBlob(cldnn::primitive_id primID,
const InferenceEngine::Blob::Ptr pBlob,
cldnn::layout blobLayout,
const cldnn::layout& blobLayout,
size_t blobByteOffset,
WeightRearrangeType rearrange) {
auto mem = cldnn::memory::allocate(*(m_env.engine), blobLayout);
@@ -765,7 +778,7 @@ void CLDNNGraph::CreateWeightAndBiasPrimitives(const InferenceEngine::CNNLayerPt
switch (LayerTypeFromStr(layer->type)) {
case Convolution: {
auto convLayer = dynamic_cast<InferenceEngine::ConvolutionLayer *> (layer.get());
auto convLayer = as<InferenceEngine::ConvolutionLayer *> (layer);
if ((inFeatures % groupSize) || (convLayer->_out_depth % groupSize)) {
THROW_CLDNN_EXCEPTION("Invalid group size in layer " << convLayer->name);
}
@@ -784,7 +797,7 @@ void CLDNNGraph::CreateWeightAndBiasPrimitives(const InferenceEngine::CNNLayerPt
}
break;
case Deconvolution: {
auto deconvLayer = dynamic_cast<InferenceEngine::DeconvolutionLayer *> (layer.get());
auto deconvLayer = as<InferenceEngine::DeconvolutionLayer *> (layer);
if ((inFeatures % groupSize) || (deconvLayer->_out_depth % groupSize)) {
THROW_CLDNN_EXCEPTION("Invalid group size in layer " << deconvLayer->name);
}
@@ -1044,7 +1057,7 @@ void CLDNNGraph::CreateSingleLayerPrimitive(InferenceEngine::CNNLayerPtr &layer)
void CLDNNGraph::CreateScaleShiftPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto scaleShiftLayer = dynamic_cast<InferenceEngine::ScaleShiftLayer*> (layer.get());
auto scaleShiftLayer = as<InferenceEngine::ScaleShiftLayer*> (layer);
// create scales and biases
cldnn::primitive_id scalePrimID = scaleShiftLayer->name + m_scalesTag;
@@ -1085,7 +1098,7 @@ void CLDNNGraph::CreateScaleShiftPrimitive(InferenceEngine::CNNLayerPtr &layer)
void CLDNNGraph::CreateProposalPrimitive(InferenceEngine::CNNLayerPtr & layer) {
ValidateLayer(layer, 3);
auto proposalLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto proposalLayer = as<InferenceEngine::GenericLayer*> (layer);
float nms_thresh = proposalLayer->GetParamAsFloat("nms_thresh", 0.7f);
int min_size = proposalLayer->GetParamAsInt("min_size", 16);
@@ -1157,7 +1170,7 @@ void CLDNNGraph::CreateProposalPrimitive(InferenceEngine::CNNLayerPtr & layer) {
void CLDNNGraph::CreatePReLUPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto preluLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto preluLayer = as<InferenceEngine::GenericLayer*> (layer);
std::string preluLayerName = layer_type_name_ID(layer);
auto inDataPtr = preluLayer->insData[0].lock();
@@ -1207,7 +1220,7 @@ void CLDNNGraph::CreateBatchNormalizationPrimitive(InferenceEngine::CNNLayerPtr
auto inputPrimitives = GetPrevLayersPrimitives(layer);
std::string bnLayerName = layer_type_name_ID(layer);
auto bnLayer = dynamic_cast<InferenceEngine::BatchNormalizationLayer *> (layer.get());
auto bnLayer = as<InferenceEngine::BatchNormalizationLayer *> (layer);
cldnn::primitive_id weightID = bnLayerName + "_" + m_scalesTag;
cldnn::primitive_id biasID = bnLayerName + "_" + m_biasesTag;
@@ -1222,8 +1235,7 @@ void CLDNNGraph::CreateBatchNormalizationPrimitive(InferenceEngine::CNNLayerPtr
m_topology->add(scalePrim);
m_env.profilingIDs.push_back(bnLayerName);
return;
#endif // _SCALE_BN_OPT
#else
cldnn::tensor blobTensor(0);
switch (bnLayer->outData[0]->dims.size()) {
case 2:
@@ -1258,12 +1270,13 @@ void CLDNNGraph::CreateBatchNormalizationPrimitive(InferenceEngine::CNNLayerPtr
m_env.primitiveIDs[bnLayerName] = bnLayerName;
m_topology->add(bnPrim);
m_env.profilingIDs.push_back(bnLayerName);
#endif // _SCALE_BN_OPT
}
void CLDNNGraph::CreateFlattenPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto flattenLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto flattenLayer = as<InferenceEngine::GenericLayer*> (layer);
std::string flattenLayerName = layer_type_name_ID(layer);
auto flattenPrim = cldnn::reshape(
@@ -1279,7 +1292,7 @@ void CLDNNGraph::CreateFlattenPrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreatePermutePrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto permuteLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto permuteLayer = as<InferenceEngine::GenericLayer*> (layer);
std::vector<uint16_t> ie_order;
for (auto& a : permuteLayer->GetParamAsInts("order"))
ie_order.push_back(static_cast<uint16_t>(a));
@@ -1320,7 +1333,7 @@ void CLDNNGraph::CreatePermutePrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateReshapePrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto reshapeLayer = dynamic_cast<InferenceEngine::ReshapeLayer*> (layer.get());
auto reshapeLayer = as<InferenceEngine::ReshapeLayer*> (layer);
IE_ASSERT(reshapeLayer->outData.size());
std::string reshapeLayerName = layer_type_name_ID(layer);
@@ -1337,7 +1350,7 @@ void CLDNNGraph::CreateReshapePrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateNormalizePrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto normLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto normLayer = as<InferenceEngine::GenericLayer*> (layer);
ValidateGenericLayerBlobs(normLayer, { "weights" });
CreateGenericLayerBlobPrimitives(normLayer);
@@ -1365,7 +1378,7 @@ void CLDNNGraph::CreateNormalizePrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateDetectionOutputPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 3);
auto detectionLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto detectionLayer = as<InferenceEngine::GenericLayer*> (layer);
uint32_t num_classes = detectionLayer->GetParamAsUInt("num_classes", 1);
bool share_location = detectionLayer->GetParamsAsBool("share_location", true);
@@ -1421,7 +1434,7 @@ void CLDNNGraph::CreateDetectionOutputPrimitive(InferenceEngine::CNNLayerPtr &la
void CLDNNGraph::CreatePriorBoxPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 2);
auto priorBoxLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto priorBoxLayer = as<InferenceEngine::GenericLayer*> (layer);
// params
std::vector<float> min_size = priorBoxLayer->GetParamAsFloats("min_size");
@@ -1491,7 +1504,7 @@ void CLDNNGraph::CreatePriorBoxPrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateDeconvolutionPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto deconvLayer = dynamic_cast<InferenceEngine::DeconvolutionLayer *> (layer.get());
auto deconvLayer = as<InferenceEngine::DeconvolutionLayer *> (layer);
if (deconvLayer->_dilation[X_AXIS] != 1 || deconvLayer->_dilation[Y_AXIS] != 1) {
THROW_CLDNN_EXCEPTION("Unsupported dilation in deconvolution " << layer->name);
@@ -1544,7 +1557,7 @@ void CLDNNGraph::CreateCropPrimitive(InferenceEngine::CNNLayerPtr &layer) {
THROW_CLDNN_EXCEPTION("Unsupported fuse in layer: " << layer->name << " with: " << layer->_fusedWith->name);
}
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto cropLayer = dynamic_cast<InferenceEngine::CropLayer*> (layer.get());
auto cropLayer = as<InferenceEngine::CropLayer*> (layer);
IE_ASSERT(cropLayer->axis.size() == cropLayer->offset.size());
// IE_ASSERT(cropLayer->outData[0] && cropLayer->outData[0]->dims.size() == 4);
@@ -1582,7 +1595,7 @@ void CLDNNGraph::CreateCropPrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateROIPoolingPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 2);
auto roiPoolingLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto roiPoolingLayer = as<InferenceEngine::GenericLayer*> (layer);
// params
int pooled_width = roiPoolingLayer->GetParamAsInt("pooled_w", 0);
@@ -1613,7 +1626,7 @@ void CLDNNGraph::CreateROIPoolingPrimitive(InferenceEngine::CNNLayerPtr &layer)
void CLDNNGraph::CreatePSROIPoolingPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 2);
auto psROIPoolingLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto psROIPoolingLayer = as<InferenceEngine::GenericLayer*> (layer);
// params
int group_size = psROIPoolingLayer->GetParamAsInt("group_size");
@@ -1650,7 +1663,7 @@ void CLDNNGraph::CreatePSROIPoolingPrimitive(InferenceEngine::CNNLayerPtr &layer
void CLDNNGraph::CreateCustomLayerPrimitive(InferenceEngine::CNNLayerPtr & layer, CLDNNCustomLayerPtr customLayer) {
ValidateLayer(layer, 0);
// todo: handling fusing
auto genericLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto genericLayer = as<InferenceEngine::GenericLayer*> (layer);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
// Handle defines
@@ -1678,10 +1691,10 @@ void CLDNNGraph::CreateCustomLayerPrimitive(InferenceEngine::CNNLayerPtr & layer
if (blob.second->dims().size() != 1) {
THROW_CLDNN_EXCEPTION("Invalid dimensions for blob " << blob.first << " in layer " << genericLayer->name);
}
CreatePrimitiveFromBlob(blobId, blob.second, cldnn::layout(
DataTypeFromPrecision(blob.second->precision()),
m_defaultFormat,
cldnn::tensor(1, 1, TensorValue(blob.second->dims()[0]), 1)));
cldnn::layout genericBlobLayout(DataTypeFromPrecision(blob.second->precision()),
m_defaultFormat,
cldnn::tensor(1, 1, TensorValue(blob.second->dims()[0]), 1));
CreatePrimitiveFromBlob(blobId, blob.second, genericBlobLayout);
// save index in blobIndex
blobIndex[blob.first] = reorderedInputs.size();
// add to reorderedInputs
@@ -1838,7 +1851,7 @@ void CLDNNGraph::CreateSimplerNMSPrimitive(InferenceEngine::CNNLayerPtr &layer)
ValidateLayer(layer, 3);
IE_ASSERT(layer->insData[0].lock()->dims[3] == 1); // only handling input batch size 1
IE_ASSERT(layer->insData[1].lock()->dims[3] == 1); // only handling input batch size 1
auto simpleNMSLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto simpleNMSLayer = as<InferenceEngine::GenericLayer*> (layer);
int max_num_proposals = simpleNMSLayer->GetParamAsInt("max_num_proposals");
float iou_threshold = simpleNMSLayer->GetParamAsFloat("iou_threshold", 0.7f);
@@ -1872,7 +1885,7 @@ void CLDNNGraph::CreateSimplerNMSPrimitive(InferenceEngine::CNNLayerPtr &layer)
void CLDNNGraph::CreateEltwisePrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateEltwiseLayer(layer);
auto eltwiseLayer = dynamic_cast<InferenceEngine::EltwiseLayer *> (layer.get());
auto eltwiseLayer = as<InferenceEngine::EltwiseLayer *> (layer);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
std::vector<float> coefficients = eltwiseLayer->coeff;
@@ -1897,7 +1910,7 @@ void CLDNNGraph::CreateEltwisePrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateConcatenatePrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 0);
auto concatLayer = dynamic_cast<InferenceEngine::ConcatLayer *> (layer.get());
auto concatLayer = as<InferenceEngine::ConcatLayer *> (layer);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
std::string concatLayerName = layer_type_name_ID(layer);
auto concatPrim = cldnn::concatenation(
@@ -1911,7 +1924,7 @@ void CLDNNGraph::CreateConcatenatePrimitive(InferenceEngine::CNNLayerPtr &layer)
void CLDNNGraph::CreateSplitPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto splitLayer = dynamic_cast<InferenceEngine::SplitLayer *> (layer.get());
auto splitLayer = as<InferenceEngine::SplitLayer *> (layer);
if (IsValidSplitConvMerge(splitLayer)) {
// AlextNet style split->conv*2->merge
CreateFusedSplitConvMergePrimitive(layer);
@@ -2014,16 +2027,15 @@ std::cout << "Splitting layer: " << layer->name << "\n\tSize:" << CldnnTensorFro
void CLDNNGraph::CreateFusedSplitConvMergePrimitive(InferenceEngine::CNNLayerPtr &layer) {
auto inputPrimitives = GetPrevLayersPrimitives(layer);
// only handle the split->conv->merge topology for now
auto splitLayer = dynamic_cast<InferenceEngine::SplitLayer *> (layer.get());
auto splitLayer = as<InferenceEngine::SplitLayer *> (layer);
IE_ASSERT(IsValidSplitConvMerge(splitLayer));
auto convLayer1 =
dynamic_cast<InferenceEngine::ConvolutionLayer *> (GetNextSingleLayer(splitLayer->outData[0]).get());
as<InferenceEngine::ConvolutionLayer *> (GetNextSingleLayer(splitLayer->outData[0]));
auto convLayer2 =
dynamic_cast<InferenceEngine::ConvolutionLayer *> (GetNextSingleLayer(splitLayer->outData[1]).get());
as<InferenceEngine::ConvolutionLayer *> (GetNextSingleLayer(splitLayer->outData[1]));
auto concatLayer =
dynamic_cast<InferenceEngine::ConcatLayer *> (GetNextSingleLayer(
GetNextSingleLayer(splitLayer->outData[0])).get());
as<InferenceEngine::ConcatLayer *> (GetNextSingleLayer(GetNextSingleLayer(splitLayer->outData[0])));
if (convLayer1 == nullptr ||
convLayer2 == nullptr ||
@@ -2078,7 +2090,7 @@ void CLDNNGraph::CreateFusedSplitConvMergePrimitive(InferenceEngine::CNNLayerPtr
void CLDNNGraph::CreatePowerPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto powerLayer = dynamic_cast<InferenceEngine::PowerLayer *> (layer.get());
auto powerLayer = as<InferenceEngine::PowerLayer *> (layer);
if (powerLayer->power != 1.0f && powerLayer->power != 0.5f) {
THROW_CLDNN_EXCEPTION("Power Layer " << layer->name << "uses unsupported power value");
}
@@ -2130,7 +2142,7 @@ void CLDNNGraph::CreatePowerPrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateSoftMaxPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto softmaxLayer = dynamic_cast<InferenceEngine::SoftMaxLayer *> (layer.get());
auto softmaxLayer = as<InferenceEngine::SoftMaxLayer *> (layer);
// additional WA for clDNN FullyConnected output in BX instead of BF
int inputOrder = 0;
@@ -2157,17 +2169,16 @@ void CLDNNGraph::CreateSoftMaxPrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateFullyConnectedPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto fcLayer = dynamic_cast<InferenceEngine::FullyConnectedLayer *> (layer.get());
auto fcLayer = as<InferenceEngine::FullyConnectedLayer *> (layer);
std::string fcLayerName = layer_type_name_ID(layer);
// create bias primitive
cldnn::primitive_id biasesPrimID = "";
if (fcLayer->_biases != nullptr) {
biasesPrimID = fcLayerName + m_biasesTag;
CreatePrimitiveFromBlob(biasesPrimID,
fcLayer->_biases,
cldnn::layout(DataTypeFromPrecision(fcLayer->precision), m_defaultFormat,
cldnn::spatial(TensorValue(fcLayer->_out_num))));
cldnn::layout fcbLayout(DataTypeFromPrecision(fcLayer->precision), m_defaultFormat,
cldnn::spatial(TensorValue(fcLayer->_out_num)));
CreatePrimitiveFromBlob(biasesPrimID, fcLayer->_biases, fcbLayout);
}
// create weights primitive
@@ -2188,9 +2199,8 @@ void CLDNNGraph::CreateFullyConnectedPrimitive(InferenceEngine::CNNLayerPtr &lay
break;
default: THROW_CLDNN_EXCEPTION("Invalid data dimensions");
}
CreatePrimitiveFromBlob(weightsPrimID,
fcLayer->_weights,
cldnn::layout(DataTypeFromPrecision(fcLayer->precision), m_defaultFormat, weightsDims));
cldnn::layout fcwLayout(DataTypeFromPrecision(fcLayer->precision), m_defaultFormat, weightsDims);
CreatePrimitiveFromBlob(weightsPrimID, fcLayer->_weights, fcwLayout);
auto fcPrim = cldnn::fully_connected(fcLayerName,
inputPrimitives[0],
@@ -2207,7 +2217,7 @@ void CLDNNGraph::CreateFullyConnectedPrimitive(InferenceEngine::CNNLayerPtr &lay
void CLDNNGraph::CreatePoolingPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto poolLayer = dynamic_cast<InferenceEngine::PoolingLayer *> (layer.get());
auto poolLayer = as<InferenceEngine::PoolingLayer *> (layer);
std::string poolLayerName = layer_type_name_ID(layer);
auto allPads = getPaddings(*poolLayer);
@@ -2293,7 +2303,7 @@ void CLDNNGraph::CreatePoolingPrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateLRNPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto lrnLayer = dynamic_cast<InferenceEngine::NormLayer *> (layer.get());
auto lrnLayer = as<InferenceEngine::NormLayer *> (layer);
std::string lrnLayerName = layer_type_name_ID(layer);
auto lrnPrim = cldnn::lrn(
lrnLayerName,
@@ -2403,7 +2413,7 @@ void CLDNNGraph::CreateActivationPrimitive(InferenceEngine::CNNLayerPtr &layer,
void CLDNNGraph::CreateCopyPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto copyLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto copyLayer = as<InferenceEngine::GenericLayer*> (layer);
// Optimize out and just update references
std::string layerName = layer_type_name_ID(layer);
@@ -2415,7 +2425,7 @@ void CLDNNGraph::CreateUpsamplingPrimitive(InferenceEngine::CNNLayerPtr &layer)
// Assuming multi-input will be handled by prev concat/eltwise layers
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto upsamplingLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto upsamplingLayer = as<InferenceEngine::GenericLayer*> (layer);
uint32_t scale = upsamplingLayer->GetParamAsUInt("scale");
uint32_t numFilter = upsamplingLayer->GetParamAsUInt("num_filter");
std::string sampleType = upsamplingLayer->GetParamAsString("sample_type");
@@ -2436,7 +2446,7 @@ void CLDNNGraph::CreateUpsamplingPrimitive(InferenceEngine::CNNLayerPtr &layer)
void CLDNNGraph::CreateResamplePrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto resampleLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto resampleLayer = as<InferenceEngine::GenericLayer*> (layer);
auto outDims = layer->outData[0]->dims;
size_t inFeatures = 1;
@@ -2472,7 +2482,7 @@ void CLDNNGraph::CreateResamplePrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateYOLO2RegionPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto YOLOregionLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto YOLOregionLayer = as<InferenceEngine::GenericLayer*> (layer);
uint32_t coords = YOLOregionLayer->GetParamAsUInt("coords", 4);
uint32_t classes = YOLOregionLayer->GetParamAsUInt("classes", 20);
@@ -2503,7 +2513,7 @@ void CLDNNGraph::CreateYOLO2RegionPrimitive(InferenceEngine::CNNLayerPtr &layer)
void CLDNNGraph::CreateYOLO2ReorgPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto YOLOreorgLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto YOLOreorgLayer = as<InferenceEngine::GenericLayer*> (layer); // as<InferenceEngine::GenericLayer*> (layer);
uint32_t stride = YOLOreorgLayer->GetParamAsUInt("stride");
std::string YOLOreorgLayerName = layer_type_name_ID(layer);
@@ -2520,7 +2530,7 @@ void CLDNNGraph::CreateYOLO2ReorgPrimitive(InferenceEngine::CNNLayerPtr &layer)
void CLDNNGraph::CreateArgMaxPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto ArgMaxLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto ArgMaxLayer = as<InferenceEngine::GenericLayer*> (layer);
const cldnn::arg_max_min::out_type otype = cldnn::arg_max_min::out_type::max;
if (HasParam(ArgMaxLayer->params, "out_max_val")) {
@@ -2565,7 +2575,7 @@ void CLDNNGraph::CreateArgMaxPrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateMaxUnpoolingPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 2);
auto UnpoolingLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto UnpoolingLayer = as<InferenceEngine::GenericLayer*> (layer);
cldnn::primitive_id real_input, argmax_mutable;
@@ -2610,7 +2620,7 @@ void CLDNNGraph::CreateMaxUnpoolingPrimitive(InferenceEngine::CNNLayerPtr &layer
void CLDNNGraph::CreateMVNPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto MvnLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto MvnLayer = as<InferenceEngine::GenericLayer*> (layer);
bool across_channels = MvnLayer->GetParamsAsBool("across_channels", false);
bool normalize_variance = MvnLayer->GetParamsAsBool("normalize_variance", true);
@@ -2632,7 +2642,7 @@ void CLDNNGraph::CreateMVNPrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateTilePrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto tileLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto tileLayer = as<InferenceEngine::GenericLayer*> (layer);
int axis = tileLayer->GetParamAsInt("axis", 1);
int tiles = tileLayer->GetParamAsInt("tiles");
@@ -2661,7 +2671,7 @@ void CLDNNGraph::CreateTilePrimitive(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreatePadPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto padLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto padLayer = as<InferenceEngine::GenericLayer*> (layer);
auto PadTensorFromArgs = [](const std::string &s) -> cldnn::tensor {
std::stringstream ss(s);
@@ -2731,7 +2741,7 @@ void CLDNNGraph::CreateLSTMCellPrimitive(InferenceEngine::CNNLayerPtr &layer) {
cldnn::primitive_id weightID = layerName + m_weightsTag;
cldnn::primitive_id recurrentID = layerName + "_recurrent" + m_weightsTag;
cldnn::primitive_id biasID = layerName + m_biasesTag;
auto cellLayer = dynamic_cast<InferenceEngine::LSTMCell*> (layer.get());
auto cellLayer = as<InferenceEngine::LSTMCell*> (layer);
/* check incoming CNN layer and setup required variables */
{
@@ -2779,7 +2789,7 @@ void CLDNNGraph::CreateLSTMCellPrimitive(InferenceEngine::CNNLayerPtr &layer) {
auto rmem = cldnn::memory::allocate(*(m_env.engine), RLayout);
auto rtmpPointer = rmem.pointer<char>();
auto wLayer = dynamic_cast<InferenceEngine::WeightableLayer *> (layer.get());
auto wLayer = as<InferenceEngine::WeightableLayer *> (layer);
auto pWeightsBlob = wLayer->_weights;
auto blobBytes = static_cast<const char *>(pWeightsBlob->buffer());
const size_t WchunkSz = lstm_input_size * elementSize;
@@ -2875,7 +2885,7 @@ void CLDNNGraph::CreateRNNPrimitive(InferenceEngine::CNNLayerPtr &layer) {
cldnn::primitive_id weightID = layerName + m_weightsTag;
cldnn::primitive_id recurrentID = layerName + "_recurrent" + m_weightsTag;
cldnn::primitive_id biasID = layerName + m_biasesTag;
auto rnnLayer = dynamic_cast<InferenceEngine::RNNSequenceLayer*> (layer.get());
auto rnnLayer = as<InferenceEngine::RNNSequenceLayer*> (layer);
bool permute_input = (1 != rnnLayer->axis);
/* check incoming CNN layer and setup required variables */
@@ -2938,7 +2948,7 @@ void CLDNNGraph::CreateRNNPrimitive(InferenceEngine::CNNLayerPtr &layer) {
auto rmem = cldnn::memory::allocate(*(m_env.engine), RLayout);
auto rtmpPointer = rmem.pointer<char>();
auto wLayer = dynamic_cast<InferenceEngine::WeightableLayer *> (layer.get());
auto wLayer = as<InferenceEngine::WeightableLayer *> (layer);
auto pWeightsBlob = wLayer->_weights;
auto blobBytes = static_cast<const char *>(pWeightsBlob->buffer());
const size_t WchunkSz = lstm_input_size * elementSize;
@@ -3107,7 +3117,7 @@ void CLDNNGraph::AddConstantBlobInput(InferenceEngine::CNNLayerPtr &layer) {
void CLDNNGraph::CreateConvolutionPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto convLayer = dynamic_cast<InferenceEngine::ConvolutionLayer *> (layer.get());
auto convLayer = as<InferenceEngine::ConvolutionLayer *> (layer);
std::vector<cldnn::primitive_id> weightPrimID;
std::vector<cldnn::primitive_id> biasPrimID;
@@ -3156,7 +3166,7 @@ void CLDNNGraph::CreateGatherPrimitive(InferenceEngine::CNNLayerPtr &layer) {
ValidateLayer(layer, 2);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto gatherLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto gatherLayer = as<InferenceEngine::GenericLayer*> (layer);
int axis = gatherLayer->GetParamAsInt("axis", 0);
@@ -3191,7 +3201,7 @@ void CLDNNGraph::CreateDepthToSpacePrimitive(InferenceEngine::CNNLayerPtr &layer
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto depthToSpace = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto depthToSpace = as<InferenceEngine::GenericLayer*> (layer);
size_t blockSize = depthToSpace->GetParamAsInt("block_size", 2);
@@ -3218,7 +3228,7 @@ void CLDNNGraph::CreateShuffleChannelsPrimitive(InferenceEngine::CNNLayerPtr &la
ValidateLayer(layer, 1);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto shuffleChannels = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto shuffleChannels = as<InferenceEngine::GenericLayer*> (layer);
const int32_t numberOfDims = shuffleChannels->input()->getDims().size();
int32_t group = shuffleChannels->GetParamAsInt("group", 1);
@@ -3252,7 +3262,7 @@ void CLDNNGraph::CreateShuffleChannelsPrimitive(InferenceEngine::CNNLayerPtr &la
void CLDNNGraph::CreateStridedSlicePrimitive(InferenceEngine::CNNLayerPtr &layer) {
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto stridedSliceLayer = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto stridedSliceLayer = as<InferenceEngine::GenericLayer*> (layer);
auto tmp = stridedSliceLayer->GetParamAsUInts("end_mask");
std::vector<uint8_t> end_mask(tmp.begin(), tmp.end());
@@ -3278,7 +3288,7 @@ void CLDNNGraph::CreateReverseSequencePrimitive(InferenceEngine::CNNLayerPtr &la
ValidateLayer(layer, 2);
auto inputPrimitives = GetPrevLayersPrimitives(layer);
auto reverseSequence = dynamic_cast<InferenceEngine::GenericLayer*> (layer.get());
auto reverseSequence = as<InferenceEngine::GenericLayer*> (layer);
const int32_t numberOfDims = reverseSequence->input()->getDims().size();
const auto input = reverseSequence->insData[0].lock()->getDims();
@@ -3329,9 +3339,9 @@ bool CLDNNGraph::IsValidSplitConvMerge(const InferenceEngine::SplitLayer *splitL
}
auto convLayer1 =
dynamic_cast<InferenceEngine::ConvolutionLayer *> (GetNextSingleLayer(splitLayer->outData[0]).get());
as<InferenceEngine::ConvolutionLayer *> (GetNextSingleLayer(splitLayer->outData[0]));
auto convLayer2 =
dynamic_cast<InferenceEngine::ConvolutionLayer *> (GetNextSingleLayer(splitLayer->outData[1]).get());
as<InferenceEngine::ConvolutionLayer *> (GetNextSingleLayer(splitLayer->outData[1]));
if (!convLayer1 || !convLayer2) { // outputs aren't convolutions
return false;
}
@@ -3353,8 +3363,8 @@ bool CLDNNGraph::IsValidSplitConvMerge(const InferenceEngine::SplitLayer *splitL
return false;
}
auto concatLayer =
dynamic_cast<InferenceEngine::ConcatLayer *> (
GetNextSingleLayer(GetNextSingleLayer(splitLayer->outData[0])).get());
as<InferenceEngine::ConcatLayer *> (
GetNextSingleLayer(GetNextSingleLayer(splitLayer->outData[0])));
if (!concatLayer || // not a merge layer
concatLayer->_axis != 1 || // merge on unsupported axis
concatLayer->outData.size() != 1) { // too many outputs
@@ -3696,12 +3706,13 @@ void CLDNNGraph::CreateGenericLayerBlobPrimitives(const InferenceEngine::Generic
if (blob.second->dims().size() != 1) {
THROW_CLDNN_EXCEPTION("Unhandled blob dim in layer " + layer->name);
}
CreatePrimitiveFromBlob(
layer->type + ":" + layer->name + "_" + blob.first + m_weightsTag,
blob.second,
cldnn::layout(
DataTypeFromPrecision(blob.second->precision()),
m_defaultFormat, cldnn::spatial(TensorValue(blob.second->dims()[0]))));
cldnn::layout genericLayout(DataTypeFromPrecision(blob.second->precision()),
m_defaultFormat,
cldnn::spatial(TensorValue(blob.second->dims()[0])));
CreatePrimitiveFromBlob(layer->type + ":" + layer->name + "_" + blob.first + m_weightsTag,
blob.second, genericLayout);
}
}

View File

@@ -195,7 +195,7 @@ protected:
static cldnn::softmax::dimension_t SoftmaxDimensionFromIEAxis(const InferenceEngine::SoftMaxLayer* softmaxLayer, bool isPrevFC = false);
void CreatePrimitiveFromBlob(cldnn::primitive_id primID,
const InferenceEngine::Blob::Ptr pBlob,
cldnn::layout blobLayout,
const cldnn::layout& blobLayout,
size_t blobByteOffset = 0,
WeightRearrangeType rearrange = NO_REARRANGE);
void CreateWeightAndBiasPrimitives(const InferenceEngine::CNNLayerPtr& layer,

View File

@@ -359,7 +359,7 @@ void CLDNNInferRequest::SetBatch(int new_batch) {
m_curBatch = new_batch;
}
CLDNNInferRequest::CLDNNInferRequest(InferenceEnv env, bool useProfiling,
CLDNNInferRequest::CLDNNInferRequest(const InferenceEnv& env, bool useProfiling,
InputsDataMap networkInputs, OutputsDataMap networkOutputs)
: InferRequestInternal(networkInputs, networkOutputs),
m_env(env),

View File

@@ -27,7 +27,7 @@ public:
void
GetPerformanceCounts(std::map<std::string, InferenceEngine::InferenceEngineProfileInfo> &perfMap) const override;
CLDNNInferRequest(InferenceEnv env, bool useProfiling,
CLDNNInferRequest(const InferenceEnv& env, bool useProfiling,
InferenceEngine::InputsDataMap networkInputs, InferenceEngine::OutputsDataMap networkOutputs);
CLDNNInferRequest(const CLDNNInferRequest &) = delete;

View File

@@ -16,8 +16,11 @@ if (NOT(IE_MAIN_SOURCE_DIR))
endif()
endif()
# treating warnings as errors
if (WIN32)
if (TREAT_WARNING_AS_ERROR)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /WX") #treating warnings as errors
endif ()
if (${CMAKE_CXX_COMPILER_ID} STREQUAL MSVC)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /wd4251 /wd4275 /wd4267") #disable some warnings
endif()

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -106,7 +106,7 @@ private:
void pad_symmetric(const float *src_data, float* dst_data);
PadMode padMode = CONSTANT;
float pad_value;
float pad_value = 0.f;
SizeVector src_dims;
SizeVector dst_dims;
std::vector<unsigned int> pads_begin;

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -385,93 +385,100 @@ public:
roi_indices_.resize(post_nms_topn_);
addConfig(layer, {DataConfigurator(ConfLayout::PLN), DataConfigurator(ConfLayout::PLN), DataConfigurator(ConfLayout::PLN)},
{DataConfigurator(ConfLayout::PLN)});
} catch (InferenceEngine::details::InferenceEngineException &ex) {
} catch (const InferenceEngine::details::InferenceEngineException &ex) {
errorMsg = ex.what();
}
}
StatusCode execute(std::vector<Blob::Ptr> &inputs, std::vector<Blob::Ptr> &outputs,
ResponseDesc *resp) noexcept override {
if (inputs.size() != 3 || outputs.empty()) {
try {
if (inputs.size() != 3 || outputs.empty()) {
THROW_IE_EXCEPTION << "Incorrect number of input or output edges!";
}
// Prepare memory
const float *p_bottom_item = inputs[0]->buffer();
const float *p_d_anchor_item = inputs[1]->buffer();
const float *p_img_info_cpu = inputs[2]->buffer();
float *p_roi_item = outputs[0]->buffer();
size_t img_info_size = inputs[2]->getTensorDesc().getDims()[1];
// No second output so ignoring this
// Dtype* p_score_item = (top.size() > 1) ? top[1]->mutable_cpu_data() : NULL;
// bottom shape: (2 x num_anchors) x H x W
const int bottom_H = inputs[0]->getTensorDesc().getDims()[2];
const int bottom_W = inputs[0]->getTensorDesc().getDims()[3];
// input image height & width
const float img_H = p_img_info_cpu[swap_xy ? 1 : 0];
const float img_W = p_img_info_cpu[swap_xy ? 0 : 1];
// scale factor for height & width
const float scale_H = p_img_info_cpu[2];
const float scale_W = img_info_size > 3 ? p_img_info_cpu[3] : scale_H;
// minimum box width & height
const float min_box_H = min_size_ * scale_H;
const float min_box_W = min_size_ * scale_W;
// number of all proposals = num_anchors * H * W
const int num_proposals = anchors_shape_0 * bottom_H * bottom_W;
// number of top-n proposals before NMS
const int pre_nms_topn = std::min<int>(num_proposals, pre_nms_topn_);
// number of final RoIs
int num_rois = 0;
// enumerate all proposals
// num_proposals = num_anchors * H * W
// (x1, y1, x2, y2, score) for each proposal
// NOTE: for bottom, only foreground scores are passed
struct ProposalBox {
float x0;
float y0;
float x1;
float y1;
float score;
};
std::vector<ProposalBox> proposals_(num_proposals);
std::vector<float> unpacked_boxes(4 * pre_nms_topn);
std::vector<int> is_dead(pre_nms_topn);
// Execute
int nn = inputs[0]->getTensorDesc().getDims()[0];
for (int n = 0; n < nn; ++n) {
enumerate_proposals_cpu(p_bottom_item + num_proposals + n * num_proposals * 2,
p_d_anchor_item + n * num_proposals * 4,
&anchors_[0], reinterpret_cast<float *>(&proposals_[0]),
anchors_shape_0, bottom_H, bottom_W, img_H, img_W,
min_box_H, min_box_W, feat_stride_,
box_coordinate_scale_, box_size_scale_,
coordinates_offset, initial_clip, swap_xy, clip_before_nms);
std::partial_sort(proposals_.begin(), proposals_.begin() + pre_nms_topn, proposals_.end(),
[](const ProposalBox &struct1, const ProposalBox &struct2) {
return (struct1.score > struct2.score);
});
unpack_boxes(reinterpret_cast<float *>(&proposals_[0]), &unpacked_boxes[0], pre_nms_topn);
nms_cpu(pre_nms_topn, &is_dead[0], &unpacked_boxes[0], &roi_indices_[0], &num_rois, 0, nms_thresh_,
post_nms_topn_, coordinates_offset);
retrieve_rois_cpu(num_rois, n, pre_nms_topn, &unpacked_boxes[0], &roi_indices_[0],
p_roi_item + n * post_nms_topn_ * 5,
post_nms_topn_, normalize_, img_H, img_W, clip_after_nms);
}
return OK;
} catch (const InferenceEngine::details::InferenceEngineException& e) {
if (resp) {
std::string errorMsg = "Incorrect number of input or output edges!";
std::string errorMsg = e.what();
errorMsg.copy(resp->msg, sizeof(resp->msg) - 1);
}
return GENERAL_ERROR;
}
// Prepare memory
const float* p_bottom_item = inputs[0]->buffer();
const float* p_d_anchor_item = inputs[1]->buffer();
const float* p_img_info_cpu = inputs[2]->buffer();
float* p_roi_item = outputs[0]->buffer();
size_t img_info_size = inputs[2]->getTensorDesc().getDims()[1];
// No second output so ignoring this
// Dtype* p_score_item = (top.size() > 1) ? top[1]->mutable_cpu_data() : NULL;
// bottom shape: (2 x num_anchors) x H x W
const int bottom_H = inputs[0]->getTensorDesc().getDims()[2];
const int bottom_W = inputs[0]->getTensorDesc().getDims()[3];
// input image height & width
const float img_H = p_img_info_cpu[swap_xy ? 1 : 0];
const float img_W = p_img_info_cpu[swap_xy ? 0 : 1];
// scale factor for height & width
const float scale_H = p_img_info_cpu[2];
const float scale_W = img_info_size > 3 ? p_img_info_cpu[3] : scale_H;
// minimum box width & height
const float min_box_H = min_size_ * scale_H;
const float min_box_W = min_size_ * scale_W;
// number of all proposals = num_anchors * H * W
const int num_proposals = anchors_shape_0 * bottom_H * bottom_W;
// number of top-n proposals before NMS
const int pre_nms_topn = std::min<int>(num_proposals, pre_nms_topn_);
// number of final RoIs
int num_rois = 0;
// enumerate all proposals
// num_proposals = num_anchors * H * W
// (x1, y1, x2, y2, score) for each proposal
// NOTE: for bottom, only foreground scores are passed
struct ProposalBox {
float x0;
float y0;
float x1;
float y1;
float score;
};
std::vector<ProposalBox> proposals_(num_proposals);
std::vector<float> unpacked_boxes(4 * pre_nms_topn);
std::vector<int> is_dead(pre_nms_topn);
// Execute
int nn = inputs[0]->getTensorDesc().getDims()[0];
for (int n = 0; n < nn; ++n) {
enumerate_proposals_cpu(p_bottom_item + num_proposals + n*num_proposals*2, p_d_anchor_item + n*num_proposals*4,
&anchors_[0], reinterpret_cast<float *>(&proposals_[0]),
anchors_shape_0, bottom_H, bottom_W, img_H, img_W,
min_box_H, min_box_W, feat_stride_,
box_coordinate_scale_, box_size_scale_,
coordinates_offset, initial_clip, swap_xy, clip_before_nms);
std::partial_sort(proposals_.begin(), proposals_.begin() + pre_nms_topn, proposals_.end(),
[](const ProposalBox& struct1, const ProposalBox& struct2) {
return (struct1.score > struct2.score);
});
unpack_boxes(reinterpret_cast<float *>(&proposals_[0]), &unpacked_boxes[0], pre_nms_topn);
nms_cpu(pre_nms_topn, &is_dead[0], &unpacked_boxes[0], &roi_indices_[0], &num_rois, 0, nms_thresh_, post_nms_topn_, coordinates_offset);
retrieve_rois_cpu(num_rois, n, pre_nms_topn, &unpacked_boxes[0], &roi_indices_[0], p_roi_item + n*post_nms_topn_*5,
post_nms_topn_, normalize_, img_H, img_W, clip_after_nms);
}
return OK;
}
private:
@@ -507,16 +514,20 @@ public:
// set output shapes by input shapes.
StatusCode getShapes(const std::vector<TensorDesc>& inShapes, std::vector<TensorDesc>& outShapes,
ResponseDesc *resp) noexcept override {
if (inShapes.size() != 1) {
try {
if (inShapes.size() != 1) {
THROW_IE_EXCEPTION << "Incorrect input shapes!";
}
outShapes.clear();
outShapes.emplace_back(cnnLayer.precision, inShapes[0].getDims(), inShapes[0].getLayout());
return OK;
} catch (const InferenceEngine::details::InferenceEngineException& e) {
if (resp) {
std::string errorMsg = "Incorrect input shapes!";
std::string errorMsg = e.what();
errorMsg.copy(resp->msg, sizeof(resp->msg) - 1);
}
return GENERAL_ERROR;
}
outShapes.clear();
outShapes.emplace_back(cnnLayer.precision, inShapes[0].getDims(), inShapes[0].getLayout());
return OK;
}
};

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
@@ -422,16 +422,20 @@ public:
// set output shapes by input shapes.
StatusCode getShapes(const std::vector<TensorDesc>& inShapes, std::vector<TensorDesc>& outShapes,
ResponseDesc *resp) noexcept override {
if (inShapes.size() != 1) {
try {
if (inShapes.size() != 1) {
THROW_IE_EXCEPTION << "Incorrect input shapes!";
}
outShapes.clear();
outShapes.emplace_back(cnnLayer.precision, inShapes[0].getDims(), inShapes[0].getLayout());
return OK;
} catch (const InferenceEngine::details::InferenceEngineException& e) {
if (resp) {
std::string errorMsg = "Incorrect input shapes!";
std::string errorMsg = e.what();
errorMsg.copy(resp->msg, sizeof(resp->msg) - 1);
}
return GENERAL_ERROR;
}
outShapes.clear();
outShapes.emplace_back(cnnLayer.precision, inShapes[0].getDims(), inShapes[0].getLayout());
return OK;
}
};

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,4 +1,4 @@
// Copyright (C) 2019 Intel Corporation
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

View File

@@ -1,8 +1,7 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// dnn.cpp : component based neural network class for ease of use
//
extern bool global_debug;
#include <cstdlib>

View File

@@ -252,7 +252,9 @@ class AmIntelDnn {
ptr_sumgroup_sizes(NULL),
num_sumgroup_sizes(0),
ptr_priors(NULL),
ptr_dnn_memory_(NULL) {
ptr_dnn_memory_(NULL),
num_bytes_dnn_memory_(0),
number_type_(kDnnNumNumberType) {
}
~AmIntelDnn() {

View File

@@ -1,8 +1,6 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// dnn_memory.cpp : memory manipulation routines
//
#include <cstdio>
#include <cstdlib>

View File

@@ -1,7 +1,6 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// dnn_memory.hpp : memory manipulation routines
#pragma once

View File

@@ -1,8 +1,6 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// dnn_traits.hpp : c++ trait approach to define dnn objects
//
#pragma once

View File

@@ -1,8 +1,6 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// floatmath.cpp : unoptimized floating point math routines (for reference)
//
#include "floatmath.h"
#include "pwl.h"

View File

@@ -43,6 +43,7 @@ class CPPWrapper<intel_nnet_type_t> {
for (int i = 0; i < obj.nLayers; i++) {
obj.pLayers[i].pLayerStruct = nullptr;
}
obj.nGroup = 0;
}
~CPPWrapper() {
for (int i = 0; i < obj.nLayers; i++) {

View File

@@ -115,6 +115,8 @@ void GNADeviceHelper::updateGnaPerfCounters() {
void GNADeviceHelper::getGnaPerfCounters(std::map<std::string, InferenceEngine::InferenceEngineProfileInfo>& retPerfCounters) {
InferenceEngine::InferenceEngineProfileInfo info;
info.status = InferenceEngine::InferenceEngineProfileInfo::EXECUTED;
info.cpu_uSec = 0;
info.execution_index = 0;
// Hardware
info.realTime_uSec = nGNAPerfResultsTotal.hw.total;

View File

@@ -1,8 +1,6 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// gna_helper.cpp : various GNA-related utility functions
//
#include "lstm.hpp"

View File

@@ -100,12 +100,16 @@ class LayerInfo {
bool isEltwiseSum() const noexcept {
IS_VALID();
if (!isEltwise()) return false;
return dynamic_cast<const InferenceEngine::EltwiseLayer*>(layer)->_operation ==
InferenceEngine::EltwiseLayer::Sum;
// dynamic_cast<const InferenceEngine::EltwiseLayer *>(layer) is validated in isEltwise function
// coverity[var_deref_op]
return dynamic_cast<const InferenceEngine::EltwiseLayer *>(layer)->_operation ==
InferenceEngine::EltwiseLayer::Sum;
}
bool isEltwiseMul() const noexcept {
IS_VALID();
if (!isEltwise()) return false;
// dynamic_cast<const InferenceEngine::EltwiseLayer *>(layer) is validated in isEltwise function
// coverity[var_deref_op]
return dynamic_cast<const InferenceEngine::EltwiseLayer*>(layer)->_operation ==
InferenceEngine::EltwiseLayer::Prod;
}
@@ -156,8 +160,13 @@ class LayerInfo {
}
bool isCropAffined() const noexcept {
auto cropLayer = dynamic_cast<InferenceEngine::CropLayer *> (layer);
size_t cropOffset = cropLayer->offset.back() * cropLayer->precision.size();
return (ALIGN64(cropOffset) != cropOffset);
if (cropLayer != nullptr && !cropLayer->offset.empty()) {
try {
size_t cropOffset = cropLayer->offset.back() * cropLayer->precision.size();
return (ALIGN64(cropOffset) != cropOffset);
} catch (InferenceEngine::details::InferenceEngineException& e) {}
}
return false;
}
bool isCopy() const noexcept {
IS_VALID();

View File

@@ -36,7 +36,7 @@ struct MemRequest {
uint8_t _element_size;
size_t _num_elements;
size_t _alignment;
size_t _offset;
size_t _offset = 0;
// expansion in bytes due to large depended layers
size_t _padding = 0;
MemRequest(rRegion region,

View File

@@ -106,19 +106,19 @@ class GNAModelSerial {
/**
* if scale factor is different then pased into infer , network might need to be requantized
*/
float scaleFactor;
float scaleFactor = 0;
/**
* Pointer descriptor
*/
void* descriptor_ptr;
void* descriptor_ptr = nullptr;
/**
* Endpoint resolution in bytes.
*/
uint32_t element_size;
uint32_t element_size = 0;
/**
* Number of elements
*/
uint32_t elements_count;
uint32_t elements_count = 0;
RuntimeEndPoint() = default;
RuntimeEndPoint(double scaleFactor,

View File

@@ -1275,7 +1275,12 @@ void GNAPlugin::PWLPrimitive(InferenceEngine::CNNLayerPtr layer) {
THROW_GNA_EXCEPTION << "Activation function type not yet supported: " << type;
}
auto activation_type = DnnActivation::fromType(it->second);
activation_type.negative_slope = (it->second == kActRelu) ? dynamic_cast<ReLULayer*>(layer.get())->negative_slope : 0.0f;
if (it->second == kActRelu) {
auto reluLayer = dynamic_cast<ReLULayer *>(layer.get());
activation_type.negative_slope = reluLayer != nullptr ? reluLayer->negative_slope : 0.0f;
} else {
activation_type.negative_slope = 0.0f;
}
// TODO: need to take graph dependency instead of linear
auto &prevComponent = dnnComponentsForLayer.back().second;
@@ -1649,20 +1654,23 @@ void GNAPlugin::LoadNetwork(ICNNNetwork &network) {
for (auto layer = sortedNoMem.begin(); layer != sortedNoMem.end(); ++layer) {
CreateLayerPrimitive(*layer);
}
if (dnnComponentsForLayer.empty()) {
THROW_GNA_EXCEPTION << "No outputs found in dnn components structure";
}
DnnComponentsForLayer::iterator output_component = std::find_if(dnnComponentsForLayer.begin(),
dnnComponentsForLayer.end(),
[&](const std::pair<std::string, intel_dnn_component_t>& v)
{ return outputsDataMap.begin()->first == v.first; });
if (output_component == dnnComponentsForLayer.end()) {
if (dnnComponentsForLayer.empty()) {
THROW_GNA_EXCEPTION << "No outputs found in internal structures";
}
// likely layer is fused. Take last one
output_component = std::prev(dnnComponentsForLayer.end());
auto it = dnnComponentsForLayer.begin();
std::advance(it, dnnComponentsForLayer.size() - 1);
output_component = it;
gnalog() << "Output layer "<< outputsDataMap.begin()->first
<< " has not been found in component list. Took "
<< output_component->first << " instead \n" << std::flush;
<< " has not been found in component list. Took "
<< output_component->first << " instead \n" << std::flush;
}
gnamem->bind_ptr(&ptr_outputs_global.front(), &output_component->second.ptr_outputs);
@@ -1775,6 +1783,10 @@ void GNAPlugin::LoadNetwork(ICNNNetwork &network) {
orientation_out = output_component->second.orientation_out;
num_bytes_per_output = output_component->second.num_bytes_per_output;
if (sortedNet.empty()) {
THROW_GNA_EXCEPTION << "Sorted network is empty";
}
// find output layer
auto output = std::find_if(sortedNet.begin(),
sortedNet.end(),
@@ -1782,7 +1794,9 @@ void GNAPlugin::LoadNetwork(ICNNNetwork &network) {
{ return outputsDataMap.begin()->first == v.get()->name; });
if (output == sortedNet.end()) {
// likely layer is fused. Take last one
output = std::prev(sortedNet.end());
auto it = sortedNet.begin();
std::advance(it, sortedNet.size() - 1);
output = it;
}
auto quantized = InferenceEngine::getInjectedData<QuantizedLayerParams>(*output);
output_scale_factor = quantized != nullptr ? quantized->_dst_quant.scale : 1.0f;
@@ -2461,36 +2475,38 @@ void GNAPlugin::connectOutput(InferenceEngine::CNNLayerPtr layer, void *ptr, voi
[&name](GNAPlugin::GNAConcatLayer::ConcatConnectedLayerInfo &item) {
return item.name == name;
});
// reserve full size for concat
if (!concatLayerInfoItem.output_allocation_flag) {
// check if this concat is being included by other one
// by going thru each concat and checking inputs
auto included =
std::find_if(concat_connection.begin(),
concat_connection.end(),
[&concatLayerInfo]
(const std::pair<std::string, GNAPlugin::GNAConcatLayer> &concatItem) -> bool {
auto it = std::find_if(concatItem.second.concatInputLayers.begin(),
concatItem.second.concatInputLayers.end(),
[&concatLayerInfo]
(const GNAPlugin::GNAConcatLayer::ConcatConnectedLayerInfo &item) -> bool {
return item.name == concatLayerInfo->first;
});
return it != concatItem.second.concatInputLayers.end();
});
if (included == concat_connection.end()) {
gnamem->reserve_ptr(&concatLayerInfoItem.gna_ptr, ALIGN64(concatLayerInfoItem.reserved_size));
if (it != concatLayerInfoItem.concatInputLayers.end()) {
// reserve full size for concat
if (!concatLayerInfoItem.output_allocation_flag) {
// check if this concat is being included by other one
// by going thru each concat and checking inputs
auto included =
std::find_if(concat_connection.begin(),
concat_connection.end(),
[&concatLayerInfo]
(const std::pair<std::string, GNAPlugin::GNAConcatLayer> &concatItem) -> bool {
auto it = std::find_if(concatItem.second.concatInputLayers.begin(),
concatItem.second.concatInputLayers.end(),
[&concatLayerInfo]
(const GNAPlugin::GNAConcatLayer::ConcatConnectedLayerInfo &item) -> bool {
return item.name == concatLayerInfo->first;
});
return it != concatItem.second.concatInputLayers.end();
});
if (included == concat_connection.end()) {
gnamem->reserve_ptr(&concatLayerInfoItem.gna_ptr, ALIGN64(concatLayerInfoItem.reserved_size));
for (auto && inputLayer : concatLayerInfoItem.concatInputLayers) {
if ( InferenceEngine::details::CaselessEq<std::string>()
(inputLayer.name, "input") ) {
bytes_alllocated_for_input[inputLayer.name] = ALIGN64(concatLayerInfoItem.reserved_size) - inputLayer.offset;
for (auto &&inputLayer : concatLayerInfoItem.concatInputLayers) {
if (InferenceEngine::details::CaselessEq<std::string>()
(inputLayer.name, "input")) {
bytes_alllocated_for_input[inputLayer.name] = ALIGN64(concatLayerInfoItem.reserved_size) - inputLayer.offset;
}
}
}
concatLayerInfo->second.output_allocation_flag = true;
}
concatLayerInfo->second.output_allocation_flag = true;
gnamem->bind_ptr(ptr, &concatLayerInfoItem.gna_ptr, it->offset);
}
gnamem->bind_ptr(ptr, &concatLayerInfoItem.gna_ptr, it->offset);
} else {
// error
}

View File

@@ -67,7 +67,7 @@ class GNAPlugin : public InferenceEngine::IInferencePluginInternal, public std::
uint32_t num_feature_maps = 1;
uint32_t num_memory_bytes;
uint32_t num_memory_bytes = 0;
std::unordered_map<std::string, std::list<std::vector<void *>>::iterator> ptr_inputs_global_id;
std::list<std::vector<void *>> ptr_inputs_global_storage;
@@ -79,7 +79,7 @@ class GNAPlugin : public InferenceEngine::IInferencePluginInternal, public std::
uint32_t *ptr_active_indices = NULL;
uint32_t num_active_indices = 0;
uint32_t num_group_in = 0;
uint32_t num_bytes_weight;
uint32_t num_bytes_weight = 0;
uint32_t num_bytes_per_output = 0;
bool use_dynamic_quantization = false;

View File

@@ -411,7 +411,7 @@ void GNAPlugin::insertCopyLayer(std::vector<InferenceEngine::CNNLayerPtr> & laye
if ((LayerInfo(l).isMemory() && LayerInfo(prevLayer).isConcat()) ||
(LayerInfo(l).isConcat() && LayerInfo(prevLayer).isCrop())) {
if (LayerInfo(prevLayer).isCrop()) {
auto cropLayer = dynamic_cast<InferenceEngine::CropLayer *> (prevLayer.get());
auto cropLayer = LayerInfo(prevLayer).as<CropLayer*>();
size_t cropOffset = cropLayer->offset.back() * cropLayer->precision.size();
if (ALIGN(cropOffset, 8) != cropOffset) {
// The crop will be replced by affine.

View File

@@ -1,8 +1,6 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// lstm.cpp : GNA LSTM macro layer definition
//
#include "lstm.hpp"

View File

@@ -1,8 +1,6 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// pwl_design.cpp : simple activation function designer
//
#include "pwl.h"
#include "gna_plugin_log.hpp"

View File

@@ -301,6 +301,9 @@ inline void quantizeWeightsBiasesConv(const QuantDesc & quantDesc,
auto inputData = conv->insData[0].lock();
uint32_t num_rows = getBiasSizeForLayer(conv);
if (num_rows == 0) {
THROW_GNA_EXCEPTION << "Invalid num rows";
}
uint32_t num_columns = conv->_weights->size() / num_rows;
uint32_t num_rows_padded = num_rows;

View File

@@ -34,7 +34,9 @@ class ModelQuantizer {
// one of solution is to create not copyNet overloads, that accepts 2 functors, one for layer copy
// and another one for net copy
auto rawNet = dynamic_cast<InferenceEngine::details::CNNNetworkImpl *>(copiedNet.get());
rawNet->setPrecision(T::mandatory().getNetPrecision());
if (rawNet != nullptr) {
rawNet->setPrecision(T::mandatory().getNetPrecision());
}
// allow client code to access copied topology, to avoid copies if user would like to chain quantisation with
// another preprocessing

View File

@@ -4,6 +4,7 @@
#include <cstring>
#include <iostream>
#include <details/ie_exception.hpp>
#include "quantization.h"
void QuantizeAffine16(float *ptr_float_weights,
@@ -496,6 +497,9 @@ void QuantizeAffine8(float *ptr_float_weights, float *ptr_float_biases,
float input_scale_factor, float *ptr_weight_scale_factor,
float *ptr_output_scale_factor, uint32_t num_rows, uint32_t num_columns,
uint32_t num_rows_padded, uint32_t num_columns_padded) {
if (ptr_int_biases == nullptr) {
THROW_IE_EXCEPTION << "Int biases are empty";
}
uint32_t num_saturate = 0;
if (*ptr_weight_scale_factor == 1.0) {
@@ -547,11 +551,11 @@ void QuantizeAffine8(float *ptr_float_weights, float *ptr_float_biases,
value = scaled_row_max / static_cast<float>(MAX_VAL_1B_WEIGHT);
ptr_int_biases[row].multiplier = (uint8_t) (value + 0.5);
for (uint32_t col = 0; col < num_columns; col++) {
int8_t *ptr_weight_8 = ptr_int_weights + (row*num_columns_padded + col);
int8_t *ptr_weight_8 = ptr_int_weights + (row * num_columns_padded + col);
rounding_value = (ptr_float_weights[row * num_columns + col] > 0) ? 0.5f : -0.5f;
value = ptr_float_weights[row*num_columns + col] * (*ptr_weight_scale_factor / ptr_int_biases[row].multiplier) + rounding_value;
value = ptr_float_weights[row * num_columns + col] * (*ptr_weight_scale_factor / ptr_int_biases[row].multiplier) + rounding_value;
if (value > 127.0) {
*ptr_weight_8 = 127;
num_saturate++;
@@ -559,11 +563,11 @@ void QuantizeAffine8(float *ptr_float_weights, float *ptr_float_biases,
*ptr_weight_8 = -128;
num_saturate++;
} else {
*ptr_weight_8 = (int8_t)value;
*ptr_weight_8 = (int8_t) value;
}
}
for (uint32_t col = num_columns; col < num_columns_padded; col++) {
int8_t *ptr_weight_8 = ptr_int_weights + (row*num_columns_padded + col);
int8_t *ptr_weight_8 = ptr_int_weights + (row * num_columns_padded + col);
*ptr_weight_8 = 0;
}
}

View File

@@ -191,7 +191,7 @@ class ScaleFactorPerLayer<InferenceEngine::EltwiseLayer*> {
continue;
} else if (info.has16BOutput() && info.isActivation()) {
auto newOutputScale = quantParams->_dst_quant.scale / maxValue;
if (newOutputScale > std::numeric_limits<int16_t>::max() / 2) {
if (newOutputScale > static_cast<float>(std::numeric_limits<int16_t>::max()) / 2) {
break;
}
auto quantDataForActivation = InferenceEngine::getInjectedData<QuantizedLayerParams>(*in);
@@ -413,7 +413,9 @@ class ScaleFactorCalculator {
}
return ptr == cnnLayer.get();
});
idx++;
if (idx != net.end()) {
idx++;
}
needRestart = true;
return true;
}

View File

@@ -1,8 +1,6 @@
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
// util.cpp : various utility functions for debugging, file i/o, etc.
//
#include <cinttypes>
#ifndef _WIN32

View File

@@ -1,17 +1,5 @@
//
// Copyright (C) 2018-2019 Intel Corporation.
//
// This software and the related documents are Intel copyrighted materials,
// and your use of them is governed by the express license under which they
// were provided to you (End User License Agreement for the Intel(R) Software
// Development Products (Version May 2017)). Unless the License provides
// otherwise, you may not use, modify, copy, publish, distribute, disclose or
// transmit this software or the related documents without Intel's prior
// written permission.
//
// This software and the related documents are provided as is, with no
// express or implied warranties, other than those that are expressly
// stated in the License.
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include "fallback_policy.h"
@@ -66,7 +54,7 @@ void FallbackPolicy::init(const std::string &config, const std::map<std::string,
if (_deviceLoaders.find(d) == _deviceLoaders.end()) {
IHeteroDeviceLoader::Ptr loader;
loader = std::make_shared<HeteroDeviceLoader>(d);
HeteroDeviceLoader *pdl = dynamic_cast<HeteroDeviceLoader *>(loader.get());
HeteroDeviceLoader *pdl = static_cast<HeteroDeviceLoader *>(loader.get());
pdl->initConfigs(allConfigs, extensions);
_deviceLoaders[d] = loader;
}

View File

@@ -1,17 +1,5 @@
//
// Copyright (C) 2018-2019 Intel Corporation.
//
// This software and the related documents are Intel copyrighted materials,
// and your use of them is governed by the express license under which they
// were provided to you (End User License Agreement for the Intel(R) Software
// Development Products (Version May 2017)). Unless the License provides
// otherwise, you may not use, modify, copy, publish, distribute, disclose or
// transmit this software or the related documents without Intel's prior
// written permission.
//
// This software and the related documents are provided as is, with no
// express or implied warranties, other than those that are expressly
// stated in the License.
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#pragma once

Some files were not shown because too many files have changed in this diff Show More