[OV2.0] Preprocessing documentation (#10451)

* [OV2.0] Preprocessing documentation - first draft

* Small update

* Added ov::Layout overview

* Fix code style

* Preprocessing details - ~50% done

* Corrected links

* Fixed comments, added more docs

* Minor updates

* Couple more links

* Fixed comments

* Remove 'future' link
This commit is contained in:
Mikhail Nosov 2022-02-21 19:20:23 +03:00 committed by GitHub
parent 65d1575642
commit f82533005b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
14 changed files with 1181 additions and 57 deletions

View File

@ -10,6 +10,7 @@
openvino_docs_IE_DG_Integrate_with_customer_application_new_API
openvino_docs_OV_Runtime_UG_Model_Representation
openvino_docs_OV_Runtime_UG_Preprocessing_Overview
<!-- rename to "Changing input shapes" -->
openvino_docs_IE_DG_ShapeInference
openvino_docs_IE_DG_Device_Plugins

View File

@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:8fed5e153636e3e556e000e3e5fc48b9da8f5a1272490550066d647d306ec24f
size 81575

View File

@ -0,0 +1,154 @@
# Layout API overview {#openvino_docs_OV_Runtime_UG_Layout_Overview}
## Introduction
In few words, with layout `NCHW` it is easier to understand what model's shape `{8, 3, 224, 224}` means. Without layout it is just a 4-dimensional tensor.
Concept of layout helps you (and your application) to understand what does each particular dimension of input/output tensor mean. For example, if your input has shape `{1, 3, 720, 1280}` and layout "NCHW" - it is clear that `N(batch) = 1`, `C(channels) = 3`, `H(height) = 720` and `W(width) = 1280`. Without layout information `{1, 3, 720, 1280}` doesn't give any idea to your application what these number mean and how to resize input image to fit model's expectations.
Reasons when you may want to care about input/output layout:
- Perform model modification:
- Apply [preprocessing](./preprocessing_overview.md) steps, like subtract means, divide by scales, resize image, convert RGB<->BGR
- Set/get batch for a model
- Same operations, used during model conversion phase, see [Model Optimizer model conversion](../MO_DG/prepare_model/convert_model/Converting_Model.md)
- Improve readability of a model's input and output
## Layout syntax
### Short
The easiest way is to fully specify each dimension with one alphabetical letter
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_layout.cpp
:language: cpp
:fragment: [ov:layout:simple]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_layout.py
:language: python
:fragment: [ov:layout:simple]
@endsphinxdirective
This assigns 'N' to first dimension, 'C' to second, 'H' to 3rd and 'W' to 4th
### Advanced
Advanced syntax allows assigning a word to a dimension. To do this, wrap layout with square brackets `[]` and specify each name separated by comma `,`
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_layout.cpp
:language: cpp
:fragment: [ov:layout:complex]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_layout.py
:language: python
:fragment: [ov:layout:complex]
@endsphinxdirective
### Partially defined layout
If some dimension is not important, it's name can be set to `?`
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_layout.cpp
:language: cpp
:fragment: [ov:layout:partially_defined]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_layout.py
:language: python
:fragment: [ov:layout:partially_defined]
@endsphinxdirective
### Dynamic layout
If number of dimensions is not important, ellipsis `...` can be used to specify variadic number of dimensions.
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_layout.cpp
:language: cpp
:fragment: [ov:layout:dynamic]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_layout.py
:language: python
:fragment: [ov:layout:dynamic]
@endsphinxdirective
### Predefined names
Layout has pre-defined some widely used in computer vision dimension names:
- N/Batch - batch size
- C/Channels - channels dimension
- D/Depth - depth
- H/Height - height
- W/Width - width
These names are used in [PreProcessing API](./preprocessing_overview.md) and there is a set of helper functions to get appropriate dimension index from layout
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_layout.cpp
:language: cpp
:fragment: [ov:layout:predefined]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_layout.py
:language: python
:fragment: [ov:layout:predefined]
@endsphinxdirective
### Equality
Layout names are case-insensitive, which means that ```Layout("NCHW") == Layout("nChW") == Layout("[N,c,H,w]")```
### Dump layout
Layout can be converted to string in advanced syntax format. Can be useful for debugging and serialization purposes
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_layout.cpp
:language: cpp
:fragment: [ov:layout:dump]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_layout.py
:language: python
:fragment: [ov:layout:dump]
@endsphinxdirective
## See also
* <code>ov::Layout</code> C++ class documentation

View File

@ -0,0 +1,346 @@
# Preprocessing API - details {#openvino_docs_OV_Runtime_UG_Preprocessing_Details}
## Preprocessing capabilities
### Addressing particular input/output
If your model has only one input, then simple <code>ov::preprocess::PrePostProcessor::input()</code> will get a reference to preprocessing builder for this input (tensor, steps, model):
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:input_1]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:input_1]
@endsphinxdirective
In general, when model has multiple inputs/outputs, each one can be addressed by tensor name
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:input_name]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:input_name]
@endsphinxdirective
Or by it's index
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:input_index]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:input_index]
@endsphinxdirective
C++ references:
* <code>ov::preprocess::InputTensorInfo</code>
* <code>ov::preprocess::OutputTensorInfo</code>
* <code>ov::preprocess::PrePostProcessor</code>
### Supported preprocessing operations
C++ references:
* <code>ov::preprocess::PreProcessSteps</code>
#### Mean/Scale normalization
Typical data normalization includes 2 operations for each data item: subtract mean value and divide to standard deviation. This can be done with the following code:
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:mean_scale]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:mean_scale]
@endsphinxdirective
In Computer Vision area normalization is usually done separately for R, G, B values. To do this, [layout with 'C' dimension](./layout_overview.md) shall be defined. Example:
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:mean_scale_array]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:mean_scale_array]
@endsphinxdirective
C++ references:
* <code>ov::preprocess::PreProcessSteps::mean()</code>
* <code>ov::preprocess::PreProcessSteps::scale()</code>
#### Convert precision
In Computer Vision, image is represented by array of unsigned 8-but integer values (for each color), but model accepts floating point tensors
To integrate precision conversion into execution graph as a preprocessing step, just do:
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:convert_element_type]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:convert_element_type]
@endsphinxdirective
C++ references:
* <code>ov::preprocess::InputTensorInfo::set_element_type()</code>
* <code>ov::preprocess::PreProcessSteps::convert_element_type()</code>
#### Convert layout (transpose)
Transposing of matrices/tensors is a typical operation in Deep Learning - you may have a BMP image 640x480 which is an array of `{480, 640, 3}` elements, but Deep Learning model can require input with shape `{1, 3, 480, 640}`
Using [layout](./layout_overview.md) of user's tensor and layout of original model conversion can be done implicitly
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:convert_layout]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:convert_layout]
@endsphinxdirective
Or if you prefer manual transpose of axes without usage of [layout](./layout_overview.md) in your code, just do:
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:convert_layout_2]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:convert_layout_2]
@endsphinxdirective
It performs the same transpose, but we believe that approach using source and destination layout can be easier to read and understand
C++ references:
* <code>ov::preprocess::PreProcessSteps::convert_layout()</code>
* <code>ov::preprocess::InputTensorInfo::set_layout()</code>
* <code>ov::preprocess::InputModelInfo::set_layout()</code>
* <code>ov::Layout</code>
#### Resize image
Resizing of image is a typical preprocessing step for computer vision tasks. With preprocessing API this step can also be integrated into execution graph and performed on target device.
To resize the input image, it is needed to define `H` and `W` dimensions of [layout](./layout_overview.md)
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:resize_1]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:resize_1]
@endsphinxdirective
Or in case if original model has known spatial dimensions (widht+height), target width/height can be omitted
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:resize_2]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:resize_2]
@endsphinxdirective
C++ references:
* <code>ov::preprocess::PreProcessSteps::resize()</code>
* <code>ov::preprocess::ResizeAlgorithm</code>
#### Color conversion
Typical use case is to reverse color channels from RGB to BGR and wise versa. To do this, specify source color format in `tensor` section and perform `convert_color` preprocessing operation. In example below, user has `BGR` image and needs to convert it to `RGB` as required for model's input
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:convert_color_1]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:convert_color_1]
@endsphinxdirective
#### Color conversion - NV12/I420
Preprocessing also support YUV-family source color formats, i.e. NV12 and I420.
In advanced cases such YUV images can be splitted into separate planes, e.g. for NV12 images Y-component may come from one source and UV-component comes from another source. Concatenating such components in user's application manually is not a perfect solution from performance and device utilization perspectives, so there is a way to use Preprocessing API. For such cases there is `NV12_TWO_PLANES` and `I420_THREE_PLANES` source color formats, which will split original `input` to 2 or 3 inputs
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:convert_color_2]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:convert_color_2]
@endsphinxdirective
In this example, original `input` is being split to `input/y` and `input/uv` inputs. You can fill `input/y` from one source, and `input/uv` from another source. Color conversion to `RGB` will be performed using these sources, it is more optimal as there will be no additional copies of NV12 buffers.
C++ references:
* <code>ov::preprocess::ColorFormat</code>
* <code>ov::preprocess::PreProcessSteps::convert_color</code>
### Custom operations
Preprocessing API also allows adding custom preprocessing steps into execution graph. Custom step is a function which accepts current 'input' node and returns new node after adding preprocessing step
> **Note:** Custom preprocessing function shall only insert node(s) after input, it will be done during model compilation. This function will NOT be called during execution phase. This may look not trivial and require some knowledge of [OpenVINO™ operations](../ops/opset.md)
If there is a need to insert some additional operations to execution graph right after input, like some specific crops and/or resizes - Preprocessing API can be a good choice to implement this
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:custom]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:custom]
@endsphinxdirective
C++ references:
* <code>ov::preprocess::PreProcessSteps::custom()</code>
* [Available Operations Sets](../ops/opset.md)
## Postprocessing
Postprocessing steps can be added to model outputs. As for preprocessing, these steps will be also integrated into graph and executed on selected device.
Preprocessing uses flow **User tensor** -> **Steps** -> **Model input**
Postprocessing is wise versa: **Model output** -> **Steps** -> **User tensor**
Comparing to preprocessing, there is not so much operations needed to do in post-processing stage, so right now only following postprocessing operations are supported:
- Convert [layout](./layout_overview.md)
- Convert element type
- Custom operations
Usage of these operations is similar to Preprocessing. Some example is shown below:
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:postprocess]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:postprocess]
@endsphinxdirective
C++ references:
* <code>ov::preprocess::PostProcessSteps</code>
* <code>ov::preprocess::OutputModelInfo</code>
* <code>ov::preprocess::OutputTensorInfo</code>

View File

@ -0,0 +1,169 @@
# Overview of Preprocessing API {#openvino_docs_OV_Runtime_UG_Preprocessing_Overview}
@sphinxdirective
.. toctree::
:maxdepth: 1
:hidden:
openvino_docs_OV_Runtime_UG_Preprocessing_Details
openvino_docs_OV_Runtime_UG_Layout_Overview
@endsphinxdirective
## Introduction
When your input data don't perfectly fit to Neural Network model input tensor - this means that additional operations/steps are needed to transform your data to format expected by model. These operations are known as "preprocessing".
### Example
Consider the following standard example: deep learning model expects input with shape `{1, 3, 224, 224}`, `FP32` precision, `RGB` color channels order, and requires data normalization (subtract mean and divide by scale factor). But you have just a `640x480` `BGR` image (data is `{480, 640, 3}`). This means that we need some operations which will:
- Convert U8 buffer to FP32
- Transform to `planar` format: from `{1, 480, 640, 3}` to `{1, 3, 480, 640}`
- Resize image from 640x480 to 224x224
- Make `BGR->RGB` conversion as model expects `RGB`
- For each pixel, subtract mean values and divide by scale factor
![](img/preprocess_not_fit.png)
Even though all these steps can be relatively easy implemented manually in application's code before actual inference, it is possible to do it with Preprocessing API. Reasons to use this API are:
- Preprocessing API is easy to use
- Preprocessing steps will be integrated into execution graph and will be performed on selected device (CPU/GPU/VPU/etc.) rather than always being executed on CPU. This will improve selected device utilization which is always good.
## Preprocessing API
Intuitively, Preprocessing API consists of the following parts:
1. **Tensor:** Declare user's data format, like shape, [layout](./layout_overview.md), precision, color format of actual user's data
2. **Steps:** Describe sequence of preprocessing steps which need to be applied to user's data
3. **Model:** Specify Model's data format. Usually, precision and shape are already known for model, only additional information, like [layout](./layout_overview.md) can be specified
> **Note:** All model's graph modification shall be performed after model is read from disk and **before** it is being loaded on actual device.
### PrePostProcessor object
`ov::preprocess::PrePostProcessor` class allows specifying preprocessing and postprocessing steps for model read from disk.
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:create]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:create]
@endsphinxdirective
### Declare user's data format
To address particular input of model/preprocessor, use `ov::preprocess::PrePostProcessor::input(input_name)` method
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:tensor]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:tensor]
@endsphinxdirective
Here we've specified all information about user's input:
- Precision is U8 (unsigned 8-bit integer)
- Data represents tensor with {1,480,640,3} shape
- [Layout](./layout_overview.md) is "NHWC". It means that 'height=480, width=640, channels=3'
- Color format is `BGR`
### Declare model's layout
Model's input already has information about precision and shape. Preprocessing API is not intended to modify this. The only thing that may be specified is input's data [layout](./layout_overview.md)
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:model]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:model]
@endsphinxdirective
Now, if model's input has `{1,3,224,224}` shape, preprocessing will be able to identify that model's `height=224`, `width=224`, `channels=3`. Height/width information is necessary for 'resize', and `channels` is needed for mean/scale normalization
### Preprocessing steps
Now we can define sequence of preprocessing steps:
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:steps]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:steps]
@endsphinxdirective
Here:
- Convert U8 to FP32 precision
- Convert current color format (BGR) to RGB
- Resize to model's height/width. **Note** that if model accepts dynamic size, e.g. {?, 3, ?, ?}, `resize` will not know how to resize the picture, so in this case you should specify target height/width on this step. See also <code>ov::preprocess::PreProcessSteps::resize()</code>
- Subtract mean from each channel. On this step, color format is RGB already, so `100.5` will be subtracted from each Red component, and `101.5` will be subtracted from `Blue` one.
- Divide each pixel data to appropriate scale value. In this example each `Red` component will be divided by 50, `Green` by 51, `Blue` by 52 respectively
- **Note:** last `convert_layout` step is commented out as it is not necessary to specify last layout conversion. PrePostProcessor will do such conversion automatically
### Integrate steps into model
We've finished with preprocessing steps declaration, now it is time to build it. For debugging purposes it is possible to print `PrePostProcessor` configuration on screen:
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_preprocessing.cpp
:language: cpp
:fragment: [ov:preprocess:build]
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_preprocessing.py
:language: python
:fragment: [ov:preprocess:build]
@endsphinxdirective
After this, `model` will accept U8 input with `{1, 480, 640, 3}` shape, with `BGR` channels order. All conversion steps will be integrated into execution graph. Now you can load model on device and pass your image to model as is, without any data manipulation on application's side
## See Also
* [Preprocessing Details](./preprocessing_details.md)
* [Layout API overview](./layout_overview.md)
* <code>ov::preprocess::PrePostProcessor</code> C++ class documentation

View File

@ -57,7 +57,7 @@ Glossary of terms used in the OpenVINO™
| Term | Description |
| :--- | :--- |
| :--- |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Batch | Number of images to analyze during one call of infer. Maximum batch size is a property of the network and it is set before loading of the network to the plugin. In NHWC, NCHW and NCDHW image data layout representation, the N refers to the number of images in the batch |
| Tensor | Memory container used for storing inputs, outputs of the network, weights and biases of the layers |
| Device (Affinitity) | A preferred Intel(R) hardware device to run the inference (CPU, GPU, etc.) |
@ -69,7 +69,7 @@ Glossary of terms used in the OpenVINO™
| OpenVINO™ Runtime | A C++ library with a set of classes that you can use in your application to infer input data (images) and get the result |
| OpenVINO™ API | The basic default API for all supported devices, which allows you to load a model from Intermediate Representation, set input and output formats and execute the model on various devices |
| OpenVINO™ <code>Core</code> | OpenVINO™ Core is a software component that manages inference on certain Intel(R) hardware devices: CPU, GPU, MYRIAD, GNA, etc. |
| <code>ov::Layout</code> | Image data layout refers to the representation of images batch. Layout shows a sequence of 4D or 5D tensor data in memory. A typical NCHW format represents pixel in horizontal direction, rows by vertical dimension, planes by channel and images into batch |
| <code>ov::Layout</code> | Image data layout refers to the representation of images batch. Layout shows a sequence of 4D or 5D tensor data in memory. A typical NCHW format represents pixel in horizontal direction, rows by vertical dimension, planes by channel and images into batch. See also [Layout API Overview](./OV_Runtime_UG/layout_overview.md) |
| <code>ov::element::Type</code> | Represents data element type. For example, f32 is 32-bit floating point, f16 is 16-bit floating point. Element type can be changed before loading the network to the plugin |

View File

@ -0,0 +1,55 @@
// Copyright (C) 2018-2022 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include <openvino/core/layout.hpp>
int main() {
ov::Layout layout;
//! [ov:layout:simple]
layout = ov::Layout("NHWC");
//! [ov:layout:simple]
//! [ov:layout:complex]
// Each dimension has name separated by comma, layout is wrapped with square brackets
layout = ov::Layout("[time,temperature,humidity]");
//! [ov:layout:complex]
//! [ov:layout:partially_defined]
// First dimension is batch, 4th is 'channels'. Others are not important for us
layout = ov::Layout("N??C");
// Or the same using advanced syntax
layout = ov::Layout("[n,?,?,c]");
//! [ov:layout:partially_defined]
//! [ov:layout:dynamic]
// First dimension is 'batch' others are whatever
layout = ov::Layout("N...");
// Second dimension is 'channels' others are whatever
layout = ov::Layout("?C...");
// Last dimension is 'channels' others are whatever
layout = ov::Layout("...C");
//! [ov:layout:dynamic]
//! [ov:layout:predefined]
// returns 0 for batch
ov::layout::batch_idx("NCDHW");
// returns 1 for channels
ov::layout::channels_idx("NCDHW");
// returns 2 for depth
ov::layout::depth_idx("NCDHW");
// returns -2 for height
ov::layout::height_idx("...HW");
// returns -1 for width
ov::layout::width_idx("...HW");
//! [ov:layout:predefined]
//! [ov:layout:dump]
layout = ov::Layout("NCHW");
std::cout << layout.to_string(); // prints [N,C,H,W]
//! [ov:layout:dump]
return 0;
}

View File

@ -0,0 +1,54 @@
# Copyright (C) 2018-2022 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
#
# ! [ov:layout:simple]
from openvino.runtime import Layout
layout = Layout('NCHW')
# ! [ov:layout:simple]
# ! [ov:layout:complex]
# Each dimension has name separated by comma
# Layout is wrapped with square brackets
layout = Layout('[time,temperature,humidity]')
# ! [ov:layout:complex]
# ! [ov:layout:partially_defined]
# First dimension is batch, 4th is 'channels'.
# Others are not important for us
layout = Layout('N??C')
# Or the same using advanced syntax
layout = Layout('[n,?,?,c]')
# ! [ov:layout:partially_defined]
# ! [ov:layout:dynamic]
# First dimension is 'batch' others are whatever
layout = Layout('N...')
# Second dimension is 'channels' others are whatever
layout = Layout('?C...')
# Last dimension is 'channels' others are whatever
layout = Layout('...C')
# ! [ov:layout:dynamic]
# ! [ov:layout:predefined]
from openvino.runtime import layout_helpers
# returns 0 for batch
layout_helpers.batch_idx(Layout('NCDHW'))
# returns 1 for channels
layout_helpers.channels_idx(Layout('NCDHW'))
# returns 2 for depth
layout_helpers.depth_idx(Layout('NCDHW'))
# returns -2 for height
layout_helpers.height_idx(Layout('...HW'))
# returns -1 for width
layout_helpers.width_idx(Layout('...HW'))
# ! [ov:layout:predefined]
# ! [ov:layout:dump]
layout = Layout('NCHW')
print(layout) # prints [N,C,H,W]
# ! [ov:layout:dump]

View File

@ -0,0 +1,152 @@
// Copyright (C) 2018-2022 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
#include <openvino/runtime/core.hpp>
#include <openvino/opsets/opset8.hpp>
#include <openvino/core/preprocess/pre_post_process.hpp>
void ppp_input_1(ov::preprocess::PrePostProcessor& ppp) {
//! [ov:preprocess:input_1]
ppp.input() // no index/name is needed if model has one input
.preprocess().scale(50.f);
ppp.output() // same for output
.postprocess().convert_element_type(ov::element::u8);
//! [ov:preprocess:input_1]
//! [ov:preprocess:mean_scale]
ppp.input("input").preprocess().mean(128).scale(127);
//! [ov:preprocess:mean_scale]
//! [ov:preprocess:mean_scale_array]
// Suppose model's shape is {1, 3, 224, 224}
ppp.input("input").model().set_layout("NCHW"); // N=1, C=3, H=224, W=224
// Mean/Scale has 3 values which matches with C=3
ppp.input("input").preprocess()
.mean({103.94, 116.78, 123.68}).scale({57.21, 57.45, 57.73});
//! [ov:preprocess:mean_scale_array]
//! [ov:preprocess:convert_element_type]
// First define data type for your tensor
ppp.input("input").tensor().set_element_type(ov::element::u8);
// Then define preprocessing step
ppp.input("input").preprocess().convert_element_type(ov::element::f32);
// If conversion is needed to `model's` element type, 'f32' can be omitted
ppp.input("input").preprocess().convert_element_type();
//! [ov:preprocess:convert_element_type]
//! [ov:preprocess:convert_layout]
// First define layout for your tensor
ppp.input("input").tensor().set_layout("NHWC");
// Then define layout of model
ppp.input("input").model().set_layout("NCHW");
std::cout << ppp; // Will print 'implicit layout conversion step'
//! [ov:preprocess:convert_layout]
//! [ov:preprocess:convert_layout_2]
ppp.input("input").tensor().set_shape({1, 480, 640, 3});
// Model expects shape {1, 3, 480, 640}
ppp.input("input").preprocess().convert_layout({0, 3, 1, 2});
// 0 -> 0; 3 -> 1; 1 -> 2; 2 -> 3
//! [ov:preprocess:convert_layout_2]
//! [ov:preprocess:resize_1]
ppp.input("input").tensor().set_shape({1, 3, 960, 1280});
ppp.input("input").model().set_layout("??HW");
ppp.input("input").preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR, 480, 640);
//! [ov:preprocess:resize_1]
//! [ov:preprocess:resize_2]
ppp.input("input").tensor().set_shape({1, 3, 960, 1280});
ppp.input("input").model().set_layout("??HW"); // Model accepts {1, 3, 480, 640} shape
// Resize to model's dimension
ppp.input("input").preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR);
//! [ov:preprocess:resize_2]
//! [ov:preprocess:convert_color_1]
ppp.input("input").tensor().set_color_format(ov::preprocess::ColorFormat::BGR);
ppp.input("input").preprocess().convert_color(ov::preprocess::ColorFormat::RGB);
//! [ov:preprocess:convert_color_1]
//! [ov:preprocess:convert_color_2]
// This will split original `input` to 2 separate inputs: `input/y' and 'input/uv'
ppp.input("input").tensor().set_color_format(ov::preprocess::ColorFormat::NV12_TWO_PLANES);
ppp.input("input").preprocess().convert_color(ov::preprocess::ColorFormat::RGB);
std::cout << ppp; // Dump preprocessing steps to see what will happen
//! [ov:preprocess:convert_color_2]
}
void ppp_input_2(ov::preprocess::PrePostProcessor& ppp) {
//! [ov:preprocess:input_index]
auto &input_1 = ppp.input(1); // Gets 2nd input in a model
auto &output_1 = ppp.output(2); // Get output with index=2 (3rd one) in a model
//! [ov:preprocess:input_index]
}
void ppp_input_name(ov::preprocess::PrePostProcessor& ppp) {
//! [ov:preprocess:input_name]
auto &input_image = ppp.input("image");
auto &output_result = ppp.output("result");
//! [ov:preprocess:input_name]
}
int main() {
std::string model_path;
std::string input_name;
//! [ov:preprocess:create]
ov::Core core;
std::shared_ptr<ov::Model> model = core.read_model(model_path);
ov::preprocess::PrePostProcessor ppp(model);
//! [ov:preprocess:create]
//! [ov:preprocess:tensor]
ov::preprocess::InputInfo& input = ppp.input(input_name);
input.tensor()
.set_element_type(ov::element::u8)
.set_shape({1, 480, 640, 3})
.set_layout("NHWC")
.set_color_format(ov::preprocess::ColorFormat::BGR);
//! [ov:preprocess:tensor]
//! [ov:preprocess:model]
// `model's input` already `knows` it's shape and data type, no need to specify them here
input.model().set_layout("NCHW");
//! [ov:preprocess:model]
//! [ov:preprocess:steps]
input.preprocess()
.convert_element_type(ov::element::f32)
.convert_color(ov::preprocess::ColorFormat::RGB)
.resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR)
.mean({100.5, 101, 101.5})
.scale({50., 51., 52.});
// Not needed, such conversion will be added implicitly
// .convert_layout("NCHW");
//! [ov:preprocess:steps]
//! [ov:preprocess:custom]
ppp.input("input_image").preprocess()
.custom([](const ov::Output<ov::Node>& node) {
// Custom nodes can be inserted as Pre-processing steps
return std::make_shared<ov::opset8::Abs>(node);
});
//! [ov:preprocess:custom]
//! [ov:preprocess:postprocess]
// Model's output has 'NCHW' layout
ppp.output("result_image").model().set_layout("NCHW");
// Set target user's tensor to U8 type + 'NHWC' layout
// Precision & layout conversions will be done implicitly
ppp.output("result_image").tensor()
.set_layout("NHWC")
.set_element_type(ov::element::u8);
// Also it is possible to insert some custom operations
ppp.output("result_image").postprocess()
.custom([](const ov::Output<ov::Node>& node) {
// Custom nodes can be inserted as Post-processing steps
return std::make_shared<ov::opset8::Abs>(node);
});
//! [ov:preprocess:postprocess]
//! [ov:preprocess:build]
std::cout << "Dump preprocessor: " << ppp << std::endl;
model = ppp.build();
//! [ov:preprocess:build]
OPENVINO_ASSERT(model, "Model is invalid");
return 0;
}

View File

@ -0,0 +1,171 @@
# Copyright (C) 2018-2022 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
#
from openvino.preprocess import ResizeAlgorithm, ColorFormat
from openvino.runtime import Layout, Type
xml_path = ''
input_name = ''
# ! [ov:preprocess:create]
from openvino.preprocess import PrePostProcessor
from openvino.runtime import Core
core = Core()
model = core.read_model(model=xml_path)
ppp = PrePostProcessor(model)
# ! [ov:preprocess:create]
# ! [ov:preprocess:tensor]
from openvino.preprocess import ColorFormat
from openvino.runtime import Layout, Type
ppp.input(input_name).tensor() \
.set_element_type(Type.u8) \
.set_shape([1, 480, 640, 3]) \
.set_layout(Layout('NHWC')) \
.set_color_format(ColorFormat.BGR)
# ! [ov:preprocess:tensor]
# ! [ov:preprocess:model]
# `model's input` already `knows` it's shape and data type, no need to specify them here
ppp.input(input_name).model().set_layout(Layout('NCHW'))
# ! [ov:preprocess:model]
# ! [ov:preprocess:steps]
from openvino.preprocess import ResizeAlgorithm
ppp.input(input_name).preprocess() \
.convert_element_type(Type.f32) \
.convert_color(ColorFormat.RGB) \
.resize(ResizeAlgorithm.RESIZE_LINEAR) \
.mean([100.5, 101, 101.5]) \
.scale([50., 51., 52.])
# .convert_layout(Layout('NCHW')); # Not needed, such conversion will be added implicitly
# ! [ov:preprocess:steps]
# ! [ov:preprocess:build]
print(f'Dump preprocessor: {ppp}')
model = ppp.build()
# ! [ov:preprocess:build]
# ! [ov:preprocess:input_index]
ppp.input(1) # Gets 2nd input in a model
ppp.output(2) # Gets output with index=2 (3rd one) in a model
# ! [ov:preprocess:input_index]
# ! [ov:preprocess:input_name]
ppp.input('image')
ppp.output('result')
# ! [ov:preprocess:input_name]
# ! [ov:preprocess:input_1]
# no index/name is needed if model has one input
ppp.input().preprocess().scale(50.)
# same for output
ppp.output() \
.postprocess().convert_element_type(Type.u8)
# ! [ov:preprocess:input_1]
# ! [ov:preprocess:mean_scale]
ppp.input('input').preprocess().mean(128).scale(127)
# ! [ov:preprocess:mean_scale]
# ! [ov:preprocess:mean_scale_array]
# Suppose model's shape is {1, 3, 224, 224}
# N=1, C=3, H=224, W=224
ppp.input('input').model().set_layout(Layout('NCHW'))
# Mean/Scale has 3 values which matches with C=3
ppp.input('input').preprocess() \
.mean([103.94, 116.78, 123.68]).scale([57.21, 57.45, 57.73])
# ! [ov:preprocess:mean_scale_array]
# ! [ov:preprocess:convert_element_type]
# First define data type for your tensor
ppp.input('input').tensor().set_element_type(Type.u8)
# Then define preprocessing step
ppp.input('input').preprocess().convert_element_type(Type.f32)
# If conversion is needed to `model's` element type, 'f32' can be omitted
ppp.input('input').preprocess().convert_element_type()
# ! [ov:preprocess:convert_element_type]
# ! [ov:preprocess:convert_layout]
# First define layout for your tensor
ppp.input('input').tensor().set_layout(Layout('NHWC'))
# Then define layout of model
ppp.input('input').model().set_layout(Layout('NCHW'))
print(ppp) # Will print 'implicit layout conversion step'
# ! [ov:preprocess:convert_layout]
# ! [ov:preprocess:convert_layout_2]
ppp.input('input').tensor().set_shape([1, 480, 640, 3])
# Model expects shape {1, 3, 480, 640}
ppp.input('input').preprocess()\
.convert_layout([0, 3, 1, 2])
# 0 -> 0; 3 -> 1; 1 -> 2; 2 -> 3
# ! [ov:preprocess:convert_layout_2]
# ! [ov:preprocess:resize_1]
ppp.input('input').tensor().set_shape([1, 3, 960, 1280])
ppp.input('input').model().set_layout(Layout('??HW'))
ppp.input('input').preprocess()\
.resize(ResizeAlgorithm.RESIZE_LINEAR, 480, 640)
# ! [ov:preprocess:resize_1]
# ! [ov:preprocess:resize_2]
ppp.input('input').tensor().set_shape([1, 3, 960, 1280])
# Model accepts {1, 3, 480, 640} shape, thus last dimensions are 'H' and 'W'
ppp.input('input').model().set_layout(Layout('??HW'))
# Resize to model's dimension
ppp.input('input').preprocess().resize(ResizeAlgorithm.RESIZE_LINEAR)
# ! [ov:preprocess:resize_2]
# ! [ov:preprocess:convert_color_1]
ppp.input('input').tensor().set_color_format(ColorFormat.BGR)
ppp.input('input').preprocess().convert_color(ColorFormat.RGB)
# ! [ov:preprocess:convert_color_1]
# ! [ov:preprocess:convert_color_2]
# This will split original `input` to 2 separate inputs: `input/y' and 'input/uv'
ppp.input('input').tensor()\
.set_color_format(ColorFormat.NV12_TWO_PLANES)
ppp.input('input').preprocess()\
.convert_color(ColorFormat.RGB)
print(ppp) # Dump preprocessing steps to see what will happen
# ! [ov:preprocess:convert_color_2]
# ! [ov:preprocess:custom]
# It is possible to insert some custom operations
import openvino.runtime.opset8 as ops
from openvino.runtime import Output
from openvino.runtime.utils.decorators import custom_preprocess_function
@custom_preprocess_function
def custom_abs(output: Output):
# Custom nodes can be inserted as Preprocessing steps
return ops.abs(output)
ppp.input("input_image").preprocess() \
.custom(custom_abs)
# ! [ov:preprocess:custom]
# ! [ov:preprocess:postprocess]
# Model's output has 'NCHW' layout
ppp.output('result_image').model().set_layout(Layout('NCHW'))
# Set target user's tensor to U8 type + 'NHWC' layout
# Precision & layout conversions will be done implicitly
ppp.output('result_image').tensor()\
.set_layout(Layout("NHWC"))\
.set_element_type(Type.u8)
# Also it is possible to insert some custom operations
import openvino.runtime.opset8 as ops
from openvino.runtime import Output
from openvino.runtime.utils.decorators import custom_preprocess_function
@custom_preprocess_function
def custom_abs(output: Output):
# Custom nodes can be inserted as Post-processing steps
return ops.abs(output)
ppp.output("result_image").postprocess()\
.custom(custom_abs)
# ! [ov:preprocess:postprocess]

View File

@ -16,6 +16,23 @@
namespace ov {
/// \brief ov::Layout represents the text information of tensor's dimensions/axes. E.g. layout `NCHW` means that 4D
/// tensor `{-1, 3, 480, 640}` will have:
/// - 0: `N = -1`: batch dimension is dynamic
/// - 1: `C = 3`: number of channels is '3'
/// - 2: `H = 480`: image height is 480
/// - 3: `W = 640`: image width is 640
///
/// Examples: `ov::Layout` can be specified for:
/// - Preprocessing purposes. E.g.
/// - To apply normalization (means/scales) it is usually required to set 'C' dimension in a layout.
/// - To resize the image to specified width/height it is needed to set 'H' and 'W' dimensions in a layout
/// - To transpose image - source and target layout can be set (see
/// `ov::preprocess::PreProcessSteps::convert_layout`)
/// - To set/get model's batch (see `ov::get_batch`/`ov::set_batch') it is required in general to specify 'N' dimension
/// in layout for appropriate inputs
///
/// Refer also to `ov::layout` namespace for various additional helper functions of `ov::Layout`
class OPENVINO_API Layout {
public:
/// \brief Constructs a dynamic Layout with no layout information.
@ -61,6 +78,7 @@ public:
/// \brief String representation of Layout
std::string to_string() const;
/// \brief Returns 'true' if layout has no information, i.e. equals to Layout()
bool empty() const {
return *this == Layout();
}

View File

@ -42,37 +42,39 @@ public:
/// \brief Add 'convert layout' operation to specified layout.
///
/// \details Adds appropriate 'transpose' operation between model layout and user's desired layout.
/// Current implementation requires source and destination layout to have same number of dimensions
///
/// \example Example: when model data has output in 'NCHW' layout ([1, 3, 224, 224]) but user needs
/// interleaved output image ('NHWC', [1, 224, 224, 3]). Post-processing may look like this:
///
/// \code{.cpp} auto proc = PrePostProcessor(function);
/// proc.output().model(OutputTensorInfo().set_layout("NCHW"); // model output is NCHW
/// proc.output().postprocess().convert_layout("NHWC"); // User needs output as NHWC
/// \endcode
///
/// \param dst_layout New layout after conversion. If not specified - destination layout is obtained from
/// appropriate tensor output properties.
///
/// \return Reference to 'this' to allow chaining with other calls in a builder-like manner.
///
/// Adds appropriate 'transpose' operation between model layout and user's desired layout.
/// Current implementation requires source and destination layout to have same number of dimensions
///
/// Example: when model data has output in 'NCHW' layout ([1, 3, 224, 224]) but user needs
/// interleaved output image ('NHWC', [1, 224, 224, 3]). Post-processing may look like this:
///
/// \code{.cpp}
///
/// auto proc = PrePostProcessor(function);
/// proc.output().model(OutputTensorInfo().set_layout("NCHW"); // model output is NCHW
/// proc.output().postprocess().convert_layout("NHWC"); // User needs output as NHWC
/// \endcode
PostProcessSteps& convert_layout(const Layout& dst_layout = {});
/// \brief Add convert layout operation by direct specification of transposed dimensions.
///
/// \example Example: model produces output with shape [1, 3, 480, 640] and user's needs
/// \param dims Dimensions array specifying places for new axis. If not empty, array size (N) must match to input
/// shape rank. Array values shall contain all values from 0 to N-1. If empty, no actual conversion will be added.
///
/// \return Reference to 'this' to allow chaining with other calls in a builder-like manner.
///
/// Example: model produces output with shape [1, 3, 480, 640] and user's needs
/// interleaved output image [1, 480, 640, 3]. Post-processing may look like this:
///
/// \code{.cpp} auto proc = PrePostProcessor(function);
/// proc.output().postprocess().convert_layout({0, 2, 3, 1});
/// function = proc.build();
/// \endcode
///
/// \param dims Dimensions array specifying places for new axis. If not empty, array size (N) must match to input
/// shape rank. Array values shall contain all values from 0 to N-1. If empty, no actual conversion will be added.
///
/// \return Reference to 'this' to allow chaining with other calls in a builder-like manner.
PostProcessSteps& convert_layout(const std::vector<uint64_t>& dims);
/// \brief Signature for custom postprocessing operation. Custom postprocessing operation takes one output node and

View File

@ -15,10 +15,6 @@ class Model;
namespace preprocess {
/// \brief Main class for adding pre- and post- processing steps to existing ov::Model
/// API has Builder-like style to allow chaining calls in client's code, like
/// \code{.cpp}
/// auto proc = PrePostProcessor(function).input(<for input1>).input(<input2>);
/// \endcode
///
/// This is a helper class for writing easy pre- and post- processing operations on ov::Model object assuming that
/// any preprocess operation takes one input and produces one output.

View File

@ -117,44 +117,48 @@ public:
/// \brief Add 'convert layout' operation to specified layout.
///
/// \details Adds appropriate 'transpose' operation between user layout and target layout.
/// Current implementation requires source and destination layout to have same number of dimensions
///
/// \example Example: when user data has 'NHWC' layout (example is RGB image, [1, 224, 224, 3]) but model expects
/// planar input image ('NCHW', [1, 3, 224, 224]). Preprocessing may look like this:
///
/// \code{.cpp} auto proc = PrePostProcessor(function);
/// proc.input().tensor().set_layout("NHWC"); // User data is NHWC
/// proc.input().preprocess().convert_layout("NCHW")) // model expects input as NCHW
/// \endcode
///
/// \param dst_layout New layout after conversion. If not specified - destination layout is obtained from
/// appropriate model input properties.
///
/// \return Reference to 'this' to allow chaining with other calls in a builder-like manner.
///
/// Adds appropriate 'transpose' operation between user layout and target layout.
/// Current implementation requires source and destination layout to have same number of dimensions
///
/// Example: when user data has 'NHWC' layout (example is RGB image, [1, 224, 224, 3]) but model expects
/// planar input image ('NCHW', [1, 3, 224, 224]). Preprocessing may look like this:
///
/// \code{.cpp}
/// auto proc = PrePostProcessor(model);
/// proc.input().tensor().set_layout("NHWC"); // User data is NHWC
/// proc.input().preprocess().convert_layout("NCHW")) // model expects input as NCHW
/// \endcode
PreProcessSteps& convert_layout(const Layout& dst_layout = {});
/// \brief Add convert layout operation by direct specification of transposed dimensions.
///
/// \example Example: when user data has input RGB image {1x480x640x3} but model expects
/// planar input image ('NCHW', [1, 3, 480, 640]). Preprocessing may look like this:
///
/// \code{.cpp}
/// auto proc = PrePostProcessor(function);
/// proc.input().preprocess().convert_layout({0, 3, 1, 2});
///
/// \param dims Dimensions array specifying places for new axis. If not empty, array size (N) must match to input
/// shape rank. Array values shall contain all values from 0 to N-1. If empty, no actual conversion will be added.
///
/// \return Reference to 'this' to allow chaining with other calls in a builder-like manner.
///
/// Example: when user data has input RGB image {1x480x640x3} but model expects
/// planar input image ('NCHW', [1, 3, 480, 640]). Preprocessing may look like this:
///
/// \code{.cpp}
/// auto proc = PrePostProcessor(function);
/// proc.input().preprocess().convert_layout({0, 3, 1, 2});
/// \endcode
PreProcessSteps& convert_layout(const std::vector<uint64_t>& dims);
/// \brief Reverse channels operation.
///
/// \details Adds appropriate operation which reverses channels layout. Operation requires layout having 'C'
/// \return Reference to 'this' to allow chaining with other calls in a builder-like manner.
///
/// Adds appropriate operation which reverses channels layout. Operation requires layout having 'C'
/// dimension Operation convert_color (RGB<->BGR) does reversing of channels also, but only for NHWC layout
///
/// \example Example: when user data has 'NCHW' layout (example is [1, 3, 224, 224] RGB order) but model expects
/// Example: when user data has 'NCHW' layout (example is [1, 3, 224, 224] RGB order) but model expects
/// BGR planes order. Preprocessing may look like this:
///
/// \code{.cpp}
@ -163,7 +167,6 @@ public:
/// proc.input().preprocess().reverse_channels();
/// \endcode
///
/// \return Reference to 'this' to allow chaining with other calls in a builder-like manner.
PreProcessSteps& reverse_channels();
};