DOCS shift to rst Custom operations (#16254)

* move to rst

* move to rst

* change intro

* fix_directive

* fix code snippets

* sphinx snippets fix

* change link

* align tab

* snippet path fix

* fix code snippet path

* fix code snippets

* fix hyperlink

* change format

* change intro

* fix list format
This commit is contained in:
Tatiana Savina 2023-03-16 10:55:39 +01:00 committed by GitHub
parent 461cc2aee8
commit c18f3824b0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 406 additions and 255 deletions

View File

@ -1,237 +1,353 @@
# How to Implement Custom GPU Operations {#openvino_docs_Extensibility_UG_GPU}
@sphinxdirective
To enable operations not supported by OpenVINO™ out of the box, you may need an extension for OpenVINO operation set, and a custom kernel for the device you will target. This article describes custom kernel support for the GPU device.
The GPU codepath abstracts many details about OpenCL. You need to provide the kernel code in OpenCL C and an XML configuration file that connects the kernel and its parameters to the parameters of the operation.
There are two options for using the custom operation configuration file:
* Include a section with your kernels into the automatically-loaded `<lib_path>/cldnn_global_custom_kernels/cldnn_global_custom_kernels.xml` file.
* Call the `ov::Core::set_property()` method from your application with the `"CONFIG_FILE"` key and the configuration file name as a value before loading the network that uses custom operations to the plugin:
@sphinxtabset
* Include a section with your kernels into the automatically-loaded ``<lib_path>/cldnn_global_custom_kernels/cldnn_global_custom_kernels.xml`` file.
* Call the ``:ref:`ov::Core::set_property() <doxid-classov_1_1_core_1aa953cb0a1601dbc9a34ef6ba82b8476e>``` method from your application with the ``"CONFIG_FILE"`` key and the configuration file name as a value before loading the network that uses custom operations to the plugin:
@sphinxtab{C++}
@snippet docs/snippets/gpu/custom_kernels_api.cpp part0
@endsphinxtab
.. tab-set::
@sphinxtab{Python}
@snippet docs/snippets/gpu/custom_kernels_api.py part0
@endsphinxtab
.. tab-item:: C++
.. doxygensnippet:: docs/snippets/gpu/custom_kernels_api.cpp
:language: cpp
:fragment: [part0]
@endsphinxtabset
.. tab-item:: Python
.. doxygensnippet:: docs/snippets/gpu/custom_kernels_api.py
:language: python
:fragment: [part0]
All OpenVINO samples, except the trivial `hello_classification`, and most Open Model Zoo demos
feature a dedicated command-line option `-c` to load custom kernels. For example, to load custom operations for the classification sample, run the command below:
```sh
$ ./classification_sample -m <path_to_model>/bvlc_alexnet_fp16.xml -i ./validation_set/daily/227x227/apron.bmp -d GPU
-c <absolute_path_to_config>/custom_layer_example.xml
```
## Configuration File Format <a name="config-file-format"></a>
All OpenVINO samples, except the trivial ``hello_classification``, and most Open Model Zoo demos
feature a dedicated command-line option ``-c`` to load custom kernels. For example, to load custom operations for the classification sample, run the command below:
The configuration file is expected to follow the `.xml` file structure
with a node of the type `CustomLayer` for every custom operation you provide.
.. code-block:: cpp
$ ./classification_sample -m <path_to_model>/bvlc_alexnet_fp16.xml -i ./validation_set/daily/227x227/apron.bmp -d GPU
-c <absolute_path_to_config>/custom_layer_example.xml
.. _config-file-format:
Configuration File Format
#########################
The configuration file is expected to follow the ``.xml`` file structure
with a node of the type ``CustomLayer`` for every custom operation you provide.
The definitions described in the sections below use the following notations:
Notation | Description
---|---
(0/1) | Can have zero or one instance of this node or attribute
(1) | Must have only one instance of this node or attribute
(0+) | Can have any number of instances of this node or attribute
(1+) | Can have one or more instances of this node or attribute
.. list-table::
:header-rows: 1
### CustomLayer Node and Sub-Node Structure
* - Notation
- Description
* - (0/1)
- Can have zero or one instance of this node or attribute
* - (1)
- Must have only one instance of this node or attribute
* - (0+)
- Can have any number of instances of this node or attribute
* - (1+)
- Can have one or more instances of this node or attribute
The `CustomLayer` node contains the entire configuration for a single custom operation.
CustomLayer Node and Sub-Node Structure
+++++++++++++++++++++++++++++++++++++++
| Attribute Name |\# | Description |
|-----|-----|-----|
| `name` | (1) | The name of the operation type to be used. This name should be identical to the type used in the OpenVINO IR.|
| `type` | (1) | Must be `SimpleGPU`. |
| `version` | (1) | Must be `1`. |
The ``CustomLayer`` node contains the entire configuration for a single custom operation.
**Sub-nodes**: `Kernel` (1), `Buffers` (1), `CompilerOptions` (0+),
`WorkSizes` (0/1)
.. list-table::
:header-rows: 1
### Kernel Node and Sub-Node Structure
* - Attribute Name
- #
- Description
* - ``name``
- (1)
- The name of the operation type to be used. This name should be identical to the type used in the IR.
* - ``type``
- (1)
- Must be ``SimpleGPU`` .
* - ``version``
- (1)
- Must be ``1`` .
The `Kernel` node contains all kernel source code configuration.
**Sub-nodes**: ``Kernel`` (1), ``Buffers`` (1), ``CompilerOptions`` (0+),
``WorkSizes`` (0/1)
**Sub-nodes**: `Source` (1+), `Define` (0+)
Kernel Node and Sub-Node Structure
++++++++++++++++++++++++++++++++++
### Source Node and Sub-Node Structure
The ``Kernel`` node contains all kernel source code configuration.
The `Source` node points to a single OpenCL source file.
**Sub-nodes**: ``Source`` (1+), ``Define`` (0+)
| Attribute Name | \# |Description|
|-----|-----|-----|
| `filename` | (1) | Name of the file containing OpenCL source code. The path is relative to your executable. Multiple source nodes will have their sources concatenated in order. |
Source Node and Sub-Node Structure
++++++++++++++++++++++++++++++++++
The ``Source`` node points to a single OpenCL source file.
.. list-table::
:header-rows: 1
* - Attribute Name
- #
- Description
* - ``filename``
- (1)
- Name of the file containing OpenCL source code. The path is relative to your executable. Multiple source nodes will have their sources concatenated in order.
**Sub-nodes**: None
### Define Node and Sub-Node Structure
Define Node and Sub-Node Structure
++++++++++++++++++++++++++++++++++
The `Define` node configures a single `#&zwj;define` instruction to be added to
The ``Define`` node configures a single ``#define`` instruction to be added to
the sources during compilation (JIT).
| Attribute Name | \# | Description |
|------|-------|------|
| `name` | (1) | The name of the defined JIT. For static constants, this can include the value as well, which is taken as a string. |
| `param` | (0/1) | This parameter value is used as the value of this JIT definition. |
| `type` | (0/1) | The parameter type. Accepted values: `int`, `float`, and `int[]`, `float[]` for arrays. |
| `default` | (0/1) | The default value to be used if the specified parameters are missing from the operation in the OpenVINO IR. |
.. list-table::
:header-rows: 1
* - Attribute Name
- #
- Description
* - ``name``
- (1)
- The name of the defined JIT. For static constants, this can include the value as well, which is taken as a string.
* - ``param``
- (0/1)
- This parameter value is used as the value of this JIT definition.
* - ``type``
- (0/1)
- The parameter type. Accepted values: ``int`` , ``float`` , and ``int[]`` , ``float[]`` for arrays.
* - ``default``
- (0/1)
- The default value to be used if the specified parameters are missing from the operation in the OpenVINO IR.
**Sub-nodes:** None
The resulting JIT has the following form:
`#&zwj;define [name] [type] [value/default]`.
``#define [name] [type] [value/default]``.
### Buffers Node and Sub-Node Structure
Buffers Node and Sub-Node Structure
+++++++++++++++++++++++++++++++++++
The `Buffers` node configures all input/output buffers for the OpenCL entry
The ``Buffers`` node configures all input/output buffers for the OpenCL entry
function. No buffers node structure exists.
**Sub-nodes:** `Data` (0+), `Tensor` (1+)
**Sub-nodes:** ``Data`` (0+), ``Tensor`` (1+)
### Data Node and Sub-Node Structure
Data Node and Sub-Node Structure
++++++++++++++++++++++++++++++++
The `Data` node configures a single input with static data, for example,
The ``Data`` node configures a single input with static data, for example,
weights or biases.
| Attribute Name | \# | Description |
|----|-----|------|
| `name` | (1) | Name of a blob attached to an operation in the OpenVINO IR. |
| `arg-index` | (1) | 0-based index in the entry function arguments to be bound to. |
.. list-table::
:header-rows: 1
* - Attribute Name
- #
- Description
* - ``name``
- (1)
- Name of a blob attached to an operation in the OpenVINO IR.
* - ``arg-index``
- (1)
- 0-based index in the entry function arguments to be bound to.
**Sub-nodes**: None
### Tensor Node and Sub-Node Structure
Tensor Node and Sub-Node Structure
++++++++++++++++++++++++++++++++++
The `Tensor` node configures a single input or output tensor.
The ``Tensor`` node configures a single input or output tensor.
| Attribute Name | \# | Description |
|------|-------|-------|
| `arg-index` | (1) | 0-based index in the entry function arguments to be bound to. |
| `type` | (1) | `input` or `output` |
| `port-index` | (1) | 0-based index in the operation input/output ports in the OpenVINO IR |
| `format` | (0/1) | Data layout declaration for the tensor. Accepted values: `BFYX`, `BYXF`, `YXFB`, `FYXB`(also in lowercase). The default value: `BFYX` |
.. list-table::
:header-rows: 1
### CompilerOptions Node and Sub-Node Structure
* - Attribute Name
- #
- Description
* - ``arg-index``
- (1)
- 0-based index in the entry function arguments to be bound to.
* - ``type``
- (1)
- ``input`` or ``output``
* - ``port-index``
- (1)
- 0-based index in the operation input/output ports in the OpenVINO IR
* - ``format``
- (0/1)
- Data layout declaration for the tensor. Accepted values: ``BFYX`` , ``BYXF`` , ``YXFB`` , ``FYXB`` , and same values in all lowercase. Default value: ``BFYX``.
The `CompilerOptions` node configures the compilation flags for the OpenCL
CompilerOptions Node and Sub-Node Structure
+++++++++++++++++++++++++++++++++++++++++++
The ``CompilerOptions`` node configures the compilation flags for the OpenCL
sources.
| Attribute Name | \# | Description |
|--------|-----|------|
| `options` | (1) | Options string to be passed to the OpenCL compiler |
.. list-table::
:header-rows: 1
* - Attribute Name
- #
- Description
* - ``options``
- (1)
- Options string to be passed to the OpenCL compiler
**Sub-nodes**: None
### WorkSizes Node and Sub-Node Structure
WorkSizes Node and Sub-Node Structure
+++++++++++++++++++++++++++++++++++++
The `WorkSizes` node configures the global/local work sizes to be used when
The ``WorkSizes`` node configures the global/local work sizes to be used when
queuing an OpenCL program for execution.
| Attribute Name | \# | Description |
|-----|------|-----|
| `global`<br>`local` | (0/1)<br>(0/1) | An array of up to three integers or formulas for defining OpenCL work-sizes to be used during execution.<br> The formulas can use the values of the B,F,Y,X dimensions and contain the operators: +,-,/,\*,%. All operators are evaluated in integer arithmetic. <br>Default value: `global=”B*F*Y*X” local=””` |
| `dim` | (0/1) | A tensor to take the work-size from. Accepted values: `input N`, `output`, where `N` is an index of input tensor starting with 0. The default value: `output` |
.. list-table::
:header-rows: 1
* - Attribute Name
- #
- Description
* - ``global`` ``local``
- (0/1) (0/1)
- An array of up to three integers or formulas for defining OpenCL work-sizes to be used during execution. The formulas can use the values of the B,F,Y,X dimensions and contain the operators: +,-,/,\*,%. All operators are evaluated in integer arithmetic. Default value: ``global=”B\*F\*Y\*X” local=””``
* - ``dim``
- (0/1)
- A tensor to take the work-size from. Accepted values: ``input N`` , ``output`` , where ``N`` is an index of input tensor starting with 0. Default value: ``output``
**Sub-nodes**: None
## Example Configuration File
Example Configuration File
##########################
The following code sample provides an example configuration file in XML
format. For information on the configuration file structure, see the
[Configuration File Format](#config-file-format).
```xml
<CustomLayer name="ReLU" type="SimpleGPU" version="1">
<Kernel entry="example_relu_kernel">
<Source filename="custom_layer_kernel.cl"/>
<Define name="neg_slope" type="float" param="negative_slope" default="0.0"/>
</Kernel>
<Buffers>
<Tensor arg-index="0" type="input" port-index="0" format="BFYX"/>
<Tensor arg-index="1" type="output" port-index="0" format="BFYX"/>
</Buffers>
<CompilerOptions options="-cl-mad-enable"/>
<WorkSizes global="X,Y,B*F"/>
</CustomLayer>
```
format. For information on the configuration file structure, see the `Configuration File Format <#config-file-format>`__.
## Built-In Definitions for Custom Layers
.. code-block:: cpp
<CustomLayer name="ReLU" type="SimpleGPU" version="1">
<Kernel entry="example_relu_kernel">
<Source filename="custom_layer_kernel.cl"/>
<Define name="neg_slope" type="float" param="negative_slope" default="0.0"/>
</Kernel>
<Buffers>
<Tensor arg-index="0" type="input" port-index="0" format="BFYX"/>
<Tensor arg-index="1" type="output" port-index="0" format="BFYX"/>
</Buffers>
<CompilerOptions options="-cl-mad-enable"/>
<WorkSizes global="X,Y,B*F"/>
</CustomLayer>
Built-In Definitions for Custom Layers
######################################
The following table includes definitions that are attached before
user sources.
For an example, see [Example Kernel](#example-kernel).
For an example, see `Example Kernel <#example-kernel>`__.
| Name | Value |
|---|---|
| `NUM_INPUTS` | Number of the input tensors bound to this kernel. |
| `GLOBAL_WORKSIZE` | An array of global work sizes used to execute this kernel. |
| `GLOBAL_WORKSIZE_SIZE` | The size of the `GLOBAL_WORKSIZE` array. |
| `LOCAL_WORKSIZE` | An array of local work sizes used to execute this kernel. |
| `LOCAL_WORKSIZE_SIZE` | The size of the `LOCAL_WORKSIZE` array. |
| `<TENSOR>_DIMS`| An array of the tensor dimension sizes. Always ordered as `BFYX`. |
| `<TENSOR>_DIMS_SIZE`| The size of the `<TENSOR>_DIMS` array.|
| `<TENSOR>_TYPE`| The datatype of the tensor: `float`, `half`, or `char`. |
| `<TENSOR>_FORMAT_<TENSOR_FORMAT>` | The format of the tensor, BFYX, BYXF, YXFB , FYXB, or ANY. The format is concatenated to the defined name. You can use the tensor format to define codepaths in your code with `#&zwj;ifdef/#&zwj;endif`. |
| `<TENSOR>_LOWER_PADDING` | An array of padding elements used for the tensor dimensions before they start. Always ordered as BFYX.|
| `<TENSOR>_LOWER_PADDING_SIZE` | The size of the `<TENSOR>_LOWER_PADDING` array. |
| `<TENSOR>_UPPER_PADDING` | An array of padding elements used for the tensor dimensions after they end. Always ordered as BFYX. |
| `<TENSOR>_UPPER_PADDING_SIZE` | The size of the `<TENSOR>_UPPER_PADDING` array. |
| `<TENSOR>_PITCHES` | The offset (in elements) between adjacent elements in each dimension. Always ordered as BFYX. |
| `<TENSOR>_PITCHES_SIZE`| The size of the `<TENSOR>_PITCHES` array. |
| `<TENSOR>_OFFSET`| The number of elements from the start of the tensor to the first valid element, bypassing the lower padding. |
.. list-table::
:header-rows: 1
All `<TENSOR>` values are automatically defined for every tensor
bound to this operation, such as `INPUT0`, `INPUT1`, and `OUTPUT0`, as shown
* - Name
- Value
* - ``NUM_INPUTS``
- Number of the input tensors bound to this kernel
* - ``GLOBAL_WORKSIZE``
- An array of global work sizes used to execute this kernel
* - ``GLOBAL_WORKSIZE_SIZE``
- The size of the ``GLOBAL_WORKSIZE`` array
* - ``LOCAL_WORKSIZE``
- An array of local work sizes used to execute this kernel
* - ``LOCAL_WORKSIZE_SIZE``
- The size of the ``LOCAL_WORKSIZE`` array
* - ``<TENSOR>_DIMS``
- An array of the tensor dimension sizes. Always ordered as ``BFYX``
* - ``<TENSOR>_DIMS_SIZE``
- The size of the ``<TENSOR>_DIMS`` array.
* - ``<TENSOR>_TYPE``
- The datatype of the tensor: ``float`` , ``half`` , or ``char``
* - ``<TENSOR>_FORMAT_<TENSOR_FORMAT>``
- The format of the tensor, BFYX, BYXF, YXFB , FYXB, or ANY. The format is concatenated to the defined name. You can use the tensor format to define codepaths in your code with ``#ifdef/#endif`` .
* - ``<TENSOR>_LOWER_PADDING``
- An array of padding elements used for the tensor dimensions before they start. Always ordered as BFYX.
* - ``<TENSOR>_LOWER_PADDING_SIZE``
- The size of the ``<TENSOR>_LOWER_PADDING`` array
* - ``<TENSOR>_UPPER_PADDING``
- An array of padding elements used for the tensor dimensions after they end. Always ordered as BFYX.
* - ``<TENSOR>_UPPER_PADDING_SIZE``
- The size of the ``<TENSOR>_UPPER_PADDING`` array
* - ``<TENSOR>_PITCHES``
- The offset (in elements) between adjacent elements in each dimension. Always ordered as BFYX.
* - ``<TENSOR>_PITCHES_SIZE``
- The size of the ``<TENSOR>_PITCHES`` array
* - ``<TENSOR>_OFFSET``
- The number of elements from the start of the tensor to the first valid element, bypassing the lower padding.
All ``<TENSOR>`` values are automatically defined for every tensor
bound to this operation, such as ``INPUT0``, ``INPUT1``, and ``OUTPUT0``, as shown
in the following example:
```c
#define INPUT0_DIMS_SIZE 4
#define INPUT0_DIMS (int []){ 1,96,55,55, }
```
.. code-block:: c
## Example Kernel<a name="example-kernel"></a>
#define INPUT0_DIMS_SIZE 4
#define INPUT0_DIMS (int []){ 1,96,55,55, }
```c
#pragma OPENCL EXTENSION cl_khr_fp16 : enable
__kernel void example_relu_kernel(
const __global INPUT0_TYPE* input0,
__global OUTPUT0_TYPE* output)
{
const uint idx = get_global_id(0);
const uint idy = get_global_id(1);
const uint idbf = get_global_id(2); // batches*features, as OpenCL supports 3D nd-ranges only
const uint feature = idbf % OUTPUT0_DIMS[1];
const uint batch = idbf / OUTPUT0_DIMS[1];
//notice that pitches are in elements, not in bytes!
const uint in_id = batch*INPUT0_PITCHES[0] + feature*INPUT0_PITCHES[1] + idy*INPUT0_PITCHES[2] + idx*INPUT0_PITCHES[3] + INPUT0_OFFSET;
const uint out_id = batch*OUTPUT0_PITCHES[0] + feature*OUTPUT0_PITCHES[1] + idy*OUTPUT0_PITCHES[2] + idx*OUTPUT0_PITCHES[3] + OUTPUT0_OFFSET;
.. _example-kernel:
INPUT0_TYPE value = input0[in_id];
// neg_slope (which is non-zero for leaky ReLU) is put automatically as #define, refer to the config xml
output[out_id] = value < 0 ? value * neg_slope : value;
}
```
Example Kernel
##############
.. code-block:: c
> **NOTE**: As described in the previous section, all items such as the
> `INPUT0_TYPE` are actually defined as OpenCL (pre-)compiler inputs by
> OpenVINO for efficiency reasons. See the [Debugging
> Tips](#debugging-tips) below for information on debugging the results.
#pragma OPENCL EXTENSION cl_khr_fp16 : enable
__kernel void example_relu_kernel(
const __global INPUT0_TYPE* input0,
__global OUTPUT0_TYPE* output)
{
const uint idx = get_global_id(0);
const uint idy = get_global_id(1);
const uint idbf = get_global_id(2); // batches*features, as OpenCL supports 3D nd-ranges only
const uint feature = idbf % OUTPUT0_DIMS[1];
const uint batch = idbf / OUTPUT0_DIMS[1];
//notice that pitches are in elements, not in bytes!
const uint in_id = batch*INPUT0_PITCHES[0] + feature*INPUT0_PITCHES[1] + idy*INPUT0_PITCHES[2] + idx*INPUT0_PITCHES[3] + INPUT0_OFFSET;
const uint out_id = batch*OUTPUT0_PITCHES[0] + feature*OUTPUT0_PITCHES[1] + idy*OUTPUT0_PITCHES[2] + idx*OUTPUT0_PITCHES[3] + OUTPUT0_OFFSET;
## Debugging Tips<a name="debugging-tips"></a>
INPUT0_TYPE value = input0[in_id];
// neg_slope (which is non-zero for leaky ReLU) is put automatically as #define, refer to the config xml
output[out_id] = value < 0 ? value * neg_slope : value;
}
**Using `printf` in the OpenCL™ Kernels**.
To debug the specific values, use `printf` in your kernels.
.. _debugging-tips:
.. note::
As described in the previous section, all items such as the ``INPUT0_TYPE`` are actually defined as OpenCL (pre-)compiler inputs by OpenVINO for efficiency reasons. See the `Debugging Tips <#debugging-tips>`__ below for information on debugging the results.
Debugging Tips
##############
**Using ``printf`` in the OpenCL™ Kernels**.
To debug the specific values, use ``printf`` in your kernels.
However, be careful not to output excessively, which
could generate too much data. The `printf` output is typical, so
could generate too much data. The ``printf`` output is typical, so
your output can be truncated to fit the buffer. Also, because of
buffering, you actually get an entire buffer of output when the
execution ends.<br>
execution ends.
For more information, refer to the [printf Function](https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/printfFunction.html).
For more information, refer to the `printf Function <https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/printfFunction.html>`__.
@endsphinxdirective

View File

@ -18,12 +18,9 @@
openvino_docs_transformations
OpenVINO Plugin Developer Guide <openvino_docs_ie_plugin_dg_overview>
@endsphinxdirective
The Intel® Distribution of OpenVINO™ toolkit supports neural network models trained with various frameworks, including
TensorFlow, PyTorch, ONNX, PaddlePaddle, Apache MXNet, Caffe, and Kaldi. The list of supported operations is different for
each of the supported frameworks. To see the operations supported by your framework, refer to
[Supported Framework Operations](../MO_DG/prepare_model/Supported_Frameworks_Layers.md).
each of the supported frameworks. To see the operations supported by your framework, refer to :doc:`Supported Framework Operations <openvino_docs_MO_DG_prepare_model_Supported_Frameworks_Layers>`.
Custom operations, which are not included in the list, are not recognized by OpenVINO out-of-the-box. The need for custom operation may appear in two cases:
@ -35,31 +32,33 @@ Importing models with such operations requires additional steps. This guide illu
Defining a new custom operation basically consists of two parts:
1. Definition of operation semantics in OpenVINO, the code that describes how this operation should be inferred consuming input tensor(s) and producing output tensor(s). The implementation of execution kernels for [GPU](./GPU_Extensibility.md) is described in separate guides.
1. Definition of operation semantics in OpenVINO, the code that describes how this operation should be inferred consuming input tensor(s) and producing output tensor(s). The implementation of execution kernels for :doc:`GPU <openvino_docs_Extensibility_UG_GPU>` is described in separate guides.
2. Mapping rule that facilitates conversion of framework operation representation to OpenVINO defined operation semantics.
The first part is required for inference. The second part is required for successful import of a model containing such operations from the original framework model format. There are several options to implement each part. The following sections will describe them in detail.
## Definition of Operation Semantics
Definition of Operation Semantics
#################################
If the custom operation can be mathematically represented as a combination of exiting OpenVINO operations and such decomposition gives desired performance, then low-level operation implementation is not required. Refer to the latest OpenVINO operation set, when deciding feasibility of such decomposition. You can use any valid combination of exiting operations. The next section of this document describes the way to map a custom operation.
If such decomposition is not possible or appears too bulky with a large number of constituent operations that do not perform well, then a new class for the custom operation should be implemented, as described in the [Custom Operation Guide](add_openvino_ops.md).
If such decomposition is not possible or appears too bulky with a large number of constituent operations that do not perform well, then a new class for the custom operation should be implemented, as described in the :doc:`Custom Operation Guide <openvino_docs_Extensibility_UG_add_openvino_ops>`.
You might prefer implementing a custom operation class if you already have a generic C++ implementation of operation kernel. Otherwise, try to decompose the operation first, as described above. Then, after verifying correctness of inference and resulting performance, you may move on to optional implementation of Bare Metal C++.
## Mapping from Framework Operation
Mapping from Framework Operation
################################
Mapping of custom operation is implemented differently, depending on model format used for import. You may choose one of the following:
1. If a model is represented in the ONNX (including models exported from Pytorch in ONNX), PaddlePaddle or TensorFlow formats, then one of the classes from [Frontend Extension API](frontend_extensions.md) should be used. It consists of several classes available in C++ which can be used with the `--extensions` option in Model Optimizer or when a model is imported directly to OpenVINO runtime using the `read_model` method. Python API is also available for runtime model import.
1. If a model is represented in the ONNX (including models exported from Pytorch in ONNX), PaddlePaddle or TensorFlow formats, then one of the classes from :doc:`Frontend Extension API <openvino_docs_Extensibility_UG_Frontend_Extensions>` should be used. It consists of several classes available in C++ which can be used with the ``--extensions`` option in Model Optimizer or when a model is imported directly to OpenVINO runtime using the ``read_model`` method. Python API is also available for runtime model import.
2. If a model is represented in the Caffe, Kaldi or MXNet formats, then [Model Optimizer Extensions](../MO_DG/prepare_model/customize_model_optimizer/Customize_Model_Optimizer.md) should be used. This approach is available for model conversion in Model Optimizer only.
2. If a model is represented in the Caffe, Kaldi or MXNet formats, then :doc:`Model Optimizer Extensions <openvino_docs_MO_DG_prepare_model_customize_model_optimizer_Customize_Model_Optimizer>` should be used. This approach is available for model conversion in Model Optimizer only.
Existing of two approaches simultaneously is explained by two different types of frontends used for model conversion in OpenVINO: new frontends (ONNX, PaddlePaddle and TensorFlow) and legacy frontends (Caffe, Kaldi and Apache MXNet). Model Optimizer can use both front-ends in contrast to the direct import of model with `read_model` method which can use new frontends only. Follow one of the appropriate guides referenced above to implement mappings depending on framework frontend.
Existing of two approaches simultaneously is explained by two different types of frontends used for model conversion in OpenVINO: new frontends (ONNX, PaddlePaddle and TensorFlow) and legacy frontends (Caffe, Kaldi and Apache MXNet). Model Optimizer can use both front-ends in contrast to the direct import of model with ``read_model`` method which can use new frontends only. Follow one of the appropriate guides referenced above to implement mappings depending on framework frontend.
If you are implementing extensions for new ONNX, PaddlePaddle or TensorFlow frontends and plan to use the `--extensions` option in Model Optimizer for model conversion, then the extensions should be:
If you are implementing extensions for new ONNX, PaddlePaddle or TensorFlow frontends and plan to use the ``--extensions`` option in Model Optimizer for model conversion, then the extensions should be:
1. Implemented in C++ only.
@ -69,109 +68,123 @@ Model Optimizer does not support new frontend extensions written in Python API.
Remaining part of this guide describes application of Frontend Extension API for new frontends.
## Registering Extensions
Registering Extensions
######################
A custom operation class and a new mapping frontend extension class object should be registered to be usable in OpenVINO runtime.
> **NOTE**: This documentation is derived from the [Template extension](https://github.com/openvinotoolkit/openvino/tree/master/src/core/template_extension/new), which demonstrates the details of extension development. It is based on minimalistic `Identity` operation that is a placeholder for your real custom operation. Review the complete, fully compilable code to see how it works.
.. note::
This documentation is derived from the `Template extension <https://github.com/openvinotoolkit/openvino/tree/master/src/core/template_extension/new>`__, which demonstrates the details of extension development. It is based on minimalistic ``Identity`` operation that is a placeholder for your real custom operation. Review the complete, fully compilable code to see how it works.
Use the `ov::Core::add_extension` method to load the extensions to the `ov::Core` object. This method allows loading library with extensions or extensions from the code.
Use the ``:ref:`ov::Core::add_extension <doxid-classov_1_1_core_1a68d0dea1cbcd42a67bea32780e32acea>``` method to load the extensions to the ``:ref:`ov::Core <doxid-classov_1_1_core>``` object. This method allows loading library with extensions or extensions from the code.
### Load Extensions to Core
Load Extensions to Core
+++++++++++++++++++++++
Extensions can be loaded from a code with the `ov::Core::add_extension` method:
Extensions can be loaded from a code with the ``:ref:`ov::Core::add_extension <doxid-classov_1_1_core_1a68d0dea1cbcd42a67bea32780e32acea>``` method:
@sphinxtabset
.. tab-set::
@sphinxtab{C++}
@snippet docs/snippets/ov_extensions.cpp add_extension
@endsphinxtab
@sphinxtab{Python}
@snippet docs/snippets/ov_extensions.py add_extension
@endsphinxtab
@endsphinxtabset
The `Identity` is a custom operation class defined in [Custom Operation Guide](add_openvino_ops.md). This is sufficient to enable reading OpenVINO IR which uses the `Identity` extension operation emitted by Model Optimizer. In order to load original model directly to the runtime, add a mapping extension:
@sphinxdirective
.. tab:: C++
.. doxygensnippet:: docs/snippets/ov_extensions.cpp
:language: cpp
:fragment: add_frontend_extension
.. tab:: Python
.. doxygensnippet:: docs/snippets/ov_extensions.py
:language: python
:fragment: add_frontend_extension
@endsphinxdirective
.. tab-item:: C++
:sync: cpp
.. doxygensnippet:: docs/snippets/ov_extensions.cpp
:language: cpp
:fragment: [add_extension]
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_extensions.py
:language: python
:fragment: [add_extension]
The ``Identity`` is a custom operation class defined in :doc:`Custom Operation Guide <openvino_docs_Extensibility_UG_add_openvino_ops>`. This is sufficient to enable reading OpenVINO IR which uses the ``Identity`` extension operation emitted by Model Optimizer. In order to load original model directly to the runtime, add a mapping extension:
.. tab-set::
.. tab-item:: C++
:sync: cpp
.. doxygensnippet:: docs/snippets/ov_extensions.cpp
:language: cpp
:fragment: [add_frontend_extension]
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_extensions.py
:language: python
:fragment: [add_frontend_extension]
When Python API is used, there is no way to implement a custom OpenVINO operation. Even if custom OpenVINO operation is implemented in C++ and loaded into the runtime by a shared library, there is still no way to add a frontend mapping extension that refers to this custom operation. In this case, use C++ shared library approach to implement both operations semantics and framework mapping.
Python can still be used to map and decompose operations when only operations from the standard OpenVINO operation set are used.
### Create a Library with Extensions
Create a Library with Extensions
++++++++++++++++++++++++++++++++
An extension library should be created in the following cases:
- Conversion of a model with custom operations in Model Optimizer.
- Loading a model with custom operations in a Python application. This applies to both framework model and OpenVINO IR.
- Loading models with custom operations in tools that support loading extensions from a library, for example the `benchmark_app`.
* Conversion of a model with custom operations in Model Optimizer.
* Loading a model with custom operations in a Python application. This applies to both framework model and OpenVINO IR.
* Loading models with custom operations in tools that support loading extensions from a library, for example the ``benchmark_app``.
To create an extension library, for example, to load the extensions into Model Optimizer, perform the following:
1. Create an entry point for extension library. OpenVINO provides the `OPENVINO_CREATE_EXTENSIONS()` macro, which allows to define an entry point to a library with OpenVINO Extensions.
1. Create an entry point for extension library. OpenVINO provides the ``:ref:`OPENVINO_CREATE_EXTENSIONS() <doxid-core_2include_2openvino_2core_2extension_8hpp_1acdadcfa0eff763d8b4dadb8a9cb6f6e6>``` macro, which allows to define an entry point to a library with OpenVINO Extensions.
This macro should have a vector of all OpenVINO Extensions as an argument.
Based on that, the declaration of an extension class might look like the following:
@snippet template_extension/new/ov_extension.cpp ov_extension:entry_point
.. doxygensnippet:: ./src/core/template_extension/new/ov_extension.cpp
:language: cpp
:fragment: [ov_extension:entry_point]
2. Configure the build of your extension library, using the following CMake script:
@snippet template_extension/new/CMakeLists.txt cmake:extension
.. doxygensnippet:: ./src/core/template_extension/new/CMakeLists.txt
:language: cpp
:fragment: [cmake:extension]
This CMake script finds OpenVINO, using the `find_package` CMake command.
This CMake script finds OpenVINO, using the ``find_package`` CMake command.
3. Build the extension library, running the commands below:
```sh
$ cd src/core/template_extension/new
$ mkdir build
$ cd build
$ cmake -DOpenVINO_DIR=<OpenVINO_DIR> ../
$ cmake --build .
```
.. code-block:: sh
$ cd src/core/template_extension/new
$ mkdir build
$ cd build
$ cmake -DOpenVINO_DIR=<OpenVINO_DIR> ../
$ cmake --build .
4. After the build, you may use the path to your extension library to load your extensions to OpenVINO Runtime:
@sphinxtabset
.. tab-set::
.. tab-item:: C++
:sync: cpp
@sphinxtab{C++}
.. doxygensnippet:: docs/snippets/ov_extensions.cpp
:language: cpp
:fragment: [add_extension_lib]
@snippet docs/snippets/ov_extensions.cpp add_extension_lib
.. tab-item:: Python
:sync: py
.. doxygensnippet:: docs/snippets/ov_extensions.py
:language: python
:fragment: [add_extension_lib]
@endsphinxtab
@sphinxtab{Python}
See Also
########
@snippet docs/snippets/ov_extensions.py add_extension_lib
* :doc:`OpenVINO Transformations <openvino_docs_transformations>`
* :doc:`Using OpenVINO Runtime Samples <openvino_docs_OV_UG_Samples_Overview>`
* :doc:`Hello Shape Infer SSD sample <openvino_inference_engine_samples_hello_reshape_ssd_README>`
@endsphinxtab
@endsphinxtabset
## See Also
* [OpenVINO Transformations](./ov_transformations.md)
* [Using OpenVINO Runtime Samples](../OV_Runtime_UG/Samples_Overview.md)
* [Hello Shape Infer SSD sample](../../samples/cpp/hello_reshape_ssd/README.md)
@endsphinxdirective

View File

@ -1,59 +1,81 @@
# Custom OpenVINO™ Operations {#openvino_docs_Extensibility_UG_add_openvino_ops}
OpenVINO™ Extension API allows you to register custom operations to support models with operations which OpenVINO™ does not support out-of-the-box. This capability requires writing code in C++, so if you are using Python to develop your application you need to build a separate shared library implemented in C++ first and load it in Python using `add_extension` API. Please refer to [Create library with extensions](Intro.md#create-library-with-extensions) for more details on library creation and usage. The remining part of this document describes how to implement an operation class.
@sphinxdirective
## Operation Class
OpenVINO™ Extension API allows you to register custom operations to support models with operations which OpenVINO™ does not support out-of-the-box. This capability requires writing code in C++, so if you are using Python to develop your application you need to build a separate shared library implemented in C++ first and load it in Python using ``add_extension`` API. Please refer to :ref:`Create library with extensions <create_library_with_extensions>` for more details on library creation and usage. The remining part of this document describes how to implement an operation class.
To add your custom operation, create a new class that extends `ov::Op`, which is in turn derived from `ov::Node`, the base class for all graph operations in OpenVINO™. To add `ov::Op` please include next file:
Operation Class
###############
@snippet template_extension/new/identity.hpp op:common_include
To add your custom operation, create a new class that extends ``ov::Op``, which is in turn derived from ``:ref:`ov::Node <doxid-classov_1_1_node>```, the base class for all graph operations in OpenVINO™. To add ``ov::Op``, include the next file:
.. doxygensnippet:: ./src/core/template_extension/new/identity.hpp
:language: cpp
:fragment: [op:common_include]
Follow the steps below to add a custom operation:
1. Add the `OPENVINO_OP` macro which defines a `NodeTypeInfo` object that identifies the type of the operation to the graph users and helps with dynamic type resolution. The type info of an operation currently consists of a string operation identifier and a string for operation version.
1. Add the ``OPENVINO_OP`` macro which defines a ``NodeTypeInfo`` object that identifies the type of the operation to the graph users and helps with dynamic type resolution. The type info of an operation currently consists of a string operation identifier and a string for operation version.
2. Implement default constructor and constructors that optionally take the operation inputs and attributes as parameters.
3. Override the shape inference method `validate_and_infer_types`. This method is called multiple times during graph manipulations to determine the shapes and element types of the operations outputs. To access the input shapes and input element types, use the `get_input_partial_shape()` and `get_input_element_type()` methods of `ov::Node`. Set the inferred shape and element type of the output using `set_output_type`.
3. Override the shape inference method ``validate_and_infer_types``. This method is called multiple times during graph manipulations to determine the shapes and element types of the operations outputs. To access the input shapes and input element types, use the ``get_input_partial_shape()`` and ``get_input_element_type()`` methods of ``:ref:`ov::Node <doxid-classov_1_1_node>```. Set the inferred shape and element type of the output using ``set_output_type``.
4. Override the `clone_with_new_inputs` method, which enables graph manipulation routines to create copies of this operation and connect it to different nodes during optimization.
4. Override the ``clone_with_new_inputs`` method, which enables graph manipulation routines to create copies of this operation and connect it to different nodes during optimization.
5. Override the `visit_attributes` method, which enables serialization and deserialization of operation attributes. An `AttributeVisitor` is passed to the method, and the implementation is expected to walk over all the attributes in the op using the type-aware `on_attribute` helper. Helpers are already implemented for standard C++ types like `int64_t`, `float`, `bool`, `vector`, and for existing OpenVINO defined types.
5. Override the ``visit_attributes`` method, which enables serialization and deserialization of operation attributes. An ``AttributeVisitor`` is passed to the method, and the implementation is expected to walk over all the attributes in the op using the type-aware ``on_attribute`` helper. Helpers are already implemented for standard C++ types like ``int64_t``, ``float``, ``bool``, ``vector``, and for existing OpenVINO defined types.
6. Override `evaluate`, which is an optional method that enables fallback of some devices to this implementation and the application of constant folding if there is a custom operation on the constant branch. If your operation contains `evaluate` method you also need to override the `has_evaluate` method, this method allows to get information about availability of `evaluate` method for the operation.
6. Override ``evaluate``, which is an optional method that enables fallback of some devices to this implementation and the application of constant folding if there is a custom operation on the constant branch. If your operation contains ``evaluate`` method you also need to override the ``has_evaluate`` method, this method allows to get information about availability of ``evaluate`` method for the operation.
Based on that, declaration of an operation class can look as follows:
### Operation Constructors
Operation Constructors
++++++++++++++++++++++
OpenVINO™ operation contains two constructors:
* Default constructor, which enables you to create an operation without attributes
* Constructor that creates and validates an operation with specified inputs and attributes
@snippet template_extension/new/identity.cpp op:ctor
.. doxygensnippet:: ./src/core/template_extension/new/identity.cpp
:language: cpp
:fragment: [op:ctor]
### `validate_and_infer_types()`
``validate_and_infer_types()``
++++++++++++++++++++++++++++++
`ov::Node::validate_and_infer_types` method validates operation attributes and calculates output shapes using attributes of the operation.
``:ref:`ov::Node::validate_and_infer_types <doxid-classov_1_1_node_1ac5224b5be848ec670d2078d9816d12e7>``` method validates operation attributes and calculates output shapes using attributes of the operation.
@snippet template_extension/new/identity.cpp op:validate
.. doxygensnippet:: ./src/core/template_extension/new/identity.cpp
:language: cpp
:fragment: [op:validate]
### `clone_with_new_inputs()`
``clone_with_new_inputs()``
+++++++++++++++++++++++++++
`ov::Node::clone_with_new_inputs` method creates a copy of the operation with new inputs.
``:ref:`ov::Node::clone_with_new_inputs <doxid-classov_1_1_node_1a04cb103fa069c3b7944ab7c44d94f5ff>``` method creates a copy of the operation with new inputs.
@snippet template_extension/new/identity.cpp op:copy
.. doxygensnippet:: ./src/core/template_extension/new/identity.cpp
:language: cpp
:fragment: [op:copy]
### `visit_attributes()`
``visit_attributes()``
++++++++++++++++++++++
`ov::Node::visit_attributes` method enables you to visit all operation attributes.
``:ref:`ov::Node::visit_attributes <doxid-classov_1_1_node_1a9743b56d352970486d17dae2416d958e>``` method enables you to visit all operation attributes.
@snippet template_extension/new/identity.cpp op:visit_attributes
.. doxygensnippet:: ./src/core/template_extension/new/identity.cpp
:language: cpp
:fragment: [op:visit_attributes]
### evaluate() and has_evaluate()
``evaluate() and has_evaluate()``
+++++++++++++++++++++++++++++++++
`ov::Node::evaluate` method enables you to apply constant folding to an operation.
``:ref:`ov::Node::evaluate <doxid-classov_1_1_node_1acfb82acc8349d7138aeaa05217c7014e>``` method enables you to apply constant folding to an operation.
@snippet template_extension/new/identity.cpp op:evaluate
.. doxygensnippet:: ./src/core/template_extension/new/identity.cpp
:language: cpp
:fragment: [op:evaluate]
@endsphinxdirective