PReLU specification refactoring (#5016)

* Review spec of PReLU operation

* Address review comments

   * Correct second input description
   * Add note to clarify input channel dimension
   * Add additional equivalent formula for op
   * Change reference link to abstract
   * Add additional examples

* Address review comments related to wording

* Fix IR layer examples
This commit is contained in:
Gabriele Galiero Casay 2021-04-01 14:32:39 +02:00 committed by GitHub
parent 4021cb7519
commit 80acd27096
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -2,32 +2,114 @@
**Versioned name**: *PReLU-1*
**Category**: Activation function
**Category**: *Activation function*
**Short description**: *PReLU* performs element-wise parametric ReLU operation with negative slope defined by the second input.
**Short description**: Parametric rectified linear unit element-wise activation function.
**Attributes**: operation has no attributes.
**Detailed description**
*PReLU* operation is introduced in this [article](https://arxiv.org/abs/1502.01852v1).
*PReLU* performs element-wise parametric *ReLU* operation on a given input tensor, based on the following mathematical formula:
\f[
PReLU(x) = \left\{\begin{array}{r}
x \quad \mbox{if } x \geq 0 \\
\alpha x \quad \mbox{if } x < 0
\end{array}\right.
\f]
where α is a learnable parameter and corresponds to the negative slope, per channel, defined by the second input `slope`.
Another mathematical representation that may be found in other references:
\f[
PReLU(x) = \max(0, x) + \alpha\cdot\min(0, x)
\f]
**Attributes**: *PReLU* operation has no attributes.
**Inputs**
* **1**: `X` - Input tensor of any supported floating point type T1. Required.
* **2**: `slope` - Tensor with negative slope values of type T2. The shape of the tensor should be broadcastable to input 1. Required.
* **1**: `data`. A tensor of type `T` and arbitrary shape. **Required**.
* **2**: `slope`. 1D tensor of type `T`. Tensor with negative slope values, one per channel dimension of `data` input tensor. **Required**.
* **Note**: Channels dimension corresponds to second dimension of `data` input tensor. If `data` rank is less than 2, the number of channels is 1.
**Outputs**
* **1**: The result of element-wise PReLU operation applied for tensor from input 1 with slope values from input 2. A tensor of type T1 and shape matching shape of input *x* tensor.
* **1**: The result of element-wise *PReLU* operation applied to `data` input tensor with negative slope values from `slope` input tensor. A tensor of type `T` and the same shape as `data` input tensor.
**Types**
* *T1*: arbitrary supported floating point type.
* *T*: arbitrary supported floating-point type.
* *T2*: arbitrary supported floating point type.
**Examples**
**Detailed description**
Before performing addition operation, input tensor 2 with slope values is broadcasted to input 1.
The broadcasting rules are aligned with ONNX Broadcasting. Description is available in <a href="https://github.com/onnx/onnx/blob/master/docs/Broadcasting.md">ONNX docs</a>.
*Example: 1D input tensor `data`*
After broadcasting *PReLU* does the following for each input 1 element x:
```xml
<layer ... type="Prelu">
<input>
<port id="0">
<dim>128</dim>
</port>
<port id="1">
<dim>1</dim>
</port>
</input>
<output>
<port id="2">
<dim>128</dim>
</port>
</output>
</layer>
```
f(x) = slope * x for x < 0; x for x >= 0
*Example: 2D input tensor `data`*
```xml
<layer ... type="Prelu">
<input>
<port id="0">
<dim>20</dim>
<dim>128</dim>
</port>
<port id="1">
<dim>128</dim>
</port>
</input>
<output>
<port id="2">
<dim>20</dim>
<dim>128</dim>
</port>
</output>
</layer>
```
*Example: 4D input tensor `data`*
```xml
<layer ... type="Prelu">
<input>
<port id="0">
<dim>1</dim>
<dim>20</dim>
<dim>128</dim>
<dim>128</dim>
</port>
<port id="1">
<dim>20</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>20</dim>
<dim>128</dim>
<dim>128</dim>
</port>
</output>
</layer>
```