Swish specification refactoring (#5015)
* Review spec of Swish operation * Change reference link to abstract * Minor change in example section * Fix minor wording issues
This commit is contained in:
parent
80acd27096
commit
f7863847ad
@ -2,38 +2,40 @@
|
||||
|
||||
**Versioned name**: *Swish-4*
|
||||
|
||||
**Category**: *Activation*
|
||||
**Category**: *Activation function*
|
||||
|
||||
**Short description**: Swish takes one input tensor and produces output tensor where the Swish function is applied to the tensor elementwise.
|
||||
**Short description**: *Swish* performs element-wise activation function on a given input tensor.
|
||||
|
||||
**Detailed description**: For each element from the input tensor calculates corresponding
|
||||
element in the output tensor with the following formula:
|
||||
**Detailed description**
|
||||
|
||||
*Swish* operation is introduced in this [article](https://arxiv.org/abs/1710.05941).
|
||||
It performs element-wise activation function on a given input tensor, based on the following mathematical formula:
|
||||
|
||||
\f[
|
||||
Swish(x) = x / (1.0 + e^{-(beta * x)})
|
||||
Swish(x) = x\cdot \sigma(\beta x) = x \left(1 + e^{-(\beta x)}\right)^{-1}
|
||||
\f]
|
||||
|
||||
The Swish operation is introduced in the [article](https://arxiv.org/pdf/1710.05941.pdf).
|
||||
where β corresponds to `beta` scalar input.
|
||||
|
||||
**Attributes**:
|
||||
**Attributes**: *Swish* operation has no attributes.
|
||||
|
||||
**Inputs**:
|
||||
|
||||
* **1**: Multidimensional input tensor of type *T*. **Required**.
|
||||
* **1**: `data`. A tensor of type `T` and arbitrary shape. **Required**.
|
||||
|
||||
* **2**: Scalar with non-negative value of type *T*. Multiplication parameter *beta* for the sigmoid. If the input is not connected then the default value 1.0 is used. **Optional**
|
||||
* **2**: `beta`. A non-negative scalar value of type `T`. Multiplication parameter for the sigmoid. Default value 1.0 is used. **Optional**.
|
||||
|
||||
**Outputs**:
|
||||
|
||||
* **1**: The resulting tensor of the same shape and type as input tensor.
|
||||
* **1**: The result of element-wise *Swish* function applied to the input tensor `data`. A tensor of type `T` and the same shape as `data` input tensor.
|
||||
|
||||
**Types**
|
||||
|
||||
* *T*: arbitrary supported floating point type.
|
||||
* *T*: arbitrary supported floating-point type.
|
||||
|
||||
**Examples**
|
||||
|
||||
**Example**
|
||||
|
||||
*Example: Second input `beta` provided*
|
||||
```xml
|
||||
<layer ... type="Swish">
|
||||
<input>
|
||||
@ -41,13 +43,30 @@ The Swish operation is introduced in the [article](https://arxiv.org/pdf/1710.05
|
||||
<dim>256</dim>
|
||||
<dim>56</dim>
|
||||
</port>
|
||||
<port id="1"/>
|
||||
<port id="1"> <!-- beta value: 2.0 -->
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="1">
|
||||
<port id="2">
|
||||
<dim>256</dim>
|
||||
<dim>56</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
```
|
||||
```
|
||||
|
||||
*Example: Second input `beta` not provided*
|
||||
```xml
|
||||
<layer ... type="Swish">
|
||||
<input>
|
||||
<port id="0">
|
||||
<dim>128</dim>
|
||||
</port>
|
||||
</input>
|
||||
<output>
|
||||
<port id="1">
|
||||
<dim>128</dim>
|
||||
</port>
|
||||
</output>
|
||||
</layer>
|
||||
```
|
||||
|
Loading…
Reference in New Issue
Block a user