From f7863847adb70c22f66289204801e320da3fa5bb Mon Sep 17 00:00:00 2001 From: Gabriele Galiero Casay Date: Thu, 1 Apr 2021 15:07:20 +0200 Subject: [PATCH] Swish specification refactoring (#5015) * Review spec of Swish operation * Change reference link to abstract * Minor change in example section * Fix minor wording issues --- docs/ops/activation/Swish_4.md | 51 +++++++++++++++++++++++----------- 1 file changed, 35 insertions(+), 16 deletions(-) diff --git a/docs/ops/activation/Swish_4.md b/docs/ops/activation/Swish_4.md index 78bcb3866e7..1a8b7d1b51a 100644 --- a/docs/ops/activation/Swish_4.md +++ b/docs/ops/activation/Swish_4.md @@ -2,38 +2,40 @@ **Versioned name**: *Swish-4* -**Category**: *Activation* +**Category**: *Activation function* -**Short description**: Swish takes one input tensor and produces output tensor where the Swish function is applied to the tensor elementwise. +**Short description**: *Swish* performs element-wise activation function on a given input tensor. -**Detailed description**: For each element from the input tensor calculates corresponding -element in the output tensor with the following formula: +**Detailed description** + +*Swish* operation is introduced in this [article](https://arxiv.org/abs/1710.05941). +It performs element-wise activation function on a given input tensor, based on the following mathematical formula: \f[ -Swish(x) = x / (1.0 + e^{-(beta * x)}) +Swish(x) = x\cdot \sigma(\beta x) = x \left(1 + e^{-(\beta x)}\right)^{-1} \f] -The Swish operation is introduced in the [article](https://arxiv.org/pdf/1710.05941.pdf). +where β corresponds to `beta` scalar input. -**Attributes**: +**Attributes**: *Swish* operation has no attributes. **Inputs**: -* **1**: Multidimensional input tensor of type *T*. **Required**. +* **1**: `data`. A tensor of type `T` and arbitrary shape. **Required**. -* **2**: Scalar with non-negative value of type *T*. Multiplication parameter *beta* for the sigmoid. If the input is not connected then the default value 1.0 is used. **Optional** +* **2**: `beta`. A non-negative scalar value of type `T`. Multiplication parameter for the sigmoid. Default value 1.0 is used. **Optional**. **Outputs**: -* **1**: The resulting tensor of the same shape and type as input tensor. +* **1**: The result of element-wise *Swish* function applied to the input tensor `data`. A tensor of type `T` and the same shape as `data` input tensor. **Types** -* *T*: arbitrary supported floating point type. +* *T*: arbitrary supported floating-point type. +**Examples** -**Example** - +*Example: Second input `beta` provided* ```xml @@ -41,13 +43,30 @@ The Swish operation is introduced in the [article](https://arxiv.org/pdf/1710.05 256 56 - + + - + 256 56 -``` \ No newline at end of file +``` + +*Example: Second input `beta` not provided* +```xml + + + + 128 + + + + + 128 + + + +```