* revise LSTM spec Signed-off-by: fishbell <bell.song@intel.com> * add param check and related test case Signed-off-by: fishbell <bell.song@intel.com> * fix clang-format Signed-off-by: fishbell <bell.song@intel.com> * use static_cast to replace c style force conversion Signed-off-by: fishbell <bell.song@intel.com>
4.0 KiB
LSTMCell
Versioned name: LSTMCell-1
Category: Sequence processing
Short description: LSTMCell operation represents a single LSTM cell. It computes the output using the formula described in the original paper Long Short-Term Memory.
Detailed description: LSTMCell computes the output Ht and ot for current time step based on the following formula:
Formula:
* - matrix multiplication
(.) - Hadamard product (element-wise)
[,] - concatenation
f, g, h - are activation functions.
it = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Wbi + Rbi)
ft = f(Xt*(Wf^T) + Ht-1*(Rf^T) + Wbf + Rbf)
ct = g(Xt*(Wc^T) + Ht-1*(Rc^T) + Wbc + Rbc)
Ct = ft (.) Ct-1 + it (.) ct
ot = f(Xt*(Wo^T) + Ht-1*(Ro^T) + Wbo + Rbo)
Ht = ot (.) h(Ct)
Attributes
-
hidden_size
- Description: hidden_size specifies hidden state size.
- Range of values: a positive integer
- Type:
int - Required: yes
-
activations
- Description: activations specifies activation functions for gates, there are three gates, so three activation functions should be specified as a value for this attributes
- Range of values: any combination of relu, sigmoid, tanh
- Type: a list of strings
- Default value: sigmoid for f, tanh for g, tanh for h
- Required: no
-
activations_alpha, activations_beta
- Description: activations_alpha, activations_beta attributes of functions; applicability and meaning of these attributes depends on chosen activation functions
- Range of values: a list of floating-point numbers
- Type:
float[] - Default value: None
- Required: no
-
clip
- Description: clip specifies bound values [-C, C] for tensor clipping. Clipping is performed before activations.
- Range of values: a positive floating-point number
- Type:
float - Default value: infinity that means that the clipping is not applied
- Required: no
Inputs
-
1:
X- 2D tensor of type T[batch_size, input_size], input data. Required. -
2:
initial_hidden_state- 2D tensor of type T[batch_size, hidden_size]. Required. -
3:
initial_cell_state- 2D tensor of type T[batch_size, hidden_size]. Required. -
4:
W- 2D tensor of type T[4 * hidden_size, input_size], the weights for matrix multiplication, gate order: fico. Required. -
5:
R- 2D tensor of type T[4 * hidden_size, hidden_size], the recurrence weights for matrix multiplication, gate order: fico. Required. -
6:
B1D tensor of type T[4 * hidden_size], the sum of biases (weights and recurrence weights), if not specified - assumed to be 0. optional.
Outputs
-
1:
Ho- 2D tensor of type T[batch_size, hidden_size], the last output value of hidden state. -
2:
Co- 2D tensor of type T[batch_size, hidden_size], the last output value of cell state.
Types
- T: any supported floating-point type.
Example
<layer ... type="LSTMCell" ...>
<data hidden_size="128"/>
<input>
<port id="0">
<dim>1</dim>
<dim>16</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>128</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>128</dim>
</port>
<port id="3">
<dim>512</dim>
<dim>16</dim>
</port>
<port id="4">
<dim>512</dim>
<dim>128</dim>
</port>
<port id="5">
<dim>512</dim>
</port>
</input>
<output>
<port id="6">
<dim>1</dim>
<dim>128</dim>
</port>
<port id="7">
<dim>1</dim>
<dim>128</dim>
</port>
</output>
</layer>