4.1 KiB
CTCGreedyDecoderSeqLen
@sphinxdirective
Versioned name: CTCGreedyDecoderSeqLen-6
Category: Sequence processing
Short description: CTCGreedyDecoderSeqLen performs greedy decoding of the logits provided as the first input. The sequence lengths are provided as the second input.
Detailed description:
This operation is similar to the TensorFlow CTCGreedyDecoder <https://www.tensorflow.org/api_docs/python/tf/nn/ctc_greedy_decoder>__.
The operation CTCGreedyDecoderSeqLen implements best path decoding. Decoding is done in two steps:
-
Concatenate the most probable classes per time-step which yields the best path.
-
Remove duplicate consecutive elements if the attribute merge_repeated is true and then remove all blank elements.
Sequences in the batch can have different length. The lengths of sequences are coded in the second input integer tensor sequence_length.
The main difference between :doc:CTCGreedyDecoder <openvino_docs_ops_sequence_CTCGreedyDecoder_1> and CTCGreedyDecoderSeqLen is in the second input. CTCGreedyDecoder uses 2D input floating-point tensor with sequence masks for each sequence in the batch while CTCGreedyDecoderSeqLen uses 1D integer tensor with sequence lengths.
Attributes
-
merge_repeated
- Description: merge_repeated is a flag for merging repeated labels during the CTC calculation. If the value is false the sequence
ABB*B*B(where '*' is the blank class) will look likeABBBB. But if the value is true, the sequence will beABBB. - Range of values: true or false
- Type:
boolean - Default value: true
- Required: no
- Description: merge_repeated is a flag for merging repeated labels during the CTC calculation. If the value is false the sequence
-
classes_index_type
- Description: the type of output tensor with classes indices
- Range of values: "i64" or "i32"
- Type: string
- Default value: "i32"
- Required: no
-
sequence_length_type
- Description: the type of output tensor with sequence length
- Range of values: "i64" or "i32"
- Type: string
- Default value: "i32"
- Required: no
Inputs
- 1:
data- input tensor of type T_F of shape[N, T, C]with a batch of sequences. WhereTis the maximum sequence length,Nis the batch size andCis the number of classes. Required. - 2:
sequence_length- input tensor of type T_I of shape[N]with sequence lengths. The values of sequence length must be less or equal toT. Required. - 3:
blank_index- scalar or 1D tensor with 1 element of type T_I. Specifies the class index to use for the blank class. Regardless of the value ofmerge_repeatedattribute, if the output index for a given batch and time step corresponds to theblank_index, no new element is emitted. Default value isC-1. Optional.
Output
- 1: Output tensor of type T_IND1 shape
[N, T]and containing the decoded classes. All elements that do not code sequence classes are filled with -1. - 2: Output tensor of type T_IND2 shape
[N]and containing length of decoded class sequence for each batch.
Types
- T_F: any supported floating-point type.
- T_I:
int32orint64. - T_IND1:
int32orint64and depends onclasses_index_typeattribute. - T_IND2:
int32orint64and depends onsequence_length_typeattribute.
Example
.. code-block:: cpp
<layer ... type="CTCGreedyDecoderSeqLen" version="opset6"> 8 20 128 8 < !-- blank_index = 120 --> 8 20 8
@endsphinxdirective