Files
openvino/docs/ops/sequence/CTCLoss_4.md

132 lines
6.3 KiB
Markdown
Raw Normal View History

Feature/azaytsev/from 2021 4 (#9247) * Added info on DockerHub CI Framework * Feature/azaytsev/change layout (#3295) * Changes according to feedback comments * Replaced @ref's with html links * Fixed links, added a title page for installing from repos and images, fixed formatting issues * Added links * minor fix * Added DL Streamer to the list of components installed by default * Link fixes * Link fixes * ovms doc fix (#2988) * added OpenVINO Model Server * ovms doc fixes Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com> * Updated openvino_docs.xml * Updated the link to software license agreements * Revert "Updated the link to software license agreements" This reverts commit 706dac500e764bd7534f7005ac6197f827d68cb5. * Docs to Sphinx (#8151) * docs to sphinx * Update GPU.md * Update CPU.md * Update AUTO.md * Update performance_int8_vs_fp32.md * update * update md * updates * disable doc ci * disable ci * fix index.rst Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com> # Conflicts: # .gitignore # docs/CMakeLists.txt # docs/IE_DG/Deep_Learning_Inference_Engine_DevGuide.md # docs/IE_DG/Extensibility_DG/Custom_ONNX_Ops.md # docs/IE_DG/Extensibility_DG/VPU_Kernel.md # docs/IE_DG/InferenceEngine_QueryAPI.md # docs/IE_DG/Int8Inference.md # docs/IE_DG/Integrate_with_customer_application_new_API.md # docs/IE_DG/Model_caching_overview.md # docs/IE_DG/supported_plugins/GPU_RemoteBlob_API.md # docs/IE_DG/supported_plugins/HETERO.md # docs/IE_DG/supported_plugins/MULTI.md # docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Caffe.md # docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md # docs/MO_DG/prepare_model/convert_model/Convert_Model_From_MxNet.md # docs/MO_DG/prepare_model/convert_model/Convert_Model_From_ONNX.md # docs/MO_DG/prepare_model/convert_model/Converting_Model.md # docs/MO_DG/prepare_model/convert_model/Converting_Model_General.md # docs/MO_DG/prepare_model/convert_model/Cutting_Model.md # docs/MO_DG/prepare_model/convert_model/pytorch_specific/Convert_RNNT.md # docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_EfficientDet_Models.md # docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_WideAndDeep_Family_Models.md # docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md # docs/doxygen/Doxyfile.config # docs/doxygen/ie_docs.xml # docs/doxygen/ie_plugin_api.config # docs/doxygen/ngraph_cpp_api.config # docs/doxygen/openvino_docs.xml # docs/get_started/get_started_macos.md # docs/get_started/get_started_raspbian.md # docs/get_started/get_started_windows.md # docs/img/cpu_int8_flow.png # docs/index.md # docs/install_guides/VisionAcceleratorFPGA_Configure.md # docs/install_guides/VisionAcceleratorFPGA_Configure_Windows.md # docs/install_guides/deployment-manager-tool.md # docs/install_guides/installing-openvino-linux.md # docs/install_guides/installing-openvino-macos.md # docs/install_guides/installing-openvino-windows.md # docs/optimization_guide/dldt_optimization_guide.md # inference-engine/ie_bridges/c/include/c_api/ie_c_api.h # inference-engine/ie_bridges/python/docs/api_overview.md # inference-engine/ie_bridges/python/sample/ngraph_function_creation_sample/README.md # inference-engine/ie_bridges/python/sample/speech_sample/README.md # inference-engine/ie_bridges/python/src/openvino/inference_engine/ie_api.pyx # inference-engine/include/ie_api.h # inference-engine/include/ie_core.hpp # inference-engine/include/ie_version.hpp # inference-engine/samples/benchmark_app/README.md # inference-engine/samples/speech_sample/README.md # inference-engine/src/plugin_api/exec_graph_info.hpp # inference-engine/src/plugin_api/file_utils.h # inference-engine/src/transformations/include/transformations_visibility.hpp # inference-engine/tools/benchmark_tool/README.md # ngraph/core/include/ngraph/ngraph.hpp # ngraph/frontend/onnx_common/include/onnx_common/parser.hpp # ngraph/python/src/ngraph/utils/node_factory.py # openvino/itt/include/openvino/itt.hpp # thirdparty/ade # tools/benchmark/README.md * Cherry-picked remove font-family (#8211) * Cherry-picked: Update get_started_scripts.md (#8338) * doc updates (#8268) * Various doc changes * theme changes * remove font-family (#8211) * fix css * Update uninstalling-openvino.md * fix css * fix * Fixes for Installation Guides Co-authored-by: Andrey Zaytsev <andrey.zaytsev@intel.com> Co-authored-by: kblaszczak-intel <karol.blaszczak@intel.com> # Conflicts: # docs/IE_DG/Bfloat16Inference.md # docs/IE_DG/InferenceEngine_QueryAPI.md # docs/IE_DG/OnnxImporterTutorial.md # docs/IE_DG/supported_plugins/AUTO.md # docs/IE_DG/supported_plugins/HETERO.md # docs/IE_DG/supported_plugins/MULTI.md # docs/MO_DG/prepare_model/convert_model/Convert_Model_From_Kaldi.md # docs/MO_DG/prepare_model/convert_model/tf_specific/Convert_YOLO_From_Tensorflow.md # docs/install_guides/installing-openvino-macos.md # docs/install_guides/installing-openvino-windows.md # docs/ops/opset.md # inference-engine/samples/benchmark_app/README.md # inference-engine/tools/benchmark_tool/README.md # thirdparty/ade * Cherry-picked: doc script changes (#8568) * fix openvino-sphinx-theme * add linkcheck target * fix * change version * add doxygen-xfail.txt * fix * AA * fix * fix * fix * fix * fix # Conflicts: # thirdparty/ade * Cherry-pick: Feature/azaytsev/doc updates gna 2021 4 2 (#8567) * Various doc changes * Reformatted C++/Pythob sections. Updated with info from PR8490 * additional fix * Gemini Lake replaced with Elkhart Lake * Fixed links in IGs, Added 12th Gen # Conflicts: # docs/IE_DG/supported_plugins/GNA.md # thirdparty/ade * Cherry-pick: Feature/azaytsev/doc fixes (#8897) * Various doc changes * Removed the empty Learning path topic * Restored the Gemini Lake CPIU list # Conflicts: # docs/IE_DG/supported_plugins/GNA.md # thirdparty/ade * Cherry-pick: sphinx copybutton doxyrest code blocks (#8992) # Conflicts: # thirdparty/ade * Cherry-pick: iframe video enable fullscreen (#9041) # Conflicts: # thirdparty/ade * Cherry-pick: fix untitled titles (#9213) # Conflicts: # thirdparty/ade * Cherry-pick: perf bench graph animation (#9045) * animation * fix # Conflicts: # thirdparty/ade * Cherry-pick: doc pytest (#8888) * docs pytest * fixes # Conflicts: # docs/doxygen/doxygen-ignore.txt # docs/scripts/ie_docs.xml # thirdparty/ade * Cherry-pick: restore deleted files (#9215) * Added new operations to the doc structure (from removed ie_docs.xml) * Additional fixes * Update docs/IE_DG/InferenceEngine_QueryAPI.md Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com> * Update docs/IE_DG/Int8Inference.md Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com> * Update Custom_Layers_Guide.md * Changes according to review comments * doc scripts fixes * Update docs/IE_DG/Int8Inference.md Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com> * Update Int8Inference.md * update xfail * clang format * updated xfail Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com> Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com> Co-authored-by: kblaszczak-intel <karol.blaszczak@intel.com> Co-authored-by: Yury Gorbachev <yury.gorbachev@intel.com> Co-authored-by: Helena Kloosterman <helena.kloosterman@intel.com>
2021-12-21 20:26:37 +03:00
# CTCLoss {#openvino_docs_ops_sequence_CTCLoss_4}
**Versioned name**: *CTCLoss-4*
**Category**: *Sequence processing*
**Short description**: *CTCLoss* computes the CTC (Connectionist Temporal Classification) Loss.
**Detailed description**:
*CTCLoss* operation is presented in [Connectionist Temporal Classification - Labeling Unsegmented Sequence Data with Recurrent Neural Networks: Graves et al., 2016](http://www.cs.toronto.edu/~graves/icml_2006.pdf)
*CTCLoss* estimates likelihood that a target `labels[i,:]` can occur (or is real) for given input sequence of logits `logits[i,:,:]`.
Briefly, *CTCLoss* operation finds all sequences aligned with a target `labels[i,:]`, computes log-probabilities of the aligned sequences using `logits[i,:,:]`
and computes a negative sum of these log-probabilies.
Input sequences of logits `logits` can have different lengths. The length of each sequence `logits[i,:,:]` equals `logit_length[i]`.
A length of target sequence `labels[i,:]` equals `label_length[i]`. The length of the target sequence must not be greater than the length of corresponding input sequence `logits[i,:,:]`.
Otherwise, the operation behaviour is undefined.
*CTCLoss* calculation scheme:
1. Compute probability of `j`-th character at time step `t` for `i`-th input sequence from `logits` using softmax formula:
\f[
p_{i,t,j} = \frac{\exp(logits[i,t,j])}{\sum^{K}_{k=0}{\exp(logits[i,t,k])}}
\f]
2. For a given `i`-th target from `labels[i,:]` find all aligned paths.
A path `S = (c1,c2,...,cT)` is aligned with a target `G=(g1,g2,...,gT)` if both chains are equal after decoding.
The decoding extracts substring of length `label_length[i]` from a target `G`, merges repeated characters in `G` in case *preprocess_collapse_repeated* equal to true and
finds unique elements in the order of character occurrence in case *unique* equal to true.
The decoding merges repeated characters in `S` in case *ctc_merge_repeated* equal to true and removes blank characters represented by `blank_index`.
By default, `blank_index` is equal to `C-1`, where `C` is a number of classes including the blank.
For example, in case default *ctc_merge_repeated*, *preprocess_collapse_repeated*, *unique* and `blank_index` a target sequence `G=(0,3,2,2,2,2,2,4,3)` of a length `label_length[i]=4` is processed
to `(0,3,2,2)` and a path `S=(0,0,4,3,2,2,4,2,4)` of a length `logit_length[i]=9` is also processed to `(0,3,2,2)`, where `C=5`.
There exist other paths that are also aligned with `G`, for instance, `0,4,3,3,2,4,2,2,2`. Paths checked for alignment with a target `label[:,i]` must be of length `logit_length[i] = L_i`.
Compute probabilities of these aligned paths (alignments) as follows:
\f[
p(S) = \prod_{t=1}^{L_i} p_{i,t,ct}
\f]
3. Finally, compute negative log of summed up probabilities of all found alignments:
\f[
Feature/azaytsev/cherry picks from 2021 2 (#4069) * Added info on DockerHub CI Framework * Feature/azaytsev/change layout (#3295) * Changes according to feedback comments * Replaced @ref's with html links * Fixed links, added a title page for installing from repos and images, fixed formatting issues * Added links * minor fix * Added DL Streamer to the list of components installed by default * Link fixes * Link fixes * ovms doc fix (#2988) * added OpenVINO Model Server * ovms doc fixes Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com> * Updated openvino_docs.xml * Added Intel® Iris® Xe Dedicated Graphics, naming convention info (#3523) * Added Intel® Iris® Xe Dedicated Graphics, naming convention info * Added GPU.0 GPU.1 * added info about Intel® Iris® Xe MAX Graphics drivers * Feature/azaytsev/transition s3 bucket (#3609) * Replaced https://download.01.org/ links with https://storage.openvinotoolkit.org/ * Fixed links # Conflicts: # inference-engine/ie_bridges/java/samples/README.md * Benchmarks 2021 2 (#3590) * Initial changes * Updates * Updates * Updates * Fixed graph names * minor fix * Fixed link * Implemented changes according to the review changes * fixed links * Updated Legal_Information.md according to review feedback * Replaced Uzel* UI-AR8 with Mustang-V100-MX8 * Feature/azaytsev/ovsa docs (#3627) * Added ovsa_get_started.md * Fixed formatting issues * Fixed formatting issues * Fixed formatting issues * Fixed formatting issues * Fixed formatting issues * Fixed formatting issues * Fixed formatting issues * Updated the GSG topic, added a new image * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Formatting issues fixes * Revert "Formatting issues fixes" This reverts commit c6e6207431d8622e2ff083315d2d99875734a5b6. * Replaced to Security section * doc fixes (#3626) Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com> # Conflicts: # docs/IE_DG/network_state_intro.md * fix latex formula (#3630) Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com> * fix comments ngraph api 2021.2 (#3520) * fix comments ngraph api * remove whitespace * fixes Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com> * Feature/azaytsev/g api docs (#3731) * Initial commit * Added content * Added new content for g-api documentation. Removed obsolete links through all docs * Fixed layout * Fixed layout * Added new topics * Added new info * added a note * Removed redundant .svg # Conflicts: # docs/get_started/get_started_dl_workbench.md * [Cherry-pick] DL Workbench cross-linking (#3488) * Added links to MO and Benchmark App * Changed wording * Fixes a link * fixed a link * Changed the wording * Links to WB * Changed wording * Changed wording * Fixes * Changes the wording * Minor corrections * Removed an extra point * cherry-pick * Added the doc * More instructions and images * Added slide * Borders for screenshots * fixes * Fixes * Added link to Benchmark app * Replaced the image * tiny fix * tiny fix * Fixed a typo * Feature/azaytsev/g api docs (#3731) * Initial commit * Added content * Added new content for g-api documentation. Removed obsolete links through all docs * Fixed layout * Fixed layout * Added new topics * Added new info * added a note * Removed redundant .svg * Doc updates 2021 2 (#3749) * Change the name of parameter tensorflow_use_custom_operations_config to transformations_config * Fixed formatting * Corrected MYRIAD plugin name * Installation Guides formatting fixes * Installation Guides formatting fixes * Installation Guides formatting fixes * Installation Guides formatting fixes * Installation Guides formatting fixes * Installation Guides formatting fixes * Installation Guides formatting fixes * Installation Guides formatting fixes * Installation Guides formatting fixes * Fixed link to Model Optimizer Extensibility * Fixed link to Model Optimizer Extensibility * Fixed link to Model Optimizer Extensibility * Fixed link to Model Optimizer Extensibility * Fixed link to Model Optimizer Extensibility * Fixed formatting * Fixed formatting * Fixed formatting * Fixed formatting * Fixed formatting * Fixed formatting * Fixed formatting * Fixed formatting * Fixed formatting * Fixed formatting * Fixed formatting * Updated IGS, added links to Get Started Guides * Fixed links * Fixed formatting issues * Fixed formatting issues * Fixed formatting issues * Fixed formatting issues * Move the Note to the proper place * Removed optimization notice # Conflicts: # docs/ops/detection/DetectionOutput_1.md * minor fix * Benchmark updates (#4041) * Link fixes for 2021.2 benchmark page (#4086) * Benchmark updates * Fixed links Co-authored-by: Trawinski, Dariusz <dariusz.trawinski@intel.com> Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com> Co-authored-by: Nikolay Tyukaev <ntyukaev_lo@jenkins.inn.intel.com> Co-authored-by: Alina Alborova <alina.alborova@intel.com>
2021-02-02 11:29:12 +03:00
CTCLoss = - \ln \sum_{S} p(S)
\f]
**Note 1**: This calculation scheme does not provide steps for optimal implementation and primarily serves for better explanation.
**Note 2**: This is recommended to compute a log-probability \f$ \ln p(S)\f$ for an aligned path as a sum of log-softmax of input logits. It helps to avoid underflow and overflow during calculation.
Having log-probabilities for aligned paths, log of summed up probabilities for these paths can be computed as follows:
\f[
\ln(a + b) = \ln(a) + \ln(1 + \exp(\ln(b) - \ln(a)))
\f]
**Attributes**
* *preprocess_collapse_repeated*
* **Description**: *preprocess_collapse_repeated* is a flag for a preprocessing step before loss calculation, wherein repeated labels in `labels[i,:]` passed to the loss are merged into single labels.
* **Range of values**: true or false
* **Type**: `boolean`
* **Default value**: false
* **Required**: *no*
* *ctc_merge_repeated*
* **Description**: *ctc_merge_repeated* is a flag for merging repeated characters in a potential alignment during the CTC loss calculation.
* **Range of values**: true or false
* **Type**: `boolean`
* **Default value**: true
* **Required**: *no*
* *unique*
* **Description**: *unique* is a flag to find unique elements for a target `labels[i,:]` before matching with potential alignments. Unique elements in the processed `labels[i,:]` are sorted in the order of their occurrence in original `labels[i,:]`. For example, the processed sequence for `labels[i,:]=(0,1,1,0,1,3,3,2,2,3)` of length `label_length[i]=10` will be `(0,1,3,2)` in case *unique* equal to true.
* **Range of values**: true or false
* **Type**: `boolean`
* **Default value**: false
* **Required**: *no*
**Inputs**
* **1**: `logits` - Input tensor with a batch of sequences of logits. Type of elements is *T_F*. Shape of the tensor is `[N, T, C]`, where `N` is the batch size, `T` is the maximum sequence length and `C` is the number of classes including the blank. **Required.**
* **2**: `logit_length` - 1D input tensor of type *T1* and of a shape `[N]`. The tensor must consist of non-negative values not greater than `T`. Lengths of input sequences of logits `logits[i,:,:]`. **Required.**
* **3**: `labels` - 2D tensor with shape `[N, T]` of type *T2*. A length of a target sequence `labels[i,:]` is equal to `label_length[i]` and must contain of integers from a range `[0; C-1]` except `blank_index`. **Required.**
* **4**: `label_length` - 1D tensor of type *T1* and of a shape `[N]`. The tensor must consist of non-negative values not greater than `T` and `label_length[i] <= logit_length[i]` for all possible `i`. **Required.**
* **5**: `blank_index` - Scalar of type *T2*. Set the class index to use for the blank label. Default value is `C-1`. **Optional.**
**Output**
* **1**: Output tensor with shape `[N]`, negative sum of log-probabilities of alignments. Type of elements is *T_F*.
**Types**
* *T_F*: any supported floating-point type.
* *T1*, *T2*: `int32` or `int64`.
**Example**
```xml
<layer ... type="CTCLoss" ...>
<input>
<port id="0">
<dim>8</dim>
<dim>20</dim>
<dim>128</dim>
</port>
<port id="1">
<dim>8</dim>
</port>
<port id="2">
<dim>8</dim>
<dim>20</dim>
</port>
<port id="3">
<dim>8</dim>
</port>
<port id="4"> <!-- blank_index value is: 120 -->
</input>
<output>
<port id="0">
<dim>8</dim>
</port>
</output>
</layer>
```