added RNN-t conversion doc (#5140)

* added RNN-t conversion doc

* applied review comments

* a couple of corrections

* added pip3 everywhere

* fixed a typo

* applied review comments

* title name fix

* applied Tatiana's comments round 2

* fixed a typo for 'inference'

* fixed typo in MLCommons name

* moved to PyTorch* specific, applied comments

* pytorch_specific typo

* froze MLCommons revision to r1.0; fixed typo in MLCommons relative path
This commit is contained in:
Pavel Esir
2021-05-31 13:14:05 +03:00
committed by GitHub
parent b33800a61c
commit cccff7fe0d
2 changed files with 111 additions and 3 deletions

View File

@@ -0,0 +1,107 @@
# Convert PyTorch\* RNN-T Model to the Intermediate Representation (IR) {#openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_RNNT}
This instruction covers conversion of RNN-T model from [MLCommons](https://github.com/mlcommons) repository. Follow
the steps below to export a PyTorch* model into ONNX* before converting it to IR:
**Step 1**. Clone RNN-T PyTorch implementation from MLCommons repository (revision r1.0). Make a shallow clone to pull
only RNN-T model without full repository. If you already have a full repository, skip this and go to **Step 2**:
```bash
git clone -b r1.0 -n https://github.com/mlcommons/inference rnnt_for_openvino --depth 1
cd rnnt_for_openvino
git checkout HEAD speech_recognition/rnnt
```
**Step 2**. If you already have a full clone of MLCommons inference repository, create a folder for
pretrained PyTorch model, where conversion into IR will take place. You will also need to specify the path to
your full clone at **Step 5**. Skip this step if you have a shallow clone.
```bash
mkdir rnnt_for_openvino
cd rnnt_for_openvino
```
**Step 3**. Download pretrained weights for PyTorch implementation from https://zenodo.org/record/3662521#.YG21DugzZaQ.
For UNIX*-like systems you can use wget:
```bash
wget https://zenodo.org/record/3662521/files/DistributedDataParallel_1576581068.9962234-epoch-100.pt
```
The link was taken from `setup.sh` in the `speech_recoginitin/rnnt` subfolder. You will get exactly the same weights as
if you were following the steps from https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt.
**Step 4**. Install required python* packages:
```bash
pip3 install torch toml
```
**Step 5**. Export RNN-T model into ONNX with the script below. Copy the code below into a file named
`export_rnnt_to_onnx.py` and run it in the current directory `rnnt_for_openvino`:
> **NOTE**: If you already have a full clone of MLCommons inference repository, you need to
> specify `mlcommons_inference_path` variable.
```python
import toml
import torch
import sys
def load_and_migrate_checkpoint(ckpt_path):
checkpoint = torch.load(ckpt_path, map_location="cpu")
migrated_state_dict = {}
for key, value in checkpoint['state_dict'].items():
key = key.replace("joint_net", "joint.net")
migrated_state_dict[key] = value
del migrated_state_dict["audio_preprocessor.featurizer.fb"]
del migrated_state_dict["audio_preprocessor.featurizer.window"]
return migrated_state_dict
mlcommons_inference_path = './' # specify relative path for MLCommons inferene
checkpoint_path = 'DistributedDataParallel_1576581068.9962234-epoch-100.pt'
config_toml = 'speech_recognition/rnnt/pytorch/configs/rnnt.toml'
config = toml.load(config_toml)
rnnt_vocab = config['labels']['labels']
sys.path.insert(0, mlcommons_inference_path + 'speech_recognition/rnnt/pytorch')
from model_separable_rnnt import RNNT
model = RNNT(config['rnnt'], len(rnnt_vocab) + 1, feature_config=config['input_eval'])
model.load_state_dict(load_and_migrate_checkpoint(checkpoint_path))
seq_length, batch_size, feature_length = 157, 1, 240
inp = torch.randn([seq_length, batch_size, feature_length])
feature_length = torch.LongTensor([seq_length])
x_padded, x_lens = model.encoder(inp, feature_length)
torch.onnx.export(model.encoder, (inp, feature_length), "rnnt_encoder.onnx", opset_version=12,
input_names=['input.1', '1'], dynamic_axes={'input.1': {0: 'seq_len', 1: 'batch'}})
symbol = torch.LongTensor([[20]])
hidden = torch.randn([2, batch_size, 320]), torch.randn([2, batch_size, 320])
g, hidden = model.prediction.forward(symbol, hidden)
torch.onnx.export(model.prediction, (symbol, hidden), "rnnt_prediction.onnx", opset_version=12,
input_names=['input.1', '1', '2'],
dynamic_axes={'input.1': {0: 'batch'}, '1': {1: 'batch'}, '2': {1: 'batch'}})
f = torch.randn([batch_size, 1, 1024])
model.joint.forward(f, g)
torch.onnx.export(model.joint, (f, g), "rnnt_joint.onnx", opset_version=12,
input_names=['0', '1'], dynamic_axes={'0': {0: 'batch'}, '1': {0: 'batch'}})
```
```bash
python3 export_rnnt_to_onnx.py
```
After completing this step, the files rnnt_encoder.onnx, rnnt_prediction.onnx, and rnnt_joint.onnx will be saved in
the current directory.
**Step 6**. Run the conversion command:
```bash
python3 {path_to_openvino}/mo.py --input_model rnnt_encoder.onnx --input "input.1[157 1 240],1->157"
python3 {path_to_openvino}/mo.py --input_model rnnt_prediction.onnx --input "input.1[1 1],1[2 1 320],2[2 1 320]"
python3 {path_to_openvino}/mo.py --input_model rnnt_joint.onnx --input "0[1 1 1024],1[1 1 320]"
```
Please note that hardcoded value for sequence length = 157 was taken from the MLCommons, but conversion to IR preserves
network [reshapeability](../../../../IE_DG/ShapeInference.md); this means you can change input shapes manually to any value either during conversion or
inference.

View File

@@ -55,9 +55,10 @@ limitations under the License.
<tab type="user" title="Convert ONNX* GPT-2 Model to the Intermediate Representation" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_onnx_specific_Convert_GPT2"/>
<tab type="user" title="Convert DLRM ONNX* Model to the Intermediate Representation" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_onnx_specific_Convert_DLRM"/>
<tab type="usergroup" title="Converting Your PyTorch* Model" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_PyTorch">
<tab type="user" title="Convert PyTorch* QuartzNet Model" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_QuartzNet"/>
<tab type="user" title="Convert PyTorch* YOLACT Model" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_YOLACT"/>
<tab type="user" title="Convert PyTorch* F3Net Model" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_F3Net"/>
<tab type="user" title="Convert PyTorch* QuartzNet Model" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_QuartzNet"/>
<tab type="user" title="Convert PyTorch* RNN-T Model " url="@ref openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_RNNT"/>
<tab type="user" title="Convert PyTorch* YOLACT Model" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_YOLACT"/>
<tab type="user" title="Convert PyTorch* F3Net Model" url="@ref openvino_docs_MO_DG_prepare_model_convert_model_pytorch_specific_Convert_F3Net"/>
</tab>
</tab>
<tab type="user" title="Model Optimizations Techniques" url="@ref openvino_docs_MO_DG_prepare_model_Model_Optimization_Techniques"/>