remove deprecated cli arguments from MO (#14937)

* removed deprecated cli arguments from MO

* removed deprecated options from docs

* Update tools/legacy/benchmark_app/README.md

* removed ConcatOptimization.py, removed deprecated args from unit-tests, renamed tensorflow_use_custom_operations_config -> transformations_config; replaced using or cmd_params.data_type -> cmd_params.compress_to_fp16

* added comment why we set internal 'data_type' always to 'FP32' after introducing 'compress_to_fp16'

* removed openvino_docs_MO_DG_prepare_model_Model_Optimization_Techniques from MO_DevGuide

* returned back --data_type argument

* leftovers from recovering

* returned back to package_BOM.txt; fixed NHWC by default for TF

* recovered --disable_nhwc_to_nchw to MO args

* reverted removing --tensorflow_use_custom_operations_config

Co-authored-by: Roman Kazantsev <roman.kazantsev@intel.com>
This commit is contained in:
Pavel Esir 2023-01-13 00:12:56 +03:00 committed by GitHub
parent 7bb1933649
commit 9e3b52eee0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
20 changed files with 28 additions and 2008 deletions

View File

@ -10,7 +10,6 @@
openvino_docs_model_inputs_outputs
openvino_docs_MO_DG_prepare_model_convert_model_Converting_Model
openvino_docs_MO_DG_prepare_model_Model_Optimization_Techniques
openvino_docs_MO_DG_prepare_model_convert_model_Cutting_Model
openvino_docs_MO_DG_Additional_Optimization_Use_Cases
openvino_docs_MO_DG_FP16_Compression

File diff suppressed because it is too large Load Diff

Before

Width:  |  Height:  |  Size: 170 KiB

View File

@ -1,65 +0,0 @@
# Model Optimization Techniques {#openvino_docs_MO_DG_prepare_model_Model_Optimization_Techniques}
Optimization offers methods to accelerate inference with the convolution neural networks (CNN) that do not require model retraining.
* * *
## Linear Operations Fusing
Many convolution neural networks includes `BatchNormalization` and `ScaleShift` layers (for example, Resnet\*, Inception\*) that can be presented as a sequence of linear operations: additions and multiplications. For example ScaleShift layer can be presented as Mul → Add sequence. These layers can be fused into previous `Convolution` or `FullyConnected` layers, except when Convolution comes after an Add operation (due to Convolution paddings).
### Usage
In the Model Optimizer, this optimization is turned on by default. To disable it, you can pass `--disable_fusing` parameter to the Model Optimizer.
### Optimization Description
This optimization method consists of three stages:
1. `BatchNormalization` and `ScaleShift` decomposition: in this stage, `BatchNormalization` layer is decomposed to `Mul → Add → Mul → Add` sequence, and `ScaleShift` layer is decomposed to `Mul → Add` layers sequence.
2. Linear operations merge: in this stage, the `Mul` and `Add` operations are merged into a single `Mul → Add` instance.
For example, if there is a `BatchNormalization → ScaleShift` sequence in the topology, it is replaced with `Mul → Add` in the first stage. In the next stage, the latter is replaced with a `ScaleShift` layer if there is no available `Convolution` or `FullyConnected` layer to fuse into next.
3. Linear operations fusion: in this stage, the tool fuses `Mul` and `Add` operations to `Convolution` or `FullyConnected` layers. Note that it searches for `Convolution` and `FullyConnected` layers both backward and forward in the graph (except for `Add` operation that cannot be fused to `Convolution` layer in forward direction).
### Usage Examples
The picture below shows the depicted part of Caffe Resnet269 topology where `BatchNorm` and `ScaleShift` layers will be fused to `Convolution` layers.
![Caffe ResNet269 block before and after optimization generated with Netscope*](../img/optimizations/resnet_269.svg)
* * *
## ResNet optimization (stride optimization)
ResNet optimization is a specific optimization that applies to Caffe ResNet topologies such as ResNet50, ResNet101, ResNet152 and to ResNet-based topologies. This optimization is turned on by default, and can be disabled with the `--disable_resnet_optimization` key.
### Optimization Description
In the picture below, you can see the original and optimized parts of a Caffe ResNet50 model. The main idea of this optimization is to move the stride that is greater than 1 from Convolution layers with the kernel size = 1 to upper Convolution layers. In addition, the Model Optimizer adds a Pooling layer to align the input shape for a Eltwise layer, if it was changed during the optimization.
![ResNet50 blocks (original and optimized) from Netscope](../img/optimizations/resnet_optimization.svg)
In this example, the stride from the `res3a_branch1` and `res3a_branch2a` Convolution layers moves to the `res2c_branch2b` Convolution layer. In addition, to align the input shape for `res2c` Eltwise, the optimization inserts the Pooling layer with kernel size = 1 and stride = 2.
* * *
## Grouped Convolution Fusing
Grouped convolution fusing is a specific optimization that applies for TensorFlow topologies. The main idea of this optimization is to combine convolutions results for the `Split` outputs and then recombine them using `Concat` operation in the same order as they were out from `Split`.
![Split→Convolutions→Concat block from TensorBoard*](../img/optimizations/groups.svg)
* * *
## Disabling Fusing
Model Optimizer allows to disable optimizations for specified nodes via `--finegrain_fusing <node_name1>,<node_name2>,...` (regex is also supported). Using this key, you mark nodes that will noy be touched by any optimizations.
### Examples of usage
On the picture below you can see two visualized Intermediate Representations (IR) of TensorFlow InceptionV4 topology.
The first one is original IR that will be produced by the Model Optimizer.
The second one will be produced by the Model Optimizer with key `--finegrain_fusing InceptionV4/InceptionV4/Conv2d_1a_3x3/Conv2D`, where you can see that `Convolution` was not fused with `Mul1_3752` and `Mul1_4061/Fused_Mul_5096/FusedScaleShift_5987` operations.
![TF InceptionV4 block without/with key --finegrain_fusing (from IR visualizer)](../img/optimizations/inception_v4.svg)

View File

@ -256,10 +256,8 @@ There are several middle transformations responsible for changing model layout f
This layout change is disabled automatically if the model does not have operations that OpenVINO&trade needs to execute in the NCHW layout, for example, Convolutions in NHWC layout.
It is still possible to force Model Optimizer to do a layout change, using `--disable_nhwc_to_nchw` command-line parameter, although it is not advised.
Layout change is a complex problem and will be addressed here very briefly. For more details on how it works, refer to the source code of the transformations mentioned in the below summary of the process:
For more details on how it works, refer to the source code of the transformations mentioned in the below summary of the process:
1. Model Optimizer changes output shapes of most of operations producing 4D and 5D (four dimensional and five
dimensional) tensors as if they were in NHWC layout to NCHW layout: `nchw_shape = np.array(nhwc_shape)[0, 3, 1, 2]` for

View File

@ -142,7 +142,7 @@ This section provides step-by-step instructions on how to run the Benchmark Tool
omz_downloader --name googlenet-v1 -o <models_dir>
2. Convert the model to the Inference Engine IR format. Run the Model Optimizer using the `mo` command with the path to the model, model format (which must be FP32 for CPU and FPG) and output directory to generate the IR files:
```sh
mo --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel --data_type FP32 --output_dir <ir_dir>
mo --input_model <models_dir>/public/googlenet-v1/googlenet-v1.caffemodel
```
3. Run the tool with specifying the `<INSTALL_DIR>/samples/scripts/car.png` file as an input image, the IR of the `googlenet-v1` model and a device to perform inference on. The following commands demonstrate running the Benchmark Tool in the asynchronous mode on CPU and GPU devices:

View File

@ -14,7 +14,7 @@ class TransposeDFT(BackReplacementPattern):
for operation DFT or IDFT.
If the input rank in the TF model was greater than 2, we have [N_0, 2, N_1, ..., N_{r - 1}] as the input shape of
(I)DFT after the layout conversion, if the option '--disable_nhwc_to_nchw' is not specified.
(I)DFT after the layout conversion.
But, generally speaking, according to DFT and IDFT specifications, the input shape [N_0, 2, N_1, ..., N_{r - 1}]
is not correct input shape for DFT and IDFT. Hence, we need to insert Transpose operations before and after (I)DFT

View File

@ -82,7 +82,6 @@ class CompressQuantizeWeights(BackReplacementPattern):
"""
enabled = True
graph_condition = [lambda graph: not graph.graph['cmd_params'].disable_weights_compression]
force_clean_up = True

View File

@ -1,86 +1,10 @@
# Copyright (C) 2018-2022 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import logging as log
from openvino.tools.mo.middle.fusings import Fusing
from openvino.tools.mo.middle.pass_separator import PostMiddleStart
from openvino.tools.mo.graph.graph import Node, Graph
from openvino.tools.mo.graph.graph import Graph
from openvino.tools.mo.middle.replacement import MiddleReplacementPattern
class ConcatOptimization(MiddleReplacementPattern):
# This optimization reduces number of edges between Concat operations
# that significantly reduce memory consumption
enabled = True
graph_condition = [lambda graph: graph.graph['cmd_params'].enable_concat_optimization]
def run_after(self):
return [Fusing]
def run_before(self):
return [PostMiddleStart]
def find_and_replace_pattern(self, graph: Graph):
mp = {}
used = {}
for node in graph.get_op_nodes(type='Concat'):
in_nodes = tuple([node.in_node(idx).id for idx in range(len(node.in_nodes()))])
out_node = (node.id, node.out_node().id)
if in_nodes in mp:
log.warning("Something is weird! {} and {}".format(node.id, mp[in_nodes]))
else:
mp.update({in_nodes: out_node})
used.update({node.id: {x: False for x in in_nodes}})
for key in mp.keys():
replacers = []
for i in range(len(key)):
for j in range(i + 1, len(key)):
arr = tuple(key[i:j + 1])
if arr in mp.keys() and arr != key:
replacers.append((len(arr), arr))
replacers.sort(reverse=True)
concat_id = mp[key][0]
for ln, arr in replacers:
# Check that we can do it!!!
we_can = True
for x in arr:
if used[concat_id][x]:
we_can = False
break
if not we_can:
continue
for x in arr:
used[concat_id][x] = True
edge_attrs = graph.get_edge_data(arr[0], concat_id)[0]
for in_node in arr:
graph.remove_edge(in_node, concat_id)
new_input = mp[arr][1]
out_port = len(Node(graph, new_input).out_nodes()) + 1
edge_attrs['out'] = out_port
graph.add_edge(new_input, concat_id, **edge_attrs)
# Renumber 'in' attrs
concat_node = Node(graph, concat_id)
ln = len(concat_node.in_nodes())
ports = [x for x in concat_node.in_nodes().keys()]
ports.sort()
p_id = 0
for p in ports:
in_node = concat_node.in_nodes()[p]
graph[in_node.id][concat_id][0]['in'] = p_id
p_id += 1
class ConcatOdInputEraserAndPortsReconnect(MiddleReplacementPattern):
"""
The transformation performs two actions with Concat operations:

View File

@ -21,7 +21,6 @@ class ReluFakeQuantizeMark(MiddleReplacementPattern):
"""
enabled = True
graph_condition = [lambda graph: not graph.graph['cmd_params'].disable_fusing]
def run_after(self):
return [BinarizeWeightsM1P1]

View File

@ -46,9 +46,6 @@ class Fusing(MiddleReplacementPattern):
for_graph_and_each_sub_graph_recursively(graph, fuse_pad)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
# Mark nodes with attr 'can_be_fused': False to disable fusing for specified nodes
for_graph_and_each_sub_graph_recursively(graph, lambda graph: mark_unfused_nodes(graph, argv.finegrain_fusing))
# Converting FusedBatchNorm layer to Mul->Add->Mul->Add sequence
# IE doesn't support batchNormInference with 4 inputs, so we have to split it to two ScaleShift
for_graph_and_each_sub_graph_recursively(graph, convert_batch_norm)
@ -61,42 +58,40 @@ class Fusing(MiddleReplacementPattern):
for_graph_and_each_sub_graph_recursively(graph, Sub().find_and_replace_pattern)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
if not argv.disable_fusing:
if fw != 'caffe':
# Converting ScaleShift layer to Mul->Add
for_graph_and_each_sub_graph_recursively(graph, convert_scale_shift_to_mul_add)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
# Fusing the sequences of Mul/Add operations
for_graph_and_each_sub_graph_recursively(graph, fuse_mul_add_sequence)
if fw != 'caffe':
# Converting ScaleShift layer to Mul->Add
for_graph_and_each_sub_graph_recursively(graph, convert_scale_shift_to_mul_add)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
normalize_eltwise_inputs(graph)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
# Fusing the sequences of Mul/Add operations
for_graph_and_each_sub_graph_recursively(graph, fuse_mul_add_sequence)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
# Fusing linear operation to Convolution
for_graph_and_each_sub_graph_recursively(graph, fuse_linear_ops)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
normalize_eltwise_inputs(graph)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
# Fusing linear operation to Convolution
for_graph_and_each_sub_graph_recursively(graph, fuse_linear_ops)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
for_graph_and_each_sub_graph_recursively(graph, grouped_convolutions_fusing)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
if not argv.disable_fusing:
for_graph_and_each_sub_graph_recursively(graph, fuse_linear_ops)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
for_graph_and_each_sub_graph_recursively(graph, fuse_linear_ops)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
for_graph_and_each_sub_graph_recursively(graph, normalize_eltwise_inputs)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
if not argv.disable_fusing:
MarkNodesToFuseUpToFakeQuantize().find_and_replace_pattern(graph)
FakeQuantizeFuse().find_and_replace_pattern(graph)
AddFakeQuantizeFuse().find_and_replace_pattern(graph)
MulFakeQuantizeFuse().find_and_replace_pattern(graph)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
MarkNodesToFuseUpToFakeQuantize().find_and_replace_pattern(graph)
FakeQuantizeFuse().find_and_replace_pattern(graph)
AddFakeQuantizeFuse().find_and_replace_pattern(graph)
MulFakeQuantizeFuse().find_and_replace_pattern(graph)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
mark_shape_of_sugraph_as_unfusable(graph)
for_graph_and_each_sub_graph_recursively(graph, fuse_pad)
for_graph_and_each_sub_graph_recursively(graph, lambda G: G.clean_up())
if layout != 'NHWC' and not argv.disable_resnet_optimization:
if layout != 'NHWC':
stride_optimization(graph)

View File

@ -20,7 +20,6 @@ class MarkNodesToFuseUpToFakeQuantize(MiddleReplacementPattern):
"""
enabled = True
graph_condition = [lambda graph: not graph.graph['cmd_params'].disable_fusing]
def run_after(self):
return [DeleteControlFlowEdges]
@ -73,7 +72,6 @@ class FakeQuantizeFuse(MiddleReplacementPattern):
replacer duplicates node to fuse (duplicate connections of inputs of node to fuse to duplicates of it)
"""
enabled = True
graph_condition = [lambda graph: not graph.graph['cmd_params'].disable_fusing]
def run_after(self):
return [QuantizeLinearResolver]

View File

@ -723,9 +723,9 @@ class DeprecatedStoreTrue(argparse.Action):
class DeprecatedOptionCommon(argparse.Action):
def __call__(self, parser, args, values, option_string):
dep_msg = "Use of deprecated cli option {} detected. Option use in the following releases will be fatal. ".format(option_string)
log.error(dep_msg, extra={'is_warning': True})
setattr(args, self.dest, values)
dep_msg = "Use of deprecated cli option {} detected. Option use in the following releases will be fatal. ".format(option_string)
log.error(dep_msg, extra={'is_warning': True})
setattr(args, self.dest, values)
class IgnoredAction(argparse.Action):
@ -1025,19 +1025,6 @@ def get_common_cli_parser(parser: argparse.ArgumentParser = None):
help=mo_convert_params_common['transform'].description.format(
mo_convert_params_common['transform'].possible_types_command_line),
default="")
common_group.add_argument('--disable_fusing',
help='[DEPRECATED] Turn off fusing of linear operations to Convolution.',
action=DeprecatedStoreTrue)
common_group.add_argument('--disable_resnet_optimization',
help='[DEPRECATED] Turn off ResNet optimization.',
action=DeprecatedStoreTrue, default=False)
common_group.add_argument('--finegrain_fusing',
help='[DEPRECATED] Regex for layers/operations that won\'t be fused. ' +
'Example: --finegrain_fusing Convolution1,.*Scale.*',
action=DeprecatedOptionCommon)
common_group.add_argument('--enable_concat_optimization',
help='[DEPRECATED] Turn on Concat optimization.',
action=DeprecatedStoreTrue, default=False)
# we use CanonicalizeDirCheckExistenceAction instead of readable_dirs to handle empty strings
common_group.add_argument("--extensions",
help=mo_convert_params_common['extensions'].description.format(
@ -1067,9 +1054,6 @@ def get_common_cli_parser(parser: argparse.ArgumentParser = None):
common_group.add_argument('--static_shape',
help=mo_convert_params_common['static_shape'].description,
action='store_true', default=False)
common_group.add_argument('--disable_weights_compression',
help='[DEPRECATED] Disable compression and store weights with original precision.',
action=DeprecatedStoreTrue, default=False)
common_group.add_argument('--progress',
help=mo_convert_params_common['progress'].description,
action='store_true', default=False)
@ -1106,7 +1090,6 @@ def get_common_cli_options(model_name):
d['scale_values'] = ['- Scale values', lambda x: x if x else 'Not specified']
d['scale'] = ['- Scale factor', lambda x: x if x else 'Not specified']
d['data_type'] = ['- Precision of IR', lambda x: 'FP32' if x == 'float' else 'FP16' if x == 'half' else x]
d['disable_fusing'] = ['- Enable fusing', lambda x: not x]
d['transform'] = ['- User transformations', lambda x: x if x else 'Not specified']
d['reverse_input_channels'] = '- Reverse input channels'
d['static_shape'] = '- Enable IR generation for fixed input shape'
@ -1126,7 +1109,6 @@ def get_caffe_cli_options():
'input_proto': ['- Path to the Input prototxt', lambda x: x],
'caffe_parser_path': ['- Path to Python Caffe* parser generated from caffe.proto', lambda x: x],
'k': '- Path to CustomLayersMapping.xml',
'disable_resnet_optimization': ['- Enable resnet optimization', lambda x: not x],
}
return OrderedDict(sorted(d.items(), key=lambda t: t[0]))

View File

@ -45,13 +45,8 @@ def base_args_config():
args.output_dir = os.getcwd()
args.freeze_placeholder_with_value = None
args.transformations_config = None
args.disable_fusing = None
args.finegrain_fusing = None
args.disable_gfusing = None
args.disable_resnet_optimization = None
args.enable_concat_optimization = None
args.static_shape = None
args.disable_weights_compression = None
args.reverse_input_channels = None
args.data_type = None
args.layout = None

View File

@ -104,7 +104,6 @@ class TFFFTToDFTTest(unittest.TestCase):
'fft': {'num_of_dimensions': num_of_dimensions, 'fft_kind': dft_type},
})
graph.stage = 'front'
setattr(graph.graph['cmd_params'], 'disable_nhwc_to_nchw', False)
graph.graph['layout'] = 'NHWC'
TFFFTToDFT().find_and_replace_pattern(graph)
ref_graph = build_graph(nodes_attrs=ref_dft_graph_node_attrs,
@ -147,7 +146,6 @@ class TFFFTToDFTTest(unittest.TestCase):
'fft': {'num_of_dimensions': num_of_dims, 'fft_kind': fft_kind},
})
graph.stage = 'front'
setattr(graph.graph['cmd_params'], 'disable_nhwc_to_nchw', False)
graph.graph['layout'] = 'NHWC'
TFFFTToDFT().find_and_replace_pattern(graph)
ref_graph = build_graph(nodes_attrs=ref_dft_graph_with_signal_size_node_attrs,

View File

@ -40,12 +40,7 @@ def base_args_config(use_legacy_fe: bool = None, use_new_fe: bool = None):
args.output_dir = os.getcwd()
args.freeze_placeholder_with_value = None
args.transformations_config = None
args.disable_fusing = None
args.finegrain_fusing = None
args.disable_resnet_optimization = None
args.enable_concat_optimization = None
args.static_shape = None
args.disable_weights_compression = None
args.reverse_input_channels = None
args.data_type = None
args.layout = None

View File

@ -44,12 +44,7 @@ def base_args_config(use_legacy_fe:bool=None, use_new_fe:bool=None):
args.output_dir=os.getcwd()
args.freeze_placeholder_with_value = None
args.transformations_config = None
args.disable_fusing = None
args.finegrain_fusing = None
args.disable_resnet_optimization = None
args.enable_concat_optimization = None
args.static_shape = None
args.disable_weights_compression = None
args.reverse_input_channels = None
args.data_type = None
args.layout = None

View File

@ -39,12 +39,7 @@ def base_args_config(use_legacy_fe: bool = None, use_new_fe: bool = None):
args.output_dir = os.getcwd()
args.freeze_placeholder_with_value = None
args.transformations_config = None
args.disable_fusing = None
args.finegrain_fusing = None
args.disable_resnet_optimization = None
args.enable_concat_optimization = None
args.static_shape = None
args.disable_weights_compression = None
args.reverse_input_channels = None
args.data_type = None
args.layout = None
@ -55,7 +50,6 @@ def base_args_config(use_legacy_fe: bool = None, use_new_fe: bool = None):
args.saved_model_dir = None
args.input_meta_graph = None
args.saved_model_tags = None
args.tensorflow_use_custom_operations_config = None
return args

View File

@ -39,12 +39,7 @@ def base_args_config():
args.output_dir=os.getcwd()
args.freeze_placeholder_with_value = None
args.transformations_config = None
args.disable_fusing = None
args.finegrain_fusing = None
args.disable_resnet_optimization = None
args.enable_concat_optimization = None
args.static_shape = None
args.disable_weights_compression = None
args.reverse_input_channels = None
args.data_type = None
args.layout = None

View File

@ -46,7 +46,6 @@ def arg_parse_helper(input_model,
source_layout={},
target_layout={},
freeze_placeholder_with_value=None,
tensorflow_use_custom_operations_config=None,
data_type=None,
tensorflow_custom_operations_config_update=None,
)

View File

@ -43,14 +43,8 @@ def base_args_config():
args.scale_values = None
args.output_dir = os.getcwd()
args.freeze_placeholder_with_value = None
args.tensorflow_use_custom_operations_config = None
args.transformations_config = None
args.disable_fusing = None
args.finegrain_fusing = None
args.disable_resnet_optimization = None
args.enable_concat_optimization = None
args.static_shape = None
args.disable_weights_compression = None
args.reverse_input_channels = None
args.data_type = None
args.layout = None