Files
openvino/docs/ops/quantization/FakeQuantize_1.md
Nikolay Tyukaev ef45b5da8d Doc Migration (master) (#1377)
* Doc Migration from Gitlab (#1289)

* doc migration

* fix

* Update FakeQuantize_1.md

* Update performance_benchmarks.md

* Updates graphs for FPGA

* Update performance_benchmarks.md

* Change DL Workbench structure (#1)

* Changed DL Workbench structure

* Fixed tags

* fixes

* Update ie_docs.xml

* Update performance_benchmarks_faq.md

* Fixes in DL Workbench layout

* Fixes for CVS-31290

* [DL Workbench] Minor correction

* Fix for CVS-30955

* Added nGraph deprecation notice as requested by Zoe

* fix broken links in api doxy layouts

* CVS-31131 fixes

* Additional fixes

* Fixed POT TOC

* Update PAC_Configure.md

PAC DCP 1.2.1 install guide.

* Update inference_engine_intro.md

* fix broken link

* Update opset.md

* fix

* added opset4 to layout

* added new opsets to layout, set labels for them

* Update VisionAcceleratorFPGA_Configure.md

Updated from 2020.3 to 2020.4

Co-authored-by: domi2000 <domi2000@users.noreply.github.com>
2020-07-20 17:36:08 +03:00

3.6 KiB

FakeQuantize

Versioned name: FakeQuantize-1

Category: Quantization

Short description: FakeQuantize is element-wise linear quantization of floating-point input values into a discrete set of floating-point values.

Detailed description: Input and output ranges as well as the number of levels of quantization are specified by dedicated inputs and attributes. There can be different limits for each element or groups of elements (channels) of the input tensors. Otherwise, one limit applies to all elements. It depends on shape of inputs that specify limits and regular broadcasting rules applied for input tensors. The output of the operator is a floating-point number of the same type as the input tensor. In general, there are four values that specify quantization for each element: input_low, input_high, output_low, output_high. input_low and input_high attributes specify the input range of quantization. All input values that are outside this range are clipped to the range before actual quantization. output_low and output_high specify minimum and maximum quantized values at the output.

Fake in FakeQuantize means the output tensor is of the same floating point type as an input tensor, not integer type.

Each element of the output is defined as the result of the following expression:

if x <= min(input_low, input_high):
    output = output_low
elif x > max(input_low, input_high):
    output = output_high
else:
    # input_low < x <= input_high
    output = round((x - input_low) / (input_high - input_low) * (levels-1)) / (levels-1) * (output_high - output_low) + output_low

Attributes

  • levels

    • Description: levels is the number of quantization levels (e.g. 2 is for binarization, 255/256 is for int8 quantization)
    • Range of values: an integer greater than or equal to 2
    • Type: int
    • Default value: None
    • Required: yes

Inputs:

  • 1: X - multidimensional input tensor of floating type to be quantized. Required.

  • 2: input_low - minimum limit for input value. The shape must be broadcastable to the shape of X. Required.

  • 3: input_high - maximum limit for input value. Can be the same as input_low for binarization. The shape must be broadcastable to the shape of X. Required.

  • 4: output_low - minimum quantized value. The shape must be broadcastable to the shape of X. Required.

  • 5: output_high - maximum quantized value. The shape must be broadcastable to the of X. Required.

Outputs:

  • 1: Y - resulting tensor with shape and type matching the 1st input tensor X.

Example

<layer  type="FakeQuantize">
    <data levels="2"/>
    <input>
        <port id="0">
            <dim>1</dim>
            <dim>64</dim>
            <dim>56</dim>
            <dim>56</dim>
        </port>
        <port id="1">
            <dim>1</dim>
            <dim>64</dim>
            <dim>1</dim>
            <dim>1</dim>
        </port>
        <port id="2">
            <dim>1</dim>
            <dim>64</dim>
            <dim>1</dim>
            <dim>1</dim>
        </port>
        <port id="3">
            <dim>1</dim>
            <dim>1</dim>
            <dim>1</dim>
            <dim>1</dim>
        </port>
        <port id="4">
            <dim>1</dim>
            <dim>1</dim>
            <dim>1</dim>
            <dim>1</dim>
        </port>
    </input>
    <output>
        <port id="5">
            <dim>1</dim>
            <dim>64</dim>
            <dim>56</dim>
            <dim>56</dim>
        </port>
    </output>
</layer>