6.6 KiB
MatMul
Versioned name: MatMul-1
Category: Matrix multiplication
Short description: Generalized matrix multiplication
Detailed description
MatMul operation takes two tensors and performs usual matrix-matrix multiplication, matrix-vector multiplication or vector-matrix multiplication depending on argument shapes. Input tensors can have any rank >= 1. Two right-most axes in each tensor are interpreted as matrix rows and columns dimensions while all left-most axes (if present) are interpreted as multi-dimensional batch: [BATCH_DIM_1, BATCH_DIM_2,..., BATCH_DIM_K, ROW_INDEX_DIM, COL_INDEX_DIM]. The operation supports usual broadcast semantics for batch dimensions. It enables multiplication of batch of pairs of matrices in a single shot.
Before matrix multiplication, there is an implicit shape alignment for input arguments. It consists of the following steps:
-
Applying transpositions specified by optional
transpose_aandtranspose_battributes. Only the two right-most dimensions are transposed, other dimensions remain the same. Transpose attributes are ignored for 1D tensors. -
One-dimensional tensors unsqueezing is applied for each input independently. The axes inserted in this step are not included in the output shape.
- If rank of the first input is equal to 1, it is always unsqueezed to 2D tensor row vector (regardless of
transpose_a) by adding axes with size 1 at ROW_INDEX_DIM, to the left of the shape. For example[S]will be reshaped to[1, S]. - If rank of the second input is equal to 1, it is always unsqueezed to 2D tensor column vector (regardless of
transpose_b) by adding axes with size 1 at COL_INDEX_DIM, to the right of the shape. For example[S]will be reshaped to[S, 1].
- If rank of the first input is equal to 1, it is always unsqueezed to 2D tensor row vector (regardless of
-
If ranks of input arguments are different after steps 1 and 2, the tensor with a smaller rank is unsqueezed from the left side of the shape by necessary number of axes to make both shapes of the same rank.
-
Usual rules of the broadcasting are applied for batch dimensions.
Temporary axes inserted in step 2 are removed from the final output shape after multiplying.
After vector-matrix multiplication, the temporary axis inserted at ROW_INDEX_DIM is removed. After matrix-vector multiplication, the temporary axis inserted at COL_INDEX_DIM is removed.
Output shape of two 1D tensors multiplication [S] x [S] is squeezed to scalar.
Output shape inference logic examples (ND here means bigger than 1D):
- 1D x 1D:
[X] x [X] -> [1, X] x [X, 1] -> [1, 1] => [](scalar) - 1D x ND:
[X] x [B, ..., X, Y] -> [1, X] x [B, ..., X, Y] -> [B, ..., 1, Y] => [B, ..., Y] - ND x 1D:
[B, ..., X, Y] x [Y] -> [B, ..., X, Y] x [Y, 1] -> [B, ..., X, 1] => [B, ..., X] - ND x ND:
[B, ..., X, Y] x [B, ..., Y, Z] => [B, ..., X, Z]
Two attributes, transpose_a and transpose_b specify embedded transposition for two right-most dimensions for the first and the second input tensors correspondingly. It implies swapping of ROW_INDEX_DIM and COL_INDEX_DIM in the corresponding input tensor. Batch dimensions and 1D tensors are not affected by these attributes.
Attributes
-
transpose_a
- Description: transposes dimensions ROW_INDEX_DIM and COL_INDEX_DIM of the 1st input; false means no transpose, true means transpose. It is ignored if first input is 1D tensor.
- Range of values: false or true
- Type: boolean
- Default value: false
- Required: no
-
transpose_b
- Description: transposes dimensions ROW_INDEX_DIM and COL_INDEX_DIM of the 2nd input; false means no transpose, true means transpose. It is ignored if second input is 1D tensor.
- Range of values: false or true
- Type: boolean
- Default value: false
- Required: no
Inputs:
-
1: Tensor of type T with matrices A. Rank >= 1. Required.
-
2: Tensor of type T with matrices B. Rank >= 1. Required.
Outputs
- 1: Tensor of type T with results of the multiplication.
Types:
- T: any supported floating-point or integer type.
Example
Vector-matrix multiplication
<layer ... type="MatMul">
<input>
<port id="0">
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>1000</dim>
</port>
</output>
</layer>
Matrix-vector multiplication
<layer ... type="MatMul">
<input>
<port id="0">
<dim>1000</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
</port>
</input>
<output>
<port id="2">
<dim>1000</dim>
</port>
</output>
</layer>
Matrix-matrix multiplication (like FullyConnected with batch size 1)
<layer ... type="MatMul">
<input>
<port id="0">
<dim>1</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>1</dim>
<dim>1000</dim>
</port>
</output>
</layer>
Vector-matrix multiplication with embedded transposition of the second matrix
<layer ... type="MatMul">
<data transpose_b="true"/>
<input>
<port id="0">
<dim>1024</dim>
</port>
<port id="1">
<dim>1000</dim>
<dim>1024</dim>
</port>
</input>
<output>
<port id="2">
<dim>1000</dim>
</port>
</output>
</layer>
Matrix-matrix multiplication (like FullyConnected with batch size 10)
<layer ... type="MatMul">
<input>
<port id="0">
<dim>10</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>10</dim>
<dim>1000</dim>
</port>
</output>
</layer>
Multiplication of batch of 5 matrices by a one matrix with broadcasting
<layer ... type="MatMul">
<input>
<port id="0">
<dim>5</dim>
<dim>10</dim>
<dim>1024</dim>
</port>
<port id="1">
<dim>1024</dim>
<dim>1000</dim>
</port>
</input>
<output>
<port id="2">
<dim>5</dim>
<dim>10</dim>
<dim>1000</dim>
</port>
</output>
</layer>