Commit Graph

60 Commits

Author SHA1 Message Date
Egor Churaev
2caca604ca [IE CLDNN] Fix reshape for yxfb layout (#1632)
In one of the network it was the following pipeline:
```
FullyConnected -> Reshape -> FullyConnected
```
And the output of Reshape wasn't in the same order as input for this
layer. I found that the problem was connected with format of the layers.
During optimization passes this pipeline was transformed to the
following:
```
FullyConnected -> Reorder -> Reshape -> Reorder -> FullyConnected
```
Both `FullyConnected` layers works with `yxfb` format.  This is why
Reorder layer after the Reshape has output layout with format `yxfb` and
`reshape_in_layout.format` returns `yxfb` format. But in this case we
have to convert Reshape to `bfyx` format because in this case we won't
change the order of elements.
I replaced `reshape_in_layout.format` (which returns `yxfb`) and
explicitly set `bfyx` format.

JIRA: 35288
2020-08-11 14:52:04 +03:00
Ilya Znamenskiy
6cccbcf28a [IE CLDNN] Gemm fp16/fp32 optimized kernel (#1646) 2020-08-11 09:54:00 +03:00
Konrad Dobros
caa38130b9 [IE CLDNN] Extend resample int8 packing optimization (#1662)
This extends resample optimization for 8-bit types that uses feature
packed to mode to process multiple features in one work-item to features
not being multiple of packing factor.

For nearest resampling it is safe to copy extra feature padding for
blocked formats, so this change only removes this condition.
2020-08-07 16:08:40 +03:00
Marcin Penkowski
bb408f2ca9 Feature/ar24 int8 optimizations (#1208) 2020-08-04 12:09:23 +03:00
Roman Lyamin
15f91be168 [IE CLDNN] Added fsv16 and int8 support in BatchToSpace and SpaceToBatch (#1381) 2020-08-03 15:04:49 +03:00
Roman Lyamin
8245e5b6f4 [IE CLDNN] Added HSwish-4 operation (#1585) 2020-08-03 10:15:43 +03:00
Lukasz Debski
a17472fed0 [IE CLDNN] Gather 5d/6d support (#1553) 2020-08-03 10:05:53 +03:00
Vladimir Paramuzov
8f966887d7 [IE CLDNN] Prod mode support in eltwise fusings (#1491) 2020-07-30 18:16:37 +03:00
Vladimir Paramuzov
48f5f524b8 [IE CLDNN] Fixed gemm fusings with FP precision (#1490) 2020-07-27 18:49:54 +03:00
Konrad Dobros
0846f2050e [IE CLDNN] Add b_fs_fsv16 concat optimizations (#1452)
1. Add fsv16 int8 support to optimized kernel
2. Optimize fsv16 concat kernel
3. Add graph optimization to improve concat alignment

Issue: CVS-28494
2020-07-27 14:49:22 +03:00
Vladimir Paramuzov
3c99c13feb [IE CLDNN] Improvements for SpaceToDepth (#1454) 2020-07-27 11:52:18 +03:00
Vladimir Paramuzov
c69591c0b7 [IE CLDNN] Shrink reshapes (#1362) 2020-07-23 10:14:52 +03:00
Ilya Znamenskiy
c37f73334c [IE CLDNN] Gemm int8 with slm optimization. Fused ops fix (#1319) 2020-07-21 17:45:42 +03:00
Egor Churaev
668abbc5d9 [IE CLDNN] LRN int8 fsv16 optimizations (#814)
JIRA: 32367
2020-07-13 13:25:15 +03:00
Roman Lyamin
f3848b4454 [IE CLDNN] Added Mish operation (#1125) 2020-07-09 16:57:59 +03:00
Lukasz Debski
bd0aa6ac6d [IE CLDNN] Addition of eltwise support for different input sizes. (#640) 2020-07-06 15:26:14 +03:00
Jedrzej Hajduczenia
fd9ae15fdd [IE CLDNN] Fix input feature padding handling in dw conv fsv16 kernel (#1217) 2020-07-05 18:57:15 +03:00
Vladimir Paramuzov
c9d4e6b934 [IE CLDNN] Removed unused primitives and related structures (#1039) 2020-06-30 22:18:24 +03:00
Alexander Chaiko
f8b2627c3b [IE CLDNN] int8 batches optimization (#632) 2020-06-29 14:09:33 +03:00
Egor Churaev
08cd0f7779 [IE CLDNN] Implement ExtractImagePatches operation (#1127)
The ExtractImagePatches operation collects patches from the input
tensor, as if applying a convolution. All extracted patches are stacked
in the depth dimension of the output.

JIRA: 30055
2020-06-29 10:36:30 +03:00
Roman Lyamin
bc132056f9 [IE CLDNN] Added space_to_batch operation (#984) 2020-06-24 18:30:24 +03:00
Ilya Churaev
934e0c61eb Removed reference implementations for some data types (#1086) 2020-06-24 12:44:19 +03:00
Vladimir Paramuzov
0ec07b2c3b [IE CLDNN] fsv4 to fsv16 conv (#1030) 2020-06-22 17:09:39 +03:00
Evgeny Lazarev
970b1301b5 Cleanup IR v7 from the MO (#1008)
* Removed back phase transformations related to IRv7

* Fixed setting value for the input port using the 'set_value' method

* Removed front and middle phase transformations related to IRv7

* Cleanup the rest of the Model Optimizer transformations from IRv7 specific transformations

* Final cleanup of the deprecated IR v7 related code

* Removed 'blobs_as_input' usage in the Model Optimizer.

* Removed function '_fuse_add' from the Model Optimizer since it is not used anymore.

* Removed 'keep_in_IR' node attribute for FakeQuantize ops in the MO

* Disabled failing gpu_engine.user_context test
2020-06-22 11:52:00 +03:00
Konrad Dobros
2a1a92d31a [IE CLDNN] Fix activation implementation for fsv16 format (#1037)
For b_fs_yx_fsv16 format in reference kernel features for dispatch are
rounded to multiple of 16. This change adds correct check in kernel to
return work-items that are inside this dispatch padding.
Previously those work-items could corrupt memory expected to be filled
with 0s, and for parametrized activation due to bounds checking with
modulo operator they could have been corrupting actual layer output.

Issue: CVS-27672
2020-06-19 21:41:08 +03:00
Vladimir Paramuzov
ba8226fcb4 [IE CLDNN] Fix strided slice (#950) 2020-06-18 19:55:17 +03:00
Jedrzej Hajduczenia
491173e01e [IE CLDNN] Add pooling b_fs_yx_fsv16 int8 (#565) 2020-06-18 16:40:52 +03:00
Konrad Dobros
ccbbdcf80d [IE CLDNN] Fix gather dimensions calculation (#959) 2020-06-17 15:07:18 +03:00
Konrad Dobros
db3dff36b9 [IE CLDNN] Add resample improvements (#933)
This change:
- extends concat in-place optimization for resample on input
- adds resample primitive int8 support for bilinear mode
- fixes some potential issues with offset calculations with in8
2020-06-16 09:07:05 +03:00
Konrad Dobros
e1c22196b4 [IE CLDNN] Fix fsv16 -> bfyx reorder removal (#872) 2020-06-12 15:44:14 +03:00
Vladimir Paramuzov
a3fce2d763 [IE CLDNN] Always use FP32 as intermediate type for fused quantize (#877) 2020-06-11 12:27:11 +03:00
Jedrzej Hajduczenia
85406c9768 [IE CLDNN] Add support for I64 data type in clDNN plugin (#555) 2020-06-10 09:34:29 +03:00
Roman Lyamin
3b4990ed30 [IE CLDNN] Added batch_to_space operation (#753) 2020-06-09 19:19:24 +03:00
Vladimir Paramuzov
fe198dd544 [IE CLDNN] Added 6d tensor support in eltwise/scale primitives (#826) 2020-06-09 14:29:36 +03:00
Vladimir Paramuzov
0022eebd71 [IE CLDNN] Enable DepthToSpace (#780)
Enabled DepthToSpace ngraph transformat
Updated implementation to support 5d and mode parameter
fsv16 direct support
Functional tests for GPU
2020-06-05 20:16:47 +03:00
Ilya Znamenskiy
4d3ddc1684 [IE CLDNN] GEMM int8 optimization using MMAD macro (#635) 2020-06-05 14:28:21 +03:00
Sergey Shlyapnikov
6e491a89ad [IE CLDNN] Improve Gather performance and add fusing support (#736) 2020-06-05 10:20:58 +03:00
Egor Churaev
2100521a14 [IE CLDNN] Implement NormalizeL2 int8 kernels (#720) 2020-06-05 10:16:27 +03:00
Lukasz Debski
698dfc4bf6 [IE CLDNN] Permute fused ops support (#642) 2020-06-04 17:01:21 +03:00
Vladimir Paramuzov
d7fad0109a [IE CLDNN] Disabled sporadic detection output tests (#740) 2020-06-04 11:14:05 +03:00
Egor Churaev
546377dc8e [IE CLDNN] Implement EmbeddingBag operations (#623)
Implemented three operations: EmbeddingBagPackedSum,
EmbeddingBagOffsetsSum and EmbeddingSegmentsSum. These operations do
the same work but have a different format of inputs.
2020-06-04 10:25:28 +03:00
Mikołaj Życzyński
023344a317 [IE CLDNN] Added fusing suport to all pooling kernels (#689)
adds fusing support to all available pooling kernels
tests all possible input type/output type configurations
fixes minor bug in max pooling in pooling_gpu_test.cpp
fixed minor bug with yxbf format in pooling_gpu_ref and pooling_gpu_int8_ref kernels
fixes bug with b_fs_yx_fsv32 format in pooling_gpu kernel
resolves bug with max pooling accuracy missmatch in case of non zero pad end layer parameter
resolves average pooling accuracy missmatch in case of non zero pad end layer parameter
2020-06-03 19:44:27 +03:00
Mikołaj Życzyński
3ea1657e4f [IE CLDNN] Activation with fused quantize bug fix (#613)
fixed bug connected with quantization fusing to activation
added scale and activation fusing support
added corresponding tests
2020-06-03 09:30:49 +03:00
Vladimir Paramuzov
dbdaaa93dd [IE CLDNN] Quantized deeplabv3 optimizations (#646)
Enabled dilation for imad dw fsv16 kernel
Added argmax and mutable_data to fsv16 white list
Enabled byxf input for quantize scale_shift kernel
2020-06-02 09:17:39 +03:00
Mikhail Letavin
65f62945dd [IE CLDNN] Free up first copy of weights/biases that were transferred to USM device memory (#561) 2020-06-01 12:01:28 +03:00
Vladimir Paramuzov
f7052a107d [IE CLDNN] Optimized FQ kernel in fsv16 layout (#573)
- Optimized FQ kernel in fsv16 layout. Enabled scaleshift transform for FP16 precision
- Disabled activation_opt kernel with fused ops in some cases
2020-05-29 20:10:30 +03:00
Mikołaj Życzyński
e734377590 [IE CLDNN] Grouped convolution bug fix (#572)
Fixes bug in grouped convolution connected with wrong weights layout in SetDefault() method
2020-05-27 21:19:49 +03:00
Egor Churaev
31fe146539 [IE CLDNN] Implement CumSum operation (#533)
CumSum performs cumulative summation of the input elements along the given axis.

Details:
By default, it will do the sum inclusively meaning the first element is
copied as is. Through an "exclusive" attribute, this behavior can change
to exclude the first element. It can also perform summation in the
opposite direction of the axis. For that, set "reverse" attribute to
true.

JIRA: 29994
2020-05-27 11:47:16 +03:00
Alexey Suhov
deb008a26f publish master branch snapshot, revision 8d31237e2c3f673cbb0f0ba110fc10f5cce1d2bb 2020-05-22 02:23:12 +03:00
Alexey Suhov
5b428f0655 publish master branch snapshot, revision 49482ae3bea0cbaa07474f86f36db11943142687 2020-05-13 21:12:22 +03:00