openvino

Author	SHA1	Message	Date
Roman Lyamin	23b863ffe8	[GPU] Added ScatterNDUpdate shape agnostic kernel (#15567 )	2023-02-08 16:27:34 +01:00
Katarzyna Mitrus	76817f56c2	[ShapeInference] CTCGreedyDecoderSeqLen shape infer improvements (#15501 ) * Shape infer improvements * Add setter for merge repeated attribute * Use new shape_infer in validate and infer types * Add more type prop tests * Add shape infer tests * Align variable names in tests * shape_infer call refactor --------- Co-authored-by: Evgenya Stepyreva <evgenya.stepyreva@intel.com>	2023-02-08 12:24:48 +00:00
Pawel Raasz	c3083589bd	Review reverse class for shape inference aspects (#15426 ) * Check partial shape and label propagation * Add template shape_infer implementation	2023-02-08 16:10:51 +04:00
Ilya Lavrenov	1f3e469c5e	Added -Wall for Clang and GCC (#15513 ) * Added -Wall for Clang and GCC * Fixes * Don't use /J * Fixed warnings * Fixed warnings * More fixes * Fixed for MSVC * Fixed more warnings on Windows * Suppressed some warnings in template plugin * Update src/tests/functional/plugin/shared/include/behavior/plugin/caching_tests.hpp * Added suppression for PT FE * Suppressed warnings in TF FE * Suppressed warnings on Core unit tests * Suppress warnings in python * Suppressed Windows warning for 3rd party modules * Suppresed one more warning	2023-02-08 15:01:00 +04:00
Katarzyna Mitrus	d94dae79d8	[ShapeInference] CTCGreedyDecoder shape infer improvements (#15474 ) * Add setter for ctc_merge_repeated * shape infer improvements * Add type prop tests * Add cpu shape infer tests * Tests refactor	2023-02-08 11:41:14 +04:00
Sergey Shlyapnikov	7b649c4150	[GPU] Fix reset_execution() method with wait option enabled for `in_order` queue type (#15562 )	2023-02-08 09:18:46 +04:00
Nadezhda Ageeva	421d791f58	[HETERO] Return only own properties for compiled model (#15547 )	2023-02-07 23:54:08 +04:00
Ilya Churaev	26108b1b67	Deprecate clone_model API, use model->clone() instead of (#15482 ) * Deprecate clone_model API, use model->clone() instead of * Renamed clone_nodes function * Fixed exception	2023-02-07 23:53:17 +04:00
Nikolay Shchegolev	188dda668f	[CPU] Fix sporadic SIGFAULT in GridSample. (#15009 )	2023-02-07 17:57:34 +04:00
Dohyun Kim (Felix)	7659551d71	[GPU][DG2] Fix fusings_gpu/gemm_2in_scale.basic/7 (#15353 ) * Onednn only supports 2D/3D gemm but openvino GPU plugin policy enforces 4D~6D. This API mismatch causes problems in the post-op axis and requires massive code changes. Therefore we decided to insert throw code for now and fix this issue later if some models require non-(per tensor/full tensor) post-ops. * Specifically, per-channel(=f) axis in this testcase becomes y-axis because onednn gemm merges b,f axes into one batch axis.	2023-02-07 16:37:26 +09:00
Sungeun Kim	00d9ed0da4	[GPU] fix bug on resample_opt (#15434 ) * fix bug: wrong feature slice num	2023-02-07 16:29:18 +09:00
Mingyu Kim	6fa31fbed2	[GPU] Show num_ccs for RANGE_FOR_STREAMS (#15525 )	2023-02-07 15:22:45 +09:00
Eddy Kim	8e84531b58	[GPU] Serialization of read_value and assign (#15007 ) * serialization of read_value and assign primitives * lines should be <= 160 characters long * added unit tests for read_value and assign * updated to store is_output_evnet in primitive_inst * removing _is_output_event in typed_primitive_impl_ocl * added comments for mem_allocated and is_output_null	2023-02-06 11:10:59 -08:00
Roman Lyamin	014a35c3ce	[GPU] Added strided_slice shape agnostic kernel (#15477 )	2023-02-06 13:03:00 +04:00
Sergey Shlyapnikov	e003bf3af7	[GPU] Shape agnostic FC opt tiled kernel (#15396 )	2023-02-06 12:17:55 +04:00
Sergey Shlyapnikov	cd48d76009	[GPU] Limit legacy fusions usage in Convolution kernels (#15465 )	2023-02-06 12:08:22 +04:00
Kelvin Choi	8ed71a22fa	[GPU] Update ScatterNDUpdate Op to use ngraph shape infer (#15176 )	2023-02-05 21:31:33 -08:00
Wang, Yang	3bfd07d535	[AUTO plugin] MULTI_DEVICE_PRIORITIES doesn't return device list by priority order from high to low (#14754 ) * 1. Correct the device list by priority order from high to low. 2. Remove GNA, CUDA, HPU, HDDL, NVIDIA from device list supported by AUTO/MULTI. Signed-off-by: Wang, Yang <yang4.wang@intel.com> * Filter out supported device when not specify the candidate device for AUTO plugin. * Add Debug MSG * Update. * Update AUTO mock test cases. * Update. * Update. * Update code style. --------- Signed-off-by: Wang, Yang <yang4.wang@intel.com> Co-authored-by: Chen Peter <peter.chen@intel.com>	2023-02-06 10:09:54 +08:00
Ilya Churaev	21fc5fbd75	Enable clone test after PR 15220 (#15481 ) * Enable deep copy template test * Disable unrol if transformation	2023-02-03 17:41:25 +01:00
Ilya Lavrenov	de1631d67d	Generalized OpenCL handling (#15253 ) * Squashed commit of the following: commit 62c992f6a0bc3a2f559faac6912be9c5632a359f Author: Ilya Lavrenov <ilya.lavrenov@intel.com> Date: Sun Jan 22 11:38:18 2023 +0400 Generalized OpenCL handling * Updates * Fixes * Update thirdparty/CMakeLists.txt test * Fixed build with CL/cl2.hpp * Fixes * Fixes * Fixed compilation flags * Fixed build with target OpenCL 120 * Don't use cache	2023-02-03 15:36:47 +00:00
Pavel Esir	4103a931c2	[FP16] call marking for mixed precision inside ConvertPrecision (#14965 ) * call marking for mixed precision inside ConvertPrecision * fix typo in precisions list; moved conversion from f64->f32 to the very beginning * remove obsolete convert_compressed_to_mixed_precision_test.cpp * typo fix after merge * corrected namespace prefix * fixed align_mixed_fp32_fp16_types_test.cpp by removing redundant ConvertPrecision * updated ConvertPrecison tests for mixed precision * style fix	2023-02-03 13:47:57 +04:00
Ilya Churaev	1bb8223ecd	Template ov plugin (#15220 ) * Initial migration of TemplatePlugin to ov::IPlugin interface * Fixed segfault * Fixed static build and some template tests * Fixed code style * Fixed some template tests * Fixed scale tests * Disabled transformations in the template plugin * Fixed ONNX tests * Fixed compilation * Fixed core tests * Fixed some crashes * Small fixes * Migrate to ICompiledModel * Fixed some behaviour tests (add legacy names and supported_properties) * Fixed output precisions * Fixed some tests * Changed parameter->result test * Fixed some preprocessing tests * Added mean image preprocessing * Disabled some tests * Fixed some template tests * Try to fix not implemented false * Try to fix template tests * Fixed doc * Catch ov::NotImplemented exception * Small changes * Fixed build * Try to fix build * Fixed some comments * Use new properties * Fixed documentation * Fixed properties	2023-02-03 13:37:40 +04:00
Andrew Kwangwoong Park	ab509ce164	[GPU] Added shape agnostic optimized GEMM kernel (#15317 ) * [GPU] Shape agnositc optimized gemm kernel Signed-off-by: Andrew Park <andrew.park@intel.com> * Fix CI failure Signed-off-by: Andrew Park <andrew.park@intel.com> * Apply code review Signed-off-by: Andrew Park <andrew.park@intel.com> * Fix dynamic shape accuracy drop on SQuAD v1.1 - F1: 91.81%, EM: 85.25% @bert-small-uncased-whole-word-masking-squad-0001 Signed-off-by: Andrew Park <andrew.park@intel.com> * Apply code review Signed-off-by: Andrew Park <andrew.park@intel.com> --------- Signed-off-by: Andrew Park <andrew.park@intel.com>	2023-02-03 09:26:35 +04:00
hyunback kim	9d8532e998	[GPU] Use onednn fc/gemm in dGPU. (#15143 ) * [GPU] Fix the functional issue using fc:onednn in bert model. * The issue had happened when input dims are 3 with post-po eltwise. * oneDNN FC out supports 2-dims only, so OV need to update output and post-op too. * Fix ACC issue in b16 onednn FC. cldnn updates yxfb format in b16 for opt kernel, but no need in onednn. * Remove W.A code for running fc cldnn. * Support gemm primtiive and multi types ForceImplTypes * Change env name OV_GPU_ForceImplTypes * Do not change elstwise post-op shape from original node: it caused the ACC issue when multiple users. Signed-off-by: hyunback <hyunback.kim@intel.com>	2023-02-03 09:58:00 +09:00
Mingyu Kim	e9a208501b	[GPU] Add layout supports for shuffle_channels (#15400 ) * [GPU] Add layout supports for shuffle_channels	2023-02-03 09:52:40 +09:00
Mang Guo	9e83b081f4	Add gather lpt transformation (#14597 ) * Add gather lpt transformation * Add per-channel gather lpt dequantization support * Fix review comments * Add GPU test case * Fix clang-format error gpu case build error * Fix comments * Fix clang-format check fail * Update docs * Fix comments * Add Gather opset1 quantization support	2023-02-02 16:13:52 +01:00
Nikolay Shchegolev	d167f4c733	[CPU] MEMC loading failed RuntimeError: There should be only one inst… (#15380 ) * [CPU] MEMC loading failed RuntimeError: There should be only one instance of RegistersPool per thread. * Fixes as per comments.	2023-02-02 16:10:21 +04:00
Min, Byungil	7bdc9ec36b	[GPU] Optimize eltwise kernel for onednn formats (#15087 ) + Bugfix of eltwise_b_fs_yx_fsv16 kernel for int satuation + Add optimizing for fsv32, fsv16 using vload + Add optimizing for double blocked format eltwise + Support mixed format and broadcasting + Add test-cases to eltwise_gpu_test Signed-off-by: Min, Byungil <byungil.min@intel.com>	2023-02-02 18:03:07 +09:00
Roman Lyamin	36df508baf	[GPU] Added shape agnostic ref kernels for Select and Activation (#15016 ) * [GPU] Added Select shape agnostic support * [GPU] Added Activation shape agnostic support	2023-02-02 10:08:36 +04:00
Taylor Yeonbok Lee	864b5075b7	Allocate output to host if it is to be used by other node's shape infer dependency, because it requires copy to host in shape inference. (#15386 )	2023-02-02 06:32:49 +01:00
Taylor Yeonbok Lee	3910e0b2d0	[GPU] Use ocl for i32 dtype concat / enable_profiling for dump_profiling (#15419 ) * Use ocl for i32 concat * Enabling dgpu profiling	2023-02-01 20:08:58 +01:00
Chenhu Wang	8fc4c2c6e1	[CPU]FC shape infer with an CPU shapeInfer object (#15092 ) * with an shapeInfer object * more efficient vector creation	2023-02-01 15:35:15 +01:00
Roman Lyamin	2d9a213ed6	[GPU] Added NV12toBGR single plane surface tests (#15417 )	2023-02-01 18:26:39 +04:00
guozhong wang	1550a98bd2	Remove the redundant functions in the executable_network.cpp (#14909 ) Co-authored-by: yanlan song <bell.song@intel.com>	2023-02-01 21:07:55 +08:00
Marcin Kusmierski	691ccad997	[GNA] Apply Style formatter (#15374 ) * [GNA] Enable clang-format * [GNA] apply fix for formatting * [GNA] fixes to make code compilable after reformatting	2023-02-01 12:40:02 +01:00
Kelvin Choi	5bcfdf15df	[GPU] Fix reduce fs_b_yx_fsv16 bug for MIN and MAX mode (#15060 )	2023-01-31 16:02:42 -08:00
Katarzyna Mitrus	407590cfc2	[ShapeInference] GatherTree shape infer improvements (#15399 ) * Shape infer improvments * Add type_prop label and interval dims tests * Update shape_infer tests * Use new shape_infer * Revert headers changes * Rename test file	2023-01-31 14:04:19 +01:00
Katarzyna Mitrus	f342e5d208	[ShapeInference] Improve GatherND shape inference (#15378 ) * Add shape_infer function for GatherND * GatherND shape infer improvements * Align test to trigger correct error message * Add new and improve GatherND type_prop tests * Update tests to use ov namespace * Add GatherND common shape_infer tests * Init shape infer tests for not common cases * Tests refactor * Add default ctor tests * Add more test cases * Register shape_infer for GatherND V5 and V8 * Enable more tests and print params * Move GatherNDTestParams	2023-01-31 14:12:12 +04:00
Pawel Raasz	4ce3e9a88d	Review CTCLoss class for shape inference aspects (#15375 ) * Review ctc loss operator for - partial shape and label propagation - template implementation of shape_infer - update/extend tests * Use namespace ov in ctc loss operator	2023-01-31 14:10:30 +04:00
Pawel Raasz	3a8646215f	Review roll class for shape inference aspects (#15295 ) * Review Roll label and interval shape propagation * Review Roll shape_infer template implementation * Fix compilation issues	2023-01-31 14:05:23 +04:00
Jade Cho	06063201d5	[GPU] Optimize permute for acdb format (#15139 ) * [GPU] Optimize permute for acdb format Target subgraphs to be optimized-out - input(bfyx) - permute(byxf) - conv - conv(byxf) - permute(bfyx) - output + Fix test_device_mem_usage_estimation unit test failed.	2023-01-31 17:32:57 +09:00
OlehKravchyshyn	4700207af0	[GPU] Feature/intepolate 3 axes onnx 5d (#13796 ) added 3-axis interpolation for linear-onnx mode fixed resample_opt for onnx mode, it didn't work in case of padding added tests for the new implementation and fix @OlehKravchyshyn	2023-01-30 22:45:53 -08:00
Vladislav Golubev	d1397b7b48	[LPT] Rank limitations removed (#14785 ) * [LPT] LayerTransformation: removed legacy rank checks * [LPT] Added test cases with 1D and 6D ranks & existing tests corrected	2023-01-31 00:26:59 +00:00
Paul Youngsoo Ahn	0b5603fa98	[GPU] improve primitive impl caching mechanism with new unified key (#14797 ) * [GPU] improved impl cache key (#14797) - Add hash function for primitive and program_node - Filter task before entering async compilation queue * [GPU] improved impl cache key (#14797) - Multiply magic prime number at input value of hash_combine to avoid hash collision * [GPU] Update codes to follow up review comments (#14797) - Change func name from pop_front_task to erase_front_task - Change func name from get_layout_key to get_impl_key - Remove average_unpooling.hpp because it was alread removed - Replace std::list to std::deque in compilation_context - Modify layout::hash() to get hash of shape from partial shape - Remove calculation code to get hash from static layout in program_node => layout hash is calculated outside of program_node * [GPU] Update gpu functional test for improved impl key (#14797) * [GPU] update compilation queue (#14797) * [GPU] Move type_string hash to primitive (#14797) - Add hash for num_outputs in program_node * [GPU] update hash functions for program_node (#14797) - add hash for number of inputs in program_node - program node::hash() had separated into void program node::caclulate_hash() and size_t program_node::get_hash() * [GPU] Fix gpu unit test failures (#14797) - move the location to calculate all nodes from compile_graph to program ctor * [GPU] Fix build issue after rebase (#14797) * [GPU] Update impl if optimized kernel is in impl_cache even if the shape does not change. (#14797) - Apply improved hash key to mem kernels cache in update_weight - Add missing hash value for broadcast - Add simple unit test to check hash value for program_node, primitive and program_inst	2023-01-30 14:35:58 -08:00
Eddy Kim	c2518f1e4a	[GPU] Serialization logic updates for OneDNN 3.0 (#15182 ) * [GPU] The draft for integration oneDNN3.0 Initial PR. 1. Support oneDNN3.0 API 2. Use binary_mul post_opt instead of oscale channel-wise mask(2) 3. Disable some post-opt fusing because of no eltwise scale API eltw(non_linear)+eltw(linear), eltw+sum+eltw(linear) Signed-off-by: hyunback <hyunback.kim@intel.com> * Fix hardwish issue in 3.0 hard coded hardswish parameter(2.7) is changed alpha and beta from user's required input. Signed-off-by: hyunback <hyunback.kim@intel.com> * clean up code Signed-off-by: hyunback <hyunback.kim@intel.com> * Apply code review comment and fix ci issue Signed-off-by: hyunback <hyunback.kim@intel.com> * Remove setting dst scale - ACC issue - No perf gain compared binary_mul Signed-off-by: hyunback <hyunback.kim@intel.com> * gpu serialization for onednn 3.0 * missed changes * add onednn engine creator when loading model from cache * fixed to use mem_dep index * updated to save zero_point_mask for serialization * fixed onednn fc serialization logic * updated the logic to check if onednn is enabled --------- Signed-off-by: hyunback <hyunback.kim@intel.com> Co-authored-by: hyunback <hyunback.kim@intel.com>	2023-01-30 09:41:25 -08:00
Nadezhda Ageeva	82a07845a5	[GPU]: Update device architecture to support other vendors (#15232 )	2023-01-30 14:12:09 +04:00
Eddy Kim	3ace063040	[GPU] Adding copy functions for image2d memory (#15330 ) * implemented copy functions for image2d * updated to dump data using copy_to without lock * calculating _bytes_count for image2d	2023-01-27 16:47:47 -08:00
Roman Lyamin	4089ee0899	[GPU] Added WA for currently unsupported scale_shift_opt agnostic kernel (#15341 )	2023-01-27 16:47:35 +04:00
Sungeun Kim	9e54ac6518	[GPU] Add black list to avoid fusing crop before experimental_detectron_roi_feature_extractor (#15036 ) * Add black list to avoid fusing crop before experimental_detectron_roi_feature_extractor * propagate crop when crop is cascaded	2023-01-27 18:52:37 +09:00
Taylor Yeonbok Lee	cd9e772802	[GPU] Optimize realloc for dynamic shape (#15169 ) * Optimize realloc for dynamic shape with - Pre-aligned alloc for bounded dynamic shape - Reuse internal buffer * - Fix internal buffer of NMS kernel to be reused - Fixed bug in nms quick sort * Additional fix for internal buffer reuse * Fix legacy dynamic batch to be applied only for 0-th dim dynamic shape with upper bound * Fix unittest error * Apply nms fixes of padding -1 to all buffers only when internal buffer is reused * Not to have separate get_max_tensor, becuase currently there is no needs for that separate API. Currently max tensor is only needed for memory allocation, and there is no need for minimum tensor size for now * Fix allocation of internal buffer to be done for each layout	2023-01-27 00:40:31 -08:00

1 2 3 4 5 ...

1338 Commits