openvino

Author	SHA1	Message	Date
hyunback kim	2582f04c9c	[GPU] Optimize stable diffusion perf igpu (#18200 ) * [GPU] Optimize stable_diffusion performance in iGPU. Change the existing heuristic shape condition to permute and no transpose gemm in case of transpose gemm. Signed-off-by: hyunback <hyunback.kim@intel.com>	2023-06-28 13:57:10 +02:00
Wilson Seok	1efb9eafae	[GPU] Add condition check for dynamic shape and onednn_impl in concat_in_place_optimization::match() (#18034 ) * add dynamic shape support for dgpu in prepare_buffer_fusing * add unit test * add space between test cases * update condition of impl create() for concat dynamic shape * update unit test * add comment and update unit test * add impl_param.is_type() function	2023-06-27 23:39:00 -07:00
Paul Youngsoo Ahn	50897e86e6	[GPU] Impl cldnn::condition to support dynamic shape (#18051 ) * [GPU] Impl cldnn::condition to support dynamic shape (#18051) * Impl CreateIfOp * Update calc_output_layouts and execute_impl * Enable gpu unit test * Create gpu functional test * [GPU] Follow-up code review (#18051) * remove redundant codes * create custom execute method for condition_inst * change name from update_loop_primitive_map to update_inner_program_io_map * [GPU] Fix gpu func test failures for fp16 * Add more test-cases to support fp16 and nested if case * [GPU] remove redundant codes * refactoring var names * fix windows build error * [GPU] Fix windows build issue * [GPU] update calc_output_layouts * [GPU] remove custom condition_inst::execute * Remove virtual keyword from primitive_inst::execute() * [GPU] Share single task executor between main program and inner program * [GPU] Fix input rank issue for const inner network in condition op * [GPU] apply calc_output_layouts for roi_align Co-authored-by: Vladimir Paramuzov <vladimir.paramuzov@intel.com> * [GPU] avoid checking allow_new_shape_infer for inner program --------- Co-authored-by: Vladimir Paramuzov <vladimir.paramuzov@intel.com>	2023-06-27 17:05:26 +02:00
Andrew Kwangwoong Park	1566567ca4	[GPU] Fix output layout calculation for crop and fc (#18207 ) * Fix get_partial_shape tensor API to access the correct index of dimensions Signed-off-by: Andrew Park <andrew.park@intel.com> * Update the rule specifying output_type to the legacy one by referring to calc_output_layout Signed-off-by: Andrew Park <andrew.park@intel.com> * Add reproducible TCs related to issues for ov_gpu_unit_tests Signed-off-by: Andrew Park <andrew.park@intel.com> * Fix failed fc dynamic i8 TCs for ov_gpu_unit_tests Signed-off-by: Andrew Park <andrew.park@intel.com> * Fix are_data_types_sutable_for_onednn not to invalidate output layout Signed-off-by: Andrew Park <andrew.park@intel.com> * Apply comment Signed-off-by: Andrew Park <andrew.park@intel.com> --------- Signed-off-by: Andrew Park <andrew.park@intel.com>	2023-06-27 11:30:30 +02:00
Mingyu Kim	61b15ce31a	Revert "[GPU] Reorder weights refactoring (#17787 )" (#18248 ) This reverts commit `d00c7d30f9`.	2023-06-27 17:26:18 +09:00
Taylor Yeonbok Lee	bcf58344cc	Fix crash for shape of subgraph due to missing mem_dep (#18246 )	2023-06-26 16:48:10 -07:00
Wilson Seok	f306a11b82	[GPU] fix issues of MobileFaceNet for dynamic shape (#18171 ) * fix issues of MobileFaceNet for dynamic shape * update unit test	2023-06-26 17:22:15 +09:00
Taylor Yeonbok Lee	bf299c807e	[GPU] Not to add sync if the node belongs to shape of subgraph (#18158 ) * Not to add sync if the node is within shape of subgraph Because the dependency is cpu impl so the execution is already finished. * Fixed as review comment : Skip clFinish only when the runtime dep is shape of subgraph, not the current node	2023-06-25 21:51:45 -07:00
Taylor Yeonbok Lee	22ef2f4e6a	Fix bug in weight reorder. (#18224 ) The original memory was overwritten unexpectedly because it was chekcing shared_ptr instead of actual buffer address	2023-06-24 00:35:07 -07:00
Irina Efode	31b07c40d9	Add global config for test infra (#17547 ) * [IE TESTS] Add Global test config for Subgraph base test * Replace using option by function redefinition * fix build * remove extra changes for gna/template * code style * add nvidia to devices * Fix debian * remove nvidia	2023-06-24 01:07:36 +04:00
Roman Lyamin	d00c7d30f9	[GPU] Reorder weights refactoring (#17787 )	2023-06-23 16:01:55 +04:00
Roman Lyamin	cca8cf15ef	[GPU] softmax_kernel_items_class_optimized fix (#18178 )	2023-06-23 16:00:11 +04:00
hyunback kim	3c378eb7ac	[GPU] Fix onednn implicit concat issue with reorder as input. (#18180 ) * [GPU] Fix onednn implicit concat issue with reorder as input. Fix for missed memory offset handling in onednn reorder. Signed-off-by: hyunback <hyunback.kim@intel.com>	2023-06-23 10:46:50 +00:00
Steve Yoo	d13adf7ae8	Allow new shape infer of ShapeOf (#17912 ) * Fixed to use input shape rank when calculating output layout, added unit test case * Fixed to use input shape rank when creating shape_of primitive, added functional tests	2023-06-22 21:04:41 -07:00
Mingyu Kim	c40efac569	[GPU] Typo (#18167 )	2023-06-22 17:49:34 +09:00
Pavel Durandin	a104e6218a	[GPU] Fix windows debug fail in contexts (#18168 )	2023-06-22 12:39:01 +04:00
Andrew Kwangwoong Park	52b9df4a6d	[GPU] Dynamism support for ReadValue and Assign ops (#18086 ) Signed-off-by: Andrew Park <andrew.park@intel.com>	2023-06-21 13:07:55 +04:00
Min, Byungil	96a0c539bd	[GPU] Not to convert crop to implicit on dynamic (#18148 ) Signed-off-by: Min, Byungil <byungil.min@intel.com>	2023-06-21 09:55:55 +02:00
hyunback kim	bcd2463813	[GPU] Fix skipped GemmBaseTests in iGPU. (#18001 ) * [GPU] Fix skipped GemmBaseTests in iGPU. Current GemmBaseTests in iGPU are skipped, just showed pass, but actual not run. Signed-off-by: hyunback <hyunback.kim@intel.com>	2023-06-21 16:09:06 +09:00
yanlan song	05e8bd375e	Bell/auto api 2.0 (#17805 ) * 2.0 innitial Signed-off-by: fishbell <bell.song@intel.com> * enable all tests Signed-off-by: fishbell <bell.song@intel.com> * remove unecessary files Signed-off-by: fishbell <bell.song@intel.com> * move container header to auto foler, remove uncessary macro define Signed-off-by: fishbell <bell.song@intel.com> * enable caching Signed-off-by: fishbell <bell.song@intel.com> * enable query_model Signed-off-by: fishbell <bell.song@intel.com> * support loaded_from_cache property Signed-off-by: fishbell <bell.song@intel.com> * fix some build warning Signed-off-by: fishbell <bell.song@intel.com> fake inputs/outputs if needed Signed-off-by: fishbell <bell.song@intel.com> * resolve conflict Signed-off-by: fishbell <bell.song@intel.com> * skip unsupported test Signed-off-by: fishbell <bell.song@intel.com> * use mock icore from common foler Signed-off-by: fishbell <bell.song@intel.com> * fix failure for remote tensors Signed-off-by: fishbell <bell.song@intel.com> * apply ppp related fix in auto Signed-off-by: fishbell <bell.song@intel.com> * fix build warning on windows Signed-off-by: fishbell <bell.song@intel.com> * fix ppp output layout issue Signed-off-by: fishbell <bell.song@intel.com> * fix ppp output layout issue Signed-off-by: fishbell <bell.song@intel.com> * clean up headers Signed-off-by: fishbell <bell.song@intel.com> * log formatting Signed-off-by: fishbell <bell.song@intel.com> * enable fps logging for binder mode Signed-off-by: fishbell <bell.song@intel.com> * apply review comments apply review comments Signed-off-by: fishbell <bell.song@intel.com> * remove all legacy namings, exenetwork/network/metric/IE etc Signed-off-by: fishbell <bell.song@intel.com> * update readme Signed-off-by: fishbell <bell.song@intel.com> * fix build lto issue Signed-off-by: fishbell <bell.song@intel.com> * minor wording Signed-off-by: fishbell <bell.song@intel.com> * case fix Signed-off-by: fishbell <bell.song@intel.com> --------- Signed-off-by: fishbell <bell.song@intel.com> Co-authored-by: Chen Peter <peter.chen@intel.com>	2023-06-21 00:10:59 +08:00
Wilson Seok	3519050ef0	skip all user format check when dynamic shape in get_preferred_format() to avoid endless recursive call (#18096 )	2023-06-19 18:52:58 -07:00
Patman11	b9575d9586	[GPU] Disable threaded kernel compilation when running in Windows Store app (#18062 )	2023-06-19 17:55:47 +04:00
Min, Byungil	9943ffc259	[GPU] Fix unit-tests for dGPU (#18125 ) + Resolved unit-tests failure on dGPU + Applied get_test_default_config for testing config Signed-off-by: Min, Byungil <byungil.min@intel.com>	2023-06-19 11:41:47 +04:00
Min, Byungil	555c083336	[GPU] Optimize out Gather by converting to implicit crop (#17743 ) + Changed Gather if it divides input tensor along batch axis + Converted Gather to cldnn Crop in CreateGatherOpBase + Added implicit Crop condition for batch axis Signed-off-by: Min, Byungil <byungil.min@intel.com>	2023-06-19 05:05:22 +00:00
Vladimir Paramuzov	3d79bd1ac5	[GPU] Minor layout optimizer refactoring (#17553 )	2023-06-16 10:33:53 +04:00
Pavel Esir	aa32ff1df3	keep Const + DecompressionConvert for CPU (#15930 ) * keep Const+DecompressionConvert pattern for CPU * temporary disabled failing unit-tests * disable CF by modifying bounds evaluate as well; minor corrections * added TODOs with ticket numbers * join const+decompression markings * minimized convert_precision.cpp changes * minor corrections * refactor fp16 transformations: moved into separate fp16_compression folder * style-fix * minor fixes * do not disable evaluate and CF in shape path * safer disabling of Const conversion * style-fix and minor corrections * restore original placement of ConvertPrecision	2023-06-15 13:07:22 +04:00
Andrei Gorbachev	52834659c4	[GPU] additional checks fixed for fully_connected (#18068 )	2023-06-15 09:11:38 +04:00
Mykhailo Hnap	bae926de22	[GPU] Unique-10 operation implementation. (#16412 ) * [GPU] Unique-10 operation implementation. * Handled flattened case. * Created results for all outputs in single layer test. * Save total unique count as fifth output. * Handled axis case. * Added unique reshape kernel. * Moved data types to unique primitive constructor. * Added shape agnostic Unique ref kernel. * Added blocked layout support to Unique-10. * Use int in bubble sort. * Added unit tests. * Added support for blocked layouts to flattened mode. * Fixed usage of shape_info in kernel. * Use correct total data size for dynamic shapes. * Commented some functional tests. For some reasons big shapes cause std::bad_alloc. * Initialize out_counts with zeros. * Implemented new approach for reducing memory footprint. Changed first kernel to only count unique values and changed second kernel to fill all outputs. * Revert "Commented some functional tests." This reverts commit a7f9763c575e71e14b85ee37adf1e98f10785c15. * Fixed calc output layouts for flattened case when rank in greater than 4. * Added temporary fix for axis case when rank is greater than 4. * Revert "Added temporary fix for axis case when rank is greater than 4." This reverts commit 236640d2f0e9d5b1f8dcbbf9482763badd7fde66. * Renamed "unique" to "unique_count" and "unique_reshape" to "unique_gather" primitives. * Quick fix for add_intermediate_node to consider dep_idx of multiple output * Fix bug for multiple output: 1) get_reorder was getting reorder from cache regardless of the dep_idx. 2) remove_redundant_reorder was not considering original dep_idx * Fixed conflicts. * Fixed win build issue. * Fixed build issue. * Revert "Fix bug for multiple output:" This reverts commit d4a2c4f32eabe9108df31d4837fed8995c93bd1c. * Revert "Quick fix for add_intermediate_node to consider dep_idx of multiple output" This reverts commit 2dfd2aaefdf32067a7469505b35f7096632ac5f2. * Added some tests to skip config. --------- Co-authored-by: Taylor Yeonbok Lee <taylor.lee@intel.com>	2023-06-14 10:41:51 -07:00
Andrei Gorbachev	1761427ab1	fixed fp16 x fp16 overflow in NonMaxSuppression (#18038 )	2023-06-14 15:58:49 +04:00
Roman Lyamin	63a5ec5762	[GPU] Several fixes for format traits (#18018 )	2023-06-14 14:33:58 +04:00
Sergey Shlyapnikov	e631f65a9b	[GPU] Fix in-order queue synchronization issue related to OCL/OneDNN impls interaction with CPU impls (#17976 )	2023-06-14 10:15:04 +09:00
Ilya Churaev	0743e9bfb5	Removed legacy methods SetBatch and SetBlob (#17984 ) * Removed legacy methods SetBatch and SetBlob * Fixed GPU plugin build * Remove DYN_BATCH_LIMIT from tests * Revert some changes in GPU plugin	2023-06-12 18:54:23 +00:00
Ilya Churaev	df44f92a97	Remove NV12 and I420 blobs and deprecate some legacy API (#17919 ) * Remove NV12 and I420 blobs and deprecate some legacy API * Fixed some errors * Remove NV12 blobs * Remote NV12 conversion * Fixed other warnings * Suppress version * Fix some warnings * Fixed version * Try to fix some warnings * Suppress warnings in C header * Suppress warnings in C * Fixed Windows exceptions * Try to fix warnings * Try to fix C bindings build * Suppress InferRequest * Fixed some build issues * Fixed some errors	2023-06-12 21:15:02 +04:00
Sergey Shlyapnikov	70e0caca4f	[GPU] Fix dynamic padding processing of static dimension (#17978 )	2023-06-12 08:39:42 +04:00
Wilson Seok	cff083f83d	[GPU] gather nd shape agnostic kernel implementation (#17940 ) * gather nd shape agnostic kernel implementation * add func test * fix minor bugs * minor bug fixes * fix win build error	2023-06-10 00:28:00 -07:00
Andrew Kwangwoong Park	c413825845	[GPU] Fuse type conversion only reorders to the prev nodes (#17881 ) * Fuse convert reorder to prev MVN/Concat node Signed-off-by: Andrew Park <andrew.park@intel.com> * Add dynamic TCs for ov_gpu_unit_test Signed-off-by: Andrew Park <andrew.park@intel.com> * Add descriptions for changes Signed-off-by: Andrew Park <andrew.park@intel.com> * Fix kernel selection failure Signed-off-by: Andrew Park <andrew.park@intel.com> * Add is_type_conversion_only function for reorder_node Signed-off-by: Andrew Park <andrew.park@intel.com> --------- Signed-off-by: Andrew Park <andrew.park@intel.com>	2023-06-09 16:07:01 -07:00
Ilya Lavrenov	a0119fe33c	Android debug build (#17955 )	2023-06-09 08:03:10 +04:00
Sergey Shlyapnikov	58d79aa3a6	[GPU] Add shape_of subgraphs markup and initial cpu implementations (#17762 ) * [GPU] Add shape of subgraphs markup and initial cpu implementations for some of primitives * Apply review comments * Exclude eltwise with boolean mode types from shape of subgraphs and fix leftovers	2023-06-08 13:46:21 +04:00
Taylor Yeonbok Lee	f246015dd7	[GPU] Fix issue in runtime buffer fusing (#17909 ) * There were two issues in runtime buffer fusing 1) Missing condition in matcher for dyanmic tensor 2) If the node is marked as can_be_optimized = true at build time and then turned out to false at runtime, the kernel compilation has been skipped becuaes it was checking node->can_be_optimized => To resolve this issue, added can_be_optimzied to impl_param and let the impl create check can_be_optimized in impl_param instead of that in node. * Fixed primtiive::can_be_optimize to be set through function	2023-06-07 19:39:26 -07:00
hyunback kim	13028397b7	Optimize permute gemm onednn (#17621 ) * [GPU] Optimized out permute in permute-gemm(onednn) pattern. Permute can be optimized out when permute's in and out are compatible and onednn gemm. Signed-off-by: hyunback <hyunback.kim@intel.com>	2023-06-07 16:20:59 +09:00
Ilya Churaev	36625404eb	[GPU] Fix GPU remote context name initialization (#17850 )	2023-06-05 12:00:04 +04:00
Sergey Shlyapnikov	db8d23231a	[GPU] Change priority of CPU implementations (#17829 )	2023-06-05 11:21:26 +04:00
Vladimir Paramuzov	1ce447674e	[GPU] Better device input memory reuse (#17853 )	2023-06-05 09:30:22 +04:00
Kelvin Choi	ec0daa5b10	[GPU] Apply m_pythondiv for fusing of eltwise div (#17590 )	2023-06-02 17:29:02 -07:00
Yaroslav Torziuk	eb588f0336	Add subgroup block reading in softmax_gpu_items_class_optimized.cl (#16223 )	2023-06-02 12:59:55 -07:00
Taylor Yeonbok Lee	f670dc5a0d	[GPU] Enable runtime buffer fusing for dynamic shape (#17668 ) * Initial impl for runtime buffer fusing Passing unittest with static kernel * pass unittest with dynamic impl * Refactor allocate_output * Separate header of buffer fusing * Refactored buffer fusing :: matcher/optimize * More cleanup * Fix crash in dolly * Reset can_be_optimized of primitive_inst when it is not * Fix empty tensor : Primitive with empty data should be skipped * Fix issue in dynamic padding : Static kernel should not contain dynamic padding dims Fix missing reset of update_shape_done_by_other flag * Not to add cache with emtpy kernel for optimized out inst * Fix corner case error in buffer fusing - Shapes of some preds may not be changed, but still needed to do update_impl because 1) paddings are changed 2) output memory should be updated - optimizable impl should not be added to the cache * Allowing reorder & permute_ref to be optimized concat predecessor * Some more fixes : runtime buffer fusing is available only when all preds/concat are dynamic runtime buffer fusing is to be executed only if the node is dynamic * Fix allocate_output parameter called by get_estimated_device_mem_usage according to the new change * Fixed error in cascaded concatt * Need to reinterprete even though the size is same	2023-06-02 12:39:28 -07:00
Sergey Shlyapnikov	5afbd4cf92	[GPU] Remove clFinish call from USM memory lock function (#17830 )	2023-06-02 16:17:05 +04:00
Andrei Gorbachev	97113b317f	[GPU] fix incorrect deformable_group_idx calculation (#17759 )	2023-06-01 10:51:48 +04:00
Vladimir Paramuzov	ac26216869	[GPU] Functional fixes for nvidia (#17735 )	2023-06-01 09:45:30 +04:00
Maciej Smyk	dc36ec11b5	[DOCS] Link adjustment for dev docs + fix to build.md CPU link for master (#17744 ) * link-update-1 * link update * Update build.md * dl workbench * Update README.md	2023-05-31 13:27:20 +04:00

1 2 3 4 5 ...

978 Commits