* [GPU] Optimize permute for acdb format
Target subgraphs to be optimized-out
- input(bfyx) - permute(byxf) - conv
- conv(byxf) - permute(bfyx) - output
+ Fix test_device_mem_usage_estimation unit test failed.
* Add test to verify add_extension with relative path
* Fix code style
* Use std::string::find instead of std::regex
* Remove unnecessary include
* Add comments about generating relative path
* Don't add empty tokens when splitting path
* [TF FE] Refactor CropAndResize support
Make it more reshape-oriented. It allows to convert Mask R-CNN model without config file.
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
* Update src/frontends/tensorflow_common/src/op/crop_and_resize.cpp
* Use Gather for coordinates swapping
* Update src/frontends/tensorflow_common/src/op/crop_and_resize.cpp
* Update src/frontends/tensorflow_common/src/op/crop_and_resize.cpp
---------
Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>
added 3-axis interpolation for linear-onnx mode
fixed resample_opt for onnx mode, it didn't work in case of padding
added tests for the new implementation and fix
@OlehKravchyshyn
* [GPU] improved impl cache key (#14797)
- Add hash function for primitive and program_node
- Filter task before entering async compilation queue
* [GPU] improved impl cache key (#14797)
- Multiply magic prime number at input value of hash_combine to avoid hash collision
* [GPU] Update codes to follow up review comments (#14797)
- Change func name from pop_front_task to erase_front_task
- Change func name from get_layout_key to get_impl_key
- Remove average_unpooling.hpp because it was alread removed
- Replace std::list to std::deque in compilation_context
- Modify layout::hash() to get hash of shape from partial shape
- Remove calculation code to get hash from static layout in program_node => layout hash is calculated outside of program_node
* [GPU] Update gpu functional test for improved impl key (#14797)
* [GPU] update compilation queue (#14797)
* [GPU] Move type_string hash to primitive (#14797)
- Add hash for num_outputs in program_node
* [GPU] update hash functions for program_node (#14797)
- add hash for number of inputs in program_node
- program node::hash() had separated into void program node::caclulate_hash() and size_t program_node::get_hash()
* [GPU] Fix gpu unit test failures (#14797)
- move the location to calculate all nodes from compile_graph to program ctor
* [GPU] Fix build issue after rebase (#14797)
* [GPU] Update impl if optimized kernel is in impl_cache even if the shape does not change. (#14797)
- Apply improved hash key to mem kernels cache in update_weight
- Add missing hash value for broadcast
- Add simple unit test to check hash value for program_node, primitive and program_inst
* [GPU] The draft for integration oneDNN3.0
Initial PR.
1. Support oneDNN3.0 API
2. Use binary_mul post_opt instead of oscale channel-wise mask(2)
3. Disable some post-opt fusing because of no eltwise scale API
eltw(non_linear)+eltw(linear), eltw+sum+eltw(linear)
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Fix hardwish issue in 3.0
hard coded hardswish parameter(2.7) is changed alpha and beta from user's required input.
Signed-off-by: hyunback <hyunback.kim@intel.com>
* clean up code
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Apply code review comment and fix ci issue
Signed-off-by: hyunback <hyunback.kim@intel.com>
* Remove setting dst scale
- ACC issue
- No perf gain compared binary_mul
Signed-off-by: hyunback <hyunback.kim@intel.com>
* gpu serialization for onednn 3.0
* missed changes
* add onednn engine creator when loading model from cache
* fixed to use mem_dep index
* updated to save zero_point_mask for serialization
* fixed onednn fc serialization logic
* updated the logic to check if onednn is enabled
---------
Signed-off-by: hyunback <hyunback.kim@intel.com>
Co-authored-by: hyunback <hyunback.kim@intel.com>
* Adds base class and first test for tflite_layer tests
* adds layer tests for unary ops
* adds functionality to get tensors from ops
* 1. adds functionality to use custom funcs for input generation
2. removed UNIQUE op from testing ops
* adds functionality to use custom dtypes
* Cast operation support
* Enhanced tfl layer tests
* Trigger tfl layer tests in .ci
* Apply suggestions from code review
---------
Co-authored-by: Evgenya Stepyreva <evgenya.stepyreva@intel.com>
Co-authored-by: Evgenya Stepyreva <eva.my.link@gmail.com>
Co-authored-by: missjane <estepyreva@gmail.com>
* add pattern mathing for MVN, Exp->ReduceSum, L2Normalize, Div with eps for mixed precision inference
* added necessary includes
* clang_format_fix_all
* fix warning_as_error for unused variable
* fix warning_as_error for specifying float literals
* enable marking for fp32 IRs as well
* cosmetic improvements in unit-tests
* fix warnings as error
* added unit-tests for compress_float_constants.cpp for out of range values
* Update align_mixed_fp32_fp16_types.cpp
* Apply suggestions from code review
Co-authored-by: Maxim Vafin <maxim.vafin@intel.com>
* some grooming: mainly in imports
* build fix: replaced ngraph:: -> ov::
* collected all markings in a single file
* shortened pass names
* style fix
* made MarkNormalizationOps as a separate pass
* removed redundant comment, fixed description of MarkSugraphsToKeepInMixedPrecision pass
* comments on Up and Down marking in MarkSugraphsToKeepInMixedPrecision
* cleared info messages in compress_float_constants.cpp, removed threshold adjusting from ENV
* moved declarations of MarkNormalizationOps, MarkExpInReduceOpPath, MarkDivWithEps to hide them from outside users
* simplified pattern matching for max_or_add
* moved `reduceop_path` rt_info inside mark_subgraphs_to_keep_in_mixed_precision.cpp
* fix potential bug with Convert
* removed redundant check for Converts in `insert_converts_after_if_needed` as well
* set Convert types more safely
* corrections in opset10 namespaces; some minor corrections
---------
Co-authored-by: Maxim Vafin <maxim.vafin@intel.com>
* Add meshgrid listunpack transformation
* Add case when indexing is not specified
* Fix typos
* Fix problem with 1 input execution & missing runtime_info
* Fix issue with meshgrid placed in loop body
* Add tests to precommit
* Apply sugestions from review
* Fix input 0
* Improve indexing attribute read
* Remove download of vodels + move some methods to utils
* Separate constants
* filelist
* separate conformance utilities
* Update script according new utils
* Fix subgraphdumper crash
* Some small improvements for api conformance
* add warn_message
* One short fix
* fix master
* Optimize realloc for dynamic shape with
- Pre-aligned alloc for bounded dynamic shape
- Reuse internal buffer
* - Fix internal buffer of NMS kernel to be reused
- Fixed bug in nms quick sort
* Additional fix for internal buffer reuse
* Fix legacy dynamic batch to be applied only for 0-th dim dynamic shape with upper bound
* Fix unittest error
* Apply nms fixes of padding -1 to all buffers only when internal buffer is reused
* Not to have separate get_max_tensor, becuase currently there is no needs for that separate API.
Currently max tensor is only needed for memory allocation, and there is no need for minimum tensor size for now
* Fix allocation of internal buffer to be done for each layout
* add aten::topk
* remove commented lines
* remove white space
* move include to invidual ops
* swithc include statements
* fix style
* trim test cases
* Remove global ENABLE_INTEL_CPU macro ddefinition.Add mlocal definition for some source files where it is used
* Fix1
Co-authored-by: Ilya Churaev <ilya.churaev@intel.com>
* Review reverse sequence for:
- partial shapes and labels propagation
- template implementation of shape infer
- refactor shape_infer to use it when op created with default ctor
* Remove friend shape_infer from reverse sequence op
* Infrastructure for tflite
* Removed submodule flatbuffers
* Added flatbuffers submodule. Fixed version to v22.12.06 aka acf39ff
* Move headers back
* Flatbuffers integration
* Small fixes
* Started parsing the Model
* flatbuffer changes
* decoder_flatbuffer changes
* Lite Input Model -- not needed as of now but looks cool
* Rolled back inherritance from ov::frontend::tensorflow::InputModel
* Results are not treated as outputs, but its ok
* Fix missplaced input vs output
* Refactor
* Load model op-by-op. Frontend API finalized
* Debugging still, there are prints here and there. Decoder is not sane
* Convolution with all attributes is translated and quantization is applied for inputs and constatants. TODO: quantize intermediate tensors, separate decoder specific logic?
* Float ssd and posenet models are showing good accuracy
* Need to refactor but work flawlessly
* Telemetry and lightweight model cutting
* Code style and test changes. Extensions supported
* Quantization and style
* Style refinements
* Move onednn back
* New portion of operations enabled
* TFLite FE doesn't inherrit TF FE
* Moved files to another directory
* Rename header op_table.hpp to common_op_table.hpp for all files in src/frontends/tensorflow_common/src/op/
* Removed visability macroses
* CMake changes
* Unit-test execution in .ci
* Update labeler.yml
* Codeowners
* Style check and fix
* Static Build arrangement
* Addressing the comments
* install common headers to previous place
* New approach with public decoder and graph_iterator
* New approach with public decoder and graph_iterator
* Move GraphIterator back
* Comments addressed
* Comments adressed
* Preliminary TF FE README.md changes
* Added target_compile_definitions OPENVINO_STATIC_LIBRARY for static build
* Fixed conflicts and added TF to common places
* Frontends use only openvino::core::dev API
* Merged common tensorflow changes and made code build and work on selective number of models
* Style
* Rollback unnecessary changes from Tensorflow FE
* Rollback unnecessary changes from Tensorflow Common
* Minor refactor
* cmake minor refactoring
* Mixed commit
* Style and merge fix
* Low hanging fruit operations
* Fix windows build
* Refactor quantization parameters representation
* license compliance. approved by OS PDT
* copyrights in generic file
* dependabot
* labeler
* Unit Test to be triggered in CI
* cmake variables naming. corrected copyright years in copyrights/generic file
* library renamed in .ci/ calls
* Copyright year update
* Set openvino-tf-frontend-maintainers as owner of /src/frontends/tensorflow_lite/
* Fixed flatc corss-compilation
* Cleaned flatbuffers header usage
* Nitpicks solved
* Update cmake/templates/OpenVINOConfig.cmake.in
* Compile with flatbuffers headers
* Fixed "which is prefixed in the source directory"
* Fixed typo in flatbuffers cmake
* Removed flatbuffers submodule
* Added fork submodule
* Fixed static build
* Fixed cross-compilatio
* Fixed -Wshadow warning
* Fixed warning on Windows
* Use only headers from flatbuffers library
* Added LTO and fixed compilation errors on Windows
* Fixed warnings in tensorflow_common
* Move ctors implementation to cpp file
* Added information about new frontends to common FEm part
* Temporaryily disable warnings
* Fixed code style using clang-format
* Fixed Windows
* reverted changes in onnx
* Revert changes in onnx_common
* Removed pragma once frm cpp
Co-authored-by: missjane <estepyreva@gmail.com>
Co-authored-by: Ilya Lavrenov <ilya.lavrenov@intel.com>