It uses CMake 3.16 built-in utilities to speed up build time:
* Unity builds
* Precompiled headers
The feature is controlled via `ENABLE_FASTER_BUILD` CMake option (disabled by default).
The option avaialble only on CMake >= 3.16.
The feature is enabled per-target via `ie_faster_build` function.
Some observations:
* Don't have actual numbers for compile time, but subjectively can see
speed up locally with VS 2019.
* Unity builds gives much more effect, but has some restriction on source files,
so are not used everywhere.
* [IE][VPU]: Fixes addCopyForOutputsInsideNetwork
In case of dynamic output with consumer pass tries
to connect output's shape with new intermediate data
twice: one at the moment of duplicateData call (successful)
and once more at the end of the pass manually. The second
try leads to error since child data is already connected.
Signed-off-by: Gladilov, Gleb <gleb.gladilov@intel.com>
* [IE][VPU]: Introduces tests on addCopyForOutputsInsideNetwork
Signed-off-by: Gladilov, Gleb <gleb.gladilov@intel.com>
- introduced type_dispatch primitive
- refactored SplitX and MergeX kernels to use type_dispatch
- extended SplitX and MergeX to support 8S, 16U, 16S, 32S types
* fake quantize single layer test for GNA plugin
* implemented fakequantize for fp32 case as an activation function
* added proper seed randomisation within single test run
* [GNA] [FAKEQUANTIZE] fixed ref-fp32 implementation on GNA to use nearbyint instead of roundf
* [GNA] [FAKEQUANTIZE] restored random seed
* [GNA][FAKEQUANTIZE] disabled 4d and integer tests for FakeQuantize
* [GNA][FAKEQUANTIZE]updated ngraph FakeQuantize builder to accept seed
* [GNA][FAKEQUANTIZE]aligned FP calculations order on GNA with reference ngraph - this however gives more error
* [CPU]build of FakeQuantise tests restored
* [TESTS][FAKEQUANTIZE] ignore extra inferRequests for disabled tests
* [GNA] Fixed legacy unit test failuers appeared due to extra check for possible segfault in import frames
* [GNA] adopted fuse multiple identities for FakeQunatize layer
* [GNA]fp32 runtime code review
* Backport of FQ+Mul transform to master
* Accept any type of input to FQ in the transformation
* Test the fusion when all FQ inputs are non-const
* Fusion test when only one output limit is const
* Test passing the output of FQ to second input of Mul
* Specify in and out precisions separately, add layouts for convolution
* Align convolution layer tests instantiations with updated definition
* Align convolution layer tests instantiations with updated definition for template plugin
* net, in, out prcs
Co-authored-by: Mikhail Treskin <mikhail.treskin@intel.com>
* UWP fixes
* Commented code for compilation with UWP
* Current state: compiled for DESKTOP_APP
* Fixes
* Added toolchain
* Enabled ONNX imported for Windows Store
* Updated toolchain
* Fixes
* Disable ONNX in case of UWP
* Fix for Windows Driver
* Applied style check
* Dynamic loading of GetDLLDirectory symbols
* Clean-up in the toolchain
* Updated mkldnn plugin cmake
* ConvertPrecision - saturate Constant's value to std::numeric_limits<dst_type>::lowest() if it's below that limit.
* Remove clamping to std::numeric_limits<int32_t>::lowest() in U32/U64 case
* fix bidirectional case in references of sequences ops, enable decomposition of bidirectional cases in CommonOptimizations
* introduce new opset5, include GRU/RNN/LSTM Sequences to opset5
* Revert "introduce new opset5, include GRU/RNN/LSTM Sequences to opset5"
This reverts commit 73c22a11db.
* Introduced a new way to test DSR+Op cases
* Enabled DSR_Reduce, DSR_VariadicSplit, DSR_TopK, DSR_Scatter, DSR_Unsqueeze tests
* Other disabled tests are still disabled until reference function is implemented. Added related comments
* Reduce DSR+Op tests execution time via reducing tensor shapes
* Now coordinate_transformation_mode used for all axes in the 'nearest' mode.
* Temporarily added tests for Interpolate-4 evaluate().
* Deleted temporarily added tests.
* Fixed documentation for the 'nearest' mode.
* Small fixes.
* Disabled Interpolate-4 layer tests for CPU.
* Disabled some Interpolate-4 CPU tests.
* do not change index table when execute each time
* layout check added
* interpolate for no batch size even scale is 1
* coordinate transformation with div scale, not multiple 1/scale, for higher accuracy
* disable tests temporal
* test modification
* Some changes.
* Enabled some tests.