desired format
changed InferRequestInternal:
- added _deviceInputs member to store plugin desired perprocessing
targets
- added default argument to preProcessingRequired to describe plugin
specific desired preprocessing target
- SetBlob and GetBlob to deal with plugin desired preprocessing targets
(_deviceInputs)
- added addInputPreProcessingFor helper method to avoid code
duplication
changed TEMPLATE plugin to use new functionality:
- removed explicit presicion conversion (to use built-in one of
InferRequestInternal)
- _networkInputBlobs to use InferRequestInternal::_deviceInputs
changed PreprocessingPrecisionConvertTest:
- to force output precision to be same as input (and not FP32 always)
changed TEMPLATE plugin to allow U8 outputs
Ticket - #-42237
Add Unsqueeze, Equal and Select operations to the StaticShapeBroadcast target shape evaluator, because they are presented in the target shape subgraph evaluator in one of the network, we are currently enabling.
Ticket - #-44546
Changes:
* Support dynamic data as broadcast input in Broadcast DTS
* Update DTS tests to support both dynamic and static inputs
* Update inference tests:
a) Refactor tests to have only one testing class - NonZero_Broadcast
b) Make DSR_TestsCommon base class to reuse createInputSubgraphWithDSR and inputs generating utils.
c) Add possibility to add additional results in DSR_TestsCommon, because NonZero doesn't support cases when both its outputs are unused, so we need to add at least one of them to function results.
GFlags builds multiple targets, require aligning build
options on windows builds.
FetchModule offloads project configuration to cmake. This also allows
to align build configurations and targets across projects:
https://crascit.com/2015/07/25/cmake-gtest/
* Create new iterators which allow to iterate over coordinates.
Use new iterators to speedup StridedSlice reference implementation.
* Call memcpy if reverse ref impl has nothing to reverse.
* Add unit tests for coordinate range.
* Change coordinates::RangeIterator to template.
* Yet another slice and reverse implementation.
Remove all stuff connected with ranges.
* Apply review suggestions.
* Back to ranges which base on CoordinateTransform.
* try to fix x84_32 build
* try to fix x84_32 build
* Ranges which return start, no, stride, direction.
* add input validation to coordinate_index
enable coordinate_range validation tests
* add some doxygens
* fix range increament
* add empyt range
* move SliceRange::get_value to cpp file
Co-authored-by: Patryk Elszkowski <patryk.elszkowki@intel.com>
Co-authored-by: ggalieroc <gabriele.galiero.casay@intel.com>
* remove gather op from layer creator
* remove floormod op from layer creator
* remove minimum op from layer creator
* remove spacetodepth op from layer creator
* remove redundant virtual function specifier
* [IE]: Allows plugins to disable Gather -> GatherIE conversion
Gather layer takes axis as 3rd input, not attribute and may
take indices as 0D scalar input
Signed-off-by: Gladilov, Gleb <gleb.gladilov@intel.com>
* [IE][VPU]: Disables Gather -> GatherIE conversion
Gather -> GatherIE conversion may introduce Gather
operation decomposition into Unsqueeze + Gather +
Squeeze in case if indices input is 0D scalar input.
In case of dynamic Gather such decomposition will
break dynamic path. Myriad plugin has to support
Gather operation natively without legacy conversion.
Signed-off-by: Gladilov, Gleb <gleb.gladilov@intel.com>
* [IE][VPU]: Enables native Gather support
Gather layer in contrast with GatherIE takes
axis as 3rd input, not attribute and may take
indices input as 0D scalar input.
0D -> 1D conversion happens automatically at
the beginning of frontend.
Axis as 3rd input is supported for single value
integral scalar only.
Signed-off-by: Gladilov, Gleb <gleb.gladilov@intel.com>
* [IE][VPU][Tests]: Enable new infra single layer Gather tests
* Removes corresponding tests from old infrastructure
* Enables test cases with 0D indices input
* Extracts base test fixture from shared tests fixture.
Unfortunately, Google Tests supports Combine generator
for tuples of size up to 10 only. Originally, shared
tests fixture already has 10 elements in tuple for
tests parameters. At the same time myriad plugin needs
to specify configuration option. Since configuration
option could not be test parameter we are forced to
use separate class, in order to get rid of code
duplication base class is used.
Signed-off-by: Gladilov, Gleb <gleb.gladilov@intel.com>
* [IE][VPU]: Updates firmware
Enables native Gather support on device side
* zero-copy (assuming determenistic app-level scheduling) for the multi-device, via "borrowing" the corresponding device-specific blobs and letting the app to implicitly use these
* Optimized Infer Request Scheduling
* remoteblob checks in the conventional SetBlob
* correctly (with status) reporting NOT_IMPLEMENTED
* SetBlob to accomodate for the RemoteBobs
* Tests for remote blobs support via MULTI: creating the shared_test in case the other (closed source) plugins would want to use that (in the private shared_tests instantiations).
Also instantiating the remote blobs tests for the some basic combinations to test the MULTI supports them
* macos compilation (and general plugin platform support) fix
* shuffled files, so that the MULTI tests are now part of the ieFuncTests (and need no separate target). Also brushed the macro that handales the NOT_IMPLEMENTED as bit
* further shuffled files, so that the initial MULTI tests are now part of the IE tests, yet specific instances do need separate targets
* Fixed misprint
* Brushing the code and comments a bit
* further brushing of the ScheduleToWorkerRequest: moving the task execution directly into the loop over devices (avoids pointers and 'else' clause)
* 1) zero-copy (assuming determenistic app-level scheduling) for the multi-device, via "borrowing" the corresponding device-specific blobs and letting the app to implicitly use these
2) Initial MULTI section in the opt guide (primarily to document a tip on helping the MULTI to keep the zero-copy path)
* [MULTI] remote context support and associated scheduling (respecting the remote data affinity)
* fix CentOS (old) gcc issue: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81880
since the intriduced therad_local string is template the bug manifests itself (and the string is not allocated/initialized).
the QA is to wrap the std::string into the function
* further fix for the old gcc versions issue, now with non-trivial thread_local destruction sefault: switching from the std::string to the plain const char*
* additional tests for the MULTI and remote blobs (no remote context and multi GPUs cases)
* fix for the tests (that now can check for more specific NotImplemented exeption).
Alos couple of line endings
* added check so that sample only supports networks with one input
* moved ngraph-realted operations to related segment of the sample
* fix for output image not being saved correcly due
* Range: Align operator with spec and add unit tests
* Range: Remove output shape from range ref impl signature
* Range: Exclude backend unit tests for CPU and GPU due to unsupported dynamic ops
* Range: Add single layer test class for Range-4
* Range: Add unit test for shape inference
* Range: Add unit tests for i32 and f32
* Range: Refactor Range v0 backend test and added test for f32 type
* Range: Add floating point tolerance in unit tests to avoid failures due to precision
* Range: Add subgraph tests for Range add element-wise
* Range: Refactor Range class for single layer tests and add range add element-wise test with truncated inputs
* [VPU] Fix dynamic networks import
* [IE][GNA][TESTS] Move ImportExport tests from GNA to shared part
* [VPU][Tests] Add ExportImport test for dynamic network
* [VPU] Review fixes
* [VPU][Tests] Review and test fixes
* [VPU][Tests] Move TEST_P to shared part
* remove convert op from layer creator
* remove depthtospace op from layer creator
* remove mvn op from layer creator
* remove normalizel2 op from layer creator
* remove notequal op from layer creator
* remove subtract op from layer creator
* correct mvn op behavior when copied with new input
* fix trying to get precision from empty output of normalize layer
* fix normalize layer not setting output type
* remove trailing whitespace
* add fp64 to possible convert op precision types
* use a function to translate bool string representation
* merge emergency opset changes for mvn and roipooling ops
* Add reference implementation for PSROIPooling operator
* fix test_roi_pooling
* use std::roundf
* remove unnecessary copies in single layer tets
* Fixes after review
* fixes after review
* use element::Type_t instead of element::
* apply code format
* add PSROIPooling to evaluates_map
* apply code format
it is easy to capture when there are 2 app-level inference requests, but only single worker (MULTI) request
main thread | callback thread
___________________________________________________________________________
| <in the callback, the worker request>
| <the request returns itself to the "idle" queue>
| 1) idleGuard.Release()->try_push(workerRequestPtr)
2)<notified on vacant worker arrived via callback> |
3) starts another request with StartAsync | ...
4) <in the ThisRequestExecutor::run()> |
workerInferRequest->_task = std::move(task); | if (_inferPipelineTasks.try_pop(workerRequestPtr->task))
the last line introduces DATA RACE (sporadically manifested in the bad_function_call exception), the fix is in this commit