it is easy to capture when there are 2 app-level inference requests, but only single worker (MULTI) request

main thread                                        |       callback thread
___________________________________________________________________________
                                                   | <in the callback, the worker request>
                                                   |   <the request returns itself to the "idle" queue>
                                                   | 1) idleGuard.Release()->try_push(workerRequestPtr)
2)<notified on vacant worker arrived via callback> |
3) starts another request with StartAsync          | ...
4) <in the ThisRequestExecutor::run()>             |
workerInferRequest->_task = std::move(task);       | if (_inferPipelineTasks.try_pop(workerRequestPtr->task))

the last line introduces DATA RACE (sporadically manifested in the bad_function_call exception), the fix is in this commit
This commit is contained in:
Maxim Shevtsov 2020-12-07 16:58:26 +03:00 committed by GitHub
parent 6bad345df9
commit 57fda7f2a8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -95,10 +95,11 @@ MultiDeviceExecutableNetwork::MultiDeviceExecutableNetwork(const DeviceMap<Infer
}
// try to return the request to the idle list (fails if the overall object destruction has began)
if (idleGuard.Release()->try_push(workerRequestPtr)) {
Task t;
// try pop the task, as we know there is at least one idle request
if (_inferPipelineTasks.try_pop(workerRequestPtr->_task)) {
if (_inferPipelineTasks.try_pop(t)) {
// if succeeded, let's schedule that
ScheduleToWorkerInferRequest(std::move(workerRequestPtr->_task));
ScheduleToWorkerInferRequest(std::move(t));
}
}
});