* Optimized Infer Request Scheduling
* Fixed misprint
* Brushing the code and comments a bit
* further brushing of the ScheduleToWorkerRequest: moving the task execution directly into the loop over devices (avoids pointers and 'else' clause)
* 1) zero-copy (assuming determenistic app-level scheduling) for the multi-device, via "borrowing" the corresponding device-specific blobs and letting the app to implicitly use these
2) Initial MULTI section in the opt guide (primarily to document a tip on helping the MULTI to keep the zero-copy path)
Co-authored-by: apankratovantonp <anton.pankratov@intel.com>