Correct the GPU number that triggers the CPU removal in CTPUT (#13225)

This commit is contained in:
Haiqi Pan 2022-10-12 15:00:04 +08:00 committed by GitHub
parent 844b0a2682
commit c0c4c41241
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -220,7 +220,7 @@ While `LATENCY` and `THROUGHPUT` can select one target device with your preferre
CUMULATIVE_THROUGHPUT has similar behavior as [the Multi-Device execution mode (MULTI)](./multi_device.md). The only difference is that CUMULATIVE_THROUGHPUT uses the devices specified by AUTO, which means that it's not mandatory to add devices manually, while with MULTI, you need to specify the devices before inference.
With the CUMULATIVE_THROUGHPUT option:
- If `AUTO` without any device names is specified, and the system has more than one GPU devices, AUTO will remove CPU from the device candidate list to keep GPU running at full capacity.
- If `AUTO` without any device names is specified, and the system has more than two GPU devices, AUTO will remove CPU from the device candidate list to keep GPU running at full capacity.
- If device priority is specified, AUTO will run inference requests on devices based on the priority. In the following example, AUTO will always try to use GPU first, and then use CPU if GPU is busy:
```sh
ov::CompiledModel compiled_model = core.compile_model(model, "AUTO:GPU,CPU", ov::hint::performance_mode(ov::hint::PerformanceMode::CUMULATIVE_THROUGHPUT));