[EISW-89824] [master] Rename VPUX to NPU (#19004)

* Change `VPUX`/`VPU` occurrences to `NPU`

* Switch `HARDWARE_AWARE_IGNORED_PATTERNS` VPU to NPU

* Rename `MYRIAD plugin`

* Rename vpu_patterns to npu_patterns in tools/pot

* Rename vpu.json to npu.json in tools/pot

* Rename restrict_for_vpu to restrict_for_npu in tools/pot

* Change keembayOptimalBatchNum to npuOptimalBatchNum

---------

Co-authored-by: Dan <mircea-aurelian.dan@intel.com>
This commit is contained in:
Stefania Hergane
2023-08-09 23:20:07 +03:00
committed by GitHub
parent dafe437833
commit 24f8c4105e
42 changed files with 711 additions and 711 deletions

View File

@@ -11,7 +11,7 @@
The Automatic Batching Execution mode (or Auto-batching for short) performs automatic batching on-the-fly to improve device utilization by grouping inference requests together, without programming effort from the user.
With Automatic Batching, gathering the input and scattering the output from the individual inference requests required for the batch happen transparently, without affecting the application code.
Auto Batching can be used :ref:`directly as a virtual device <auto-batching-as-device>` or as an :ref:`option for inference on CPU/GPU/VPU <auto-batching-as-option>` (by means of configuration/hint). These 2 ways are provided for the user to enable the BATCH devices **explicitly** or **implicitly**, with the underlying logic remaining the same. An example of the difference is that the CPU device doesnt support implicitly to enable BATCH device, commands such as ``./benchmark_app -m <model> -d CPU -hint tput`` will not apply BATCH device **implicitly**, but ``./benchmark_app -m <model> -d "BATCH:CPU(16)`` can **explicitly** load BATCH device.
Auto Batching can be used :ref:`directly as a virtual device <auto-batching-as-device>` or as an :ref:`option for inference on CPU/GPU/NPU <auto-batching-as-option>` (by means of configuration/hint). These 2 ways are provided for the user to enable the BATCH devices **explicitly** or **implicitly**, with the underlying logic remaining the same. An example of the difference is that the CPU device doesnt support implicitly to enable BATCH device, commands such as ``./benchmark_app -m <model> -d CPU -hint tput`` will not apply BATCH device **implicitly**, but ``./benchmark_app -m <model> -d "BATCH:CPU(16)`` can **explicitly** load BATCH device.
Auto-batching primarily targets the existing code written for inferencing many requests, each instance with the batch size 1. To get corresponding performance improvements, the application **must be running multiple inference requests simultaneously**.
Auto-batching can also be used via a particular *virtual* device.

View File

@@ -191,7 +191,7 @@ Tune quantization parameters
regex = '.*layer_.*'
nncf.quantize(model, dataset, ignored_scope=nncf.IgnoredScope(patterns=regex))
* ``target_device`` - defines the target device, the specificity of which will be taken into account during optimization. The following values are supported: ``ANY`` (default), ``CPU``, ``CPU_SPR``, ``GPU``, and ``VPU``.
* ``target_device`` - defines the target device, the specificity of which will be taken into account during optimization. The following values are supported: ``ANY`` (default), ``CPU``, ``CPU_SPR``, ``GPU``, and ``NPU``.
.. code-block:: sh