[docs] python snippets for devices master (#11176)
* Update CPU docs * update GPU docs * update with sphinxtab * Fix docs * Add preprocessig snippet * Fix path Co-authored-by: Anastasia Kuporosova <anastasia.kuporosova@intel.com>
This commit is contained in:
parent
bc9f140bb4
commit
8f88889876
@ -9,7 +9,17 @@ There are two options for using the custom operation configuration file:
|
||||
* Include a section with your kernels into the automatically-loaded `<lib_path>/cldnn_global_custom_kernels/cldnn_global_custom_kernels.xml` file.
|
||||
* Call the `ov::Core::set_property()` method from your application with the `"CONFIG_FILE"` key and the configuration file name as a value before loading the network that uses custom operations to the plugin:
|
||||
|
||||
@snippet snippets/gpu/custom_kernels_api.cpp part0
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/custom_kernels_api.cpp part0
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/custom_kernels_api.py part0
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
All OpenVINO samples, except the trivial `hello_classification`, and most Open Model Zoo demos
|
||||
feature a dedicated command-line option `-c` to load custom kernels. For example, to load custom operations for the classification sample, run the command below:
|
||||
|
@ -15,7 +15,17 @@ For the CPU plugin `"CPU"` device name is used, and even though there can be mor
|
||||
On multi-socket platforms, load balancing and memory usage distribution between NUMA nodes are handled automatically.
|
||||
In order to use CPU for inference the device name should be passed to `ov::Core::compile_model()` method:
|
||||
|
||||
@snippet snippets/cpu/compile_model.cpp compile_model_default
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/cpu/compile_model.cpp compile_model_default
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/cpu/compile_model.py compile_model_default
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
## Supported inference data types
|
||||
CPU plugin supports the following data types as inference precision of internal primitives:
|
||||
@ -55,7 +65,17 @@ Using bf16 precision provides the following performance benefits:
|
||||
|
||||
To check if the CPU device can support the bfloat16 data type use the [query device properties interface](./config_properties.md) to query ov::device::capabilities property, which should contain `BF16` in the list of CPU capabilities:
|
||||
|
||||
@snippet snippets/cpu/Bfloat16Inference0.cpp part0
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/cpu/Bfloat16Inference0.cpp part0
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/cpu/Bfloat16Inference.py part0
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
In case if the model was converted to bf16, ov::hint::inference_precision is set to ov::element::bf16 and can be checked via ov::CompiledModel::get_property call. The code below demonstrates how to get the element type:
|
||||
|
||||
@ -63,7 +83,17 @@ In case if the model was converted to bf16, ov::hint::inference_precision is set
|
||||
|
||||
To infer the model in f32 precision instead of bf16 on targets with native bf16 support, set the ov::hint::inference_precision to ov::element::f32.
|
||||
|
||||
@snippet snippets/cpu/Bfloat16Inference2.cpp part2
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/cpu/Bfloat16Inference2.cpp part2
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/cpu/Bfloat16Inference.py part2
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
Bfloat16 software simulation mode is available on CPUs with Intel® AVX-512 instruction set that do not support the native `avx512_bf16` instruction. This mode is used for development purposes and it does not guarantee good performance.
|
||||
To enable the simulation, one have to explicitly set ov::hint::inference_precision to ov::element::bf16.
|
||||
@ -78,7 +108,17 @@ To enable the simulation, one have to explicitly set ov::hint::inference_precisi
|
||||
If a machine has OpenVINO supported devices other than CPU (for example integrated GPU), then any supported model can be executed on CPU and all the other devices simultaneously.
|
||||
This can be achieved by specifying `"MULTI:CPU,GPU.0"` as a target device in case of simultaneous usage of CPU and GPU.
|
||||
|
||||
@snippet snippets/cpu/compile_model.cpp compile_model_multi
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/cpu/compile_model.cpp compile_model_multi
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/cpu/compile_model.py compile_model_multi
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
See [Multi-device execution page](../multi_device.md) for more details.
|
||||
|
||||
@ -103,7 +143,17 @@ The most flexible configuration is the fully undefined shape, when we do not app
|
||||
But reducing the level of uncertainty will bring performance gains.
|
||||
We can reduce memory consumption through memory reuse, and as a result achieve better cache locality, which in its turn leads to better inference performance, if we explicitly set dynamic shapes with defined upper bounds.
|
||||
|
||||
@snippet snippets/cpu/dynamic_shape.cpp defined_upper_bound
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/cpu/dynamic_shape.cpp defined_upper_bound
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/cpu/dynamic_shape.py defined_upper_bound
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
> **NOTE**: Using fully undefined shapes may result in significantly higher memory consumption compared to inferring the same model with static shapes.
|
||||
> If the memory consumption is unacceptable but dynamic shapes are still required, one can reshape the model using shapes with defined upper bound to reduce memory footprint.
|
||||
@ -111,7 +161,17 @@ We can reduce memory consumption through memory reuse, and as a result achieve b
|
||||
Some runtime optimizations works better if the model shapes are known in advance.
|
||||
Therefore, if the input data shape is not changed between inference calls, it is recommended to use a model with static shapes or reshape the existing model with the static input shape to get the best performance.
|
||||
|
||||
@snippet snippets/cpu/dynamic_shape.cpp static_shape
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/cpu/dynamic_shape.cpp static_shape
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/cpu/dynamic_shape.py static_shape
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
See [dynamic shapes guide](../ov_dynamic_shapes.md) for more details.
|
||||
|
||||
@ -211,4 +271,3 @@ For some performance-critical DL operations, the CPU plugin uses optimized imple
|
||||
* [Supported Devices](Supported_Devices.md)
|
||||
* [Optimization guide](@ref openvino_docs_optimization_guide_dldt_optimization_guide)
|
||||
* [СPU plugin developers documentation](https://github.com/openvinotoolkit/openvino/wiki/CPUPluginDevelopersDocs)
|
||||
|
||||
|
@ -48,19 +48,49 @@ Then device name can be passed to `ov::Core::compile_model()` method:
|
||||
|
||||
@sphinxtab{Running on default device}
|
||||
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/compile_model.cpp compile_model_default_gpu
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/compile_model.py compile_model_default_gpu
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Running on specific GPU}
|
||||
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/compile_model.cpp compile_model_gpu_with_id
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/compile_model.py compile_model_gpu_with_id
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Running on specific tile}
|
||||
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/compile_model.cpp compile_model_gpu_with_id_and_tile
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/compile_model.py compile_model_gpu_with_id_and_tile
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@ -93,7 +123,17 @@ Floating-point precision of a GPU primitive is selected based on operation preci
|
||||
If a machine has multiple GPUs (for example integrated GPU and discrete Intel GPU), then any supported model can be executed on all GPUs simultaneously.
|
||||
This can be achieved by specifying `"MULTI:GPU.1,GPU.0"` as a target device.
|
||||
|
||||
@snippet snippets/gpu/compile_model.cpp compile_model_multi
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/compile_model.cpp compile_model_multi
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/compile_model.py compile_model_multi
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
See [Multi-device execution page](../multi_device.md) for more details.
|
||||
|
||||
@ -106,13 +146,33 @@ Alternatively it can be enabled explicitly via the device notion, e.g. `"BATCH:G
|
||||
|
||||
@sphinxtab{Batching via BATCH plugin}
|
||||
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/compile_model.cpp compile_model_batch_plugin
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/compile_model.py compile_model_batch_plugin
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Bacthing via throughput hint}
|
||||
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/compile_model.cpp compile_model_auto_batch
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/compile_model.py compile_model_auto_batch
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@ -141,7 +201,17 @@ For example, batch size 33 may be executed via 2 internal networks with batch si
|
||||
|
||||
The code snippet below demonstrates how to use dynamic batch in simple scenarios:
|
||||
|
||||
@snippet snippets/gpu/dynamic_batch.cpp dynamic_batch
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/dynamic_batch.cpp dynamic_batch
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/dynamic_batch.py dynamic_batch
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
See [dynamic shapes guide](../ov_dynamic_shapes.md) for more details.
|
||||
|
||||
@ -149,7 +219,17 @@ See [dynamic shapes guide](../ov_dynamic_shapes.md) for more details.
|
||||
GPU plugin has the following additional preprocessing options:
|
||||
- `ov::intel_gpu::memory_type::surface` and `ov::intel_gpu::memory_type::buffer` values for `ov::preprocess::InputTensorInfo::set_memory_type()` preprocessing method. These values are intended to be used to provide a hint for the plugin on the type of input Tensors that will be set in runtime to generate proper kernels.
|
||||
|
||||
@snippet snippets/gpu/preprocessing.cpp init_preproc
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/preprocessing.cpp init_preproc
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/preprocessing.py init_preproc
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
With such preprocessing GPU plugin will expect `ov::intel_gpu::ocl::ClImage2DTensor` (or derived) to be passed for each NV12 plane via `ov::InferRequest::set_tensor()` or `ov::InferRequest::set_tensors()` methods.
|
||||
|
||||
|
23
docs/snippets/cpu/Bfloat16Inference.py
Normal file
23
docs/snippets/cpu/Bfloat16Inference.py
Normal file
@ -0,0 +1,23 @@
|
||||
# Copyright (C) 2022 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
from openvino.runtime import Core
|
||||
|
||||
#! [part0]
|
||||
core = Core()
|
||||
cpu_optimization_capabilities = core.get_property("CPU", "OPTIMIZATION_CAPABILITIES")
|
||||
#! [part0]
|
||||
|
||||
# TODO: enable part1 when property api will be supported in python
|
||||
#! [part1]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "CPU")
|
||||
inference_precision = core.get_property("CPU", "INFERENCE_PRECISION_HINT")
|
||||
#! [part1]
|
||||
|
||||
#! [part2]
|
||||
core = Core()
|
||||
core.set_property("CPU", {"INFERENCE_PRECISION_HINT": "f32"})
|
||||
#! [part2]
|
17
docs/snippets/cpu/compile_model.py
Normal file
17
docs/snippets/cpu/compile_model.py
Normal file
@ -0,0 +1,17 @@
|
||||
# Copyright (C) 2022 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
#! [compile_model_default]
|
||||
from openvino.runtime import Core
|
||||
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "CPU")
|
||||
#! [compile_model_default]
|
||||
|
||||
#! [compile_model_multi]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "MULTI:CPU,GPU.0")
|
||||
#! [compile_model_multi]
|
17
docs/snippets/cpu/dynamic_shape.py
Normal file
17
docs/snippets/cpu/dynamic_shape.py
Normal file
@ -0,0 +1,17 @@
|
||||
# Copyright (C) 2022 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
from openvino.runtime import Core
|
||||
|
||||
#! [defined_upper_bound]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
model.reshape([(1, 10), (1, 20), (1, 30), (1, 40)])
|
||||
#! [defined_upper_bound]
|
||||
|
||||
#! [static_shape]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
model.reshape([10, 20, 30, 40])
|
||||
#! [static_shape]
|
41
docs/snippets/gpu/compile_model.py
Normal file
41
docs/snippets/gpu/compile_model.py
Normal file
@ -0,0 +1,41 @@
|
||||
# Copyright (C) 2022 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
from openvino.runtime import Core
|
||||
|
||||
#! [compile_model_default_gpu]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "GPU")
|
||||
#! [compile_model_default_gpu]
|
||||
|
||||
#! [compile_model_gpu_with_id]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "GPU.1")
|
||||
#! [compile_model_gpu_with_id]
|
||||
|
||||
#! [compile_model_gpu_with_id_and_tile]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "GPU.1.0")
|
||||
#! [compile_model_gpu_with_id_and_tile]
|
||||
|
||||
#! [compile_model_multi]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "MULTI:GPU.1,GPU.0")
|
||||
#! [compile_model_multi]
|
||||
|
||||
#! [compile_model_batch_plugin]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "BATCH:GPU")
|
||||
#! [compile_model_batch_plugin]
|
||||
|
||||
#! [compile_model_auto_batch]
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
compiled_model = core.compile_model(model, "GPU", {"PERFORMANCE_HINT": "THROUGHPUT"})
|
||||
#! [compile_model_auto_batch]
|
10
docs/snippets/gpu/custom_kernels_api.py
Normal file
10
docs/snippets/gpu/custom_kernels_api.py
Normal file
@ -0,0 +1,10 @@
|
||||
# Copyright (C) 2022 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
from openvino.runtime import Core
|
||||
|
||||
#! [part0]
|
||||
core = Core()
|
||||
core.set_property("GPU", {"CONFIG_FILE": "<path_to_the_xml_file>"})
|
||||
#! [part0]
|
28
docs/snippets/gpu/dynamic_batch.py
Normal file
28
docs/snippets/gpu/dynamic_batch.py
Normal file
@ -0,0 +1,28 @@
|
||||
# Copyright (C) 2022 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
|
||||
import openvino.runtime as ov
|
||||
|
||||
#! [dynamic_batch]
|
||||
core = ov.Core()
|
||||
|
||||
C = 3
|
||||
H = 224
|
||||
W = 224
|
||||
|
||||
model = core.read_model("model.xml")
|
||||
model.reshape([(1, 10), C, H, W])
|
||||
|
||||
# compile model and create infer request
|
||||
compiled_model = core.compile_model(model, "GPU")
|
||||
infer_request = compiled_model.create_infer_request()
|
||||
|
||||
# create input tensor with specific batch size
|
||||
input_tensor = ov.Tensor(model.input().element_type, [2, C, H, W])
|
||||
|
||||
# ...
|
||||
|
||||
infer_request.infer([input_tensor])
|
||||
|
||||
#! [dynamic_batch]
|
18
docs/snippets/gpu/preprocessing.py
Normal file
18
docs/snippets/gpu/preprocessing.py
Normal file
@ -0,0 +1,18 @@
|
||||
# Copyright (C) 2022 Intel Corporation
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
|
||||
#! [init_preproc]
|
||||
from openvino.runtime import Core, Type, Layout
|
||||
from openvino.preprocess import PrePostProcessor, ColorFormat
|
||||
|
||||
core = Core()
|
||||
model = core.read_model("model.xml")
|
||||
|
||||
p = PrePostProcessor(model)
|
||||
p.input().tensor().set_element_type(Type.u8) \
|
||||
.set_color_format(ColorFormat.NV12_TWO_PLANES, ["y", "uv"]) \
|
||||
.set_memory_type("GPU_SURFACE")
|
||||
p.input().preprocess().convert_color(ColorFormat.BGR)
|
||||
p.input().model().set_layout(Layout("NCHW"))
|
||||
model_with_preproc = p.build()
|
||||
#! [init_preproc]
|
Loading…
Reference in New Issue
Block a user