update NV12 (#15370)
* update NV12 docs and snippets add single-plane input information create single-plane cpp snippet menu fix update formatting for sphinx directives Co-Authored-By: Ilya Churaev <ilyachur@gmail.com> Co-Authored-By: Vladimir Paramuzov <vladimir.paramuzov@intel.com> * additional snippet fixes --------- Co-authored-by: Ilya Churaev <ilyachur@gmail.com> Co-authored-by: Vladimir Paramuzov <vladimir.paramuzov@intel.com>
This commit is contained in:
parent
4dff2d1c60
commit
f9a8d9132d
@ -222,11 +222,11 @@ The GPU plugin has the following additional preprocessing options:
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{C++}
|
||||
@snippet docs/snippets/gpu/preprocessing.cpp init_preproc
|
||||
@snippet docs/snippets/gpu/preprocessing_nv12_two_planes.cpp init_preproc
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Python}
|
||||
@snippet docs/snippets/gpu/preprocessing.py init_preproc
|
||||
@snippet docs/snippets/gpu/preprocessing_nv12_two_planes.py init_preproc
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
@ -3,8 +3,11 @@
|
||||
The GPU plugin implementation of the `ov::RemoteContext` and `ov::RemoteTensor` interfaces supports GPU
|
||||
pipeline developers who need video memory sharing and interoperability with existing native APIs,
|
||||
such as OpenCL, Microsoft DirectX, or VAAPI.
|
||||
Using these interfaces allows you to avoid any memory copy overhead when plugging OpenVINO™ inference
|
||||
into an existing GPU pipeline. It also enables OpenCL kernels to participate in the pipeline to become
|
||||
|
||||
The `ov::RemoteContext` and `ov::RemoteTensor` interface implementation targets the need for memory sharing and
|
||||
interoperability with existing native APIs, such as OpenCL, Microsoft DirectX, and VAAPI.
|
||||
They allow you to avoid any memory copy overhead when plugging OpenVINO™ inference
|
||||
into an existing GPU pipeline. They also enable OpenCL kernels to participate in the pipeline to become
|
||||
native buffer consumers or producers of the OpenVINO™ inference.
|
||||
|
||||
There are two interoperability scenarios supported by the Remote Tensor API:
|
||||
@ -23,7 +26,7 @@ and functions that consume or produce native handles directly.
|
||||
## Context Sharing Between Application and GPU Plugin
|
||||
|
||||
GPU plugin classes that implement the `ov::RemoteContext` interface are responsible for context sharing.
|
||||
Obtaining a context object is the first step of sharing pipeline objects.
|
||||
Obtaining a context object is the first step in sharing pipeline objects.
|
||||
The context object of the GPU plugin directly wraps OpenCL context, setting a scope for sharing the
|
||||
`ov::CompiledModel` and `ov::RemoteTensor` objects. The `ov::RemoteContext` object can be either created on top of
|
||||
an existing handle from a native API or retrieved from the GPU plugin.
|
||||
@ -37,60 +40,49 @@ additional parameter.
|
||||
To create the `ov::RemoteContext` object for user context, explicitly provide the context to the plugin using constructor for one
|
||||
of `ov::RemoteContext` derived classes.
|
||||
|
||||
@sphinxtabset
|
||||
@sphinxdirective
|
||||
|
||||
@sphinxtab{Linux}
|
||||
.. tab:: Linux
|
||||
|
||||
@sphinxtabset
|
||||
.. tab:: Create from cl_context
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: context_from_cl_context
|
||||
|
||||
@sphinxtab{Create from cl_context}
|
||||
.. tab:: Create from cl_queue
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp context_from_cl_context
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: context_from_cl_queue
|
||||
|
||||
@endsphinxtab
|
||||
.. tab:: Create from VADisplay
|
||||
|
||||
@sphinxtab{Create from cl_queue}
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: context_from_va_display
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp context_from_cl_queue
|
||||
.. tab:: Windows
|
||||
|
||||
@endsphinxtab
|
||||
.. tab:: Create from cl_context
|
||||
|
||||
@sphinxtab{Create from VADisplay}
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: context_from_cl_context
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp context_from_va_display
|
||||
.. tab:: Create from cl_queue
|
||||
|
||||
@endsphinxtab
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: context_from_cl_queue
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Windows}
|
||||
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{Create from cl_context}
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp context_from_cl_context
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Create from cl_queue}
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp context_from_cl_queue
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Create from ID3D11Device}
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp context_from_d3d_device
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
@endsphinxtabset
|
||||
.. tab:: Create from ID3D11Device
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: context_from_d3d_device
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
### Getting RemoteContext from the Plugin
|
||||
If you do not provide any user context, the plugin uses its default internal context.
|
||||
@ -100,21 +92,21 @@ Once the plugin options have been changed, the internal context is replaced by t
|
||||
|
||||
To request the current default context of the plugin, use one of the following methods:
|
||||
|
||||
@sphinxtabset
|
||||
@sphinxdirective
|
||||
|
||||
@sphinxtab{Get context from Core}
|
||||
.. tab:: Get context from Core
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp default_context_from_core
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: default_context_from_core
|
||||
|
||||
@endsphinxtab
|
||||
.. tab:: Get context from compiled model
|
||||
|
||||
@sphinxtab{Batching via throughput hint}
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: default_context_from_model
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp default_context_from_model
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
@endsphinxdirective
|
||||
|
||||
## Memory Sharing Between Application and GPU Plugin
|
||||
|
||||
@ -126,108 +118,153 @@ of the `ov::RemoteContext` sub-classes.
|
||||
`ov::intel_gpu::ocl::ClContext` has multiple overloads of `create_tensor` methods which allow to wrap pre-allocated native handles with the `ov::RemoteTensor`
|
||||
object or request plugin to allocate specific device memory. For more details, see the code snippets below:
|
||||
|
||||
@sphinxtabset
|
||||
@sphinxdirective
|
||||
|
||||
@sphinxtab{Wrap native handles}
|
||||
.. tab:: Wrap native handles
|
||||
|
||||
@sphinxtabset
|
||||
.. tab:: USM pointer
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: wrap_usm_pointer
|
||||
|
||||
@sphinxtab{USM pointer}
|
||||
.. tab:: cl_mem
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: wrap_cl_mem
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp wrap_usm_pointer
|
||||
.. tab:: cl::Buffer
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: wrap_cl_buffer
|
||||
|
||||
@endsphinxtab
|
||||
.. tab:: cl::Image2D
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: wrap_cl_image
|
||||
|
||||
@sphinxtab{cl_mem}
|
||||
.. tab:: biplanar NV12 surface
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: wrap_nv12_surface
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp wrap_cl_mem
|
||||
.. tab:: Allocate device memory
|
||||
|
||||
@endsphinxtab
|
||||
.. tab:: USM host memory
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: allocate_usm_host
|
||||
|
||||
@sphinxtab{cl::Buffer}
|
||||
.. tab:: USM device memory
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: allocate_usm_device
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp wrap_cl_buffer
|
||||
.. tab:: cl::Buffer
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/remote_objects_creation.cpp
|
||||
:language: cpp
|
||||
:fragment: allocate_cl_buffer
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{cl::Image2D}
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp wrap_cl_image
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{biplanar NV12 surface}
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp wrap_nv12_surface
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{Allocate device memory}
|
||||
|
||||
@sphinxtabset
|
||||
|
||||
@sphinxtab{USM host memory}
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp allocate_usm_host
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{USM device memory}
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp allocate_usm_device
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@sphinxtab{cl::Buffer}
|
||||
|
||||
@snippet docs/snippets/gpu/remote_objects_creation.cpp allocate_cl_buffer
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
@endsphinxdirective
|
||||
|
||||
The `ov::intel_gpu::ocl::D3DContext` and `ov::intel_gpu::ocl::VAContext` classes are derived from `ov::intel_gpu::ocl::ClContext`.
|
||||
Therefore, they provide the functionality described above and extend it
|
||||
to allow creation of `ov::RemoteTensor` objects from `ID3D11Buffer`, `ID3D11Texture2D` pointers or the `VASurfaceID` handle respectively.
|
||||
|
||||
|
||||
## Direct NV12 Video Surface Input
|
||||
|
||||
To support the direct consumption of a hardware video decoder output, the plugin accepts two-plane video
|
||||
surfaces as arguments for the `create_tensor_nv12()` function, which creates a pair of `ov::RemoteTensor`
|
||||
objects which represent the Y and UV planes.
|
||||
To support the direct consumption of a hardware video decoder output, the GPU plugin accepts:
|
||||
|
||||
To ensure that the plugin generates the correct execution graph for the NV12 dual-plane input, static preprocessing
|
||||
* Two-plane NV12 video surface input - calling the `create_tensor_nv12()` function creates
|
||||
a pair of `ov::RemoteTensor` objects, representing the Y and UV planes.
|
||||
* Single-plane NV12 video surface input - calling the `create_tensor()` function creates one
|
||||
`ov::RemoteTensor` object, representing the Y and UV planes at once (Y elements before UV elements).
|
||||
* NV12 to Grey video surface input conversion - calling the `create_tensor()` function creates one
|
||||
`ov::RemoteTensor` object, representing only the Y plane.
|
||||
|
||||
To ensure that the plugin generates a correct execution graph, static preprocessing
|
||||
should be added before model compilation:
|
||||
|
||||
@snippet snippets/gpu/preprocessing.cpp init_preproc
|
||||
@sphinxdirective
|
||||
|
||||
Since the `ov::intel_gpu::ocl::ClImage2DTensor` and its derived classes do not support batched surfaces, if batching and surface sharing are required
|
||||
at the same time, inputs need to be set via the `ov::InferRequest::set_tensors` method with vector of shared surfaces for each plane:
|
||||
.. tab:: two-plane
|
||||
|
||||
@sphinxtabset
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_two_planes.cpp
|
||||
:language: cpp
|
||||
:fragment: [init_preproc]
|
||||
|
||||
@sphinxtab{Single batch}
|
||||
.. tab:: single-plane
|
||||
|
||||
@snippet docs/snippets/gpu/preprocessing.cpp single_batch
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_single_plane.cpp
|
||||
:language: cpp
|
||||
:fragment: [init_preproc]
|
||||
|
||||
@endsphinxtab
|
||||
.. tab:: NV12 to Grey
|
||||
|
||||
@sphinxtab{Multiple batches}
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_to_gray.cpp
|
||||
:language: cpp
|
||||
:fragment: [init_preproc]
|
||||
|
||||
@snippet docs/snippets/gpu/preprocessing.cpp batched_case
|
||||
@endsphinxdirective
|
||||
|
||||
@endsphinxtab
|
||||
|
||||
@endsphinxtabset
|
||||
Since the `ov::intel_gpu::ocl::ClImage2DTensor` and its derived classes do not support batched surfaces,
|
||||
if batching and surface sharing are required at the same time,
|
||||
inputs need to be set via the `ov::InferRequest::set_tensors` method with vector of shared surfaces for each plane:
|
||||
|
||||
|
||||
@sphinxdirective
|
||||
|
||||
.. tab:: Single Batch
|
||||
|
||||
.. tab:: two-plane
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_two_planes.cpp
|
||||
:language: cpp
|
||||
:fragment: single_batch
|
||||
|
||||
.. tab:: single-plane
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_single_plane.cpp
|
||||
:language: cpp
|
||||
:fragment: single_batch
|
||||
|
||||
.. tab:: NV12 to Grey
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_to_gray.cpp
|
||||
:language: cpp
|
||||
:fragment: single_batch
|
||||
|
||||
.. tab:: Multiple Batches
|
||||
|
||||
.. tab:: two-plane
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_two_planes.cpp
|
||||
:language: cpp
|
||||
:fragment: batched_case
|
||||
|
||||
.. tab:: single-plane
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_single_plane.cpp
|
||||
:language: cpp
|
||||
:fragment: batched_case
|
||||
|
||||
.. tab:: NV12 to Grey
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/preprocessing_nv12_to_gray.cpp
|
||||
:language: cpp
|
||||
:fragment: batched_case
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
I420 color format can be processed in a similar way
|
||||
|
||||
## Context & Queue Sharing
|
||||
@ -242,18 +279,12 @@ This sharing mechanism allows performing pipeline synchronization on the app sid
|
||||
on waiting for the completion of inference. The pseudo-code may look as follows:
|
||||
|
||||
@sphinxdirective
|
||||
.. raw:: html
|
||||
|
||||
<div class="collapsible-section" data-title="Queue and context sharing example">
|
||||
.. dropdown:: Queue and context sharing example
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
@snippet snippets/gpu/queue_sharing.cpp queue_sharing
|
||||
|
||||
@sphinxdirective
|
||||
.. raw:: html
|
||||
|
||||
</div>
|
||||
.. doxygensnippet:: docs/snippets/gpu/queue_sharing.cpp
|
||||
:language: cpp
|
||||
:fragment: queue_sharing
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
@ -282,60 +313,34 @@ For possible low-level properties and their description, refer to the `openvino/
|
||||
|
||||
To see pseudo-code of usage examples, refer to the sections below.
|
||||
|
||||
> **NOTE**: For low-level parameter usage examples, see the source code of user-side wrappers from the include files mentioned above.
|
||||
|
||||
|
||||
@sphinxdirective
|
||||
.. raw:: html
|
||||
|
||||
<div class="collapsible-section" data-title="OpenCL Kernel Execution on a Shared Buffer">
|
||||
.. NOTE::
|
||||
|
||||
For low-level parameter usage examples, see the source code of user-side wrappers from the include files mentioned above.
|
||||
|
||||
.. dropdown:: OpenCL Kernel Execution on a Shared Buffer
|
||||
|
||||
This example uses the OpenCL context obtained from a compiled model object.
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/context_sharing.cpp
|
||||
:language: cpp
|
||||
:fragment: context_sharing_get_from_ov
|
||||
|
||||
.. dropdown:: Running GPU Plugin Inference within User-Supplied Shared Context
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/context_sharing.cpp
|
||||
:language: cpp
|
||||
:fragment: context_sharing_user_handle
|
||||
|
||||
.. dropdown:: Direct Consuming of the NV12 VAAPI Video Decoder Surface on Linux
|
||||
|
||||
.. doxygensnippet:: docs/snippets/gpu/context_sharing_va.cpp
|
||||
:language: cpp
|
||||
:fragment: context_sharing_va
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
This example uses the OpenCL context obtained from a compiled model object.
|
||||
|
||||
@snippet snippets/gpu/context_sharing.cpp context_sharing_get_from_ov
|
||||
|
||||
@sphinxdirective
|
||||
.. raw:: html
|
||||
|
||||
</div>
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
|
||||
@sphinxdirective
|
||||
.. raw:: html
|
||||
|
||||
<div class="collapsible-section" data-title="Running GPU Plugin Inference within User-Supplied Shared Context">
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
@snippet snippets/gpu/context_sharing.cpp context_sharing_user_handle
|
||||
|
||||
@sphinxdirective
|
||||
.. raw:: html
|
||||
|
||||
</div>
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
|
||||
@sphinxdirective
|
||||
.. raw:: html
|
||||
|
||||
<div class="collapsible-section" data-title="Direct Consuming of the NV12 VAAPI Video Decoder Surface on Linux">
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
@snippet snippets/gpu/context_sharing_va.cpp context_sharing_va
|
||||
|
||||
@sphinxdirective
|
||||
.. raw:: html
|
||||
|
||||
</div>
|
||||
|
||||
@endsphinxdirective
|
||||
|
||||
## See Also
|
||||
|
||||
|
@ -79,7 +79,7 @@ html_theme = "openvino_sphinx_theme"
|
||||
html_theme_path = ['_themes']
|
||||
|
||||
html_theme_options = {
|
||||
"navigation_depth": 6,
|
||||
"navigation_depth": 8,
|
||||
"show_nav_level": 2,
|
||||
"use_edit_page_button": True,
|
||||
"github_url": "https://github.com/openvinotoolkit/openvino",
|
||||
|
@ -24,7 +24,9 @@ file(GLOB SOURCES "${CMAKE_CURRENT_SOURCE_DIR}/*.cpp"
|
||||
if (NOT TARGET OpenCL::OpenCL)
|
||||
list(REMOVE_ITEM SOURCES "${CMAKE_CURRENT_SOURCE_DIR}/gpu/context_sharing_va.cpp"
|
||||
"${CMAKE_CURRENT_SOURCE_DIR}/gpu/context_sharing.cpp"
|
||||
"${CMAKE_CURRENT_SOURCE_DIR}/gpu/preprocessing.cpp"
|
||||
"${CMAKE_CURRENT_SOURCE_DIR}/gpu/preprocessing_nv12_two_planes.cpp"
|
||||
"${CMAKE_CURRENT_SOURCE_DIR}/gpu/preprocessing_nv12_single_plane.cpp"
|
||||
"${CMAKE_CURRENT_SOURCE_DIR}/gpu/preprocessing_nv12_to_gray.cpp"
|
||||
"${CMAKE_CURRENT_SOURCE_DIR}/gpu/queue_sharing.cpp"
|
||||
"${CMAKE_CURRENT_SOURCE_DIR}/gpu/remote_objects_creation.cpp")
|
||||
endif()
|
||||
|
48
docs/snippets/gpu/preprocessing_nv12_single_plane.cpp
Normal file
48
docs/snippets/gpu/preprocessing_nv12_single_plane.cpp
Normal file
@ -0,0 +1,48 @@
|
||||
#include <openvino/runtime/core.hpp>
|
||||
#define OV_GPU_USE_OPENCL_HPP
|
||||
#include <openvino/runtime/intel_gpu/ocl/ocl.hpp>
|
||||
#include <openvino/runtime/intel_gpu/properties.hpp>
|
||||
#include <openvino/core/preprocess/pre_post_process.hpp>
|
||||
|
||||
ov::intel_gpu::ocl::ClImage2DTensor get_yuv_tensor();
|
||||
|
||||
int main() {
|
||||
ov::Core core;
|
||||
auto model = core.read_model("model.xml");
|
||||
|
||||
//! [init_preproc]
|
||||
using namespace ov::preprocess;
|
||||
auto p = PrePostProcessor(model);
|
||||
p.input().tensor().set_element_type(ov::element::u8)
|
||||
.set_color_format(ColorFormat::NV12_SINGLE_PLANE)
|
||||
.set_memory_type(ov::intel_gpu::memory_type::surface);
|
||||
p.input().preprocess().convert_color(ov::preprocess::ColorFormat::BGR);
|
||||
p.input().model().set_layout("NCHW");
|
||||
auto model_with_preproc = p.build();
|
||||
//! [init_preproc]
|
||||
|
||||
auto compiled_model = core.compile_model(model_with_preproc, "GPU");
|
||||
auto context = compiled_model.get_context().as<ov::intel_gpu::ocl::ClContext>();
|
||||
auto infer_request = compiled_model.create_infer_request();
|
||||
|
||||
{
|
||||
//! [single_batch]
|
||||
auto input_yuv = model_with_preproc->input(0);
|
||||
ov::intel_gpu::ocl::ClImage2DTensor yuv_tensor = get_yuv_tensor();
|
||||
infer_request.set_tensor(input_yuv.get_any_name(), yuv_tensor);
|
||||
infer_request.infer();
|
||||
//! [single_batch]
|
||||
}
|
||||
|
||||
{
|
||||
auto yuv_tensor_0 = get_yuv_tensor();
|
||||
auto yuv_tensor_1 = get_yuv_tensor();
|
||||
//! [batched_case]
|
||||
auto input_yuv = model_with_preproc->input(0);
|
||||
std::vector<ov::Tensor> yuv_tensors = {yuv_tensor_0, yuv_tensor_1};
|
||||
infer_request.set_tensors(input_yuv.get_any_name(), yuv_tensors);
|
||||
infer_request.infer();
|
||||
//! [batched_case]
|
||||
}
|
||||
return 0;
|
||||
}
|
50
docs/snippets/gpu/preprocessing_nv12_to_gray.cpp
Normal file
50
docs/snippets/gpu/preprocessing_nv12_to_gray.cpp
Normal file
@ -0,0 +1,50 @@
|
||||
#define OV_GPU_USE_OPENCL_HPP
|
||||
#include <openvino/runtime/intel_gpu/ocl/ocl.hpp>
|
||||
#include <openvino/runtime/intel_gpu/properties.hpp>
|
||||
#include <openvino/core/preprocess/pre_post_process.hpp>
|
||||
|
||||
ov::intel_gpu::ocl::ClImage2DTensor get_y_tensor();
|
||||
ov::intel_gpu::ocl::ClImage2DTensor get_uv_tensor();
|
||||
|
||||
int main() {
|
||||
ov::Core core;
|
||||
auto model = core.read_model("model.xml");
|
||||
|
||||
//! [init_preproc]
|
||||
using namespace ov::preprocess;
|
||||
auto p = PrePostProcessor(model);
|
||||
p.input().tensor().set_element_type(ov::element::u8)
|
||||
.set_layout("NHWC")
|
||||
.set_memory_type(ov::intel_gpu::memory_type::surface);
|
||||
p.input().model().set_layout("NCHW");
|
||||
auto model_with_preproc = p.build();
|
||||
//! [init_preproc]
|
||||
|
||||
auto compiled_model = core.compile_model(model_with_preproc, "GPU");
|
||||
auto remote_context = compiled_model.get_context().as<ov::intel_gpu::ocl::ClContext>();
|
||||
auto input = model->input(0);
|
||||
auto infer_request = compiled_model.create_infer_request();
|
||||
|
||||
{
|
||||
//! [single_batch]
|
||||
cl::Image2D img_y_plane;
|
||||
auto input_y = model_with_preproc->input(0);
|
||||
auto remote_y_tensor = remote_context.create_tensor(input_y.get_element_type(), input.get_shape(), img_y_plane);
|
||||
infer_request.set_tensor(input_y.get_any_name(), remote_y_tensor);
|
||||
infer_request.infer();
|
||||
//! [single_batch]
|
||||
}
|
||||
|
||||
{
|
||||
//! [batched_case]
|
||||
cl::Image2D img_y_plane_0, img_y_plane_l;
|
||||
auto input_y = model_with_preproc->input(0);
|
||||
auto remote_y_tensor_0 = remote_context.create_tensor(input_y.get_element_type(), input.get_shape(), img_y_plane_0);
|
||||
auto remote_y_tensor_1 = remote_context.create_tensor(input_y.get_element_type(), input.get_shape(), img_y_plane_l);
|
||||
std::vector<ov::Tensor> y_tensors = {remote_y_tensor_0, remote_y_tensor_1};
|
||||
infer_request.set_tensors(input_y.get_any_name(), y_tensors);
|
||||
infer_request.infer();
|
||||
//! [batched_case]
|
||||
}
|
||||
return 0;
|
||||
}
|
Loading…
Reference in New Issue
Block a user