Files
openvino/docs/snippets/ov_python_inference.py
Anastasia Kuporosova 2bf8d910f6 [Docs][PyOV] update python snippets (#19367)
* [Docs][PyOV] update python snippets

* first snippet

* Fix samples debug

* Fix linter

* part1

* Fix speech sample

* update model state snippet

* add serialize

* add temp dir

* CPU snippets update (#134)

* snippets CPU 1/6

* snippets CPU 2/6

* snippets CPU 3/6

* snippets CPU 4/6

* snippets CPU 5/6

* snippets CPU 6/6

* make  module TODO: REMEMBER ABOUT EXPORTING PYTONPATH ON CIs ETC

* Add static model creation in snippets for CPU

* export_comp_model done

* leftovers

* apply comments

* apply comments -- properties

* small fixes

* rempve debug info

* return IENetwork instead of Function

* apply comments

* revert precision change in common snippets

* update opset

* [PyOV] Edit docs for the rest of plugins (#136)

* modify main.py

* GNA snippets

* GPU snippets

* AUTO snippets

* MULTI snippets

* HETERO snippets

* Added properties

* update gna

* more samples

* Update docs/OV_Runtime_UG/model_state_intro.md

* Update docs/OV_Runtime_UG/model_state_intro.md

* attempt1 fix ci

* new approach to test

* temporary remove some files from run

* revert cmake changes

* fix ci

* fix snippet

* fix py_exclusive snippet

* fix preprocessing snippet

* clean-up main

* remove numpy installation in gha

* check for GPU

* add logger

* iexclude main

* main update

* temp

* Temp2

* Temp2

* temp

* Revert temp

* add property execution devices

* hide output from samples

---------

Co-authored-by: p-wysocki <przemyslaw.wysocki@intel.com>
Co-authored-by: Jan Iwaszkiewicz <jan.iwaszkiewicz@intel.com>
Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com>
2023-09-13 21:05:24 +02:00

75 lines
2.5 KiB
Python

# Copyright (C) 2018-2023 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
import numpy as np
import openvino as ov
import openvino.runtime.opset12 as ops
INPUT_SIZE = 1_000_000 # Use bigger values if necessary, i.e.: 300_000_000
input_0 = ops.parameter([INPUT_SIZE], name="input_0")
input_1 = ops.parameter([INPUT_SIZE], name="input_1")
add_inputs = ops.add(input_0, input_1)
res = ops.reduce_sum(add_inputs, reduction_axes=0, name="reduced")
model = ov.Model(res, [input_0, input_1], name="my_model")
model.outputs[0].tensor.set_names({"reduced_result"}) # Add name for Output
core = ov.Core()
compiled_model = core.compile_model(model, device_name="CPU")
data_0 = np.array([0.1] * INPUT_SIZE, dtype=np.float32)
data_1 = np.array([-0.1] * INPUT_SIZE, dtype=np.float32)
data_2 = np.array([0.2] * INPUT_SIZE, dtype=np.float32)
data_3 = np.array([-0.2] * INPUT_SIZE, dtype=np.float32)
#! [direct_inference]
# Calling CompiledModel creates and saves InferRequest object
results_0 = compiled_model({"input_0": data_0, "input_1": data_1})
# Second call reuses previously created InferRequest object
results_1 = compiled_model({"input_0": data_2, "input_1": data_3})
#! [direct_inference]
request = compiled_model.create_infer_request()
#! [shared_memory_inference]
# Data can be shared only on inputs
_ = compiled_model({"input_0": data_0, "input_1": data_1}, share_inputs=True)
_ = request.infer({"input_0": data_0, "input_1": data_1}, share_inputs=True)
# Data can be shared only on outputs
_ = request.infer({"input_0": data_0, "input_1": data_1}, share_outputs=True)
# Or both flags can be combined to achieve desired behavior
_ = compiled_model({"input_0": data_0, "input_1": data_1}, share_inputs=False, share_outputs=True)
#! [shared_memory_inference]
time_in_sec = 2.0
#! [hiding_latency]
import time
# Long running function
def run(time_in_sec):
time.sleep(time_in_sec)
# No latency hiding
results = request.infer({"input_0": data_0, "input_1": data_1})[0]
run(time_in_sec)
# Hiding latency
request.start_async({"input_0": data_0, "input_1": data_1})
run(time_in_sec)
request.wait()
results = request.get_output_tensor(0).data # Gather data from InferRequest
#! [hiding_latency]
#! [no_return_inference]
# Standard approach
results = request.infer({"input_0": data_0, "input_1": data_1})[0]
# "Postponed Return" approach
request.start_async({"input_0": data_0, "input_1": data_1})
request.wait()
results = request.get_output_tensor(0).data # Gather data "on demand" from InferRequest
#! [no_return_inference]