[PYTHON API] release GIL (#10810)

* AsyncInferQueue nogil update + refactoring * nogil in compiled model * nogil in Core * fix refactoring * nogil in infer_request * add tests * Fix code style * update test with incrementing reference counting * try to fix code style * fix code style * release gil in reshape and preprocessing * make args optional in test * fix code style * add docs about GIL * try to link doc string with docs * Apply suggestions from code review Co-authored-by: Jan Iwaszkiewicz <jan.iwaszkiewicz@intel.com> * Fix docs * docs refactoring * Apply review comments * Fix code style Co-authored-by: Jan Iwaszkiewicz <jan.iwaszkiewicz@intel.com> Co-authored-by: Anastasia Kuporosova <anastasia.kuporosova@intel.com>
2022-03-31 16:12:48 +03:00
parent 1d247815be
commit 1efb0a034f
9 changed files with 421 additions and 67 deletions
--- a/docs/OV_Runtime_UG/Python_API_exclusives.md
+++ b/docs/OV_Runtime_UG/Python_API_exclusives.md
@@ -75,3 +75,36 @@ Another feature of `AsyncInferQueue` is ability of setting callbacks. When callb
 The callback of `AsyncInferQueue` is uniform for every job. When executed, GIL is acquired to ensure safety of data manipulation inside the function.

@snippet docs/snippets/ov_python_exclusives.py asyncinferqueue_set_callback
+
+
+### Releasing the GIL
+
+Some functions in Python API release the Global Lock Interpreter (GIL) while running work-intensive code. It can help you to achieve more parallelism in your application using Python threads. For more information about GIL please refer to the Python documentation.
+
+@snippet docs/snippets/ov_python_exclusives.py releasing_gil
+
+> **NOTE**: While GIL is released functions can still modify and/or operate on Python objects in C++, thus there is no reference counting. User is responsible for thread safety if sharing of these objects with other thread occurs. It can affects your code only if multiple threads are spawned in Python.:
+
+#### List of functions that release the GIL
+- openvino.runtime.AsyncInferQueue.start_async
+- openvino.runtime.AsyncInferQueue.is_ready
+- openvino.runtime.AsyncInferQueue.wait_all
+- openvino.runtime.AsyncInferQueue.get_idle_request_id
+- openvino.runtime.CompiledModel.create_infer_request
+- openvino.runtime.CompiledModel.infer_new_request
+- openvino.runtime.CompiledModel.__call__
+- openvino.runtime.CompiledModel.export
+- openvino.runtime.CompiledModel.get_runtime_model
+- openvino.runtime.Core.compile_model
+- openvino.runtime.Core.read_model
+- openvino.runtime.Core.import_model
+- openvino.runtime.Core.query_model
+- openvino.runtime.Core.get_available_devices
+- openvino.runtime.InferRequest.infer
+- openvino.runtime.InferRequest.start_async
+- openvino.runtime.InferRequest.wait
+- openvino.runtime.InferRequest.wait_for
+- openvino.runtime.InferRequest.get_profiling_info
+- openvino.runtime.InferRequest.query_state
+- openvino.runtime.Model.reshape
+- openvino.preprocess.PrePostProcessor.build
--- a/docs/snippets/ov_python_exclusives.py
+++ b/docs/snippets/ov_python_exclusives.py
@@ -131,3 +131,39 @@ infer_queue.wait_all()

 assert all(data_done)
 #! [asyncinferqueue_set_callback]
+
+#! [releasing_gil]
+import openvino.runtime as ov
+import cv2 as cv
+from threading import Thread
+
+input_data = []
+
+# Processing input data will be done in a separate thread
+# while compilation of the model and creation of the infer request
+# is going to be executed in the main thread.
+def prepare_data(input, image_path):
+    image = cv.imread(image_path)
+    h, w = list(input.shape)[-2:]
+    image = cv.resize(image, (h, w))
+    image = image.transpose((2, 0, 1))
+    image = np.expand_dims(image, 0)
+    input_data.append(image)
+
+core = ov.Core()
+model = core.read_model("model.xml")
+# Create thread with prepare_data function as target and start it
+thread = Thread(target=prepare_data, args=[model.input(), "path/to/image"])
+thread.start()
+# The GIL will be released in compile_model.
+# It allows a thread above to start the job,
+# while main thread is running in the background.
+compiled = core.compile_model(model, "GPU")
+# After returning from compile_model, the main thread acquires the GIL
+# and starts create_infer_request which releases it once again.
+request = compiled.create_infer_request()
+# Join the thread to make sure the input_data is ready
+thread.join()
+# running the inference
+request.infer(input_data)
+#! [releasing_gil]