* Added migration for deployment (#10800) * Added migration for deployment * Addressed comments * more info after the What's new Sessions' questions (#10803) * more info after the What's new Sessions' questions * generalizing the optimal_batch_size vs explicit value message * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Update docs/OV_Runtime_UG/automatic_batching.md Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Perf Hints docs and General Opt Guide refactoring (#10815) * Brushed the general optimization page * Opt GUIDE, WIP * perf hints doc placeholder * WIP * WIP2 * WIP 3 * added streams and few other details * fixed titles, misprints etc * Perf hints * movin the runtime optimizations intro * fixed link * Apply suggestions from code review Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * some details on the FIL and other means when pure inference time is not the only factor * shuffled according to general->use-case->device-specifics flow, minor brushing * next iter * section on optimizing for tput and latency * couple of links to the features support matrix * Links, brushing, dedicated subsections for Latency/FIL/Tput * had to make the link less specific (otherwise docs compilations fails) * removing the Temp/Should be moved to the Opt Guide * shuffled the tput/latency/etc info into separated documents. also the following docs moved from the temp into specific feature, general product desc or corresponding plugins - openvino_docs_IE_DG_Model_caching_overview - openvino_docs_IE_DG_Int8Inference - openvino_docs_IE_DG_Bfloat16Inference - openvino_docs_OV_UG_NoDynamicShapes * fixed toc for ov_dynamic_shapes.md * referring the openvino_docs_IE_DG_Bfloat16Inference to avoid docs compilation errors * fixed main product TOC, removed ref from the second-level items * reviewers remarks * reverted the openvino_docs_OV_UG_NoDynamicShapes * reverting openvino_docs_IE_DG_Bfloat16Inference and openvino_docs_IE_DG_Int8Inference * "No dynamic shapes" to the "Dynamic shapes" as TOC * removed duplication * minor brushing * Caching to the next level in TOC * brushing * more on the perf counters ( for latency and dynamic cases) Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> * Updated common IE pipeline infer-request section (#10844) * Updated common IE pipeline infer-reqest section * Update ov_infer_request.md * Apply suggestions from code review Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com> Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com> Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com> * DOCS: Removed useless 4 spaces in snippets (#10870) * Updated snippets * Added link to encryption * [DOCS] ARM CPU plugin docs (#10885) * initial commit ARM_CPU.md added ARM CPU is added to the list of supported devices * Update the list of supported properties * Update Device_Plugins.md * Update CODEOWNERS * Removed quotes in limitations section * NVIDIA and Android are added to the list of supported devices * Added See Also section and reg sign to arm * Added Preprocessing acceleration section * Update the list of supported layers * updated list of supported layers * fix typos * Added support disclaimer * update trade and reg symbols * fixed typos * fix typos * reg fix * add reg symbol back Co-authored-by: Vitaly Tuzov <vitaly.tuzov@intel.com> * Try to fix visualization (#10896) * Try to fix visualization * New try * Update Install&Deployment for migration guide to 22/1 (#10933) * updates * update * Getting started improvements (#10948) * Onnx updates (#10962) * onnx changes * onnx updates * onnx updates * fix broken anchors api reference (#10976) * add ote repo (#10979) * DOCS: Increase content width (#10995) * fixes * fix * Fixed compilation Co-authored-by: Maxim Shevtsov <maxim.y.shevtsov@intel.com> Co-authored-by: Tatiana Savina <tatiana.savina@intel.com> Co-authored-by: Karol Blaszczak <karol.blaszczak@intel.com> Co-authored-by: Aleksandr Voron <aleksandr.voron@intel.com> Co-authored-by: Vitaly Tuzov <vitaly.tuzov@intel.com> Co-authored-by: Ilya Churaev <ilya.churaev@intel.com> Co-authored-by: Yuan Xu <yuan1.xu@intel.com> Co-authored-by: Victoria Yashina <victoria.yashina@intel.com> Co-authored-by: Nikolay Tyukaev <nikolay.tyukaev@intel.com>
41 lines
1.6 KiB
Python
41 lines
1.6 KiB
Python
from openvino.runtime import Core
|
|
|
|
core = Core()
|
|
model = core.read_model(model="sample.xml")
|
|
|
|
# [compile_model]
|
|
config = {"PERFORMANCE_HINT": "THROUGHPUT"}
|
|
compiled_model = core.compile_model(model, "GPU", config)
|
|
# [compile_model]
|
|
|
|
# [compile_model_no_auto_batching]
|
|
# disabling the automatic batching
|
|
# leaving intact other configurations options that the device selects for the 'throughput' hint
|
|
config = {"PERFORMANCE_HINT": "THROUGHPUT",
|
|
"ALLOW_AUTO_BATCHING": False}
|
|
compiled_model = core.compile_model(model, "GPU", config)
|
|
# [compile_model_no_auto_batching]
|
|
|
|
# [query_optimal_num_requests]
|
|
# when the batch size is automatically selected by the implementation
|
|
# it is important to query/create and run the sufficient requests
|
|
config = {"PERFORMANCE_HINT": "THROUGHPUT"}
|
|
compiled_model = core.compile_model(model, "GPU", config)
|
|
num_requests = compiled_model.get_property("OPTIMAL_NUMBER_OF_INFER_REQUESTS")
|
|
# [query_optimal_num_requests]
|
|
|
|
# [hint_num_requests]
|
|
config = {"PERFORMANCE_HINT": "THROUGHPUT",
|
|
"PERFORMANCE_HINT_NUM_REQUESTS": "4"}
|
|
# limiting the available parallel slack for the 'throughput'
|
|
# so that certain parameters (like selected batch size) are automatically accommodated accordingly
|
|
compiled_model = core.compile_model(model, "GPU", config)
|
|
# [hint_num_requests]
|
|
|
|
# [hint_plus_low_level]
|
|
config = {"PERFORMANCE_HINT": "THROUGHPUT",
|
|
"INFERENCE_NUM_THREADS": "4"}
|
|
# limiting the available parallel slack for the 'throughput'
|
|
# so that certain parameters (like selected batch size) are automatically accommodated accordingly
|
|
compiled_model = core.compile_model(model, "CPU", config)
|
|
# [hint_plus_low_level]] |