Docs: model caching page update according to OpenVINO API 2.0 (#10981)

2022-03-16 12:22:33 +03:00 · 2022-03-16 12:22:33 +03:00 · 7cea7dd4e6
commit 7cea7dd4e6
parent 2687f6fb2e
9 changed files with 184 additions and 158 deletions
--- a/docs/OV_Runtime_UG/Model_caching_overview.md
+++ b/docs/OV_Runtime_UG/Model_caching_overview.md
@ -1,59 +1,95 @@
 # Model Caching Overview {#openvino_docs_IE_DG_Model_caching_overview}

-## Introduction (C++)
+## Introduction

-@sphinxdirective
-.. raw:: html
+As described in the [Integrate OpenVINO™ with Your Application](integrate_with_your_application.md), a common application flow consists of the following steps:

-    <div id="switcher-cpp" class="switcher-anchor">C++</div>
-@endsphinxdirective
+1. **Create a Core object**: First step to manage available devices and read model objects

-As described in the [OpenVINO™ Runtime User Guide](openvino_intro.md), a common application flow consists of the following steps:
-
-1. **Create a Core object**: First step to manage available devices and read network objects
-
-2. **Read the Intermediate Representation**: Read an Intermediate Representation file into an object of the `InferenceEngine::CNNNetwork`
+2. **Read the Intermediate Representation**: Read an Intermediate Representation file into an object of the `ov::Model`

 3. **Prepare inputs and outputs**: If needed, manipulate precision, memory layout, size or color format

 4. **Set configuration**: Pass device-specific loading configurations to the device

-5. **Compile and Load Network to device**: Use the `InferenceEngine::Core::LoadNetwork()` method with a specific device
+5. **Compile and Load Network to device**: Use the `ov::Core::compile_model()` method with a specific device

-6. **Set input data**: Specify input blob
+6. **Set input data**: Specify input tensor

 7. **Execute**: Carry out inference and process results

 Step 5 can potentially perform several time-consuming device-specific optimizations and network compilations,
 and such delays can lead to a bad user experience on application startup. To avoid this, some devices offer
 import/export network capability, and it is possible to either use the [Compile tool](../../tools/compile_tool/README.md)
-or enable model caching to export compiled network automatically. Reusing cached networks can significantly reduce load network time.
+or enable model caching to export compiled model automatically. Reusing cached model can significantly reduce compile model time.

-### Set "CACHE_DIR" config option to enable model caching
+### Set "cache_dir" config option to enable model caching

 To enable model caching, the application must specify a folder to store cached blobs, which is done like this:

-@snippet snippets/InferenceEngine_Caching0.cpp part0
+@sphinxdirective

-With this code, if the device specified by `LoadNetwork` supports import/export network capability, a cached blob is automatically created inside the `myCacheFolder` folder.
-CACHE_DIR config is set to the Core object. If the device does not support import/export capability, cache is not created and no error is thrown.
+.. tab:: C++

-Depending on your device, total time for loading network on application startup can be significantly reduced.
-Also note that the very first LoadNetwork (when cache is not yet created) takes slightly longer time to "export" the compiled blob into a cache file:
+      .. doxygensnippet:: docs/snippets/ov_caching.cpp
+         :language: cpp
+         :fragment: [ov:caching:part0]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_caching.py
+         :language: python
+         :fragment: [ov:caching:part0]
+
+@endsphinxdirective
+
+With this code, if the device specified by `device_name` supports import/export model capability, a cached blob is automatically created inside the `/path/to/cache/dir` folder.
+If the device does not support import/export capability, cache is not created and no error is thrown.
+
+Depending on your device, total time for compiling model on application startup can be significantly reduced.
+Also note that the very first `compile_model` (when cache is not yet created) takes slightly longer time to "export" the compiled blob into a cache file:

 ![caching_enabled]

-### Even faster: use LoadNetwork(modelPath)
+### Even faster: use compile_model(modelPath)

-In some cases, applications do not need to customize inputs and outputs every time. Such an application always
-call `cnnNet = ie.ReadNetwork(...)`, then `ie.LoadNetwork(cnnNet, ..)` and it can be further optimized.
-For these cases, the 2021.4 release introduces a more convenient API to load the network in a single call, skipping the export step:
+In some cases, applications do not need to customize inputs and outputs every time. Such application always
+call `model = core.read_model(...)`, then `core.compile_model(model, ..)` and it can be further optimized.
+For these cases, there is a more convenient API to compile the model in a single call, skipping the read step:

-@snippet snippets/InferenceEngine_Caching1.cpp part1
+@sphinxdirective

-With model caching enabled, total load time is even smaller, if ReadNetwork is optimized as well.
+.. tab:: C++

-@snippet snippets/InferenceEngine_Caching2.cpp part2
+      .. doxygensnippet:: docs/snippets/ov_caching.cpp
+         :language: cpp
+         :fragment: [ov:caching:part1]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_caching.py
+         :language: python
+         :fragment: [ov:caching:part1]
+
+@endsphinxdirective
+
+With model caching enabled, total load time is even smaller, if `read_model` is optimized as well.
+
+@sphinxdirective
+
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_caching.cpp
+         :language: cpp
+         :fragment: [ov:caching:part2]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_caching.py
+         :language: python
+         :fragment: [ov:caching:part2]
+
+@endsphinxdirective

 ![caching_times]

@ -62,74 +98,23 @@ With model caching enabled, total load time is even smaller, if ReadNetwork is o
 Not every device supports network import/export capability. For those that don't, enabling caching has no effect.
 To check in advance if a particular device supports model caching, your application can use the following code:

-@snippet snippets/InferenceEngine_Caching3.cpp part3
-
-## Introduction (Python)
-
@sphinxdirective
-.. raw:: html

-    <div id="switcher-python" class="switcher-anchor">Python</div>
+.. tab:: C++
+
+      .. doxygensnippet:: docs/snippets/ov_caching.cpp
+         :language: cpp
+         :fragment: [ov:caching:part3]
+
+.. tab:: Python
+
+      .. doxygensnippet:: docs/snippets/ov_caching.py
+         :language: python
+         :fragment: [ov:caching:part3]
+
@endsphinxdirective

-As described in OpenVINO User Guide, a common application flow consists of the following steps:
-
-1. **Create a Core Object**
-2. **Read the Intermediate Representation** - Read an Intermediate Representation file into an object of the [ie_api.IENetwork](api/ie_python_api/_autosummary/openvino.inference_engine.IENetwork.html)
-3. **Prepare inputs and outputs**
-4. **Set configuration** - Pass device-specific loading configurations to the device
-5. **Compile and Load Network to device** - Use the `IECore.load_network()` method and specify the target device
-6. **Set input data**
-7. **Execute the model** - Run inference
-
-Step #5 can potentially perform several time-consuming device-specific optimizations and network compilations, and such delays can lead to bad user experience on application startup. To avoid this, some devices offer Import/Export network capability, and it is possible to either use the [Compile tool](../../tools/compile_tool/README.md) or enable model caching to export the compiled network automatically. Reusing cached networks can significantly reduce load network time.
-
-### Set the “CACHE_DIR” config option to enable model caching
-
-To enable model caching, the application must specify the folder where to store cached blobs. It can be done using [IECore.set_config](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.set_config).
-
-``` python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-ie.set_config(config={"CACHE_DIR": path_to_cache}, device_name=device)
-net = ie.read_network(model=path_to_xml_file)
-exec_net = ie.load_network(network=net, device_name=device)
-```
-
-With this code, if a device supports the Import/Export network capability, a cached blob is automatically created inside the path_to_cache directory `CACHE_DIR` config is set to the Core object. If device does not support Import/Export capability, cache is just not created and no error is thrown
-
-Depending on your device, total time for loading network on application startup can be significantly reduced. Please also note that very first [IECore.load_network](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.load_network) (when the cache is not yet created) takes slightly longer time to ‘export’ the compiled blob into a cache file.
-
-![caching_enabled]
-
-
-### Even Faster: Use IECore.load_network(path_to_xml_file)
-
-In some cases, applications do not need to customize inputs and outputs every time. These applications always call [IECore.read_network](api/ie_python_api/_autosummary/openvino.inference_engine.IECore.html#openvino.inference_engine.IECore.read_network), then `IECore.load_network(model=path_to_xml_file)` and may be further optimized. For such cases, it's more convenient to load the network in a single call to `ie.load_network()`
-A model can be loaded directly to the device, with model caching enabled:
-
-``` python
-from openvino.inference_engine import IECore
-
-ie = IECore()
-ie.set_config(config={"CACHE_DIR" : path_to_cache}, device_name=device)
-ie.load_network(network=path_to_xml_file, device_name=device)
-```
-
-![caching_times]
-
-### Advanced Examples
-
-Not every device supports network import/export capability, enabling of caching for such devices does not have any effect. To check in advance if a particular device supports model caching, your application can use the following code:
-
-```python
-all_metrics = ie.get_metric(device_name=device, metric_name="SUPPORTED_METRICS")
-# Find the 'IMPORT_EXPORT_SUPPORT' metric in supported metrics
-allows_caching = "IMPORT_EXPORT_SUPPORT" in all_metrics
-```
-
-> **NOTE**: The GPU plugin does not have the IMPORT_EXPORT_SUPPORT capability, and does not support model caching yet. However, the GPU plugin supports caching kernels (see the [GPU plugin documentation](supported_plugins/GPU.md)). Kernel caching for the GPU plugin can be accessed the same way as model caching: by setting the `CACHE_DIR` configuration key to a folder where the cache should be stored.
+> **NOTE**: The GPU plugin does not have the EXPORT_IMPORT capability, and does not support model caching yet. However, the GPU plugin supports caching kernels (see the [GPU plugin documentation](supported_plugins/GPU.md)). Kernel caching for the GPU plugin can be accessed the same way as model caching: by setting the `CACHE_DIR` configuration key to a folder where the cache should be stored.


 [caching_enabled]: ../img/caching_enabled.png
--- a/docs/img/caching_enabled.png
+++ b/docs/img/caching_enabled.png
@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:488a7a47e5086a6868c22219bc9d58a3508059e5a1dc470f2653a12552dea82f
-size 36207
+oid sha256:ecf560b08b921da29d59a3c1f6332d092a0575dd00cf59806dc801c32a10790f
+size 120241
--- a/docs/img/caching_times.png
+++ b/docs/img/caching_times.png
@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2eed189f9cb3d30fe13b4ba4515edd4e6da5d01545660e65fa8a33d945967281
-size 28894
+oid sha256:357483dd3460848e98489073cd9d58b5c8ada9ec3df4fbfd0956ba9e779f9c15
+size 79843
--- a/docs/snippets/InferenceEngine_Caching0.cpp
+++ b/docs/snippets/InferenceEngine_Caching0.cpp
@ -1,17 +0,0 @@
-#include <ie_core.hpp>
-
-int main() {
-using namespace InferenceEngine;
-    std::string modelPath = "/tmp/myModel.xml";
-    std::string device = "GNA";
-    std::map<std::string, std::string> deviceConfig;
-//! [part0]
-    InferenceEngine::Core ie;                                 // Step 1: create Inference engine object
-    ie.SetConfig({{CONFIG_KEY(CACHE_DIR), "myCacheFolder"}}); // Step 1b: Enable caching
-    auto cnnNet = ie.ReadNetwork(modelPath);                  // Step 2: ReadNetwork
-    //...                                                     // Step 3: Prepare inputs/outputs
-    //...                                                     // Step 4: Set device configuration
-    ie.LoadNetwork(cnnNet, device, deviceConfig);             // Step 5: LoadNetwork
-//! [part0]
-return 0;
-}
--- a/docs/snippets/InferenceEngine_Caching1.cpp
+++ b/docs/snippets/InferenceEngine_Caching1.cpp
@ -1,13 +0,0 @@
-#include <ie_core.hpp>
-
-int main() {
-using namespace InferenceEngine;
-    std::string modelPath = "/tmp/myModel.xml";
-    std::string device = "GNA";
-    std::map<std::string, std::string> deviceConfig;
-//! [part1]
-    InferenceEngine::Core ie;                                 // Step 1: create Inference engine object
-    ie.LoadNetwork(modelPath, device, deviceConfig);          // Step 2: LoadNetwork by model file path
-//! [part1]
-return 0;
-}
--- a/docs/snippets/InferenceEngine_Caching2.cpp
+++ b/docs/snippets/InferenceEngine_Caching2.cpp
@ -1,14 +0,0 @@
-#include <ie_core.hpp>
-
-int main() {
-using namespace InferenceEngine;
-    std::string modelPath = "/tmp/myModel.xml";
-    std::string device = "GNA";
-    std::map<std::string, std::string> deviceConfig;
-//! [part2]
-    InferenceEngine::Core ie;                                  // Step 1: create Inference engine object
-    ie.SetConfig({{CONFIG_KEY(CACHE_DIR), "myCacheFolder"}});  // Step 1b: Enable caching
-    ie.LoadNetwork(modelPath, device, deviceConfig);           // Step 2: LoadNetwork by model file path
-//! [part2]
-return 0;
-}
--- a/docs/snippets/InferenceEngine_Caching3.cpp
+++ b/docs/snippets/InferenceEngine_Caching3.cpp
@ -1,20 +0,0 @@
-#include <ie_core.hpp>
-
-int main() {
-using namespace InferenceEngine;
-    std::string modelPath = "/tmp/myModel.xml";
-    std::string deviceName = "GNA";
-    std::map<std::string, std::string> deviceConfig;
-    InferenceEngine::Core ie;
-//! [part3]
-    // Get list of supported metrics
-    std::vector<std::string> keys = ie.GetMetric(deviceName, METRIC_KEY(SUPPORTED_METRICS));
-
-    // Find 'IMPORT_EXPORT_SUPPORT' metric in supported metrics
-    auto it = std::find(keys.begin(), keys.end(), METRIC_KEY(IMPORT_EXPORT_SUPPORT));
-
-    // If metric 'IMPORT_EXPORT_SUPPORT' exists, check it's value
-    auto cachingSupported = (it != keys.end()) && ie.GetMetric(deviceName, METRIC_KEY(IMPORT_EXPORT_SUPPORT)).as<bool>();
-//! [part3]
-    return 0;
-}
--- a/docs/snippets/ov_caching.cpp
+++ b/docs/snippets/ov_caching.cpp
@ -0,0 +1,69 @@
+#include <openvino/runtime/core.hpp>
+
+void part0() {
+    std::string modelPath = "/tmp/myModel.xml";
+    std::string device = "GNA";
+    ov::AnyMap config;
+//! [ov:caching:part0]
+ov::Core core;                                              // Step 1: create ov::Core object
+core.set_property(ov::cache_dir("/path/to/cache/dir"));     // Step 1b: Enable caching
+auto model = core.read_model(modelPath);                    // Step 2: Read Model
+//...                                                       // Step 3: Prepare inputs/outputs
+//...                                                       // Step 4: Set device configuration
+auto compiled = core.compile_model(model, device, config);  // Step 5: LoadNetwork
+//! [ov:caching:part0]
+    if (!compiled) {
+        throw std::runtime_error("error");
+    }
+}
+
+void part1() {
+    std::string modelPath = "/tmp/myModel.xml";
+    std::string device = "GNA";
+    ov::AnyMap config;
+//! [ov:caching:part1]
+ov::Core core;                                                  // Step 1: create ov::Core object
+auto compiled = core.compile_model(modelPath, device, config);  // Step 2: Compile model by file path
+//! [ov:caching:part1]
+    if (!compiled) {
+        throw std::runtime_error("error");
+    }
+}
+
+void part2() {
+    std::string modelPath = "/tmp/myModel.xml";
+    std::string device = "GNA";
+    ov::AnyMap config;
+//! [ov:caching:part2]
+ov::Core core;                                                  // Step 1: create ov::Core object
+core.set_property(ov::cache_dir("/path/to/cache/dir"));         // Step 1b: Enable caching
+auto compiled = core.compile_model(modelPath, device, config);  // Step 2: Compile model by file path
+//! [ov:caching:part2]
+    if (!compiled) {
+        throw std::runtime_error("error");
+    }
+}
+
+void part3() {
+    std::string deviceName = "GNA";
+    ov::AnyMap config;
+    ov::Core core;
+//! [ov:caching:part3]
+// Get list of supported device capabilities
+std::vector<std::string> caps = core.get_property(deviceName, ov::device::capabilities);
+
+// Find 'EXPORT_IMPORT' capability in supported capabilities
+bool cachingSupported = std::find(caps.begin(), caps.end(), ov::device::capability::EXPORT_IMPORT) != caps.end();
+//! [ov:caching:part3]
+    if (!cachingSupported) {
+        throw std::runtime_error("GNA should support model caching");
+    }
+}
+
+int main() {
+    part0();
+    part1();
+    part2();
+    part3();
+    return 0;
+}
--- a/docs/snippets/ov_caching.py
+++ b/docs/snippets/ov_caching.py
@ -0,0 +1,36 @@
+# Copyright (C) 2018-2022 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+#
+
+from openvino.runtime import Core
+
+device_name = 'GNA'
+xml_path = '/tmp/myModel.xml'
+# ! [ov:caching:part0]
+core = Core()
+core.set_property({'CACHE_DIR': '/path/to/cache/dir'})
+model = core.read_model(model=xml_path)
+compiled_model = core.compile_model(model=model, device_name=device_name)
+# ! [ov:caching:part0]
+
+assert compiled_model
+
+# ! [ov:caching:part1]
+core = Core()
+compiled_model = core.compile_model(model_path=xml_path, device_name=device_name)
+# ! [ov:caching:part1]
+
+assert compiled_model
+
+# ! [ov:caching:part2]
+core = Core()
+core.set_property({'CACHE_DIR': '/path/to/cache/dir'})
+compiled_model = core.compile_model(model_path=xml_path, device_name=device_name)
+# ! [ov:caching:part2]
+
+assert compiled_model
+
+# ! [ov:caching:part3]
+# Find 'EXPORT_IMPORT' capability in supported capabilities
+caching_supported = 'EXPORT_IMPORT' in core.get_property(device_name, 'OPTIMIZATION_CAPABILITIES')
+# ! [ov:caching:part3]