Files
openvino/docs/IE_DG/GPU_Kernels_Tuning.md
Ilya Lavrenov 92e3972853 Significant documentation fixes (#3364)
* Added VariableState to Plugin API documentation

* More fixes for plugin documentation

* Added ie_memory_state.hpp to documentation

* Added proper dependencies between C++ and Plugin API targets

* Fixed issues in public C++ API reference

* Fixed issues in public C++ API reference: part 2

* Removed obsolete entries from EXCLUDE_SYMBOLS in doxygen config

* Fixed path to examples, tag files for Plugin API doxygen file

* Put impl to a private section for VariableStatebase

* Fixed examples path to Plugin API: part 2

* Fixed path to examples in main ie_docs doxygen file

* Replaced path to snippets; otherwise path depends on how cloned repo is named

* Added path to snippets for ie_docs doxygen file as well

* Great amount of fixes for documentation

* Removed IE_SET_METRIC

* Fixes for C API documentation

* More fixes for documentation

* Restored Transformations API as a part of Plugin API

* Fixed tag files usage for Plugin API

* Fixed link to FakeQuantize operation
2020-11-26 14:32:12 +03:00

2.2 KiB

Using GPU Kernels Tuning

GPU Kernels Tuning allows you to tune models, so the heavy computational layers are configured to fit better into hardware, which the tuning was done on. It is required to achieve best performance on GPU.

Note

Currently only convolution and fully connected layers undergo tuning process. It means that the performance boost depends on the amount of that layers in the model.

OpenVINO™ releases include the <INSTALL_DIR>/inference_engine/bin/intel64/Release/cache.json file with pretuned data for current state of the art models. It is highly recommended to do the tuning for new kind of models, hardwares or drivers.

Tuned data

GPU tuning data is saved in JSON format. File's content is composed of 2 types of attributes and 1 type of value:

  1. Execution units number - this attribute splits the content into different EU sections.
  2. Hash - hashed tuned kernel data. Key: Array with kernel name and kernel's mode index.

Usage


You can activate Kernels Tuning process by setting KEY_TUNING_MODE flag to TUNING_CREATE and KEY_TUNING_FILE to <"filename"> in a configuration map that is passed to the plugin while loading a network. This configuration modifies the behavior of the ExecutableNetwork object. Instead of standard network compilation, it will run the tuning process. Please keep in mind that the tuning can be very time consuming. The bigger the network, the longer it will take. File with tuned data is the result of this step.

Note

If a filename passed to KEY_TUNING_FILE points to existing tuned data and you are tuning a new model, then this file will be extended by new data. This allows you to extend existing cache.json provided in the OpenVINO™ release package.

The example below shows how to set and use the key files:

@snippet snippets/GPU_Kernels_Tuning.cpp part0


You can activate the inference with tuned data by setting KEY_TUNING_MODE flag to TUNING_USE_EXISTING and KEY_TUNING_FILE flag to <"filename">.

GPU backend will process the content of the file during network compilation to configure the OpenCL kernels for the best performance.