* Added VariableState to Plugin API documentation * More fixes for plugin documentation * Added ie_memory_state.hpp to documentation * Added proper dependencies between C++ and Plugin API targets * Fixed issues in public C++ API reference * Fixed issues in public C++ API reference: part 2 * Removed obsolete entries from EXCLUDE_SYMBOLS in doxygen config * Fixed path to examples, tag files for Plugin API doxygen file * Put impl to a private section for VariableStatebase * Fixed examples path to Plugin API: part 2 * Fixed path to examples in main ie_docs doxygen file * Replaced path to snippets; otherwise path depends on how cloned repo is named * Added path to snippets for ie_docs doxygen file as well * Great amount of fixes for documentation * Removed IE_SET_METRIC * Fixes for C API documentation * More fixes for documentation * Restored Transformations API as a part of Plugin API * Fixed tag files usage for Plugin API * Fixed link to FakeQuantize operation
1.7 KiB
Using Dynamic Batching
Dynamic Batching feature allows you+ to dynamically change batch size for inference calls within preset batch size limit. This feature might be useful when batch size is unknown beforehand, and using extra large batch size is undesired or impossible due to resource limitations. For example, face detection with person age, gender, or mood recognition is a typical usage scenario.
Usage
You can activate Dynamic Batching by setting KEY_DYN_BATCH_ENABLED flag to YES in a configuration map that is
passed to the plugin while loading a network.
This configuration creates an ExecutableNetwork object that will allow setting batch size
dynamically in all of its infer requests using SetBatch() method.
The batch size that was set in passed CNNNetwork object will be used as a maximum batch size limit.
Here is a code example:
@snippet snippets/DynamicBatching.cpp part0
Limitations
Currently, certain limitations for using Dynamic Batching exist:
-
Use Dynamic Batching with CPU and GPU plugins only.
-
Use Dynamic Batching on topologies that consist of certain layers only:
- Convolution
- Deconvolution
- Activation
- LRN
- Pooling
- FullyConnected
- SoftMax
- Split
- Concatenation
- Power
- Eltwise
- Crop
- BatchNormalization
- Copy
Do not use layers that might arbitrary change tensor shape (such as Flatten, Permute, Reshape), layers specific to object detection topologies (ROIPooling, ProirBox, DetectionOutput), and custom layers. Topology analysis is performed during the process of loading a network into plugin, and if topology is not applicable, an exception is generated.