openvino/docs/IE_DG/DynamicBatching.md

Using Dynamic Batching {#openvino_docs_IE_DG_DynamicBatching}
======================

Dynamic Batching feature allows you+ to dynamically change batch size for inference calls
within preset batch size limit.
This feature might be useful when batch size is unknown beforehand, and using extra large batch size is
undesired or impossible due to resource limitations.
For example, face detection with person age, gender, or mood recognition is a typical usage scenario.


## Usage

You can activate Dynamic Batching by setting <code>KEY_DYN_BATCH_ENABLED</code> flag to <code>YES</code> in a configuration map that is
passed to the plugin while loading a network.
This configuration creates an <code>ExecutableNetwork</code> object that will allow setting batch size
dynamically in all of its infer requests using <code>SetBatch()</code> method.
The batch size that was set in passed <code>CNNNetwork</code> object will be used as a maximum batch size limit.

Here is a code example:
```cpp
int dynBatchLimit = FLAGS_bl;	//take dynamic batch limit from command line option

// Read network model
Core core;
CNNNetwork network = core.ReadNetwork(modelFileName, weightFileName);

// enable dynamic batching and prepare for setting max batch limit
const std::map<std::string, std::string> dyn_config = 
{ { PluginConfigParams::KEY_DYN_BATCH_ENABLED, PluginConfigParams::YES } };
network.setBatchSize(dynBatchLimit);

// create executable network and infer request
auto executable_network = core.LoadNetwork(network, "CPU", dyn_config);
auto infer_request = executable_network.CreateInferRequest();


...


// process a set of images
// dynamically set batch size for subsequent Infer() calls of this request
size_t batchSize = imagesData.size();
infer_request.SetBatch(batchSize);
infer_request.Infer();

...

// process another set of images
batchSize = imagesData2.size();
infer_request.SetBatch(batchSize);
infer_request.Infer();
```


## Limitations

Currently, certain limitations for using Dynamic Batching exist:

* Use Dynamic Batching with CPU and GPU plugins only.

* Use Dynamic Batching on topologies that consist of certain layers only:

	* Convolution
	* Deconvolution
	* Activation
	* LRN
	* Pooling
	* FullyConnected
	* SoftMax
	* Split
	* Concatenation
	* Power
	* Eltwise
	* Crop
	* BatchNormalization
	* Copy
	
Do not use layers that might arbitrary change tensor shape (such as Flatten, Permute, Reshape),
layers specific to object detection topologies (ROIPooling, ProirBox, DetectionOutput), and
custom layers.
Topology analysis is performed during the process of loading a network into plugin, and if topology is
not applicable, an exception is generated.
Doc Migration from Gitlab (#1289) * doc migration * fix * Update FakeQuantize_1.md * Update performance_benchmarks.md * Updates graphs for FPGA * Update performance_benchmarks.md * Change DL Workbench structure (#1) * Changed DL Workbench structure * Fixed tags * fixes * Update ie_docs.xml * Update performance_benchmarks_faq.md * Fixes in DL Workbench layout * Fixes for CVS-31290 * [DL Workbench] Minor correction * Fix for CVS-30955 * Added nGraph deprecation notice as requested by Zoe * fix broken links in api doxy layouts * CVS-31131 fixes * Additional fixes * Fixed POT TOC * Update PAC_Configure.md PAC DCP 1.2.1 install guide. * Update inference_engine_intro.md * fix broken link * Update opset.md 2020-07-16 15:24:27 +03:00			`Using Dynamic Batching {#openvino_docs_IE_DG_DynamicBatching}`
			`======================`

			`Dynamic Batching feature allows you+ to dynamically change batch size for inference calls`
			`within preset batch size limit.`
			`This feature might be useful when batch size is unknown beforehand, and using extra large batch size is`
			`undesired or impossible due to resource limitations.`
			`For example, face detection with person age, gender, or mood recognition is a typical usage scenario.`


			`## Usage`

			`You can activate Dynamic Batching by setting <code>KEY_DYN_BATCH_ENABLED</code> flag to <code>YES</code> in a configuration map that is`
			`passed to the plugin while loading a network.`
			`This configuration creates an <code>ExecutableNetwork</code> object that will allow setting batch size`
			`dynamically in all of its infer requests using <code>SetBatch()</code> method.`
			`The batch size that was set in passed <code>CNNNetwork</code> object will be used as a maximum batch size limit.`

			`Here is a code example:`
			```cpp
			`int dynBatchLimit = FLAGS_bl; //take dynamic batch limit from command line option`

			`// Read network model`
			`Core core;`
			`CNNNetwork network = core.ReadNetwork(modelFileName, weightFileName);`

			`// enable dynamic batching and prepare for setting max batch limit`
			`const std::map<std::string, std::string> dyn_config =`
			`{ { PluginConfigParams::KEY_DYN_BATCH_ENABLED, PluginConfigParams::YES } };`
			`network.setBatchSize(dynBatchLimit);`

			`// create executable network and infer request`
			`auto executable_network = core.LoadNetwork(network, "CPU", dyn_config);`
			`auto infer_request = executable_network.CreateInferRequest();`


			`...`


			`// process a set of images`
			`// dynamically set batch size for subsequent Infer() calls of this request`
			`size_t batchSize = imagesData.size();`
			`infer_request.SetBatch(batchSize);`
			`infer_request.Infer();`

			`...`

			`// process another set of images`
			`batchSize = imagesData2.size();`
			`infer_request.SetBatch(batchSize);`
			`infer_request.Infer();`
			```


			`## Limitations`

			`Currently, certain limitations for using Dynamic Batching exist:`

			`* Use Dynamic Batching with CPU and GPU plugins only.`

			`* Use Dynamic Batching on topologies that consist of certain layers only:`

			`* Convolution`
			`* Deconvolution`
			`* Activation`
			`* LRN`
			`* Pooling`
			`* FullyConnected`
			`* SoftMax`
			`* Split`
			`* Concatenation`
			`* Power`
			`* Eltwise`
			`* Crop`
			`* BatchNormalization`
			`* Copy`

			`Do not use layers that might arbitrary change tensor shape (such as Flatten, Permute, Reshape),`
			`layers specific to object detection topologies (ROIPooling, ProirBox, DetectionOutput), and`
			`custom layers.`
			`Topology analysis is performed during the process of loading a network into plugin, and if topology is`
			`not applicable, an exception is generated.`