Files
openvino/docs/notebooks/123-detectron2-to-openvino-with-output.rst
2023-12-13 10:33:50 +01:00

470 lines
16 KiB
ReStructuredText
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Convert Detectron2 Models to OpenVINO™
`Detectron2 <https://github.com/facebookresearch/detectron2>`__ is
Facebook AI Researchs library that provides state-of-the-art detection
and segmentation algorithms. It is the successor of
`Detectron <https://github.com/facebookresearch/Detectron/>`__ and
`maskrcnn-benchmark <https://github.com/facebookresearch/maskrcnn-benchmark/>`__.
It supports a number of computer vision research projects and production
applications.
In this tutorial we consider how to convert and run Detectron2 models
using OpenVINO™. We will use ``Faster R-CNN FPN x1`` model and
``Mask R-CNN FPN x3`` pretrained on
`COCO <https://cocodataset.org/#home>`__ dataset as examples for object
detection and instance segmentation respectively.
**Table of contents:**
- `Prerequisites <#prerequisites>`__
- `Define helpers for PyTorch model initialization and
conversion <#define-helpers-for-pytorch-model-initialization-and-conversion>`__
- `Prepare input data <#prepare-input-data>`__
- `Object Detection <#object-detection>`__
- `Download PyTorch Detection
model <#download-pytorch-detection-model>`__
- `Convert Detection Model to OpenVINO Intermediate
Representation <#convert-detection-model-to-openvino-intermediate-representation>`__
- `Select inference device <#select-inference-device>`__
- `Run Detection model inference <#run-detection-model-inference>`__
- `Instance Segmentation <#instance-segmentation>`__
- `Download Instance Segmentation PyTorch
model <#download-instance-segmentation-pytorch-model>`__
- `Convert Instance Segmentation Model to OpenVINO Intermediate
Representation <#convert-instance-segmentation-model-to-openvino-intermediate-representation>`__
- `Select inference device <#select-inference-device>`__
- `Run Instance Segmentation model
inference <#run-instance-segmentation-model-inference>`__
Prerequisites
-------------
Install required packages for running model
.. code:: ipython3
%pip install -q --extra-index-url https://download.pytorch.org/whl/cpu torch torchvision
%pip install -q "git+https://github.com/facebookresearch/detectron2.git" --extra-index-url https://download.pytorch.org/whl/cpu
%pip install -q "openvino>=2023.1.0"
.. parsed-literal::
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Define helpers for PyTorch model initialization and conversion
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Detectron2 provides universal and configurable API for working with
models, it means that all steps required for model creation, conversion
and inference will be common for all models, that is why it is enough to
define helper functions once, then reuse them for different models. For
obtaining models we will use `Detectron2 Model
Zoo <https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md>`__
API. ``detecton_zoo.get`` function allow to download and instantiate
model based on its config file. Configuration file is playing key role
in interaction with models in Detectron2 project and describes model
architecture and training and validation processes.
``detectron_zoo.get_config`` function can be used for finding and
reading model config.
.. code:: ipython3
import detectron2.model_zoo as detectron_zoo
def get_model_and_config(model_name:str):
"""
Helper function for downloading PyTorch model and its configuration from Detectron2 Model Zoo
Parameters:
model_name (str): model_id from Detectron2 Model Zoo
Returns:
model (torch.nn.Module): Pretrained model instance
cfg (Config): Configuration for model
"""
cfg = detectron_zoo.get_config(model_name + '.yaml', trained=True)
model = detectron_zoo.get(model_name + '.yaml', trained=True)
return model, cfg
Detectron2 library is based on PyTorch. Starting from 2023.0 release
OpenVINO supports PyTorch models conversion directly via Model
Conversion API. ``ov.convert_model`` function can be used for converting
PyTorch model to OpenVINO Model object instance, that ready to use for
loading on device and then running inference or can be saved on disk for
next deployment using ``ov.save_model`` function.
Detectron2 models use custom complex data structures inside that brings
some difficulties for exporting models in different formats and
frameworks including OpenVINO. For avoid these issues,
``detectron2.export.TracingAdapter`` provided as part of Detectron2
deployment API. ``TracingAdapter`` is a model wrapper class that
simplify models structure making it more export-friendly.
.. code:: ipython3
from detectron2.modeling import GeneralizedRCNN
from detectron2.export import TracingAdapter
import torch
import openvino as ov
import warnings
from typing import List, Dict
def convert_detectron2_model(model:torch.nn.Module, sample_input:List[Dict[str, torch.Tensor]]):
"""
Function for converting Detectron2 models, creates TracingAdapter for making model tracing-friendly,
prepares inputs and converts model to OpenVINO Model
Parameters:
model (torch.nn.Module): Model object for conversion
sample_input (List[Dict[str, torch.Tensor]]): sample input for tracing
Returns:
ov_model (ov.Model): OpenVINO Model
"""
# prepare input for tracing adapter
tracing_input = [{'image': sample_input[0]["image"]}]
# override model forward and disable postprocessing if required
if isinstance(model, GeneralizedRCNN):
def inference(model, inputs):
# use do_postprocess=False so it returns ROI mask
inst = model.inference(inputs, do_postprocess=False)[0]
return [{"instances": inst}]
else:
inference = None # assume that we just call the model directly
# create traceable model
traceable_model = TracingAdapter(model, tracing_input, inference)
warnings.filterwarnings("ignore")
# convert PyTorch model to OpenVINO model
ov_model = ov.convert_model(traceable_model, example_input=sample_input[0]["image"])
return ov_model
Prepare input data
~~~~~~~~~~~~~~~~~~
For running model conversion and inference we need to provide example
input. The cells below download sample image and apply preprocessing
steps based on model specific transformations defined in model config.
.. code:: ipython3
import requests
from pathlib import Path
from PIL import Image
MODEL_DIR = Path("model")
DATA_DIR = Path("data")
MODEL_DIR.mkdir(exist_ok=True)
DATA_DIR.mkdir(exist_ok=True)
input_image_url = "https://farm9.staticflickr.com/8040/8017130856_1b46b5f5fc_z.jpg"
image_file = DATA_DIR / "example_image.jpg"
if not image_file.exists():
image = Image.open(requests.get(input_image_url, stream=True).raw)
image.save(image_file)
else:
image = Image.open(image_file)
image
.. image:: 123-detectron2-to-openvino-with-output_files/123-detectron2-to-openvino-with-output_8_0.png
.. code:: ipython3
import detectron2.data.transforms as T
from detectron2.data import detection_utils
import torch
def get_sample_inputs(image_path, cfg):
# get a sample data
original_image = detection_utils.read_image(image_path, format=cfg.INPUT.FORMAT)
# Do same preprocessing as DefaultPredictor
aug = T.ResizeShortestEdge([cfg.INPUT.MIN_SIZE_TEST, cfg.INPUT.MIN_SIZE_TEST], cfg.INPUT.MAX_SIZE_TEST)
height, width = original_image.shape[:2]
image = aug.get_transform(original_image).apply_image(original_image)
image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1))
inputs = {"image": image, "height": height, "width": width}
# Sample ready
sample_inputs = [inputs]
return sample_inputs
Now, when all components required for model conversion are prepared, we
can consider how to use them on specific examples.
Object Detection
----------------
Download PyTorch Detection model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Download faster_rcnn_R_50_FPN_1x from Detectron Model Zoo.
.. code:: ipython3
model_name = 'COCO-Detection/faster_rcnn_R_50_FPN_1x'
model, cfg = get_model_and_config(model_name)
sample_input = get_sample_inputs(image_file, cfg)
Convert Detection Model to OpenVINO Intermediate Representation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Convert model using ``convert_detectron2_model`` function and
``sample_input`` prepared above. After conversion, model saved on disk
using ``ov.save_model`` function and can be found in ``model``
directory.
.. code:: ipython3
model_xml_path = MODEL_DIR / (model_name.split("/")[-1] + '.xml')
if not model_xml_path.exists():
ov_model = convert_detectron2_model(model, sample_input)
ov.save_model(ov_model, MODEL_DIR / (model_name.split("/")[-1] + '.xml'))
else:
ov_model = model_xml_path
Select inference device
~~~~~~~~~~~~~~~~~~~~~~~
select device from dropdown list for running inference using OpenVINO
.. code:: ipython3
import ipywidgets as widgets
core = ov.Core()
device = widgets.Dropdown(
options=core.available_devices + ["AUTO"],
value='AUTO',
description='Device:',
disabled=False,
)
device
.. parsed-literal::
Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')
Run Detection model inference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Load our converted model on selected device and run inference on sample
input.
.. code:: ipython3
compiled_model = core.compile_model(ov_model, device.value)
.. code:: ipython3
results = compiled_model(sample_input[0]["image"])
Tracing adapter simplifies model input and output format. After
conversion, model has multiple outputs in following format: 1. Predicted
boxes is floating-point tensor in format [``N``, 4], where N is number
of detected boxes. 2. Predicted classes is integer tensor in format
[``N``], where N is number of predicted objects that defines which label
each object belongs. The values range of predicted classes tensor is [0,
``num_labels``], where ``num_labels`` is number of classes supported of
model (in our case 80). 3. Predicted scores is floating-point tensor in
format [``N``], where ``N`` is number of predicted objects that defines
confidence of each prediction. 4. Input image size is integer tensor
with values [``H``, ``W``], where ``H`` is height of input data and
``W`` is width of input data, used for rescaling predictions on
postprocessing step.
For reusing Detectron2 API for postprocessing and visualization, we
provide helpers for wrapping output in original Detectron2 format.
.. code:: ipython3
from detectron2.structures import Instances, Boxes
from detectron2.modeling.postprocessing import detector_postprocess
from detectron2.utils.visualizer import ColorMode, Visualizer
from detectron2.data import MetadataCatalog
import numpy as np
def postprocess_detection_result(outputs:Dict, orig_height:int, orig_width:int, conf_threshold:float = 0.0):
"""
Helper function for postprocessing prediction results
Parameters:
outputs (Dict): OpenVINO model output dictionary
orig_height (int): original image height before preprocessing
orig_width (int): original image width before preprocessing
conf_threshold (float, optional, defaults 0.0): confidence threshold for valid prediction
Returns:
prediction_result (instances): postprocessed predicted instances
"""
boxes = outputs[0]
classes = outputs[1]
has_mask = len(outputs) >= 5
masks = None if not has_mask else outputs[2]
scores = outputs[2 if not has_mask else 3]
model_input_size = (int(outputs[3 if not has_mask else 4][0]), int(outputs[3 if not has_mask else 4][1]))
filtered_detections = scores >= conf_threshold
boxes = Boxes(boxes[filtered_detections])
scores = scores[filtered_detections]
classes = classes[filtered_detections]
out_dict = {"pred_boxes": boxes, "scores": scores, "pred_classes": classes}
if masks is not None:
masks = masks[filtered_detections]
out_dict["pred_masks"] = torch.from_numpy(masks)
instances = Instances(model_input_size, **out_dict)
return detector_postprocess(instances, orig_height, orig_width)
def draw_instance_prediction(img:np.ndarray, results:Instances, cfg:"Config"):
"""
Helper function for visualization prediction results
Parameters:
img (np.ndarray): original image for drawing predictions
results (instances): model predictions
cfg (Config): model configuration
Returns:
img_with_res: image with results
"""
metadata = MetadataCatalog.get(cfg.DATASETS.TEST[0])
visualizer = Visualizer(img, metadata, instance_mode=ColorMode.IMAGE)
img_with_res = visualizer.draw_instance_predictions(results)
return img_with_res
.. code:: ipython3
results = postprocess_detection_result(results, sample_input[0]["height"], sample_input[0]["width"], conf_threshold=0.05)
img_with_res = draw_instance_prediction(np.array(image), results, cfg)
Image.fromarray(img_with_res.get_image())
.. image:: 123-detectron2-to-openvino-with-output_files/123-detectron2-to-openvino-with-output_22_0.png
Instance Segmentation
---------------------
As it was discussed above, Detectron2 provides generic approach for
working with models for different use cases. The steps that required to
convert and run models pretrained for Instance Segmentation use case
will be very similar to Object Detection.
Download Instance Segmentation PyTorch model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: ipython3
model_name = "COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x"
model, cfg = get_model_and_config(model_name)
sample_input = get_sample_inputs(image_file, cfg)
Convert Instance Segmentation Model to OpenVINO Intermediate Representation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: ipython3
model_xml_path = MODEL_DIR / (model_name.split("/")[-1] + '.xml')
if not model_xml_path.exists():
ov_model = convert_detectron2_model(model, sample_input)
ov.save_model(ov_model, MODEL_DIR / (model_name.split("/")[-1] + '.xml'))
else:
ov_model = model_xml_path
Select inference device
~~~~~~~~~~~~~~~~~~~~~~~
select device from dropdown list for running inference using OpenVINO
.. code:: ipython3
device
.. parsed-literal::
Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO')
Run Instance Segmentation model inference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In comparison with Object Detection, Instance Segmentation models have
additional output that represents instance masks for each object. Our
postprocessing function handle this difference.
.. code:: ipython3
compiled_model = core.compile_model(ov_model, device.value)
.. code:: ipython3
results = compiled_model(sample_input[0]["image"])
results = postprocess_detection_result(results, sample_input[0]["height"], sample_input[0]["width"], conf_threshold=0.05)
img_with_res = draw_instance_prediction(np.array(image), results, cfg)
Image.fromarray(img_with_res.get_image())
.. image:: 123-detectron2-to-openvino-with-output_files/123-detectron2-to-openvino-with-output_32_0.png