# Convert Detectron2 Models to OpenVINO™ `Detectron2 `__ is Facebook AI Research’s library that provides state-of-the-art detection and segmentation algorithms. It is the successor of `Detectron `__ and `maskrcnn-benchmark `__. It supports a number of computer vision research projects and production applications. In this tutorial we consider how to convert and run Detectron2 models using OpenVINO™. We will use ``Faster R-CNN FPN x1`` model and ``Mask R-CNN FPN x3`` pretrained on `COCO `__ dataset as examples for object detection and instance segmentation respectively. **Table of contents:** - `Prerequisites <#prerequisites>`__ - `Define helpers for PyTorch model initialization and conversion <#define-helpers-for-pytorch-model-initialization-and-conversion>`__ - `Prepare input data <#prepare-input-data>`__ - `Object Detection <#object-detection>`__ - `Download PyTorch Detection model <#download-pytorch-detection-model>`__ - `Convert Detection Model to OpenVINO Intermediate Representation <#convert-detection-model-to-openvino-intermediate-representation>`__ - `Select inference device <#select-inference-device>`__ - `Run Detection model inference <#run-detection-model-inference>`__ - `Instance Segmentation <#instance-segmentation>`__ - `Download Instance Segmentation PyTorch model <#download-instance-segmentation-pytorch-model>`__ - `Convert Instance Segmentation Model to OpenVINO Intermediate Representation <#convert-instance-segmentation-model-to-openvino-intermediate-representation>`__ - `Select inference device <#select-inference-device>`__ - `Run Instance Segmentation model inference <#run-instance-segmentation-model-inference>`__ Prerequisites ------------- Install required packages for running model .. code:: ipython3 %pip install -q --extra-index-url https://download.pytorch.org/whl/cpu torch torchvision %pip install -q "git+https://github.com/facebookresearch/detectron2.git" --extra-index-url https://download.pytorch.org/whl/cpu %pip install -q "openvino>=2023.1.0" .. parsed-literal:: Note: you may need to restart the kernel to use updated packages. Note: you may need to restart the kernel to use updated packages. Note: you may need to restart the kernel to use updated packages. Define helpers for PyTorch model initialization and conversion ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Detectron2 provides universal and configurable API for working with models, it means that all steps required for model creation, conversion and inference will be common for all models, that is why it is enough to define helper functions once, then reuse them for different models. For obtaining models we will use `Detectron2 Model Zoo `__ API. ``detecton_zoo.get`` function allow to download and instantiate model based on its config file. Configuration file is playing key role in interaction with models in Detectron2 project and describes model architecture and training and validation processes. ``detectron_zoo.get_config`` function can be used for finding and reading model config. .. code:: ipython3 import detectron2.model_zoo as detectron_zoo def get_model_and_config(model_name:str): """ Helper function for downloading PyTorch model and its configuration from Detectron2 Model Zoo Parameters: model_name (str): model_id from Detectron2 Model Zoo Returns: model (torch.nn.Module): Pretrained model instance cfg (Config): Configuration for model """ cfg = detectron_zoo.get_config(model_name + '.yaml', trained=True) model = detectron_zoo.get(model_name + '.yaml', trained=True) return model, cfg Detectron2 library is based on PyTorch. Starting from 2023.0 release OpenVINO supports PyTorch models conversion directly via Model Conversion API. ``ov.convert_model`` function can be used for converting PyTorch model to OpenVINO Model object instance, that ready to use for loading on device and then running inference or can be saved on disk for next deployment using ``ov.save_model`` function. Detectron2 models use custom complex data structures inside that brings some difficulties for exporting models in different formats and frameworks including OpenVINO. For avoid these issues, ``detectron2.export.TracingAdapter`` provided as part of Detectron2 deployment API. ``TracingAdapter`` is a model wrapper class that simplify model’s structure making it more export-friendly. .. code:: ipython3 from detectron2.modeling import GeneralizedRCNN from detectron2.export import TracingAdapter import torch import openvino as ov import warnings from typing import List, Dict def convert_detectron2_model(model:torch.nn.Module, sample_input:List[Dict[str, torch.Tensor]]): """ Function for converting Detectron2 models, creates TracingAdapter for making model tracing-friendly, prepares inputs and converts model to OpenVINO Model Parameters: model (torch.nn.Module): Model object for conversion sample_input (List[Dict[str, torch.Tensor]]): sample input for tracing Returns: ov_model (ov.Model): OpenVINO Model """ # prepare input for tracing adapter tracing_input = [{'image': sample_input[0]["image"]}] # override model forward and disable postprocessing if required if isinstance(model, GeneralizedRCNN): def inference(model, inputs): # use do_postprocess=False so it returns ROI mask inst = model.inference(inputs, do_postprocess=False)[0] return [{"instances": inst}] else: inference = None # assume that we just call the model directly # create traceable model traceable_model = TracingAdapter(model, tracing_input, inference) warnings.filterwarnings("ignore") # convert PyTorch model to OpenVINO model ov_model = ov.convert_model(traceable_model, example_input=sample_input[0]["image"]) return ov_model Prepare input data ~~~~~~~~~~~~~~~~~~ For running model conversion and inference we need to provide example input. The cells below download sample image and apply preprocessing steps based on model specific transformations defined in model config. .. code:: ipython3 import requests from pathlib import Path from PIL import Image MODEL_DIR = Path("model") DATA_DIR = Path("data") MODEL_DIR.mkdir(exist_ok=True) DATA_DIR.mkdir(exist_ok=True) input_image_url = "https://farm9.staticflickr.com/8040/8017130856_1b46b5f5fc_z.jpg" image_file = DATA_DIR / "example_image.jpg" if not image_file.exists(): image = Image.open(requests.get(input_image_url, stream=True).raw) image.save(image_file) else: image = Image.open(image_file) image .. image:: 123-detectron2-to-openvino-with-output_files/123-detectron2-to-openvino-with-output_8_0.png .. code:: ipython3 import detectron2.data.transforms as T from detectron2.data import detection_utils import torch def get_sample_inputs(image_path, cfg): # get a sample data original_image = detection_utils.read_image(image_path, format=cfg.INPUT.FORMAT) # Do same preprocessing as DefaultPredictor aug = T.ResizeShortestEdge([cfg.INPUT.MIN_SIZE_TEST, cfg.INPUT.MIN_SIZE_TEST], cfg.INPUT.MAX_SIZE_TEST) height, width = original_image.shape[:2] image = aug.get_transform(original_image).apply_image(original_image) image = torch.as_tensor(image.astype("float32").transpose(2, 0, 1)) inputs = {"image": image, "height": height, "width": width} # Sample ready sample_inputs = [inputs] return sample_inputs Now, when all components required for model conversion are prepared, we can consider how to use them on specific examples. Object Detection ---------------- Download PyTorch Detection model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Download faster_rcnn_R_50_FPN_1x from Detectron Model Zoo. .. code:: ipython3 model_name = 'COCO-Detection/faster_rcnn_R_50_FPN_1x' model, cfg = get_model_and_config(model_name) sample_input = get_sample_inputs(image_file, cfg) Convert Detection Model to OpenVINO Intermediate Representation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Convert model using ``convert_detectron2_model`` function and ``sample_input`` prepared above. After conversion, model saved on disk using ``ov.save_model`` function and can be found in ``model`` directory. .. code:: ipython3 model_xml_path = MODEL_DIR / (model_name.split("/")[-1] + '.xml') if not model_xml_path.exists(): ov_model = convert_detectron2_model(model, sample_input) ov.save_model(ov_model, MODEL_DIR / (model_name.split("/")[-1] + '.xml')) else: ov_model = model_xml_path Select inference device ~~~~~~~~~~~~~~~~~~~~~~~ select device from dropdown list for running inference using OpenVINO .. code:: ipython3 import ipywidgets as widgets core = ov.Core() device = widgets.Dropdown( options=core.available_devices + ["AUTO"], value='AUTO', description='Device:', disabled=False, ) device .. parsed-literal:: Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO') Run Detection model inference ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Load our converted model on selected device and run inference on sample input. .. code:: ipython3 compiled_model = core.compile_model(ov_model, device.value) .. code:: ipython3 results = compiled_model(sample_input[0]["image"]) Tracing adapter simplifies model input and output format. After conversion, model has multiple outputs in following format: 1. Predicted boxes is floating-point tensor in format [``N``, 4], where N is number of detected boxes. 2. Predicted classes is integer tensor in format [``N``], where N is number of predicted objects that defines which label each object belongs. The values range of predicted classes tensor is [0, ``num_labels``], where ``num_labels`` is number of classes supported of model (in our case 80). 3. Predicted scores is floating-point tensor in format [``N``], where ``N`` is number of predicted objects that defines confidence of each prediction. 4. Input image size is integer tensor with values [``H``, ``W``], where ``H`` is height of input data and ``W`` is width of input data, used for rescaling predictions on postprocessing step. For reusing Detectron2 API for postprocessing and visualization, we provide helpers for wrapping output in original Detectron2 format. .. code:: ipython3 from detectron2.structures import Instances, Boxes from detectron2.modeling.postprocessing import detector_postprocess from detectron2.utils.visualizer import ColorMode, Visualizer from detectron2.data import MetadataCatalog import numpy as np def postprocess_detection_result(outputs:Dict, orig_height:int, orig_width:int, conf_threshold:float = 0.0): """ Helper function for postprocessing prediction results Parameters: outputs (Dict): OpenVINO model output dictionary orig_height (int): original image height before preprocessing orig_width (int): original image width before preprocessing conf_threshold (float, optional, defaults 0.0): confidence threshold for valid prediction Returns: prediction_result (instances): postprocessed predicted instances """ boxes = outputs[0] classes = outputs[1] has_mask = len(outputs) >= 5 masks = None if not has_mask else outputs[2] scores = outputs[2 if not has_mask else 3] model_input_size = (int(outputs[3 if not has_mask else 4][0]), int(outputs[3 if not has_mask else 4][1])) filtered_detections = scores >= conf_threshold boxes = Boxes(boxes[filtered_detections]) scores = scores[filtered_detections] classes = classes[filtered_detections] out_dict = {"pred_boxes": boxes, "scores": scores, "pred_classes": classes} if masks is not None: masks = masks[filtered_detections] out_dict["pred_masks"] = torch.from_numpy(masks) instances = Instances(model_input_size, **out_dict) return detector_postprocess(instances, orig_height, orig_width) def draw_instance_prediction(img:np.ndarray, results:Instances, cfg:"Config"): """ Helper function for visualization prediction results Parameters: img (np.ndarray): original image for drawing predictions results (instances): model predictions cfg (Config): model configuration Returns: img_with_res: image with results """ metadata = MetadataCatalog.get(cfg.DATASETS.TEST[0]) visualizer = Visualizer(img, metadata, instance_mode=ColorMode.IMAGE) img_with_res = visualizer.draw_instance_predictions(results) return img_with_res .. code:: ipython3 results = postprocess_detection_result(results, sample_input[0]["height"], sample_input[0]["width"], conf_threshold=0.05) img_with_res = draw_instance_prediction(np.array(image), results, cfg) Image.fromarray(img_with_res.get_image()) .. image:: 123-detectron2-to-openvino-with-output_files/123-detectron2-to-openvino-with-output_22_0.png Instance Segmentation --------------------- As it was discussed above, Detectron2 provides generic approach for working with models for different use cases. The steps that required to convert and run models pretrained for Instance Segmentation use case will be very similar to Object Detection. Download Instance Segmentation PyTorch model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code:: ipython3 model_name = "COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x" model, cfg = get_model_and_config(model_name) sample_input = get_sample_inputs(image_file, cfg) Convert Instance Segmentation Model to OpenVINO Intermediate Representation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code:: ipython3 model_xml_path = MODEL_DIR / (model_name.split("/")[-1] + '.xml') if not model_xml_path.exists(): ov_model = convert_detectron2_model(model, sample_input) ov.save_model(ov_model, MODEL_DIR / (model_name.split("/")[-1] + '.xml')) else: ov_model = model_xml_path Select inference device ~~~~~~~~~~~~~~~~~~~~~~~ select device from dropdown list for running inference using OpenVINO .. code:: ipython3 device .. parsed-literal:: Dropdown(description='Device:', index=1, options=('CPU', 'AUTO'), value='AUTO') Run Instance Segmentation model inference ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In comparison with Object Detection, Instance Segmentation models have additional output that represents instance masks for each object. Our postprocessing function handle this difference. .. code:: ipython3 compiled_model = core.compile_model(ov_model, device.value) .. code:: ipython3 results = compiled_model(sample_input[0]["image"]) results = postprocess_detection_result(results, sample_input[0]["height"], sample_input[0]["width"], conf_threshold=0.05) img_with_res = draw_instance_prediction(np.array(image), results, cfg) Image.fromarray(img_with_res.get_image()) .. image:: 123-detectron2-to-openvino-with-output_files/123-detectron2-to-openvino-with-output_32_0.png