Converting model to MyriadX blob

To allow DepthAI to use your custom trained models, you need to convert them into a MyriadX blob file format - so that they are optimized for the best inference on MyriadX VPU processor.

There are two conversion steps that have to be taken in order to obtain a blob file:

  • Use Model Optimizer to produce OpenVINO IR representation (where IR stands for Intermediate Representation)

  • Use Compile Tool to compile IR representation model into VPU blob

Model Optimizer

OpenVINO’s Model optimizer converts the model from the original framework format into the OpenVINO’s Intermediate Representation (IR) standard format (.bin and .xml). This format of the model can be deployed across multiple Intel devices: CPU, GPU, iGPU, VPU (which we are interested in), and FPGA.

Example usage of Model Optimizer with online Blobconverter:

--data_type=FP16 --mean_values=[0,0,0] --scale_values=[255,255,255]

Example for local conversion:

mo --input_model path/to/model.onnx --data_type=FP16 --mean_values=[0,0,0] --scale_values=[255,255,255]

All arguments below are also documented on OpenVINO’s docs here.

FP16 Data Type

Since we are converting for VPU (which supports FP16), we need to use parameter --data_type=FP16. More information here.

Mean and Scale parameters

OpenVINO’s documentation here. –mean_values and –scale_values parameters will normalize the input image to the model: new_value = (byte - mean) / scale. By default, frames from ColorCamera/MonoCamera are in U8 data type ([0,255]).

Models are usually trained with normalized frames [-1,1] or [0,1], so we need to normalize frames before running the inference. One (not ideal) option is to create Custom model that normalizes frames before inferencing (example here), but it’s better (more optimized) to do it in the model itself.

Common options:

  • [0,1] values, mean=0 and scale=255 (([0,255] - 0) / 255 = [0,1])

  • [-1,1] values, mean=127.5 and scale=127.5 (([0,255] - 127.5) / 127.5 = [-1,1])

  • [-0.5,0.5] values, mean=127.5 and scale=255 (([0,255] - 127.5) / 255 = [-0.5,0.5])

Model layout parameter

OpenVINO’s documentation here. Model layout can be specified with --layout parameter. We use Planar / CHW layout convention. A similar DepthAI error message will be shown if the image layout is not matching the model layout:

[NeuralNetwork(0)] [warning] Input image (416x416) does not match NN (3x416)

Note that by default, ColorCamera node will output preview frames in Interleaved / HWC layout (as it’s native to OpenCV), and can be changed to Planar layout via API:

import depthai as dai
pipeline = dai.Pipeline()
colorCam = pipeline.createColorCamera()
colorCam.setInterleaved(False) # False = Planar layout

Color order

OpenVINO’s documentation here. NN models can be trained on images that have either RGB or BGR color order. You can change from one to another using --reverse_input_channels parameter. We use BGR color order. For example, see Changing color order>.

Note that by default, ColorCamera node will output preview frames in BGR color order (as it’s native to OpenCV), and can be changed to RGB color order via API:

import depthai as dai
pipeline = dai.Pipeline()
colorCam = pipeline.createColorCamera()
colorCam.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB) # RGB color order, BGR by default

Compile Tool

After converting the model to OpenVINO’s IR format (.bin/.xml), we need to use OpenVINO’s Compile Tool to compile the model in IR format into .blob file, which can be deployed to the device (tutorial here)

Input layer precision: RVC2 only supports FP16 precision, so -ip U8 will add conversion layer U8->FP16 on all input layers of the model - which is what we usually want. In some cases (eg. when we aren’t dealing with frames), we want to use FP16 precision directly, so we can use -ip FP16 (Cosine distance model example).

Shaves: RVC2 has a total has 16 SHAVE cores (see Hardware accelerators documentation). Compiling for more SHAVEs can make the model perform faster, but the proportion of shave cores isn’t linear with performance. Firmware will warn you about a possibly optimal number of shave cores, which is available_cores/2. As by default, each model will run on 2 threads.

Converting and compiling models

There are a few options to perform these steps:

  1. Using our online blobconverter app

  2. Using our blobconverter library

  3. Converting & Compiling locally

1. Using online blobconverter

You can visit our online Blobconverter app which allows you to convert and compile the NN model from TensorFlow, Caffe, ONNX, OpenVINO IR, and OpenVINO Model Zoo.

BlobConverter Web

2. Using blobconverter package

For automated usage of our blobconverter tool, we have released a blobconverter PyPi package, that allows converting & compiling models both from the command line and from the Python script directly. Example usage below.

Install and usage instructions can be found here

import blobconverter

blob_path = blobconverter.from_onnx(
    model="/path/to/model.onnx",
    data_type="FP16",
    shaves=5,
)

3. Local compilation

If you want to perform model conversion and compilation locally, you can follow:

Troubleshooting

When converting your model to the OpenVINO format or compiling it to a .blob, you might come across an issue. This usually means that a connection between two layers is not supported or that the layer is not supported.

For visualization of NN models we suggest using Netron app.

Netron

Supported layers

When converting your model to OpenVINO’s IR format (.bin and .xml), you have to check if the OpenVINO supports layers that were used. Here are supported layers and their limitations for Caffee, MXNet, TensorFlow, TensorFlow 2 Keras, Kaldi, and ONNX.

Unsupported layer type “layer_type”

When using compile_tool to compile from IR (.xml/.bin) into .blob, you might get an error like this:

Failed to compile layer "Resize_230": unsupported layer type "Interpolate"

This means that the layer type is not supported by the VPU (Intels Myriad X). You can find supported OpenVINO layers by the VPU here, under the Supported Layers header, in the third column (VPU). Refer to official Intel’s troubleshooting docs for more information.

Incorrect data types

If the compiler returns something along the lines of “check error: input #0 has type S32, but one of [FP16] is expected”, it means that you are using incorrect data types. In the case above, an INT32 layer is connected to FP16 directly. There should be a conversion in between these layers, and we can achieve that by using the OpenVINOs Convert layer between these two layers. You can do that by editing your models .xml and adding the Convert layer. You can find additional information on this discord thread.

Got questions?

Head over to Discussion Forum for technical support or any other questions you might have.