Manual Conversion with OpenVino (RVC2 & RVC3)

Overview

RVC2 and RVC3 conversion is based on the OpenVINO toolkit. ModelConverter Docker Images provide all the necessary tools installed. Install the ModelConverter CLI as:

Command Line

1pip install modelconv

and run:

Command Line

1modelconverter shell <platform>

Where platform stands for the target platform you aim to convert for, so either rvc2 or rvc3.This is equivalent to starting a new Docker container from the luxonis/modelconverter-<platform>:latest image and running it as an interactive terminal session (-it) with the --rm flag to ensure the container is automatically removed once the session is exited:

Command Line

1docker run --rm -it \
2    -v $(pwd)/shared_with_container:/app/shared_with_container/ \
3    luxonis/modelconverter-<platform>:latest

Alternatively, you can also install the OpenVino toolkit yourself by running:

Command Line

1pip install openvino-dev==2022.3

In the following sections, we explain the conversion process step-by-step.

Simplify model (Optional)

In order to obtain a model with optimal performance, we recommend running model simplification prior to the conversion steps. For an .onnx model, you can run the simplification as:

Command Line

1pip install onnxsim
2onnxsim <path to .onnx model> <path to simplified .onnx model>

Compile OpenVINO IR

The model is first converted from its original format to the OpenVINO Intermediate Representation (IR) format. It consists of two files encoding the network topology (.xml file), and storing the model's weights and biases (.bin file). OpenVINO Model Optimizer (v2022.3.0) is used for this conversion job and the following source model formats are supported:

ONNX
TensorFlow
PyTorch
PaddlePaddle
MXNet
Kaldi
Caffe

To convert the model to IR run:

Command Line

1mo --input_model <path to the (un-)simplified source model> --compress_to_fp16

Note that it's advisable to use the --compress_to_fp16 parameter to obtain optimal performance on our devices. You can find additional details here.

Consult the mo --help for the full list of conversion options.

If planning to use your model in Luxonis ecosystem, we propose you set the flags so that the model expects un-normalized BGR input.

Therefore, be sure to set the --reverse_input_channels, --mean_values, and --scale_values flags if the model expects RGB input or normalization.

Quantize (RVC3 only)

If converting for RVC3, one must perform model quantization using OpenVINO Post-Training Optimization Toolkit (POT) prior to compiling to BLOB. See the following example for guidance.

Compile BLOB

Once the model has been transformed into OpenVINO's IR format, the next step is to compile it for inference on MYRIAD device and convert it to the BLOB format. OpenVINO Compile Tool (v2022.3.0) is used for this job.To convert the model from IR to BLOB run:

Command Line

1compile_tool -d MYRIAD -m <path to .xml model (make sure that .bin is at the same root)>

Note that the Compile Tool is part of the OpenVINO toolkit. If you have installed it manually, its location will depend on your installation path. Typically, it's found in the .../tools/compile_tool directory of your OpenVINO installation. You can run it as:

Command Line

1cd .../tools/compile_tool
2./compile_tool -d MYRIAD -m ...

Consult the compile_tool -h for the full list of conversion options.

Advanced

Model Optimizer

Mean and Scale Values

The normalization of input images for the model is achieved through the --mean_values and --scale_values. By default, frames from Camera node are in U8 data type, ranging from [0,255].However, models are typically trained with normalized frames within the range of [-1,1] or [0,1]. To ensure accurate inference results, frames need to be normalized beforehand.Although creating a custom model that normalizes frames before inference is an option (example here), it is more efficient to include this normalization directly within the model itself using the flags during model optimizer step.Here are some common normalization options (assuming that the initial input is in the range of [0,255]):

For required input with values between 0 and 1, use mean=0 and scale=255, computed as ([0,255] - 0) / 255 = [0,1].
For required input with values between -1 and 1, use mean=127.5 and scale=127.5, computed as ([0,255] - 127.5) / 127.5 = [-1,1].
For required input with values between -0.5 and 0.5, use mean=127.5 and scale=255, computed as ([0,255] - 127.5) / 255 = [-0.5,0.5].

For more information, refer to OpenVINO's documentation.

Model Layout

The model layout can be defined using the --layout parameter. For example:

Command Line

1--layout NCHW

In following configuration:

N - batch size
C - channels
H - height
W - width

If the image layout does not match the model layout, DepthAI will display a corresponding error message: [NeuralNetwork(0)] [warning] Input image (416x416) does not match NN (3x416)You have the option to switch between Interleaved / HWC and Planar / CHW layout through the API when requesting the output:

Python

1import depthai as dai
2pipeline = dai.Pipeline()
3cam = pipeline.create(dai.node.Camera).build()
4output = cam.requestOutput(
5    size=SIZE, type=dai.ImgFrame.Type.BGR888i # or BGR888p (i stands for interleaved, and p stands for planar)
6)

You can find further details in OpenVINO's documentation.

Color Order

Neural network models are commonly trained using images in RGB color order. The Camera node, by default, outputs frames in BGR format. Mismatching the color order between input frames and the trained model can lead to inaccurate predictions. To address this, the --reverse_input_channels flag is utilized.Moreover, there is an option to switch the camera output to RGB via the API, eliminating the need for the flag:

Python

1import depthai as dai
2pipeline = dai.Pipeline()
3cam = pipeline.create(dai.node.Camera).build()
4output = cam.requestOutput(
5    size=SIZE, type=dai.ImgFrame.Type.RGB888p
6)

You can find further details in OpenVINO's documentation.

Compile Tool

Input Layer Precision

Using -ip U8 will incorporate a conversion layer U8->FP16 on all input layers of the model, which is typically the desired configuration. However, in specific scenarios, such as when working with data other than frames, using FP16 precision directly is necessary. In such cases, you can opt for -ip FP16

Shaves

Increasing the number of SHAVEs during compilation can enhance the model's speed, although the relationship between SHAVE cores and performance is not linear. The firmware will provide a warning suggesting an optimal number of SHAVE cores, which is typically half of the available cores.

ON THIS PAGE

Manual Conversion with OpenVino (RVC2 & RVC3)

Overview

Simplify model (Optional)

Compile OpenVINO IR

Quantize (RVC3 only)

Compile BLOB

Advanced

Model Optimizer

Mean and Scale Values

Model Layout

Color Order

Compile Tool

Input Layer Precision

Shaves