Benchmarking

Overview

The Modelconverter benchmark tool provides a way to measure the on-device performance of converted models on supported targets (RVC2, and RVC4). It is intended to produce the same kinds of performance results typically shown in HubAI Model Zoo model cards.

Installation

To use the benchmark tool, install ModelConverter and ensure its benchmarking dependencies are included. You can follow the installation steps in the Modelconverter repository or the corresponding ModelConverter documentation section.To install ModelConverter from PyPI with benchmarking support, use:

Command Line

1pip install modelconv[bench]

Usage

You can benchmark a model as follows:

Command Line

1modelconverter benchmark <target> --model-path <path_or_slug>

The command prints a results table to the console. If you pass --save, it also writes the results to a CSV file in the current working directory.

Basic Example

To benchmark the YOLOv6 Nano model on RVC4, run:

Command Line

1modelconverter benchmark rvc4 --model-path yolov6-nano:r2-coco-512x288 --save

This benchmarks the model, prints the results to the console, and saves them as yolov6-nano_benchmark_results.csv.For more advanced examples and options, see the Advanced Examples section.

Supported Targets

The benchmark command supports multiple targets via the positional <target> argument:

RVC2: DepthAI (DAI) pipeline benchmarking.
RVC4: DepthAI (DAI) pipeline benchmarking (default) or SNPE-based benchmarking via ADB connection to the RVC4 device.

Model Sources

The --model-path argument is required and accepts the following model sources:

Local model artifacts: .blob files (for RVC2) or .dlc files (for RVC4)
Local NN Archives: Luxonis NN Archives (.tar.xz files containing model executables and configuration)
HubAI Model Slugs: Model slugs from HubAI's Model Zoo, such as yolov6-nano:r2-coco-512x288

Note: When using HubAI slugs, ensure your HUBAI_API_KEY environment variable is configured with valid credentials.

Output Results

By default, results are printed to the console as a table.If --save is provided, results are written to a CSV file. The file name is derived from the model name and follows the pattern: <model_name>_benchmark_results.csv.

CLI Options

You can customize the benchmarking process using additional command-line options:

Parameter	Description	Default	Platforms
`full`	Run full benchmark with all configurations.	`False`	RVC2, RVC4
`save`	Save benchmark results to CSV file.	`False`	RVC2, RVC4
`repetitions`	Number of repetitions for DAI benchmark.	`10`	RVC2, RVC4
`benchmark-time`	Benchmark duration in seconds.	`20`	RVC2, RVC4
`num-threads`	Number of threads for inference (DAI only).	`2`	RVC2, RVC4
`num-messages`	Messages to send per inference run (DAI only).	`50`	RVC2, RVC4
`profile`	SNPE performance profile to use.	`balanced`	RVC4
`runtime`	SNPE runtime to use for inference.	`dsp`	RVC4
`num-images`	Number of images for SNPE inference.	`1000`	RVC4
`device-ip`	IP address of target device.	`None`	RVC4
`device-id`	Unique ID of target device.	`None`	RVC4
`dai-benchmark`	Use DAI benchmark instead of SNPE tools.	`True`	RVC4
`power-benchmark`	Measure power consumption (requires ADB).	`False`	RVC4
`dsp-benchmark`	Measure DSP utilization (requires ADB).	`False`	RVC4

Controlling Benchmark Duration

You can control how long benchmarking runs in two ways:

Time-based (recommended for stable averages): --benchmark-time <seconds>
Fixed repetitions: --repetitions <N>

By default, time-based benchmarking is enabled (--benchmark-time 20) and takes precedence over --repetitions both for RVC2 and RVC4 targets.To run a fixed number of repetitions instead, disable time-based benchmarking:

Command Line

1modelconverter benchmark rvc4 \
2  --model-path <path_or_slug> \
3  --benchmark-time -1 \
4  --repetitions 50

RVC4 Related Options

Selecting the Device

On RVC4 you can target a specific device by either:

--device-ip <ip>. Specifies the IP address of the target RVC4 device.
--device-id <id>. Specifies the unique ID of the target RVC4 device.

If neither is provided, the device is selected automatically from the available devices. If both are provided, device-id takes precedence.

Power and DSP Monitoring

On RVC4, the tool can optionally record additional metrics during the benchmark run:

--power-benchmark: When enabled, the benchmark collects power readings from device hwmon nodes (when available) and reports the average power consumption statistics for the system and the processor during the benchmark run.
--dsp-benchmark: When enabled, the benchmark collects DSP utilization data from the device and reports the average DSP utilization during the benchmark run.

These options require (root) ADB access to the device.

DAI vs SNPE Benchmarking

RVC4 supports two benchmarking approaches:

DAI benchmark (default): Enabled with --dai-benchmark.
SNPE tools over ADB: Enabled with --no-dai-benchmark.

When to use each approach:

Use the default DAI benchmark for most use cases.
Use the SNPE path when you need manual SNPE-style benchmarking or want to benchmark .dlc files directly.

SNPE benchmark requirements:The SNPE path requires ADB access to the device. The tool uses snpe-parallel-run internally with the following behavior:

Prepares and sends a set of images to the device for inference (configured via --num-images).
Supports runtime and performance customization via --runtime and --profile options.
Runs on two threads by default (to match DAI defaults).

Manual SNPE benchmarking

For advanced use cases or custom workflows, you can benchmark models using SNPE tools directly without modelconverter benchmark. This approach is useful when:

Building custom benchmarking scripts or automation
Integrating with existing SNPE pipelines
Requiring fine-grained control over benchmarking parameters

Workflow

Inspect the DLC model Extract input tensor metadata including tensor names, shapes, and data types. Use snpe-dlc-info to retrieve this information.
Prepare input data
- Create input .raw files matching the expected tensor shapes and types.
- Generate an input_list.txt file that lists all input files for processing.
Run SNPE benchmarking tools Choose the appropriate tool based on your benchmarking goals:
- snpe-parallel-run.
- snpe-throughput-net-run.

For detailed usage and options for these tools, refer to the SNPE General Tools documentation.

Advanced Examples

NN Archive with Custom Duration

Benchmark an exported NN Archive on RVC2 with a longer test duration and save results for later comparison.

Command Line

1modelconverter benchmark rvc2 \
2  --model-path path/to/model_archive.tar.xz \
3  --benchmark-time 60 \
4  --save

Key points:

Runs a 60-second time-based benchmark for more stable results
Prints results to console and saves to CSV for comparison

HubAI Model with Power Monitoring

Benchmark a Model Zoo model using its HubAI Model Slug on RVC4 with power consumption and DSP utilization metrics.

Command Line

1modelconverter benchmark rvc4 \
2  --model-path yolov6-nano:r2-coco-512x288 \
3  --device-ip 192.168.1.50 \
4  --power-benchmark \
5  --dsp-benchmark

Key points:

Use --device-id <id> as an alternative to --device-ip <ip>
Power and DSP monitoring require ADB access and device support

Local DLC via SNPE

Benchmark a local .dlc file using SNPE tools with a controlled workload size.

Command Line

1modelconverter benchmark rvc4 \
2  --model-path path/to/model.dlc \
3  --no-dai-benchmark \
4  --device-ip 192.168.1.50 \
5  --num-images 500

Key points:

Uses SNPE benchmarking path instead of DAI (via --no-dai-benchmark)
Processes 500 generated input images
Targets a specific device by IP address

ON THIS PAGE

Benchmarking

Overview

Installation

Usage

Basic Example

Supported Targets

Model Sources

Output Results

CLI Options

Controlling Benchmark Duration

RVC4 Related Options

Selecting the Device

Power and DSP Monitoring

DAI vs SNPE Benchmarking

Manual SNPE benchmarking

Workflow

Advanced Examples

NN Archive with Custom Duration

HubAI Model with Power Monitoring

Local DLC via SNPE