# Benchmarking

## Overview

The ModelConverter [benchmark tool](https://github.com/luxonis/modelconverter#benchmarking) provides a way to measure the
on-device performance of converted models on supported targets (RVC2 and RVC4). It is intended to produce the same kinds of
performance results typically shown in [HubAI Model Zoo](https://models.luxonis.com/) model cards.

## Installation

To use the benchmark tool, install ModelConverter and ensure its benchmarking dependencies are included. You can follow the
installation steps in the [ModelConverter repository](https://github.com/luxonis/modelconverter#installation) or the corresponding
[ModelConverter documentation
section](https://docs.luxonis.com/software-v3/ai-inference/conversion/rvc-conversion/offline/modelconverter.md).

To install ModelConverter from PyPI with benchmarking support, use:

```bash
pip install modelconv[bench]
```

## Usage

You can benchmark a model as follows:

```bash
modelconverter benchmark <target> --model-path <path_or_slug>
```

The command prints a results table to the console. If you pass --save, it also writes the results to a CSV file in the current
working directory.

### Basic Example

To benchmark the [YOLOv6 Nano](https://models.luxonis.com/luxonis/yolov6-nano/face58c4-45ab-42a0-bafc-19f9fee8a034?backTo=%2F)
model on RVC4, run:

```bash
modelconverter benchmark rvc4 --model-path yolov6-nano:r2-coco-512x288 --save
```

This benchmarks the model, prints the results to the console, and saves them as yolov6-nano_benchmark_results.csv.

For more advanced examples and options, see the [Advanced Examples](#Benchmarking-Advanced%2520Examples) section.

### Supported Targets

The benchmark command supports multiple targets via the positional <target> argument:

 * RVC2: DepthAI (DAI) pipeline benchmarking.
 * RVC4: DepthAI (DAI) pipeline benchmarking (default) or SNPE-based benchmarking via ADB connection to the RVC4 device.

### Model Sources

The --model-path argument is required and accepts the following model sources:

 * Local model artifacts: .blob files (for RVC2) or .dlc files (for RVC4)
 * Local NN Archives: [Luxonis NN Archives](https://docs.luxonis.com/software-v3/ai-inference/nn-archive.md) (.tar.xz files
   containing model executables and configuration)
 * HubAI Model Slugs: Model slugs from [HubAI's Model Zoo](https://models.luxonis.com/), such as yolov6-nano:r2-coco-512x288

Note: When using HubAI slugs, ensure your HUBAI_API_KEY environment variable is configured with valid credentials.

### Output Results

By default, results are printed to the console as a table.

If --save is provided, results are written to a CSV file. The file name is derived from the model name and follows the pattern:
<model_name>_benchmark_results.csv.

### CLI Options

You can customize the benchmarking process using additional command-line options:

| Parameter | Description | Default | Platforms |
| --- | --- | --- | --- |
| `full` | Run full benchmark with all configurations. | `False` | RVC2, RVC4 |
| `save` | Save benchmark results to CSV file. | `False` | RVC2, RVC4 |
| `repetitions` | Number of repetitions for DAI benchmark. | `10` | RVC2, RVC4 |
| `benchmark-time` | Benchmark duration in seconds. | `20` | RVC2, RVC4 |
| `num-threads` | Number of threads for inference (DAI only). | `2` | RVC2, RVC4 |
| `num-messages` | Messages to send per inference run (DAI only). | `50` | RVC2, RVC4 |
| `profile` | SNPE performance profile to use. | `balanced` | RVC4 |
| `runtime` | SNPE runtime to use for inference. | `dsp` | RVC4 |
| `num-images` | Number of images for SNPE inference. | `1000` | RVC4 |
| `device-ip` | IP address of target device. | `None` | RVC4 |
| `device-id` | Unique ID of target device. | `None` | RVC4 |
| `dai-benchmark` | Use DAI benchmark instead of SNPE tools. | `True` | RVC4 |
| `power-benchmark` | Measure power consumption (requires ADB). | `False` | RVC4 |
| `dsp-benchmark` | Measure DSP utilization (requires ADB). | `False` | RVC4 |

### Controlling Benchmark Duration

You can control how long benchmarking runs in two ways:

 * Time-based (recommended for stable averages): --benchmark-time <seconds>
 * Fixed repetitions: --repetitions <N>

By default, time-based benchmarking is enabled (--benchmark-time 20) and takes precedence over --repetitions both for RVC2 and
RVC4 targets.

To run a fixed number of repetitions instead, disable time-based benchmarking:

```bash
modelconverter benchmark rvc4 \
  --model-path <path_or_slug> \
  --benchmark-time -1 \
  --repetitions 50
```

### RVC4 Related Options

#### Selecting the Device

On RVC4 you can target a specific device by either:

 * --device-ip <ip>. Specifies the IP address of the target RVC4 device.
 * --device-id <id>. Specifies the unique ID of the target RVC4 device.

If neither is provided, the device is selected automatically from the available devices. If both are provided, device-id takes
precedence.

#### Power and DSP Monitoring

On RVC4, the tool can optionally record additional metrics during the benchmark run:

 * --power-benchmark: When enabled, the benchmark collects power readings from device hwmon nodes (when available) and reports the
   average power consumption statistics for the system and the processor during the benchmark run.
 * --dsp-benchmark: When enabled, the benchmark collects DSP utilization data from the device and reports the average DSP
   utilization during the benchmark run.

These options require (root) ADB access to the device.

#### DAI vs SNPE Benchmarking

RVC4 supports two benchmarking approaches:

 * DAI benchmark (default): Enabled with --dai-benchmark.
 * SNPE tools over ADB: Enabled with --no-dai-benchmark.

When to use each approach:

 * Use the default DAI benchmark for most use cases.
 * Use the SNPE path when you need manual SNPE-style benchmarking or want to benchmark .dlc files directly.

SNPE benchmark requirements:

The SNPE path requires ADB access to the device. The tool uses snpe-parallel-run internally with the following behavior:

 * Prepares and sends a set of images to the device for inference (configured via --num-images).
 * Supports runtime and performance customization via --runtime and --profile options.
 * Runs on two threads by default (to match DAI defaults).

## Manual SNPE benchmarking

For advanced use cases or custom workflows, you can benchmark models using SNPE tools directly without modelconverter benchmark.
This approach is useful when:

 * Building custom benchmarking scripts or automation
 * Integrating with existing SNPE pipelines
 * Requiring fine-grained control over benchmarking parameters

### Workflow

 1. Inspect the DLC model Extract input tensor metadata including tensor names, shapes, and data types. Use snpe-dlc-info to
    retrieve this information.
 2. Prepare input data
    * Create input .raw files matching the expected tensor shapes and types.
    * Generate an input_list.txt file that lists all input files for processing.
 3. Run SNPE benchmarking tools Choose the appropriate tool based on your benchmarking goals:
    * [snpe-parallel-run](https://docs.qualcomm.com/nav/home/SNPE_general_tools.html?product=1601111740010412#snpe-parallel-run).
    * [snpe-throughput-net-run](https://docs.qualcomm.com/nav/home/SNPE_general_tools.html?product=1601111740010412#snpe-throughput-net-run).

For detailed usage and options for these tools, refer to the [SNPE General Tools
documentation](https://docs.qualcomm.com/nav/home/SNPE_general_tools.html?product=1601111740010412).

## Advanced Examples

### NN Archive with Custom Duration

Benchmark an exported NN Archive on RVC2 with a longer test duration and save results for later comparison.

```bash
modelconverter benchmark rvc2 \
  --model-path path/to/model_archive.tar.xz \
  --benchmark-time 60 \
  --save
```

Key points:

 * Runs a 60-second time-based benchmark for more stable results
 * Prints results to console and saves to CSV for comparison

### HubAI Model with Power Monitoring

Benchmark a Model Zoo model using its HubAI Model Slug on RVC4 with power consumption and DSP utilization metrics.

```bash
modelconverter benchmark rvc4 \
  --model-path yolov6-nano:r2-coco-512x288 \
  --device-ip 192.168.1.50 \
  --power-benchmark \
  --dsp-benchmark
```

Key points:

 * Use --device-id <id> as an alternative to --device-ip <ip>
 * Power and DSP monitoring require ADB access and device support

### Local DLC via SNPE

Benchmark a local .dlc file using SNPE tools with a controlled workload size.

```bash
modelconverter benchmark rvc4 \
  --model-path path/to/model.dlc \
  --no-dai-benchmark \
  --device-ip 192.168.1.50 \
  --num-images 500
```

Key points:

 * Uses SNPE benchmarking path instead of DAI (via --no-dai-benchmark)
 * Processes 500 generated input images
 * Targets a specific device by IP address