NN Archive

Overview

NN Archive is our own format that packages the model executable(s) and configuration files together into a .tar.xz archive. Configuration file encodes the architecture, pre- and post-processing, and other relevant information about the model required for seamless implementation across our ecosystem. For example, this simplifies specification of conversion parameters when converting the model with the ModelConverter tool or through Detailed Conversion in Hub, or handling the pre- and post-processing when deploying the model in a DepthAI pipeline.

An NN Archive always describes the model executable(s) stored inside it.

If the archive contains an ONNX model, the values in config.json should match that ONNX model.

If the archive contains a compiled RVC model, the values in config.json should describe that compiled model.

Model Executable(s)

The model executable(s) constitute the actual model used for inference. If the archive is destined for conversion, the file(s) need to be in one of the formats supported by ModelConverter:

ONNX (.onnx),
OpenVINO IR (.xml and .bin), or
TensorFlow Lite (.tflite).

If the archive is aimed for inference on our devices, the file needs to be in a RVC compiled format matching the requirements of the RVC Platform it is meant to run on:

Binary Large Object (.blob or .superblob) for RVC2 and RVC3 Platforms, or
Deep Learning Container (.dlc) for RVC4 Platform.

Configuration

config.json file encoding the config scheme version and a dictionary representing the model with inputs, outputs, heads, and metadata sections.

Where to get the values

When building an ONNX NN Archive by hand, the most common sources of truth are:

the original training or inference code, for preprocessing and output semantics,
Netron, for tensor names, shapes, and layouts,
and onnxruntime, for input and output names, shapes, and dtypes.

Inputs

This section configures the model's input stream(s). It's defined as a list of Input dicts. Each consists of the following fields:

name (str) - Name of the input layer. Copy this from the model stored in the archive.
dtype (str) - Input tensor data type (e.g. float32 or uint8). This should match the datatype expected by the model input tensor.
input_type (str) - Type of input data ('raw' or 'image').
shape (list of ints) - Shape of the input data as a list of integers (e.g. [H,W], [H,W,C], [N,H,W,C], ...). This should match the model input tensor shape exactly.
layout (str) - Lettercode interpretation of the input data dimensions (e.g. NCHW). This should match the order used by the model tensor shape.
preprocessing (dict) - Preprocessing applied to the input data.
- mean (list of floats) - Mean values in channel order. Order depends on the order in which the model was trained on.
- scale (list of floats) - Standardization values in channel order. Order depends on the order in which the model was trained on.
- reverse_channels (bool) - If True input to the model is RGB else BGR. Deprecated, will be replaced by dai_type flag in future versions.
- interleaved_to_planar (bool): If True input to the model is interleaved (NHWC) else planar (NCHW). Deprecated, will be replaced by dai_type flag in future versions.
- dai_type (str): DepthAI input type which is read by DepthAI to automatically setup the pipeline.

mean and scale should describe the preprocessing expected by the model stored in the NN Archive.

If your original preprocessing is input = (input / 255.0 - mean) / std, store 255 * mean as mean and 255 * std as scale.

Do not swap mean and scale values when changing between RGB and BGR; only reorder the channels to match the encoding expected by the model.

Outputs

This section configures the model's output stream(s). It's defined as a list of Output dicts. Each consists of the following fields:

name (str) - Name of the output layer.
dtype (str) - Data type of the output data (e.g. float32).
shape (list of ints) - Shape of the output tensor.
layout (str) - Lettercode interpretation of the output tensor dimensions (e.g. NC or NCHW).

Heads

This section configures the post-processing steps applied to the model's output(s). It's defined as a list of Head dicts. Each consists of the following fields:

parser (str) - Name of the depthai-nodes parser responsible for post-processing the model's output(s).
outputs (list of str) - List the output names that should be fed into the parser. If None, all outputs are fed.
metadata (dict): Head-specific metadata. Consult the sourcecode for more information.

The Heads section is optional. If not defined, we assume raw output (no post-processing required). Use heads when the model output needs semantic interpretation, for example class names for classification or parser-specific metadata for detection or segmentation.

Metadata

This section represents model's metadata. It's defined as a Metadata dict. It consists of the following fields:

name (str) - Name of the model.
path (str) - Path to the model executable (e.g. 'model.onnx'). The path is relative to the archive root. In case of an OpenVINO IR model, provide here only the path to the .xml file and make sure that the .bin file is located in the same path.
precision (str) - Precision of the model's weights (e.g. float32 or float16). This is the precision of the stored model file itself, not the datatype of individual input or output tensors.

Additionally, feel free to add arbitrary fields to this section that help with understanding of the model.

Generation

In order to generate a NN Archive, follow the steps below:

prepare a model executable(s) by either training it from scratch or acquiring it from existing sources (beware of the model format).
prepare a config.json. Example configuration for a simple single input/output source ONNX model without any pre- (mean=0, scale=1, reverse_channels=False) or post-processing requirements:

JSON

1{
2    "config_version": null,
3    "model": {
4        "metadata": {
5            "name": "ModelName",
6            "path": "model_name.onnx",
7            "precision": "float32"
8        },
9        "inputs": [
10            {
11                "name": "input_layer_name",
12                "dtype": "float32",
13                "input_type": "image",
14                "shape": [
15                    1,
16                    3,
17                    256,
18                    256
19                ],
20                "layout": "NCHW",
21                "preprocessing": {
22                    "mean": [
23                        0.0,
24                        0.0,
25                        0.0
26                    ],
27                    "scale": [
28                        1.0,
29                        1.0,
30                        1.0
31                    ],
32                    "reverse_channels": false,
33                    "interleaved_to_planar": null
34                }
35            }
36        ],
37        "outputs": [
38            {
39                "name": "output",
40                "dtype": "float32",
41                "shape": [
42                    1,
43                    3
44                ],
45                "layout": "NC"
46            }
47        ],
48        "heads": []
49    }
50}

We can add post-processing step by expanding the heads section. Example head for a model with three classification categories:

JSON

1{
2    ...
3    "model": {
4        ...
5        "heads": [
6            {
7                "parser": "ClassificationParser",
8                "metadata": {
9                    "postprocessor_path": null,
10                    "classes": [
11                        "Class1",
12                        "Class2",
13                        "Class3"
14                    ],
15                    "n_classes": 3,
16                    "is_softmax": true
17                },
18                "outputs": [
19                    "output"
20                ]
21            }
22        ]
23    }
24}

install luxonis-ml and run the Archive Generator:

Python

1from luxonis_ml.nn_archive.archive_generator import ArchiveGenerator
2from luxonis_ml.nn_archive.config import CONFIG_VERSION
3import json
4
5cfg_path = ... # string path to configuration data JSON.
6with open(cfg_path, "r") as file:
7    cfg_dict = json.load(file)
8cfg_dict["config_version"] = CONFIG_VERSION # set config version from luxonis-ml
9
10generator = ArchiveGenerator(
11    archive_name=..., # string name of the generated archive.
12    save_path=..., # string path to where you want to save the archive file.
13    cfg_dict=cfg_dict,
14    executables_paths=... # list of string paths to relevant model executables.
15    )
16
17generator.make_archive() # archive file is saved to the specified save_path

Multi-Stage Models

Models sometimes consist of multiple stages. A common example is a two-stage model consisting of a main model (first-stage) and a postprocessor (second-stage), each with its own executable file(s). We also support packaging of such models into a NN Archive. Start by defining configuration file for the first-stage model, following the steps described above. The second-stage model is defined simply by setting the postprocessor_path parameter in the Heads.Metadata section. It should point to the executable of the second-stage model (the path must be relative to the archive root). The NN Archive can be constructed with the ArchiveGenerator. Make sure that paths to model executables of both the first- and the second- stage models are provided in the executables_paths argument.

Beware that the configuration file relates only to the first-stage model and no information can be defined for the second-stage model.

In case that would be necessary, we recommend construction of two separate NN Archives.

ON THIS PAGE

NN ArchiveView as Markdown