# NN Archive

## Overview

NN Archive is our own format that packages the model executable(s) and configuration files together into a .tar.xz archive.
Configuration file encodes the architecture, pre- and post-processing, and other relevant information about the model required for
seamless implementation across our ecosystem. For example, this simplifies specification of conversion parameters when converting
the model with the
[ModelConverter](https://docs.luxonis.com/software-v3/ai-inference/conversion/rvc-conversion/offline/modelconverter.md) tool or
through [Detailed Conversion in Hub](https://docs.luxonis.com/cloud/hubai/model-registry/detailed-conversion.md), or handling the
pre- and post-processing when deploying the model in a DepthAI pipeline.

> An NN Archive always describes the model executable(s) stored inside it. If the archive contains an
> `ONNX`
> model, the values in
> `config.json`
> should match that
> `ONNX`
> model. If the archive contains a compiled
> `RVC`
> model, the values in
> `config.json`
> should describe that compiled model.

## Model Executable(s)

The model executable(s) constitute the actual model used for inference. If the archive is destined for conversion, the file(s)
need to be in one of the formats supported by
[ModelConverter](https://docs.luxonis.com/software-v3/ai-inference/conversion/rvc-conversion/offline/modelconverter.md):

 * ONNX (.onnx),
 * OpenVINO IR (.xml and .bin), or
 * TensorFlow Lite (.tflite).

If the archive is aimed for inference on our devices, the file needs to be in a RVC compiled format matching the requirements of
the RVC Platform it is meant to run on:

 * Binary Large Object (.blob or .superblob) for RVC2 and RVC3 Platforms, or
 * Deep Learning Container (.dlc) for RVC4 Platform.

## Configuration

config.json file encoding the config scheme version and a dictionary representing the model with inputs, outputs, heads, and
metadata sections.

### Where to get the values

When building an ONNX NN Archive by hand, the most common sources of truth are:

 * the original training or inference code, for preprocessing and output semantics,
 * [Netron](https://netron.app/), for tensor names, shapes, and layouts,
 * and onnxruntime, for input and output names, shapes, and dtypes.

### Inputs

This section configures the model's input stream(s). It's defined as a list of Input dicts. Each consists of the following fields:

 * name (str) - Name of the input layer. Copy this from the model stored in the archive.
 * dtype (str) - Input tensor data type (e.g. float32 or uint8). This should match the datatype expected by the model input
   tensor.
 * input_type (str) - Type of input data ('raw' or 'image').
 * shape (list of ints) - Shape of the input data as a list of integers (e.g. [H,W], [H,W,C], [N,H,W,C], ...). This should match
   the model input tensor shape exactly.
 * layout (str) - Lettercode interpretation of the input data dimensions (e.g. NCHW). This should match the order used by the
   model tensor shape.
 * preprocessing (dict) - Preprocessing applied to the input data.
   * mean (list of floats) - Mean values in channel order. Order depends on the order in which the model was trained on.
   * scale (list of floats) - Standardization values in channel order. Order depends on the order in which the model was trained
     on.
   * reverse_channels (bool) - If True input to the model is RGB else BGR. Deprecated, will be replaced by dai_type flag in future
     versions.
   * interleaved_to_planar (bool): If True input to the model is interleaved (NHWC) else planar (NCHW). Deprecated, will be
     replaced by dai_type flag in future versions.
   * dai_type (str): DepthAI input type which is read by DepthAI to automatically setup the pipeline.

> `mean`
> and
> `scale`
> should describe the preprocessing expected by the model stored in the NN Archive. If your original preprocessing is
> `input = (input / 255.0 - mean) / std`
> , store
> `255 * mean`
> as
> `mean`
> and
> `255 * std`
> as
> `scale`
> . Do not swap mean and scale values when changing between
> `RGB`
> and
> `BGR`
> ; only reorder the channels to match the encoding expected by the model.

### Outputs

This section configures the model's output stream(s). It's defined as a list of Output dicts. Each consists of the following
fields:

 * name (str) - Name of the output layer.
 * dtype (str) - Data type of the output data (e.g. float32).
 * shape (list of ints) - Shape of the output tensor.
 * layout (str) - Lettercode interpretation of the output tensor dimensions (e.g. NC or NCHW).

### Heads

This section configures the post-processing steps applied to the model's output(s). It's defined as a list of Head dicts. Each
consists of the following fields:

 * parser (str) - Name of the depthai-nodes parser responsible for post-processing the model's output(s).
 * outputs (list of str) - List the output names that should be fed into the parser. If None, all outputs are fed.
 * metadata (dict): Head-specific metadata. Consult the
   [sourcecode](https://github.com/luxonis/luxonis-ml/blob/main/luxonis_ml/nn_archive/config_building_blocks/base_models/head_metadata.py)
   for more information.

The Heads section is optional. If not defined, we assume raw output (no post-processing required). Use heads when the model output
needs semantic interpretation, for example class names for classification or parser-specific metadata for detection or
segmentation.

### Metadata

This section represents model's metadata. It's defined as a Metadata dict. It consists of the following fields:

 * name (str) - Name of the model.
 * path (str) - Path to the model executable (e.g. 'model.onnx'). The path is relative to the archive root. In case of an OpenVINO
   IR model, provide here only the path to the .xml file and make sure that the .bin file is located in the same path.
 * precision (str) - Precision of the model's weights (e.g. float32 or float16). This is the precision of the stored model file
   itself, not the datatype of individual input or output tensors.

Additionally, feel free to add arbitrary fields to this section that help with understanding of the model.

## Generation

In order to generate a NN Archive, follow the steps below:

 * prepare a model executable(s) by either training it from scratch or acquiring it from existing sources (beware of the model
   format).

 * prepare a config.json. Example configuration for a simple single input/output source ONNX model without any pre- (mean=0,
   scale=1, reverse_channels=False) or post-processing requirements:

```json
{
    "config_version": null,
    "model": {
        "metadata": {
            "name": "ModelName",
            "path": "model_name.onnx",
            "precision": "float32"
        },
        "inputs": [
            {
                "name": "input_layer_name",
                "dtype": "float32",
                "input_type": "image",
                "shape": [
                    1,
                    3,
                    256,
                    256
                ],
                "layout": "NCHW",
                "preprocessing": {
                    "mean": [
                        0.0,
                        0.0,
                        0.0
                    ],
                    "scale": [
                        1.0,
                        1.0,
                        1.0
                    ],
                    "reverse_channels": false,
                    "interleaved_to_planar": null
                }
            }
        ],
        "outputs": [
            {
                "name": "output",
                "dtype": "float32",
                "shape": [
                    1,
                    3
                ],
                "layout": "NC"
            }
        ],
        "heads": []
    }
}
```

We can add post-processing step by expanding the heads section. Example head for a model with three classification categories:

```json
{
    ...
    "model": {
        ...
        "heads": [
            {
                "parser": "ClassificationParser",
                "metadata": {
                    "postprocessor_path": null,
                    "classes": [
                        "Class1",
                        "Class2",
                        "Class3"
                    ],
                    "n_classes": 3,
                    "is_softmax": true
                },
                "outputs": [
                    "output"
                ]
            }
        ]
    }
}
```

 * install luxonis-ml and run the Archive Generator:

```python
from luxonis_ml.nn_archive.archive_generator import ArchiveGenerator
from luxonis_ml.nn_archive.config import CONFIG_VERSION
import json

cfg_path = ... # string path to configuration data JSON.
with open(cfg_path, "r") as file:
    cfg_dict = json.load(file)
cfg_dict["config_version"] = CONFIG_VERSION # set config version from luxonis-ml

generator = ArchiveGenerator(
    archive_name=..., # string name of the generated archive.
    save_path=..., # string path to where you want to save the archive file.
    cfg_dict=cfg_dict,
    executables_paths=... # list of string paths to relevant model executables.
    )

generator.make_archive() # archive file is saved to the specified save_path
```

## Multi-Stage Models

Models sometimes consist of multiple stages. A common example is a two-stage model consisting of a main model (first-stage) and a
postprocessor (second-stage), each with its own executable file(s). We also support packaging of such models into a NN Archive.
Start by defining configuration file for the first-stage model, following the steps described above. The second-stage model is
defined simply by setting the postprocessor_path parameter in the Heads.Metadata section. It should point to the executable of the
second-stage model (the path must be relative to the archive root). The NN Archive can be constructed with the ArchiveGenerator.
Make sure that paths to model executables of both the first- and the second- stage models are provided in the executables_paths
argument.

> Beware that the configuration file relates only to the first-stage model and no information can be defined for the second-stage
model. In case that would be necessary, we recommend construction of two separate NN Archives.
