# LuxonisLoader

## Overview

LuxonisLoader offers a simple and efficient way to load and iterate through data stored in the Luxonis Data Format (LDF), with
support for on-the-fly data augmentation. Since it is also natively integrated with LuxonisTrain, this enables a smooth and
seamless training workflow.

## Dataset Loading

To load a dataset with LuxonisLoader, we first need an instance of LuxonisDataset. The loader is initialized with the dataset and
the dataset view (i.e. the split) we intend to load.

```python
from luxonis_ml.data.datasets import LuxonisDataset
from luxonis_ml.data.loaders import LuxonisLoader

dataset_name: str = ... # name of an existing LDF dataset, _e.g._ "parking_lot"
dataset = LuxonisDataset(dataset_name)
loader = LuxonisLoader(dataset, view="train")
```

> The
> `view`
> can be either a single split or a list of splits.

The data can be iterated over using a simple for loop:

```python
for images, labels in loader:
    ...
```

For single-source datasets, images is typically a single image array. For multi-source datasets, it can also be a dictionary keyed
by source or component name. The labels output is grouped by task name.

## Augmentation

Augmentations are transformations applied to the data to increase the diversity of the dataset, thus improving the model training.
We can define them by passing a list of Python dictionaries to the augmentation_config parameter of the LuxonisLoader constructor,
each representing an individual augmentation:

```python
{
    "name": str,  # name of the augmentation
    "params": dict  # parameters of the augmentation
}
```

By default, we support most augmentations from the albumentations library. You can find the full list of augmentations and their
parameters in the [Albumentations documentation](https://albumentations.ai/docs/api_reference/augmentations/). On top of that, we
provide a handful of custom batch augmentations:

 * Mosaic4 - Mosaic augmentation with 4 images. Combines crops of 4 images into a single image in a mosaic pattern.
 * MixUp - MixUp augmentation. Overlays two images with a random weight.

Each augmentation entry can also define use_for_resizing: true when you want that transform to handle resizing explicitly.

### Example

The following example demonstrates a simple augmentation pipeline:

```python
[
  {
    'name': 'HueSaturationValue',
    'params': {
      'p': 0.5,
      'hue_shift_limit': 3,
      'sat_shift_limit': 70,
      'val_shift_limit': 40,
    }
  },
  {
    'name': 'Rotate',
    'params': {
      'p': 0.6,
      'limit': 30,
      'border_mode': 0,
      'value': [0, 0, 0]
    }
  },
  {
    'name': 'Perspective',
    'params': {
      'p': 0.5,
      'scale': [0.04, 0.08],
      'keep_size': True,
      'pad_mode': 0,
      'pad_val': 0,
      'mask_pad_val': 0,
      'fit_output': False,
      'interpolation': 1,
      'always_apply': False,
    }
  },
  {
    'name': 'Affine',
    'params': {
      'p': 0.4,
      'scale': None,
      'translate_percent': None,
      'translate_px': None,
      'rotate': None,
      'shear': 10,
      'interpolation': 1,
      'mask_interpolation': 0,
      'cval': 0,
      'cval_mask': 0,
      'mode': 0,
      'fit_output': False,
      'keep_ratio': False,
      'rotate_method': 'largest_box',
      'always_apply': False,
    }
  },
]
```

Let's assume the augmentation list above is stored as a YAML file named augmentations.yaml. We can then use it to create a loader:

```python
from luxonis_ml.data.datasets import LuxonisDataset
from luxonis_ml.data.loaders import LuxonisLoader

dataset_name: str = ...
dataset = LuxonisDataset(dataset_name)
loader = LuxonisLoader(
    dataset,
    view="train", 
    augmentation_config="augmentations.yaml", 
    augmentation_engine="albumentations",  # default
    height=256,
    width=320,
    keep_aspect_ratio=True, # default
    color_space="RGB",  # default, can be also BGR
)
for img, labels in loader:
    ...
```

> The augmentations are
> **not**
> necessarily applied in the same order as defined in the list. Instead, an optimal order is determined based on the type of the
augmentations to minimize computational cost.

## Additional loader options

The current loader also supports several useful options that are important in larger training pipelines:

 * exclude_empty_annotations=True to drop empty label entries from the final label dictionary
 * filter_task_names=[...] to load only selected task groups from a multi-task dataset
 * keep_categorical_as_strings=True to keep categorical metadata values as strings instead of encoded integers
 * color_space={...} to set color space per source when working with multi-source datasets
 * update_mode="all" or "missing" to control local synchronization behavior for remote datasets

For the full constructor signature and return types, see the [API
reference](https://docs.luxonis.com/software-v3/ai-inference/model-source/training/luxonis-ml/api-reference.md).
