LuxonisParser

Overview

The LuxonisParser offers a simple API for creating datasets from several common dataset formats (all of which are also supported by Roboflow):

- COCO

We support COCO JSON format in two variants:
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── data/
4    │   │   ├── img1.jpg
5    │   │   ├── img2.jpg
6    │   │   └── ...
7    │   └── labels.json
8    ├── validation/
9    │   ├── data/
10    │   └── labels.json
11    └── test/
12        ├── data/
13        └── labels.json
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── img1.jpg
4    │   ├── img2.jpg
5    │   └── ...
6    │   └── _annotations.coco.json
7    ├── valid/
8    └── test/
  • Roboflow format (supports YOLOv8-v12)
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── images/
4    │   │   ├── img1.jpg
5    │   │   ├── img2.jpg
6    │   │   └── ...
7    │   ├── labels/
8    │   │   ├── img1.txt
9    │   │   ├── img2.txt
10    │   │   └── ...
11    ├── valid/
12    ├── test/
13    └── *.yaml
  • Ultralytics format
Plain Text
1dataset_dir/
2    ├── images/
3    │   ├── train/
4    │   │   ├── img1.jpg
5    │   │   ├── img2.jpg
6    │   │   └── ...
7    │   ├── val/
8    │   └── test/
9    ├── labels/
10    │   ├── train/
11    │   │   ├── img1.txt
12    │   │   ├── img2.txt
13    │   │   └── ...
14    │   ├── val/
15    │   └── test/
16    └── *.yaml
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── img1.jpg
4    │   ├── img1.xml
5    │   └── ...
6    ├── valid/
7    └── test/
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── img1.jpg
4    │   ├── img1.txt
5    │   ├── ...
6    │   └── _darknet.labels
7    ├── valid/
8    └── test/
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── img1.jpg
4    │   ├── img2.jpg
5    │   ├── ...
6    │   ├── _annotations.txt
7    │   └── _classes.txt
8    ├── valid/
9    └── test/
Plain Text
1dataset_dir/
2    ├── images/
3    │   ├── train/
4    │   │   ├── img1.jpg
5    │   │   ├── img2.jpg
6    │   │   └── ...
7    │   ├── valid/
8    │   └── test/
9    ├── labels/
10    │   ├── train/
11    │   │   ├── img1.txt
12    │   │   ├── img2.txt
13    │   │   └── ...
14    │   ├── valid/
15    │   └── test/
16    └── data.yaml
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── img1.jpg
4    │   ├── img2.jpg
5    │   └── ...
6    │   └── _annotations.createml.json
7    ├── valid/
8    └── test/
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── img1.jpg
4    │   ├── img2.jpg
5    │   ├── ...
6    │   └── _annotations.csv
7    ├── valid/
8    └── test/
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── metadata.json
4    │   ├── sensor_definitions.json
5    │   ├── annotation_definitions.json
6    │   ├── metric_definitions.json
7    │   └── sequence.<SequenceNUM>/
8    │       ├── step<StepNUM>.camera.jpg
9    │       ├── step<StepNUM>.frame_data.json
10    │       └── (OPTIONAL: step<StepNUM>.camera.semantic segmentation.jpg)
11    ├── valid/
12    └── test/

- Classification Directory

A directory with subdirectories for each class. Two structures are supported:
  • Split structure with train/valid/test subdirectories:
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── class1/
4    │   │   ├── img1.jpg
5    │   │   ├── img2.jpg
6    │   │   └── ...
7    │   ├── class2/
8    │   └── ...
9    ├── valid/
10    └── test/
  • Flat structure (class subdirectories directly in root, random splits applied at parse time):
Plain Text
1dataset_dir/
2    ├── class1/
3    │   ├── img1.jpg
4    │   └── ...
5    ├── class2/
6    │   └── ...
7    └── info.json  (optional metadata file)
FiftyOneImageClassificationDataset format with images in a data/ folder and labels in labels.json. Two structures are supported:
  • Split structure with train/validation/test subdirectories:
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── data/
4    │   │   ├── img1.jpg
5    │   │   └── ...
6    │   └── labels.json
7    ├── validation/
8    │   ├── data/
9    │   └── labels.json
10    └── test/
11        ├── data/
12        └── labels.json
  • Flat structure (random splits applied at parse time):
Plain Text
1dataset_dir/
2    ├── data/
3    │   ├── img1.jpg
4    │   └── ...
5    └── labels.json
The labels.json format:
JSON
1{
2    "classes": ["class1", "class2", ...],
3    "labels": {
4        "image_stem": class_index,
5        ...
6    }
7}

- Segmentation Mask Directory

A directory with images and corresponding masks.
Plain Text
1dataset_dir/
2    ├── train/
3    │   ├── img1.jpg
4    │   ├── img1_mask.png
5    │   ├── ...
6    │   └── _classes.csv
7    ├── valid/
8    └── test/
The masks are stored as grayscale PNG images where each pixel value corresponds to a class. The mapping from pixel values to class is defined in the _classes.csv file.
Csv
1Pixel Value, Class
20, background
31, class1
42, class2
53, class3

Dataset Parsing

Parsing starts by initializing the LuxonisParser object with the path to dataset directory. Optionally, you can specify the name and the type (i.e. the format) of the dataset (by default, the name is set to the name of the provided dataset directory, and the type is infered based on dataset directory structure) The dataset directory can either be a path to a local directory or an URL to a directory stored on one of the supported cloud storage providers (the dataset is automatically downloaded in that case). You can also provide the dataset directory as a .zip file.
Python
1from luxonisml.data import LuxonisParser
2from luxonis_ml.enums import DatasetType
3
4dataset_dir = "local/path/to/dataset or URL"
5
6parser = LuxonisParser(
7  dataset_dir: str = dataset_dir,
8  dataset_name: Optional[str] = ... # e.g. "my_dataset"
9  dataset_type: Optional[str] = ... # e.g. DatasetType.COCO; if None, luxonis-ml auto-detects the format
10)
After initializing the LuxonisParser object, parsing can be ran by calling .parse() method on it:
Python
1dataset = parser.parse()
This creates a LuxonisDataset instance containing the data from the provided dataset, keeping the original splits.

CLI Reference

The parsing functionality can be invoked by using the luxonis_ml data parse command.
Command Line
1luxonis_ml data parse path/to/dataset --name my_dataset --type coco
For more detailed information, run luxonis_ml data parse --help.