LuxonisParser
LuxonisParser
Overview
LuxonisParser offers a simple API for creating datasets from several common dataset formats. This includes popular Roboflow-exported layouts, Ultralytics-style datasets, Luxonis native LDF datasets, and a few specialized formats such as SOLO:Note: When parsing ZIP files, do not include a top-level
dataset_dir folder in the archive. The train, validation, and test directories (according to the selected format) should be placed directly at the root of the ZIP archive.- COCO
Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── data/
4 │ │ ├── img1.jpg
5 │ │ ├── img2.jpg
6 │ │ └── ...
7 │ └── labels.json
8 ├── validation/
9 │ ├── data/
10 │ └── labels.json
11 └── test/
12 ├── data/
13 └── labels.jsonPlain Text
1dataset_dir/
2 ├── train/
3 │ ├── img1.jpg
4 │ ├── img2.jpg
5 │ └── ...
6 │ └── _annotations.coco.json
7 ├── valid/
8 └── test/- YOLOv8-v12 and Ultralytics
- Roboflow format (supports YOLOv8-v12)
Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── images/
4 │ │ ├── img1.jpg
5 │ │ ├── img2.jpg
6 │ │ └── ...
7 │ ├── labels/
8 │ │ ├── img1.txt
9 │ │ ├── img2.txt
10 │ │ └── ...
11 ├── valid/
12 ├── test/
13 └── *.yaml- Ultralytics format
Plain Text
1dataset_dir/
2 ├── images/
3 │ ├── train/
4 │ │ ├── img1.jpg
5 │ │ ├── img2.jpg
6 │ │ └── ...
7 │ ├── val/
8 │ └── test/
9 ├── labels/
10 │ ├── train/
11 │ │ ├── img1.txt
12 │ │ ├── img2.txt
13 │ │ └── ...
14 │ ├── val/
15 │ └── test/
16 └── *.yamlPlain Text
1dataset_dir/
2 ├── train/
3 │ ├── img1.jpg
4 │ ├── img1.xml
5 │ └── ...
6 ├── valid/
7 └── test/Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── img1.jpg
4 │ ├── img1.txt
5 │ ├── ...
6 │ └── _darknet.labels
7 ├── valid/
8 └── test/Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── img1.jpg
4 │ ├── img2.jpg
5 │ ├── ...
6 │ ├── _annotations.txt
7 │ └── _classes.txt
8 ├── valid/
9 └── test/Plain Text
1dataset_dir/
2 ├── images/
3 │ ├── train/
4 │ │ ├── img1.jpg
5 │ │ ├── img2.jpg
6 │ │ └── ...
7 │ ├── valid/
8 │ └── test/
9 ├── labels/
10 │ ├── train/
11 │ │ ├── img1.txt
12 │ │ ├── img2.txt
13 │ │ └── ...
14 │ ├── valid/
15 │ └── test/
16 └── data.yamlPlain Text
1dataset_dir/
2 ├── train/
3 │ ├── img1.jpg
4 │ ├── img2.jpg
5 │ └── ...
6 │ └── _annotations.createml.json
7 ├── valid/
8 └── test/Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── img1.jpg
4 │ ├── img2.jpg
5 │ ├── ...
6 │ └── _annotations.csv
7 ├── valid/
8 └── test/- SOLO
Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── metadata.json
4 │ ├── sensor_definitions.json
5 │ ├── annotation_definitions.json
6 │ ├── metric_definitions.json
7 │ └── sequence.<SequenceNUM>/
8 │ ├── step<StepNUM>.camera.jpg
9 │ ├── step<StepNUM>.frame_data.json
10 │ └── (OPTIONAL: step<StepNUM>.camera.semantic segmentation.jpg)
11 ├── valid/
12 └── test/- Classification Directory
- Split structure with train/valid/test subdirectories:
Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── class1/
4 │ │ ├── img1.jpg
5 │ │ ├── img2.jpg
6 │ │ └── ...
7 │ ├── class2/
8 │ └── ...
9 ├── valid/
10 └── test/- Flat structure (class subdirectories directly in root, random splits applied at parse time):
Plain Text
1dataset_dir/
2 ├── class1/
3 │ ├── img1.jpg
4 │ └── ...
5 ├── class2/
6 │ └── ...
7 └── info.json (optional metadata file)data/ folder and labels in labels.json. Two structures are supported:- Split structure with train/validation/test subdirectories:
Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── data/
4 │ │ ├── img1.jpg
5 │ │ └── ...
6 │ └── labels.json
7 ├── validation/
8 │ ├── data/
9 │ └── labels.json
10 └── test/
11 ├── data/
12 └── labels.json- Flat structure (random splits applied at parse time):
Plain Text
1dataset_dir/
2 ├── data/
3 │ ├── img1.jpg
4 │ └── ...
5 └── labels.jsonlabels.json format:JSON
1{
2 "classes": ["class1", "class2", ...],
3 "labels": {
4 "image_stem": class_index,
5 ...
6 }
7}- Native LDF
annotations.json files.Plain Text
1dataset_dir/
2 ├── train/
3 │ └── annotations.json
4 ├── valid/
5 └── test/- Segmentation Mask Directory
Plain Text
1dataset_dir/
2 ├── train/
3 │ ├── img1.jpg
4 │ ├── img1_mask.png
5 │ ├── ...
6 │ └── _classes.csv
7 ├── valid/
8 └── test/_classes.csv file.Csv
1Pixel Value, Class
20, background
31, class1
42, class2
53, class3Dataset Parsing
LuxonisParser object with the path to dataset directory. Optionally, you can specify the name, task name, and the type (i.e. the format) of the dataset (by default, the name is set to the name of the provided dataset directory, and the type is inferred based on dataset directory structure). The dataset directory can either be a path to a local directory or a remote dataset identifier. The parser currently accepts local paths, .zip archives, gcs://..., s3://..., and roboflow://workspace/project/version/format dataset identifiers. You can also provide the dataset directory as a .zip file.Python
1from luxonis_ml.data.parsers import LuxonisParser
2from luxonis_ml.enums import DatasetType
3
4dataset_dir = "roboflow://workspace/project/version/coco"
5
6parser = LuxonisParser(
7 dataset_dir=dataset_dir,
8 dataset_name="my_dataset",
9 dataset_type=DatasetType.COCO,
10 task_name="detection",
11)LuxonisParser object, parsing can be run by calling the .parse() method on it:Python
1dataset = parser.parse()LuxonisDataset instance containing the data from the provided dataset, keeping the original splits whenever the source format defines them. If the dataset already exists in Luxonis format, parsing is skipped and the existing dataset is returned.CLI Reference
luxonis_ml data parse command.Command Line
1luxonis_ml data parse path/to/dataset --name my_dataset --type cocoluxonis_ml data parse --help.