LuxonisDataset
Overview
LuxonisDataset class offers a simple API for creating and managing data in the Luxonis Data Format (LDF). It acts as an abstraction layer and provides methods for dataset:- initialization,
- ingestion,
- splitting,
- merging, and
- deletion.
Dataset initialization
LuxonisDataset object:Python
1from luxonisml.data import LuxonisDataset
2
3dataset_name: str = ... # e.g. "parking_lot"
4dataset = LuxonisDataset(dataset_name)Datasets can be stored locally or using one of the supported cloud storage providers (e.g. GCS or S3). By default, the initialized dataset is stored locally.
If there already exist a dataset with the provided
dataset_name, it will be automatically loaded instead of initializing a new one. Therefore, beware to use a unique name for each new dataset or pass delete_local=True to the LuxonisDataset constructor to overwrite an existing one.Adding Data
Python
1{
2 "file": str, # path to the image file
3 "annotation": Optional[dict] # single image annotation
4}annotation field depends on the task type. The following task types are supported:Below we provide an examplary generator function for the parking lot dataset, yielding the data instances for bounding box annotations.Python
1import json
2from pathlib import Path
3
4# path to the dataset, replace it with the actual path on your system
5dataset_root = Path("data/parking_lot")
6
7def generator():
8 for annotation_dir in dataset_root.iterdir():
9 with open(annotation_dir / "annotations.json") as f:
10 data = json.load(f)
11
12 # get the width and height of the image
13 W = data["dimensions"]["width"]
14 H = data["dimensions"]["height"]
15
16 image_path = annotation_dir / data["filename"]
17
18 for instance_id, bbox in data["BoundingBoxAnnotation"].items():
19
20 # get unnormalized bounding box coordinates
21 x, y = bbox["origin"]
22 w, h = bbox["dimension"]
23
24 # get the class name of the bounding box
25 class_ = bbox["labelName"]
26 yield {
27 "file": image_path,
28 "annotation": {
29 "class": class_,
30 # normalized bounding box
31 "boundingbox": {
32 "x": x / W,
33 "y": y / H,
34 "w": w / W,
35 "h": h / H,
36 },
37 },
38 }add method of the dataset.Python
1dataset.add(generator())The
add method accepts any iterable, not only generators.Defining Splits
train, val, and test sets. The splits are defined by calling the make_splits method on the LuxonisDataset object and passing the desired split ratios in its arguments (by default, the data are split with the 80:10:10 ratio between train, val, and test sets).Python
1dataset.make_splits({
2 "train": 0.7,
3 "val": 0.2,
4 "test": 0.1,
5})Python
1dataset.make_splits({
2 "train": ["file1.jpg", "file2.jpg", ...],
3 "val": ["file3.jpg", "file4.jpg", ...],
4 "test": ["file5.jpg", "file6.jpg", ...],
5})make_splits method again will raise an error. If you wish to redefine them, pass redefine_splits=True to the method call.Dataset Cloning
clone method on the LuxonisDataset object and passing the desired name of the new dataset.Python
1dataset_clone = dataset.clone(new_dataset_name="dataset_clone")Dataset Merging
merge_with method on the first LuxonisDataset object and passing the second one as an argument. You can choose between two different merging modes:inplace: the first dataset is modified to include data from the second datasetout-of-place: a new dataset is created from the combination of two existing datasets
Python
1# inplace merging
2dataset1.merge_with(dataset2, inplace=True)
3# OR out-of-place merging
4dataset_merge = dataset1.merge_with(dataset2, inplace=False, new_dataset_name="dataset_merge")CLI Reference
luxonis_ml CLI provides a set of various useful commands for managing datasets. These commands are accessible via the luxonis_ml data command.The available commands are:luxonis_ml data ls- lists all datasetsluxonis_ml data info <dataset_name>- prints information about the datasetluxonis_ml data inspect <dataset_name>- renders the data in the dataset on screen usingcv2luxonis_ml data delete <dataset_name>- deletes the dataset
luxonis_ml data --help or pass the --help flag to any of the above commands.