# Roboflow Universe

## Overview

[Roboflow](https://roboflow.com/) is a comprehensive platform designed to simplify the process of building, training, and
deploying computer vision models. One of its key offerings is [Roboflow Universe](https://universe.roboflow.com/), a
community-driven platform where users can share and explore annotated computer vision datasets across a wide range of use cases.

In this integration, we focus on using pre-annotated datasets from Roboflow Universe and leveraging them within your own custom
model development pipeline. We’ve built a seamless integration between Roboflow Universe and our
[LuxonisTrain](https://docs.luxonis.com/software-v3/ai-inference/model-source/training/luxonis-train.md) framework. This guide
walks through the main steps of the process. For a more hands-on example, check out [our Colab
tutorial](https://colab.research.google.com/github/luxonis/ai-tutorials/blob/main/training/train_roboflow_dataset.ipynb).

## Usage

### Dataset Selection

Start by visiting [Roboflow Universe](https://universe.roboflow.com/) and searching for a dataset relevant to your task. Entering
relevant keywords into the search bar returns numerous datasets for exploration.

When selecting a dataset, consider the following:

 * Annotation type must match your prediction target: For object detection models, the dataset must include bounding box
   annotations. For instance segmentation, annotations must also include per-object masks in addition to bounding boxes. And
   similarly then for other type of tasks.

 * Dataset quality matters: Ensure that annotations are accurate, the image content is diverse, and the dataset is sufficiently
   large. We generally recommend a minimum of 500–1000 images for achieving robust performance - but more is better.

 * Image domain should match your deployment environment: The training images should resemble what your model will see in
   production. For example, if your factory only uses blue PCBs but the dataset contains green ones, it may not generalize well.
   You can try using color augmentation to help with this domain shift or seach for a dataset that is more suitable for your
   deployment.

### Training

Once you've chosen a dataset, you can reference it in your LuxonisTrain [configuration
file](https://docs.luxonis.com/software-v3/ai-inference/model-source/training/luxonis-train/concepts.md). The key parameter to set
is dataset_dir under the params section of loader.

We use a special URI format for Roboflow datasets:

```markdown
roboflow://<TEAM_NAME>/<DATASET_NAME>/<DATASET_VERSION>/coco
```

You can extract these components directly from the dataset URL. For example, for this dataset:

```markdown
https://universe.roboflow.com/learn-uzoux/pcb-defect-0i1a7
```

The corresponding dataset_dir becomes:

```markdown
roboflow://learn-uzoux/pcb-defect-0i1a7/2/coco
```

> If the dataset is private to your Roboflow team, be sure to export your Roboflow API key before starting the training process:
> `export ROBOFLOW_API_KEY="<ADD_YOUR_ROBOFLOW_API_KEY_HERE>"`

With the dataset correctly specified, you're ready to start training. You can follow the remaining steps - training, validation,
and deployment - via our [Colab
tutorial](https://colab.research.google.com/github/luxonis/ai-tutorials/blob/main/training/train_roboflow_dataset.ipynb).

## Next steps

If your model’s performance isn't satisfactory after initial training, consider tuning the configuration—e.g., training for more
epochs, using a larger architecture, or adjusting hyperparameters. Often, however, the best improvement comes from adding more
high-quality data.

Roboflow makes it easy to fork existing datasets, merge new images, and re-annotate if needed. If the dataset is private, remember
to re-export your API key before training (as mentioned above).
