DataDreamer
Quickstart
DataDreamer enables you to create annotated datasets from scratch using Generative AI and foundational Computer Vision models. This allows you to train your own models for edge AI applications, such as object detection, without the need for real-world data.To generate your dataset with custom classes, you need to execute only two commands:Command Line
1pip install datadreamer
2datadreamer --class_names person moon robot
- Google Colab notebook with instructions on how to generate a dataset, train a model, and export it to for RVC2/RVC3: DataDreamer Quickstart
- Helmet detection example: Helmet detection
- More info in the DataDreamer GitHub repository
Overview
DataDreamer
is an advanced toolkit engineered to facilitate the development of edge AI models, irrespective of initial data availability. Distinctive features of DataDreamer include:- Synthetic Data Generation: Eliminate the dependency on extensive datasets for AI training. DataDreamer empowers users to generate synthetic datasets from the ground up, utilizing advanced AI algorithms capable of producing high-quality, diverse images.
- Knowledge Extraction from Foundational Models:
DataDreamer
leverages the latent knowledge embedded within sophisticated, pre-trained AI models. This capability allows for the transfer of expansive understanding from these "Foundation models" to smaller, custom-built models, enhancing their capabilities significantly. - Efficient and Potent Models: The primary objective of
DataDreamer
is to enable the creation of compact models that are both size-efficient for integration into any device and robust in performance for specialized tasks.
Features
- Prompt Generation: Automate the creation of image prompts using powerful language models.Provided class names: ["horse", "robot"]Generated prompt: "A photo of a horse and a robot coexisting peacefully in the midst of a serene pasture."
- Image Generation: Generate synthetic datasets with state-of-the-art generative models.
- Dataset Annotation: Leverage foundation models to label datasets automatically.
- Edge Model Training: Train efficient small-scale neural networks for edge deployment. (not part of this library)
Installation
To install with pip:Command Line
1pip install datadreamer
Available models
Model Category | Model Names | Description/Notes |
---|---|---|
Prompt Generation | Mistral-7B-Instruct-v0.1 | Semantically rich prompts |
TinyLlama-1.1B-Chat-v1.0 | Tiny LM | |
Simple random generator | Joins randomly chosen object names | |
Image Generation | SDXL-1.0 | Slow and accurate (1024x1024 images) |
SDXL-Turbo | Fast and less accurate (512x512 images) | |
SDXL-Lightning | Fast and accurate (1024x1024 images) | |
Image Annotation | OWLv2 | Open-Vocabulary object detector |
Example
Command Line
1datadreamer --save_dir path/to/save_directory --class_names person moon robot --prompts_number 20 --prompt_generator simple --num_objects_range 1 3 --image_generator sdxl-turbo
Useful tips
- Batched generation: To speed up the generation process, consider increasing the batch size with
--batch_size_prompt
,--batch_size_image
and--batch_size_annotation
parameters. If you are running out of memory, try reducing the batch size. - Better image quality: For better image quality, consider tuning the following parameters:
--image_generator
: Choose a model with higher image quality. SDXL-Turbo -> SDXL-Lightning -> SDXL (from fastest to slowest, and from lowest to highest quality).--use_image_tester
and--image_tester_patience
: Enable iterative image generation and use the CLIP model to select the best images. Consider increasing the patience to get better results.
- Number of objects per image: To generate images with a different number of objects, use the
--num_objects_range
parameter. For example,--num_objects_range 1 3
generates images with 1, 2, or 3 objects. Values higher than 3 are not recommended due to the limited ability of the current models to generate complex scenes. - Prompt generation: To generate more diverse prompts consider using the
--prompt_generator tiny
generator which uses a small language model to generate prompts.