Here we have examples of Google Colaboratory (aka Colab or simply colabs) notebooks trained on various datasets. They are free GPU instances, so great for prototyping and even simple production models.
The below tutorials cover MobileNetv2-SSD, tiny-YOLOv3, tiny-YOLOv4, and Deeplabv3+ (semantic segmentation). A bunch of other object detectors and neural networks could be trained/supported on Colab and run on DepthAI, so if you have a request for a different object detector/network backend, please feel free to make a Github Issue!
And please feel free to work directly from our Github of depthai-ml-training for the latest models we support:
The tutorial notebook Easy_Object_Detection_With_Custom_Data_Demo_Training.ipynb shows how to quickly train an object detector based on the Mobilenet SSDv2 network.
Optionally, see our documentation around this module (here) for of a guide/walk-through on how to use this notebook. Also, feel free to jump right into the Notebook, with some experimentation it’s relatively straightforward to get a model trained.
After training is complete, it also converts the model to a .blob file that runs on our DepthAI platform and modules. First the model is converted to a format usable by OpenVINO called Intermediate Representation, or IR. The IR model is then compiled to a .blob file using a server we set up for that purpose. (The IR model can also be converted locally to a blob.)
And that’s it, in less than a couple of hours a fairly advanced proof of concept object detector can run on DepthAI to detect objects of your choice and their associated spatial information (i.e. xyz location). For example this notebook was used to train DepthAI to locate strawberries in 3D space, see below:
The above example used a DepthAI Modular Cameras Edition (BW1098FFC).
The Medical Mask Detection Demo Training.ipynb training notebook shows another example of a more complex object detector. The training data set consists of people wearing or not wearing masks for viral protection. There are almost 700 pictures with approximately 3600 bounding box annotations. The images are complex: they vary quite a lot in scale and composition. Nonetheless, the object detector does quite a good job with this relatively small dataset for such a task. Again, training takes around 2 hours. Depending on which GPU the Colab lottery assigns to the notebook instance, training 10k steps can take 2.5 hours or 1.5 hours. Either way, a short period for such a good quality proof of concept for such a difficult task. We then performed the steps above for converting to blob and then running it on our DepthAI module.
Below is a quick test of the model produced with this notebook on Luxonis DepthAI Onboard Cameras Edition (BW1098OBC):
This notebook operates on your set of images in Google Drive to resize them to the format needed by the training notebooks.
We’re always happy to help with code or other questions you might have.