First steps with DepthAI

Hello DepthAI users!

In this guide, I assume you just got your DepthAI device (e.g. OAK-1 or OAK-D) and you want to give it the first try to explore what is possible with it and what you can achieve when working with it.

  • First, we will run a DepthAI demo script, that will allow you to preview DepthAI functionalities.

  • Next, I will explain what the script does and describe basic terms used in the DepthAI world.

  • Following up, I will show which models you can run on the DepthAI out-of-the-box and how to run a custom model.

  • Last, you will receive useful links to expand your knowledge further, and check open-sourced use-case implementations, code examples and tutorials, that you can use as a starting point for your projects.

Let’s start with the device setup below

Connect the DepthAI device

After unboxing your DepthAI package, you will receive your device together with a USB-C cable (and a power supply if you ordered OAK-D)

Make sure that the device is connected to your host (which can be a PC or Raspberry Pi or another capable device) directly to a USB port, or via a powered USB hub.

On Ubuntu, you can check if a new USB device was detected by running

$ lsusb | grep MyriadX
Bus 003 Device 002: ID 03e7:2485 Intel Movidius MyriadX

Note

If you are running other OS than Ubuntu, or you think something has gone wrong, we have detailed OS-specific installation guides here, together with discord support channels where you can chat with us live if you have any issues or questions.

Download demo script

Our goal is to make engineering efficiency higher with DepthAI. As a part of this effort, we created an all-in-one script that allows you to check DepthAI features using command line arguments - no coding required!

To download the demo script, you can either use git or directly download a zip file

From zip file

First, download a repository package from here and then unpack the archive to a directory of preference. Next open a terminal session in this directory.

From git

First, open the terminal session and go to a directory of preference, where you’d like to download your demo script. Then, run the following code to download the demo script

$ git clone https://github.com/luxonis/depthai.git

After the repository is downloaded, make sure to enter the downloaded repository by running

$ cd depthai

Create python virtualenv (optional)

To create and use the virtualenv, you can follow an official python guide to virtualenvs or follow os-specific guides on the web, like “How to Create Python 3 Virtual Environment on Ubuntu 20.04”

This will make sure that you are using a fresh environment and that Python 3 is the default interpreter - this can help to prevent potential issues.

I usually create and use virtualenvs by running

$ python3 -m venv myvenv
$ source myvenv/bin/activate
$ pip install -U pip

And this may require installing these packages prior

$ apt-get install python3-pip python3-venv

Install requirements

Once the demo source code is downloaded, and you have your terminal session set up, the next thing that has to be done is to install all additional packages that this script requires (together with the depthai Python API itself).

To install these packages, run the install_requirements.py script

$ python3 install_requirements.py

Warning

If you are using a Linux system, in most cases you have to add a new udev rule for our script to be able to access the device correctly. You can add and apply new rules by running

$ echo 'SUBSYSTEM=="usb", ATTRS{idVendor}=="03e7", MODE="0666"' | sudo tee /etc/udev/rules.d/80-movidius.rules
$ sudo udevadm control --reload-rules && sudo udevadm trigger

Now, you should be able to start using the demo script, which we will do now

Run demo script

Having everything set up, we are now ready to use the demo script by running

$ python3 depthai_demo.py

This will compile and download a default mobilenet-ssd model, configure the DepthAI and then display rgb window that will contain a scaled preview from the RGB camera from your device.

If you’re using OAK-D, it will also display depth window, that will show the depth projection calculated from left & right camera images by DepthAI.

Default run

Change input camera to left/right (OAK-D only)

To run the demo script and get a preview from the left camera, run

$ python3 depthai_demo.py -cam left
Run from left

Respectively, to get a preview from the right camera, run

$ python3 depthai_demo.py -cam right
Run from right

Default model

While the demo was running, you could see the detection results - and if you were standing in front of the camera, you should see yourself detected as a person with a pretty high probability.

The model that is used by default is a MobileNetv2 SSD object detector trained on the PASCAL 2007 VOC classes, which are:

  • Person: person

  • Animal: bird, cat, cow, dog, horse, sheep

  • Vehicle: airplane, bicycle, boat, bus, car, motorbike, train

  • Indoor: bottle, chair, dining table, potted plant, sofa, TV/monitor

So give it a try to detect different objects, like bottles or apples

bottles and apples

Or even cats

cat

Using other models

We have prepared other models, which you can try and evaluate by simply changing one command line parameter. To run the demo script with a different model, e.g. face-detection-retail-0004, run the following command

$ python3 depthai_demo.py -cnn face-detection-retail-0004

Which will allow you to detect human faces, like below

face

You can use -cnn <name> flag to change the model that is being run on the DepthAI. Below, there is a list of models that you can use, having just the demo script downloaded

  • face-detection-adas-0001 - Allows to detect faces on the image (slower)

    $ python3 depthai_demo.py -cnn face-detection-adas-0001
    
    face-detection-adas-0001
  • face-detection-retail-0004 - Allows to detect faces on the image (faster)

    $ python3 depthai_demo.py -cnn face-detection-retail-0004
    
    face-detection-retail-0004
  • mobilenet-ssd - Object detector that detects 20 different classes (default)

    $ python3 depthai_demo.py -cnn mobilenet-ssd
    
    mobilenet-ssd
  • pedestrian-detection-adas-0002 - allows to detect people on the image (slower)

    $ python3 depthai_demo.py -cnn pedestrian-detection-adas-0002
    
    pedestrian-detection-adas-0002
  • person-detection-retail-0013 - allows to detect people on the image (faster)

    $ python3 depthai_demo.py -cnn person-detection-retail-0013
    
    person-detection-retail-0013
  • person-vehicle-bike-detection-crossroad-1016 - allows to detect both people, bikes and vehicles on the image

    $ python3 depthai_demo.py -cnn person-vehicle-bike-detection-crossroad-1016
    
    person-vehicle-bike-detection-crossroad-1016
  • yolo-v3 - Object detector that detects 80 different classes (slower)

    $ python3 depthai_demo.py -cnn yolo-v3
    
    yolo-v3
  • tiny-yolo-v3 - Object detector that detects 80 different classes (faster)

    $ python3 depthai_demo.py -cnn tiny-yolo-v3
    
    tiny-yolo-v3
  • vehicle-detection-adas-0002 - allows to detect vehicles on the image

    $ python3 depthai_demo.py -cnn vehicle-detection-adas-0002
    
    vehicle-detection-adas-0002
  • vehicle-license-plate-detection-barrier-0106 - allows to detect both vehicle and license plate on the image (only Chineese license plates)

    $ python3 depthai_demo.py -cnn vehicle-license-plate-detection-barrier-0106
    
    vehicle-license-plate-detection-barrier-0106

All of the data we use to download and compile a model can be found here.

Using custom models

Let’s assume you want to run a custom model which you downloaded from the model zoo or trained yourself (or both). In order to prepare your model to be runnable on DepthAI, it has to be compiled into MyriadX blob format - which is an optimized version of your model, capable of utilizing MyriadX chip as a processing unit.

our demo script, we support a few ways you can run your custom blob, which will be covered below. As an example, I’ll add a custom face detection network called custom_model (substitute with your preferred name) and run it with the demo script

Compile MyriadX blob

To receive MyriadX blob, the network has to be already in OpenVINO IR format (consisting of .xml and .bin files) that will be used for compilation. We won’t focus here on how to obtain this representation for your model, but be sure to check official OpenVINO conversion guide.

To convert custom_model.xml and custom_model.bin, we’ll use the blobconverter cli - our tool that utilizes Online MyriadX blob converter to perform the conversion. No local OpenVINO installation is needed in this case, as all of the dependencies are already installed on the server. If your model is in TensorFlow or Caffe format, you can still use our tool for conversion, just note that you’ll have to use different input flags and sometimes provide a custom model optimizer args (Read more)

First, let’s install blobconverter from PyPi

$ python3 -m pip install -U blobconverter

Now, having the blobconverter installed, we can compile our IR files with the following command

$ python3 -m blobconverter --openvino-xml /path/to/custom_model.xml --openvino-bin /path/to/custom_model.bin

By running this command, blobconverter sends a request to the BlobConverter API to perform a model compilation on provided files. After compilation, the API responds with a .blob file and deletes all source files that were sent with the request.

After a successful compilation, blobconverter returns the path to the downloaded blob file. Since this blob is required by the depthai repository, let’s move it there

$ mkdir <depthai_repo>/resources/nn/custom_model
$ mv <path_to_blob> <depthai_repo>/resources/nn/custom_model

Configuration

We need to provide some additional configuration for the demo script to run this blob. The demo script will look for a custom_model.json for details on how to configure the pipeline and parse the results.

If your model is based on MobileNetSSD or Yolo, you can use our detection output format. If it’s a different type of network, you can use raw (default) output format and provide a custom handler file to decode and display the NN results.

You can use these configuration examples to customize your custom_model.json inside resources/nn/custom_model directory

  • MobileNetSSD

{
    "nn_config":
    {
        "output_format" : "detection",
        "NN_family" : "mobilenet",
        "confidence_threshold" : 0.5,
        "input_size": "300x300"
    },
    "mappings":
    {
        "labels":
        [
            "unknown",
            "face"
        ]
    }
}
  • Yolo

{
    "nn_config":
    {
        "output_format" : "detection",
        "NN_family" : "YOLO",
        "input_size": "300x300",
        "NN_specific_metadata" :
        {
            "classes" : 80,
            "coordinates" : 4,
            "anchors" : [10,14, 23,27, 37,58, 81,82, 135,169, 344,319],
            "anchor_masks" :
            {
                "side26" : [1,2,3],
                "side13" : [3,4,5]
            },
            "iou_threshold" : 0.5,
            "confidence_threshold" : 0.5
        }
    },
    "mappings":
    {
        "labels":
        [
              "unknown",
              "face"
        ]
    }
}
{
    "nn_config": {
        "output_format" : "raw",
        "input_size": "300x300"
    },
    "handler": "handler.py"
}

Run the demo script

Having the files in place, we can now run the demo with our custom model

$ python3 depthai_demo.py -cnn custom_model

And you should see the output and your NN results displayed (or printed in the console if raw was selected and there is no handler file)

custom model

Be sure to check the advanced sections below or see Next steps

Custom handler

Custom handler is a file that the demo script will load and execute to parse the NN results. We specify this file with handler config value, specifying a path to the file of preference. It also requires raw output format, since it prevents the script from handling the results itself.

The handler.py file should contain two methods - decode(nn_manager, packet) and draw(nn_manager, data, frames)

def decode(nn_manager, packet):
  pass

def draw(nn_manager, data, frames):
  pass

First method, decode, is called whenever a NN packet arrives from the pipeline (stored as a packet param) also providing a nn_manager object that contains all nn-related info that was used by the script (like input size etc.). The goal of this function is to decode the received packets from the NN blob into meaningful results that can later be displayed.

Second one, draw, is called with the NN results (returned from decode), nn_manager object and frames array, having [(<frame_name>, <frame>), (<frame_name>, <frame>), ...] items. This array will contain frames that were specified with the -s/--show param. The goal of this function is to draw the decoded results onto received frames.

Below, you can find an example handle.py file that decodes and displays MobilenetSSD-based results.

import cv2
import numpy as np
from depthai_helpers.utils import frame_norm


def decode(nn_manager, packet):
    bboxes = np.array(packet.getFirstLayerFp16())
    bboxes = bboxes.reshape((bboxes.size // 7, 7))
    bboxes = bboxes[bboxes[:, 2] > 0.5]
    labels = bboxes[:, 1].astype(int)
    confidences = bboxes[:, 2]
    bboxes = bboxes[:, 3:7]
    return {
        "labels": labels,
        "confidences": confidences,
        "bboxes": bboxes
    }


decoded = ["unknown", "face"]


def draw(nn_manager, data, frames):
    for name, frame in frames:
        if name == nn_manager.source:
            for label, conf, raw_bbox in zip(*data.values()):
                bbox = frame_norm(frame, raw_bbox)
                cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (255, 0, 0), 2)
                cv2.putText(frame, decoded[label], (bbox[0] + 10, bbox[1] + 20), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)
                cv2.putText(frame, f"{int(conf * 100)}%", (bbox[0] + 10, bbox[1] + 40), cv2.FONT_HERSHEY_TRIPLEX, 0.5, 255)

With the custom face detection model, using this code we receive the following output

custom model custom handler

We already use this handler mechanism to decode deeplabv3p_person, that comes as one of the available networks in the demo script to use

On-demand compilation

Since files in the IR format can be large, and we’re both downloading the blob and uploading IR format to the server, we have incorporated an OpenVINO-like model.yml file structure that BlobConverter server uses internally as well. You can check how this file looks like in OpenVINO model zoo or in available models in demo script.

This file is used by the OpenVINO model downloader that is used to download the required files for compilation. In our demo script, we use these files to provide a URL to the NN source files instead of uploading them along with the source code. It is also useful because on-demand compilation allows us to use the same configuration while requesting a different amount of MyriadX SHAVE cores.

To download the blob using model.yml file, run

$ python3 -m blobconverter --raw-config /path/to/model.yml --raw-name custom_model

You can also leave the model.yml file inside resources/nn/<name> directory. This will make the demo script perform the conversion for you and run the received blob

$ python3 depthai_demo.py -cnn <name>

Next steps

In the previous sections, we learned how to preview basic DepthAI features. From this point, you can explore the DepthAI world further

  • Looking for inspiration?

    Check our Example Use Cases for ready to use applications that solve a specific problem on DepthAI

  • Want to start coding?

    Be sure to check hello world tutorial on API section for a step-by-step introduction to the API

  • Want to train and deploy a custom model to DepthAI?

    Visit Custom training page for ready to use Colab notebooks

Got questions?

We’re always happy to help with code or other questions you might have.

Discord
Community Discord

Chat live with the DepthAI team and devs like you.

forum
Discussion Forum

Like chat, just asynchronous.

forum
Email Support

Send a message to our support team.