Introduction
Let's dive into the basics using an example. We'll create a simple application that runs an object detection neural network and streams color video with visualized neural network detections. We'll use the DepthAI Python API to create the application.1
Creating a pipeline
Sources
Now, first node we will add is a ColorCamera. This node will automatically select the center camera (which in most devices is the color camera) and will provide the video stream to the next node in the pipeline.We will use the preview output, resized to 300x300 to fit the mobilenet-ssd input size (which we will define later).
For more information about the ColorCamera node, please refer to the ColorCamera documentation
Detecton network
Up next, let’s define a MobileNetDetectionNetwork node with mobilenet-ssd network.Each neural network we want to use must first be compiled to a blob file, which is a binary file containing the network weights and configuration and can be loaded to the device.
More info about compiling the network can be found in the Compiling a neural network section.
The blob file for this example will be compiled and downloaded automatically using blobconverter tool.
blobconverter.from_zoo()
function returns Path
to the model, so we can directly put it inside the detection_nn.setBlobPath()
function.With this node, the output from nn will be parsed on device side and we’ll receive a ready to use detection objects. For this to work properly, we need also to set the confidence threshold to filter out the incorrect results.
Sinks
If we wish to see the output of the neural network, we need to add a XLinkOut node. This node will send the data to the host, where we can process it further.Each XLinkOut node should only be connected to one node, so we need to add one for the detection network and one for the RGB stream.
Linking nodes together
Now we need to connect the nodes. We will connect the ColorCamera to the MobileNetDetectionNetwork and the MobileNetDetectionNetwork to the XLinkOut. We will also connect the ColorCamera to the XLinkOut to get the video stream.Note that a node output can be connected to multiple nodes inputs at the same time.
2
Uploading and running the pipeline
Connect to device
In order to upload the constructed pipeline to the device, we must first initialize it. The general practice is to initialize it using context manager, so we don't have to worry about closing the device after we are done.Thedepthai.Device(pipeline)
call will create a Device object and pass it the created Pipeline object. In the background this call will check both USB as well as NETWORK interfaces to check if a device is available and ready to accept connections.Initialize queues
Consuming device messages
We create a infinite loop to make sure the device stays open. In each iteration of the loop, we try to retrieve messages sent by the device to the queue we created above..tryGet()
method will return either the data packet or None if there isn't any. .try()
method is similar but will block until a packet is received.- If the packet from RGB camera is present, we're retrieving the frame in OpenCV format using getCvFrame:
.getCvFrame()
. - When data from nn is received, we take the detections array that contains mobilenet-ssd results:
.detections
.
Drawing frames
If we want to view the results, we can do so with OpenCV. We use theframe
we have received on the rgb queue and draw a rectangle over the image for every detection consumed on the nn queue. The bbox coordinates we receive are usually normalized in range [0, 1] so we need to denormalize them first if we wish to draw over our RGB frame. This is done with custom function: frameNorm()
.1# first, import all necessary modules2from pathlib import Path34import blobconverter5import cv26import depthai7import numpy as np8910pipeline = depthai.Pipeline()1112# First, we want the Color camera as the output13cam_rgb = pipeline.createColorCamera()14cam_rgb.setPreviewSize(300, 300) # 300x300 will be the preview frame size, available as 'preview' output of the node15cam_rgb.setInterleaved(False)1617detection_nn = pipeline.createMobileNetDetectionNetwork()18# Blob is the Neural Network file, compiled for MyriadX. It contains both the definition and weights of the model19# We're using a blobconverter tool to retreive the MobileNetSSD blob automatically from OpenVINO Model Zoo20detection_nn.setBlobPath(blobconverter.from_zoo(name='mobilenet-ssd', shaves=6))21# Next, we filter out the detections that are below a confidence threshold. Confidence can be anywhere between <0..1>22detection_nn.setConfidenceThreshold(0.5)2324# XLinkOut is a "way out" from the device. Any data you want to transfer to host need to be send via XLink25xout_rgb = pipeline.createXLinkOut()26xout_rgb.setStreamName("rgb")2728xout_nn = pipeline.createXLinkOut()29xout_nn.setStreamName("nn")3031cam_rgb.preview.link(xout_rgb.input)32cam_rgb.preview.link(detection_nn.input)33detection_nn.out.link(xout_nn.input)3435# Pipeline is now finished, and we need to find an available device to run our pipeline36# we are using context manager here that will dispose the device after we stop using it37with depthai.Device(pipeline) as device:38 # From this point, the Device will be in "running" mode and will start sending data via XLink3940 # To consume the device results, we get two output queues from the device, with stream names we assigned earlier41 q_rgb = device.getOutputQueue("rgb")42 q_nn = device.getOutputQueue("nn")43 # Here, some of the default values are defined. Frame will be an image from "rgb" stream, detections will contain nn results44 frame = None45 detections = []4647 # Since the detections returned by nn have values from <0..1> range, they need to be multiplied by frame width/height to48 # receive the actual position of the bounding box on the image49 def frameNorm(frame, bbox):50 normVals = np.full(len(bbox), frame.shape[0])51 normVals[::2] = frame.shape[1]52 return (np.clip(np.array(bbox), 0, 1) * normVals).astype(int)535455 while True:56 # we try to fetch the data from nn/rgb queues. tryGet will return either the data packet or None if there isn't any57 in_rgb = q_rgb.tryGet()58 in_nn = q_nn.tryGet()5960 if in_rgb is not None:61 # If the packet from RGB camera is present, we're retrieving the frame in OpenCV format using getCvFrame62 frame = in_rgb.getCvFrame()6364 if in_nn is not None:65 # when data from nn is received, we take the detections array that contains mobilenet-ssd results66 detections = in_nn.detections676869 if frame is not None:70 for detection in detections:71 # for each bounding box, we first normalize it to match the frame size72 bbox = frameNorm(frame, (detection.xmin, detection.ymin, detection.xmax, detection.ymax))73 # and then draw a rectangle on the frame to show the actual result74 cv2.rectangle(frame, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (255, 0, 0), 2)75 # After all the drawing is finished, we show the frame on the screen76 cv2.imshow("preview", frame)7778 # at any time, you can press "q" and exit the main loop, therefore exiting the program itself79 if cv2.waitKey(1) == ord('q'):80 break818283848586878889909192939495969798