YoloSpatialDetectionNetwork
Spatial detection for the Yolo NN. It is similar to a combination of the YoloDetectionNetwork and SpatialLocationCalculator.How to place it
Python
C++
Python
Python
1pipeline = dai.Pipeline()
2yoloSpatial = pipeline.create(dai.node.YoloSpatialDetectionNetwork)
Inputs and Outputs
Configuring Spatial Detection
The pipeline of the SpatialDetectionNetwork node is described in the schema below:Spatial Detection node is essentially just an abstraction of the Detection Network (YoloDetectionNetwork and MobileNetDetectionNetwork) and the SpatialLocationCalculator.It works by linking the bounding boxes of each detected object to the spatial location calculator. The process goes as follows:1
Detection
The Detection Network is responsible for detecting objects in the input frame. It outputs a list of detected objects, each represented by a bounding box, label and a confidence score.
2
Alignment
The depth map is aligned with the input frame. This is necessary because the DetectionNetwork operates on the input frame, while the SpatialLocationCalculator operates on the depth map.
3
Scaling of BBOX
The bounding box from the network is sent to SpatialLocationCalculator and is scaled according to
BoundingBoxScaleFactor
. This is done to ensure it includes the entire object. The bounding box is then used along with depth to calculate the spatial coordinates of the object.4
Calculation of spatials
- X and Y coordinates are taken from the bounding box center. They are calculated based of the offset from the center of the frame and the depth at that point.
- For depth (Z), each pixel inside the scaled bounding box (ROI) is taken into account. This gives us a set of depth values, which are then averaged to get the final depth value.
Averaging methods
- Average/mean: the average of ROI is used for calculation.
- Min: the minimum value inside ROI is used for calculation.
- Max: the maximum value inside ROI is used for calculation.
- Mode: the most frequent value inside ROI is used for calculation.
- Median: the median value inside ROI is used for calculation.
Common mistakes
Most mistakes stem from incorrect bounding box overlap. The scaled bounding box may include parts of the background, which can skew the depth calculation.- Thin objects (like a pole) may will have inaccurate spatials since only a small portion of the bounding box actually lies on the detected object. In such cases, it is best to use a smaller
BoundingBoxScaleFactor
if possible. - Objects with holes - hoops, rings, etc. To get the correct depth, the bounding box should include the entire object. Instead of median depth, use MIN depth method to exclude the background from calculation. Alternatively a depth threshold can be set to ignore the background in static environment.
Usage
Python
C++
Python
Python
1pipeline = dai.Pipeline()
2yoloSpatial = pipeline.create(dai.node.YoloSpatialDetectionNetwork)
3yoloSpatial.setBlobPath(nnBlobPath)
4
5# Spatial detection specific parameters
6yoloSpatial.setConfidenceThreshold(0.5)
7yoloSpatial.input.setBlocking(False)
8yoloSpatial.setBoundingBoxScaleFactor(0.5)
9yoloSpatial.setDepthLowerThreshold(100) # Min 10 centimeters
10yoloSpatial.setDepthUpperThreshold(5000) # Max 5 meters
11
12# Yolo specific parameters
13yoloSpatial.setNumClasses(80)
14yoloSpatial.setCoordinateSize(4)
15yoloSpatial.setAnchors([10,14, 23,27, 37,58, 81,82, 135,169, 344,319])
16yoloSpatial.setAnchorMasks({ "side26": [1,2,3], "side13": [3,4,5] })
17yoloSpatial.setIouThreshold(0.5)
Examples of functionality
Spatial coordinate system
OAK camera uses left-handed (Cartesian) coordinate system for all spatial coordinates.Reference
class
depthai.node.YoloSpatialDetectionNetwork(depthai.node.SpatialDetectionNetwork)
method
method
getAnchors(self) -> list[float]: list[float]
Get anchors
method
getCoordinateSize(self) -> int: int
Get coordianate size
method
getIouThreshold(self) -> float: float
Get Iou threshold
method
getNumClasses(self) -> int: int
Get num classes
method
setAnchorMasks(self, anchorMasks: dict
[
str
,
list
[
int
]
])
Set anchor masks
method
setAnchors(self, anchors: list
[
float
])
Set anchors
method
setCoordinateSize(self, coordinates: int)
Set coordianate size
method
setIouThreshold(self, thresh: float)
Set Iou threshold
method
setNumClasses(self, numClasses: int)
Set num classes
Need assistance?
Head over to Discussion Forum for technical support or any other questions you might have.