SpatialDetectionNetwork

Spatial detection node is similar to a combination of the DetectionNetwork and SpatialLocationCalculator.

How to place it

Python

C++

Python

1modelDescription = dai.NNModelDescription("yolov6-nano")
2pipeline = dai.Pipeline()
3spatialDetectionNetwork = pipeline.create(dai.node.SpatialDetectionNetwork).build(camRgb, stereo, modelDescription)

Inputs and Outputs

React Flow

Configuring Spatial Detection

The pipeline of the SpatialDetectionNetwork node is described in the schema below:

Spatial Detection node is essentially just an abstraction of the Detection Network (DetectionNetwork and the SpatialLocationCalculator.It works by linking the bounding boxes of each detected object to the spatial location calculator. The process goes as follows:

Detection

The Detection Network is responsible for detecting objects in the input frame. It outputs a list of detected objects, each represented by a bounding box, label and a confidence score.

Alignment

The depth map is aligned with the input frame. This is necessary because the DetectionNetwork operates on the input frame, while the SpatialLocationCalculator operates on the depth map.

Scaling of BBOX

The bounding box from the network is sent to SpatialLocationCalculator and is scaled according to BoundingBoxScaleFactor. This is done to ensure it includes the entire object. The bounding box is then used along with depth to calculate the spatial coordinates of the object.

Calculation of spatials

X and Y coordinates are taken from the bounding box center. They are calculated based of the offset from the center of the frame and the depth at that point.
For depth (Z), each pixel inside the scaled bounding box (ROI) is taken into account. This gives us a set of depth values, which are then averaged to get the final depth value.

Averaging methods

Average/mean: the average of ROI is used for calculation.
Min: the minimum value inside ROI is used for calculation.
Max: the maximum value inside ROI is used for calculation.
Mode: the most frequent value inside ROI is used for calculation.
Median: the median value inside ROI is used for calculation.

Default method is Median.

Common mistakes

Most mistakes stem from incorrect bounding box overlap. The scaled bounding box may include parts of the background, which can skew the depth calculation.

Thin objects (like a pole) may will have inaccurate spatials since only a small portion of the bounding box actually lies on the detected object. In such cases, it is best to use a smaller BoundingBoxScaleFactor if possible.
Objects with holes - hoops, rings, etc. To get the correct depth, the bounding box should include the entire object. Instead of median depth, use MIN depth method to exclude the background from calculation. Alternatively a depth threshold can be set to ignore the background in static environment.

Usage

Python

C++

Python

1with dai.Pipeline() as p:
2    camRgb = p.create(dai.node.Camera).build(dai.CameraBoardSocket.CAM_A)
3    monoLeft = p.create(dai.node.Camera).build(dai.CameraBoardSocket.CAM_B)
4    monoRight = p.create(dai.node.Camera).build(dai.CameraBoardSocket.CAM_C)
5    stereo = p.create(dai.node.StereoDepth)
6    spatialDetectionNetwork = p.create(dai.node.SpatialDetectionNetwork).build(camRgb, stereo, modelDescription, fps=FPS)
7
8    spatialDetectionNetwork.input.setBlocking(False)
9    spatialDetectionNetwork.setBoundingBoxScaleFactor(0.5)
10    spatialDetectionNetwork.setDepthLowerThreshold(100)
11    spatialDetectionNetwork.setDepthUpperThreshold(5000)
12
13    labelMap = spatialDetectionNetwork.getClasses()

Examples of functionality

Spatial Detection Network

Spatial coordinate system

OAK camera uses left-handed (Cartesian) coordinate system for all spatial coordinates.

Reference

class

dai::node::SpatialDetectionNetwork

#include SpatialDetectionNetwork.hpp

variable

Subnode< NeuralNetwork > neuralNetwork

variable

Subnode< DetectionParser > detectionParser

variable

std::unique_ptr< Subnode< ImageAlign > > depthAlign

variable

Input & input

Input message with data to be inferred upon Default queue is blocking with size 5

variable

Output & outNetwork

Outputs unparsed inference results.

variable

Output & passthrough

Passthrough message on which the inference was performed.Suitable for when input queue is set to non-blocking behavior.

variable

Input inputDepth

Input message with depth data used to retrieve spatial information about detected object Default queue is non-blocking with size 4

variable

Input inputImg

Input message with image data used to retrieve image transformation from detected object Default queue is blocking with size 1

variable

Input inputDetections

Input message with input detections object Default queue is blocking with size 1

variable

Output out

Outputs ImgDetections message that carries parsed detection results.

variable

Output boundingBoxMapping

Outputs mapping of detected bounding boxes relative to depth map Suitable for when displaying remapped bounding boxes on depth frame

variable

Output passthroughDepth

Passthrough message for depth frame on which the spatial location calculation was performed. Suitable for when input queue is set to non-blocking behavior.

variable

Output spatialLocationCalculatorOutput

Output of SpatialLocationCalculator node, which is used internally by SpatialDetectionNetwork. Suitable when extra information is required from SpatialLocationCalculator node, e.g. minimum, maximum distance.

inline explicit function

SpatialDetectionNetwork(const std::shared_ptr< Device > & device)

inline function

SpatialDetectionNetwork(std::unique_ptr< Properties > props)

inline function

SpatialDetectionNetwork(std::unique_ptr< Properties > props, bool confMode)

inline function

SpatialDetectionNetwork(const std::shared_ptr< Device > & device, std::unique_ptr< Properties > props, bool confMode)

function

std::shared_ptr< SpatialDetectionNetwork > build(const std::shared_ptr< Camera > & inputRgb, const std::shared_ptr< StereoDepth > & stereo, dai::NNModelDescription modelDesc, std::optional< float > fps)

function

std::shared_ptr< SpatialDetectionNetwork > build(const std::shared_ptr< Camera > & inputRgb, const std::shared_ptr< StereoDepth > & stereo, const dai::NNArchive & nnArchive, std::optional< float > fps)

function

void setNNArchive(const NNArchive & nnArchive)

function

void setFromModelZoo(NNModelDescription description, bool useCached)

function

void setNNArchive(const NNArchive & nnArchive, int numShaves)

function

void setBlobPath(const dai::Path & path)

Backwards compatibility interface Load network blob into assets and use once pipeline is started.

Parameters

Error: if file doesn't exist or isn't a valid network blob.

Parameters

path: Path to network blob

function

void setBlob(OpenVINO::Blob blob)

Load network blob into assets and use once pipeline is started.

Parameters

blob: Network blob

function

void setBlob(const dai::Path & path)

Same functionality as the setBlobPath(). Load network blob into assets and use once pipeline is started.

Parameters

Error: if file doesn't exist or isn't a valid network blob.

Parameters

path: Path to network blob

function

void setModelPath(const dai::Path & modelPath)

Load network file into assets.

Parameters

modelPath: Path to the model file.

function

void setNumPoolFrames(int numFrames)

Specifies how many frames will be available in the pool

Parameters

numFrames: How many frames will pool have

function

void setNumInferenceThreads(int numThreads)

How many threads should the node use to run the network.

Parameters

numThreads: Number of threads to dedicate to this node

function

void setNumNCEPerInferenceThread(int numNCEPerThread)

How many Neural Compute Engines should a single thread use for inference

Parameters

numNCEPerThread: Number of NCE per thread

function

void setNumShavesPerInferenceThread(int numShavesPerThread)

How many Shaves should a single thread use for inference

Parameters

numShavesPerThread: Number of shaves per thread

function

void setBackend(std::string backend)

Specifies backend to use

Parameters

backend: String specifying backend to use

function

void setBackendProperties(std::map< std::string, std::string > properties)

Set backend properties

Parameters

backendProperties: backend properties map

function

int getNumInferenceThreads()

How many inference threads will be used to run the network

Returns

Number of threads, 0, 1 or 2. Zero means AUTO

function

void setConfidenceThreshold(float thresh)

Specifies confidence threshold at which to filter the rest of the detections.

Parameters

thresh: Detection confidence must be greater than specified threshold to be added to the list

function

float getConfidenceThreshold()

Retrieves threshold at which to filter the rest of the detections.

Returns

Detection confidence

function

void setBoundingBoxScaleFactor(float scaleFactor)

Custom interface Specifies scale factor for detected bounding boxes.

Parameters

scaleFactor: Scale factor must be in the interval (0,1].

function

void setDepthLowerThreshold(uint32_t lowerThreshold)

Specifies lower threshold in depth units (millimeter by default) for depth values which will used to calculate spatial data

Parameters

lowerThreshold: LowerThreshold must be in the interval [0,upperThreshold] and less than upperThreshold.

function

void setDepthUpperThreshold(uint32_t upperThreshold)

Specifies upper threshold in depth units (millimeter by default) for depth values which will used to calculate spatial data

Parameters

upperThreshold: UpperThreshold must be in the interval (lowerThreshold,65535].

function

void setSpatialCalculationAlgorithm(dai::SpatialLocationCalculatorAlgorithm calculationAlgorithm)

Specifies spatial location calculator algorithm: Average/Min/Max

Parameters

calculationAlgorithm: Calculation algorithm.

function

void setSpatialCalculationStepSize(int stepSize)

Specifies spatial location calculator step size for depth calculation. Step size 1 means that every pixel is taken into calculation, size 2 means every second etc.

Parameters

stepSize: Step size.

function

std::optional< std::vector< std::string > > getClasses()

function

void buildInternal()

Need assistance?

Head over to Discussion Forum for technical support or any other questions you might have.

ON THIS PAGE

SpatialDetectionNetwork

How to place it

Python

Inputs and Outputs

Configuring Spatial Detection

Detection

Alignment

Scaling of BBOX

Calculation of spatials

Averaging methods

Common mistakes

Usage

Python

Examples of functionality

Spatial coordinate system

Reference

dai::node::SpatialDetectionNetwork

Need assistance?