ON THIS PAGE

  • SpatialDetectionNetwork
  • How to place it
  • Inputs and Outputs
  • Configuring Spatial Detection
  • Detection
  • Alignment
  • Scaling of BBOX
  • Calculation of spatials
  • Averaging methods
  • Common mistakes
  • Usage
  • Examples of functionality
  • Spatial coordinate system
  • Reference

SpatialDetectionNetwork

Spatial detection node is similar to a combination of the DetectionNetwork and SpatialLocationCalculator.

How to place it

Python
C++

Python

Python
1modelDescription = dai.NNModelDescription("yolov6-nano")
2pipeline = dai.Pipeline()
3spatialDetectionNetwork = pipeline.create(dai.node.SpatialDetectionNetwork).build(camRgb, stereo, modelDescription)

Inputs and Outputs

Configuring Spatial Detection

The pipeline of the SpatialDetectionNetwork node is described in the schema below:Spatial Detection node is essentially just an abstraction of the Detection Network (DetectionNetwork and the SpatialLocationCalculator.It works by linking the bounding boxes of each detected object to the spatial location calculator. The process goes as follows:
1

Detection

The Detection Network is responsible for detecting objects in the input frame. It outputs a list of detected objects, each represented by a bounding box, label and a confidence score.
2

Alignment

The depth map is aligned with the input frame. This is necessary because the DetectionNetwork operates on the input frame, while the SpatialLocationCalculator operates on the depth map.
3

Scaling of BBOX

The bounding box from the network is sent to SpatialLocationCalculator and is scaled according to BoundingBoxScaleFactor. This is done to ensure it includes the entire object. The bounding box is then used along with depth to calculate the spatial coordinates of the object.
4

Calculation of spatials

  • X and Y coordinates are taken from the bounding box center. They are calculated based of the offset from the center of the frame and the depth at that point.
  • For depth (Z), each pixel inside the scaled bounding box (ROI) is taken into account. This gives us a set of depth values, which are then averaged to get the final depth value.

Averaging methods

  • Average/mean: the average of ROI is used for calculation.
  • Min: the minimum value inside ROI is used for calculation.
  • Max: the maximum value inside ROI is used for calculation.
  • Mode: the most frequent value inside ROI is used for calculation.
  • Median: the median value inside ROI is used for calculation.
Default method is Median.

Common mistakes

Most mistakes stem from incorrect bounding box overlap. The scaled bounding box may include parts of the background, which can skew the depth calculation.
  • Thin objects (like a pole) may will have inaccurate spatials since only a small portion of the bounding box actually lies on the detected object. In such cases, it is best to use a smaller BoundingBoxScaleFactor if possible.
  • Objects with holes - hoops, rings, etc. To get the correct depth, the bounding box should include the entire object. Instead of median depth, use MIN depth method to exclude the background from calculation. Alternatively a depth threshold can be set to ignore the background in static environment.

Usage

Python
C++

Python

Python
1with dai.Pipeline() as p:
2    camRgb = p.create(dai.node.Camera).build(dai.CameraBoardSocket.CAM_A)
3    monoLeft = p.create(dai.node.Camera).build(dai.CameraBoardSocket.CAM_B)
4    monoRight = p.create(dai.node.Camera).build(dai.CameraBoardSocket.CAM_C)
5    stereo = p.create(dai.node.StereoDepth)
6    spatialDetectionNetwork = p.create(dai.node.SpatialDetectionNetwork).build(camRgb, stereo, modelDescription, fps=FPS)
7
8    spatialDetectionNetwork.input.setBlocking(False)
9    spatialDetectionNetwork.setBoundingBoxScaleFactor(0.5)
10    spatialDetectionNetwork.setDepthLowerThreshold(100)
11    spatialDetectionNetwork.setDepthUpperThreshold(5000)
12
13    labelMap = spatialDetectionNetwork.getClasses()

Examples of functionality

Spatial coordinate system

OAK camera uses left-handed (Cartesian) coordinate system for all spatial coordinates.

Reference

class

dai::node::SpatialDetectionNetwork

#include SpatialDetectionNetwork.hpp
variable
Subnode< NeuralNetwork > neuralNetwork
variable
Subnode< DetectionParser > detectionParser
variable
std::unique_ptr< Subnode< ImageAlign > > depthAlign
variable
Input & input
Input message with data to be inferred upon Default queue is blocking with size 5
variable
Output & outNetwork
Outputs unparsed inference results.
variable
Output & passthrough
Passthrough message on which the inference was performed.Suitable for when input queue is set to non-blocking behavior.
variable
Input inputDepth
Input message with depth data used to retrieve spatial information about detected object Default queue is non-blocking with size 4
variable
Input inputImg
Input message with image data used to retrieve image transformation from detected object Default queue is blocking with size 1
variable
Input inputDetections
Input message with input detections object Default queue is blocking with size 1
variable
Output out
Outputs ImgDetections message that carries parsed detection results.
variable
Output boundingBoxMapping
Outputs mapping of detected bounding boxes relative to depth map Suitable for when displaying remapped bounding boxes on depth frame
variable
Output passthroughDepth
Passthrough message for depth frame on which the spatial location calculation was performed. Suitable for when input queue is set to non-blocking behavior.
variable
Output spatialLocationCalculatorOutput
Output of SpatialLocationCalculator node, which is used internally by SpatialDetectionNetwork. Suitable when extra information is required from SpatialLocationCalculator node, e.g. minimum, maximum distance.
inline explicit function
SpatialDetectionNetwork(const std::shared_ptr< Device > & device)
inline function
SpatialDetectionNetwork(std::unique_ptr< Properties > props)
inline function
SpatialDetectionNetwork(std::unique_ptr< Properties > props, bool confMode)
inline function
SpatialDetectionNetwork(const std::shared_ptr< Device > & device, std::unique_ptr< Properties > props, bool confMode)
function
std::shared_ptr< SpatialDetectionNetwork > build(const std::shared_ptr< Camera > & inputRgb, const std::shared_ptr< StereoDepth > & stereo, dai::NNModelDescription modelDesc, std::optional< float > fps)
function
std::shared_ptr< SpatialDetectionNetwork > build(const std::shared_ptr< Camera > & inputRgb, const std::shared_ptr< StereoDepth > & stereo, const dai::NNArchive & nnArchive, std::optional< float > fps)
function
void setNNArchive(const NNArchive & nnArchive)
function
void setFromModelZoo(NNModelDescription description, bool useCached)
function
void setNNArchive(const NNArchive & nnArchive, int numShaves)
function
void setBlobPath(const dai::Path & path)
Backwards compatibility interface Load network blob into assets and use once pipeline is started.
Parameters
  • Error: if file doesn't exist or isn't a valid network blob.
Parameters
  • path: Path to network blob
function
void setBlob(OpenVINO::Blob blob)
Load network blob into assets and use once pipeline is started.
Parameters
  • blob: Network blob
function
void setBlob(const dai::Path & path)
Same functionality as the setBlobPath(). Load network blob into assets and use once pipeline is started.
Parameters
  • Error: if file doesn't exist or isn't a valid network blob.
Parameters
  • path: Path to network blob
function
void setModelPath(const dai::Path & modelPath)
Load network file into assets.
Parameters
  • modelPath: Path to the model file.
function
void setNumPoolFrames(int numFrames)
Specifies how many frames will be available in the pool
Parameters
  • numFrames: How many frames will pool have
function
void setNumInferenceThreads(int numThreads)
How many threads should the node use to run the network.
Parameters
  • numThreads: Number of threads to dedicate to this node
function
void setNumNCEPerInferenceThread(int numNCEPerThread)
How many Neural Compute Engines should a single thread use for inference
Parameters
  • numNCEPerThread: Number of NCE per thread
function
void setNumShavesPerInferenceThread(int numShavesPerThread)
How many Shaves should a single thread use for inference
Parameters
  • numShavesPerThread: Number of shaves per thread
function
void setBackend(std::string backend)
Specifies backend to use
Parameters
  • backend: String specifying backend to use
function
void setBackendProperties(std::map< std::string, std::string > properties)
Set backend properties
Parameters
  • backendProperties: backend properties map
function
int getNumInferenceThreads()
How many inference threads will be used to run the network
Returns
Number of threads, 0, 1 or 2. Zero means AUTO
function
void setConfidenceThreshold(float thresh)
Specifies confidence threshold at which to filter the rest of the detections.
Parameters
  • thresh: Detection confidence must be greater than specified threshold to be added to the list
function
float getConfidenceThreshold()
Retrieves threshold at which to filter the rest of the detections.
Returns
Detection confidence
function
void setBoundingBoxScaleFactor(float scaleFactor)
Custom interface Specifies scale factor for detected bounding boxes.
Parameters
  • scaleFactor: Scale factor must be in the interval (0,1].
function
void setDepthLowerThreshold(uint32_t lowerThreshold)
Specifies lower threshold in depth units (millimeter by default) for depth values which will used to calculate spatial data
Parameters
  • lowerThreshold: LowerThreshold must be in the interval [0,upperThreshold] and less than upperThreshold.
function
void setDepthUpperThreshold(uint32_t upperThreshold)
Specifies upper threshold in depth units (millimeter by default) for depth values which will used to calculate spatial data
Parameters
  • upperThreshold: UpperThreshold must be in the interval (lowerThreshold,65535].
function
void setSpatialCalculationAlgorithm(dai::SpatialLocationCalculatorAlgorithm calculationAlgorithm)
Specifies spatial location calculator algorithm: Average/Min/Max
Parameters
  • calculationAlgorithm: Calculation algorithm.
function
void setSpatialCalculationStepSize(int stepSize)
Specifies spatial location calculator step size for depth calculation. Step size 1 means that every pixel is taken into calculation, size 2 means every second etc.
Parameters
  • stepSize: Step size.
function
std::optional< std::vector< std::string > > getClasses()
function
void buildInternal()

Need assistance?

Head over to Discussion Forum for technical support or any other questions you might have.